solving speech-understanding systems: Topics by WorldWideScience.org

Sample records for solving speech-understanding systems

Speech Understanding with a New Implant Technology: A Comparative Study with a New Nonskin Penetrating Baha System

Directory of Open Access Journals (Sweden)

Anja Kurz

2014-01-01

Full Text Available Objective. To compare hearing and speech understanding between a new, nonskin penetrating Baha system (Baha Attract to the current Baha system using a skin-penetrating abutment. Methods. Hearing and speech understanding were measured in 16 experienced Baha users. The transmission path via the abutment was compared to a simulated Baha Attract transmission path by attaching the implantable magnet to the abutment and then by adding a sample of artificial skin and the external parts of the Baha Attract system. Four different measurements were performed: bone conduction thresholds directly through the sound processor (BC Direct, aided sound field thresholds, aided speech understanding in quiet, and aided speech understanding in noise. Results. The simulated Baha Attract transmission path introduced an attenuation starting from approximately 5 dB at 1000 Hz, increasing to 20–25 dB above 6000 Hz. However, aided sound field threshold shows smaller differences and aided speech understanding in quiet and in noise does not differ significantly between the two transmission paths. Conclusion. The Baha Attract system transmission path introduces predominately high frequency attenuation. This attenuation can be partially compensated by adequate fitting of the speech processor. No significant decrease in speech understanding in either quiet or in noise was found.
Spoken Language Understanding Systems for Extracting Semantic Information from Speech

CERN Document Server

Tur, Gokhan

2011-01-01

Spoken language understanding (SLU) is an emerging field in between speech and language processing, investigating human/ machine and human/ human communication by leveraging technologies from signal processing, pattern recognition, machine learning and artificial intelligence. SLU systems are designed to extract the meaning from speech utterances and its applications are vast, from voice search in mobile devices to meeting summarization, attracting interest from both commercial and academic sectors. Both human/machine and human/human communications can benefit from the application of SLU, usin
Difficulty understanding speech in noise by the hearing impaired: underlying causes and technological solutions.

Science.gov (United States)

Healy, Eric W; Yoho, Sarah E

2016-08-01

A primary complaint of hearing-impaired individuals involves poor speech understanding when background noise is present. Hearing aids and cochlear implants often allow good speech understanding in quiet backgrounds. But hearing-impaired individuals are highly noise intolerant, and existing devices are not very effective at combating background noise. As a result, speech understanding in noise is often quite poor. In accord with the significance of the problem, considerable effort has been expended toward understanding and remedying this issue. Fortunately, our understanding of the underlying issues is reasonably good. In sharp contrast, effective solutions have remained elusive. One solution that seems promising involves a single-microphone machine-learning algorithm to extract speech from background noise. Data from our group indicate that the algorithm is capable of producing vast increases in speech understanding by hearing-impaired individuals. This paper will first provide an overview of the speech-in-noise problem and outline why hearing-impaired individuals are so noise intolerant. An overview of our approach to solving this problem will follow.
Speech Understanding with a New Implant Technology: A Comparative Study with a New Nonskin Penetrating Baha System

OpenAIRE

Kurz, Anja; Flynn, Mark; Caversaccio, Marco; Kompis, Martin

2014-01-01

Objective. To compare hearing and speech understanding between a new, nonskin penetrating Baha system (Baha Attract) to the current Baha system using a skin-penetrating abutment. Methods. Hearing and speech understanding were measured in 16 experienced Baha users. The transmission path via the abutment was compared to a simulated Baha Attract transmission path by attaching the implantable magnet to the abutment and then by adding a sample of artificial skin and the external parts of the Baha...
A real-time spoken-language system for interactive problem-solving, combining linguistic and statistical technology for improved spoken language understanding

Science.gov (United States)

Moore, Robert C.; Cohen, Michael H.

1993-09-01

Under this effort, SRI has developed spoken-language technology for interactive problem solving, featuring real-time performance for up to several thousand word vocabularies, high semantic accuracy, habitability within the domain, and robustness to many sources of variability. Although the technology is suitable for many applications, efforts to date have focused on developing an Air Travel Information System (ATIS) prototype application. SRI's ATIS system has been evaluated in four ARPA benchmark evaluations, and has consistently been at or near the top in performance. These achievements are the result of SRI's technical progress in speech recognition, natural-language processing, and speech and natural-language integration.
Toward a Natural Speech Understanding System

Science.gov (United States)

1989-10-01

toward the monolingual English 25 msec value. Miyawaki et a]. (1975) investigated the /ra/ - /la/ continuum with English and Japanese speakers...Standard Dictionary In order to evaluate some of the claims of the learning theory of speech recognition, a computer model was developed. The NEXus...discrimination of synthetic vowels. Language and Speech, 1962, 5, 171-189. Funk and Wagnalls New Standard Dictionary of the English Language. New York: Funk and
Self-regulatory speech during planning and problem-solving in children with SLI and their typically developing peers.

Science.gov (United States)

Abdul Aziz, Safiyyah; Fletcher, Janet; Bayliss, Donna M

2017-05-01

Past research with children with specific language impairment (SLI) has shown them to have poorer planning and problem-solving ability, and delayed self-regulatory speech (SRS) relative to their typically developing (TD) peers. However, the studies are few in number and are restricted in terms of the number and age range of participants, which limits our understanding of the nature and extent of any delays. Moreover, no study has examined the performance of a significant subset of children with SLI, those who have hyperactive and inattentive behaviours. This cross-sectional study aimed to compare the performance of young children with SLI (aged 4-7 years) with that of their TD peers on a planning and problem-solving task and to examine the use of SRS while performing the task. Within each language group, the performance of children with and without hyperactive and inattentive behaviours was further examined. Children with SLI (n = 91) and TD children (n = 81), with and without hyperactive and inattentive behaviours across the three earliest school years (Kindergarten, Preprimary and Year 1) were video-taped while they completed the Tower of London (TOL), a planning and problem-solving task. Their recorded speech was coded and analysed to look at differences in SRS and its relation to TOL performance across the groups. Children with SLI scored lower on the TOL than TD children. Additionally, children with hyperactive and inattentive behaviours performed worse than those without hyperactive and inattentive behaviours, but only in the SLI group. This suggests that children with SLI with hyperactive and inattentive behaviours experience a double deficit. Children with SLI produced less inaudible muttering than TD children, and showed no reduction in social speech across the first three years of school. Finally, for children with SLI, a higher percentage performed better on the TOL when they used SRS than when they did not. The results point towards a significant delay
Understanding the nature of apraxia of speech: Theory, analysis, and treatment

Directory of Open Access Journals (Sweden)

Kirrie J. Ballard

2010-08-01

Full Text Available Researchers have interpreted the behaviours of individuals with acquired apraxia of speech (AOS as impairment of linguistic phonological processing, motor control, or both. Acoustic, kinematic, and perceptual studies of speech in more recent years have led to significant advances in our understanding of the disorder and wide acceptance that it affects phonetic - motoric planning of speech. However, newly developed methods for studying nonspeech motor control are providing new insights, indicating that the motor control impairment of AOS extends beyond speech and is manifest in nonspeech movements of the oral structures. We present the most recent developments in theory and methods to examine and define the nature of AOS. Theories of the disorder are then related to existing treatment approaches and the efficacy of these approaches is examined. Directions for development of new treatments are posited. It is proposed that treatment programmes driven by a principled account of how the motor system learns to produce skilled actions will provide the most efficient and effective framework for treating motorbased speech disorders. In turn, well controlled and theoretically motivated studies of treatment efficacy promise to stimulate further development of theoretical accounts and contribute to our understanding of AOS.
How early do children understand gesture-speech combinations with iconic gestures?

Science.gov (United States)

Stanfield, Carmen; Williamson, Rebecca; Ozçalişkan, Seyda

2014-03-01

Children understand gesture+speech combinations in which a deictic gesture adds new information to the accompanying speech by age 1;6 (Morford & Goldin-Meadow, 1992; 'push'+point at ball). This study explores how early children understand gesture+speech combinations in which an iconic gesture conveys additional information not found in the accompanying speech (e.g., 'read'+BOOK gesture). Our analysis of two- to four-year-old children's responses in a gesture+speech comprehension task showed that children grasp the meaning of iconic co-speech gestures by age three and continue to improve their understanding with age. Overall, our study highlights the important role gesture plays in language comprehension as children learn to unpack increasingly complex communications addressed to them at the early ages.
Do 6-Month-Olds Understand That Speech Can Communicate?

Science.gov (United States)

Vouloumanos, Athena; Martin, Alia; Onishi, Kristine H.

2014-01-01

Adults and 12-month-old infants recognize that even unfamiliar speech can communicate information between third parties, suggesting that they can separate the communicative function of speech from its lexical content. But do infants recognize that speech can communicate due to their experience understanding and producing language, or do they…
An analysis of machine translation and speech synthesis in speech-to-speech translation system

OpenAIRE

Hashimoto, K.; Yamagishi, J.; Byrne, W.; King, S.; Tokuda, K.

2011-01-01

This paper provides an analysis of the impacts of machine translation and speech synthesis on speech-to-speech translation systems. The speech-to-speech translation system consists of three components: speech recognition, machine translation and speech synthesis. Many techniques for integration of speech recognition and machine translation have been proposed. However, speech synthesis has not yet been considered. Therefore, in this paper, we focus on machine translation and speech synthesis, ...
Utility of TMS to understand the neurobiology of speech

Directory of Open Access Journals (Sweden)

Takenobu eMurakami

2013-07-01

Full Text Available According to a traditional view, speech perception and production are processed largely separately in sensory and motor brain areas. Recent psycholinguistic and neuroimaging studies provide novel evidence that the sensory and motor systems dynamically interact in speech processing, by demonstrating that speech perception and imitation share regional brain activations. However, the exact nature and mechanisms of these sensorimotor interactions are not completely understood yet.Transcranial magnetic stimulation (TMS has often been used in the cognitive neurosciences, including speech research, as a complementary technique to behavioral and neuroimaging studies. Here we provide an up-to-date review focusing on TMS studies that explored speech perception and imitation.Single-pulse TMS of the primary motor cortex (M1 demonstrated a speech specific and somatotopically specific increase of excitability of the M1 lip area during speech perception (listening to speech or lip reading. A paired-coil TMS approach showed increases in effective connectivity from brain regions that are involved in speech processing to the M1 lip area when listening to speech. TMS in virtual lesion mode applied to speech processing areas modulated performance of phonological recognition and imitation of perceived speech.In summary, TMS is an innovative tool to investigate processing of speech perception and imitation. TMS studies have provided strong evidence that the sensory system is critically involved in mapping sensory input onto motor output and that the motor system plays an important role in speech perception.
Asymmetric Dynamic Attunement of Speech and Gestures in the Construction of Children's Understanding.

Science.gov (United States)

De Jonge-Hoekstra, Lisette; Van der Steen, Steffie; Van Geert, Paul; Cox, Ralf F A

2016-01-01

As children learn they use their speech to express words and their hands to gesture. This study investigates the interplay between real-time gestures and speech as children construct cognitive understanding during a hands-on science task. 12 children (M = 6, F = 6) from Kindergarten (n = 5) and first grade (n = 7) participated in this study. Each verbal utterance and gesture during the task were coded, on a complexity scale derived from dynamic skill theory. To explore the interplay between speech and gestures, we applied a cross recurrence quantification analysis (CRQA) to the two coupled time series of the skill levels of verbalizations and gestures. The analysis focused on (1) the temporal relation between gestures and speech, (2) the relative strength and direction of the interaction between gestures and speech, (3) the relative strength and direction between gestures and speech for different levels of understanding, and (4) relations between CRQA measures and other child characteristics. The results show that older and younger children differ in the (temporal) asymmetry in the gestures-speech interaction. For younger children, the balance leans more toward gestures leading speech in time, while the balance leans more toward speech leading gestures for older children. Secondly, at the group level, speech attracts gestures in a more dynamically stable fashion than vice versa, and this asymmetry in gestures and speech extends to lower and higher understanding levels. Yet, for older children, the mutual coupling between gestures and speech is more dynamically stable regarding the higher understanding levels. Gestures and speech are more synchronized in time as children are older. A higher score on schools' language tests is related to speech attracting gestures more rigidly and more asymmetry between gestures and speech, only for the less difficult understanding levels. A higher score on math or past science tasks is related to less asymmetry between gestures and
Influence of musical training on understanding voiced and whispered speech in noise.

Science.gov (United States)

Ruggles, Dorea R; Freyman, Richard L; Oxenham, Andrew J

2014-01-01

This study tested the hypothesis that the previously reported advantage of musicians over non-musicians in understanding speech in noise arises from more efficient or robust coding of periodic voiced speech, particularly in fluctuating backgrounds. Speech intelligibility was measured in listeners with extensive musical training, and in those with very little musical training or experience, using normal (voiced) or whispered (unvoiced) grammatically correct nonsense sentences in noise that was spectrally shaped to match the long-term spectrum of the speech, and was either continuous or gated with a 16-Hz square wave. Performance was also measured in clinical speech-in-noise tests and in pitch discrimination. Musicians exhibited enhanced pitch discrimination, as expected. However, no systematic or statistically significant advantage for musicians over non-musicians was found in understanding either voiced or whispered sentences in either continuous or gated noise. Musicians also showed no statistically significant advantage in the clinical speech-in-noise tests. Overall, the results provide no evidence for a significant difference between young adult musicians and non-musicians in their ability to understand speech in noise.
Auditory and Cognitive Factors Underlying Individual Differences in Aided Speech-Understanding among Older Adults

Directory of Open Access Journals (Sweden)

Larry E. Humes

2013-10-01

Full Text Available This study was designed to address individual differences in aided speech understanding among a relatively large group of older adults. The group of older adults consisted of 98 adults (50 female and 48 male ranging in age from 60 to 86 (mean = 69.2. Hearing loss was typical for this age group and about 90% had not worn hearing aids. All subjects completed a battery of tests, including cognitive (6 measures, psychophysical (17 measures, and speech-understanding (9 measures, as well as the Speech, Spatial and Qualities of Hearing (SSQ self-report scale. Most of the speech-understanding measures made use of competing speech and the non-speech psychophysical measures were designed to tap phenomena thought to be relevant for the perception of speech in competing speech (e.g., stream segregation, modulation-detection interference. All measures of speech understanding were administered with spectral shaping applied to the speech stimuli to fully restore audibility through at least 4000 Hz. The measures used were demonstrated to be reliable in older adults and, when compared to a reference group of 28 young normal-hearing adults, age-group differences were observed on many of the measures. Principal-components factor analysis was applied successfully to reduce the number of independent and dependent (speech understanding measures for a multiple-regression analysis. Doing so yielded one global cognitive-processing factor and five non-speech psychoacoustic factors (hearing loss, dichotic signal detection, multi-burst masking, stream segregation, and modulation detection as potential predictors. To this set of six potential predictor variables were added subject age, Environmental Sound Identification (ESI, and performance on the text-recognition-threshold (TRT task (a visual analog of interrupted speech recognition. These variables were used to successfully predict one global aided speech-understanding factor, accounting for about 60% of the variance.
Understanding the Abstract Role of Speech in Communication at 12 Months

Science.gov (United States)

Martin, Alia; Onishi, Kristine H.; Vouloumanos, Athena

2012-01-01

Adult humans recognize that even unfamiliar speech can communicate information between third parties, demonstrating an ability to separate communicative function from linguistic content. We examined whether 12-month-old infants understand that speech can communicate before they understand the meanings of specific words. Specifically, we test the…
Alternative Speech Communication System for Persons with Severe Speech Disorders

Science.gov (United States)

Selouani, Sid-Ahmed; Sidi Yakoub, Mohammed; O'Shaughnessy, Douglas

2009-12-01

Assistive speech-enabled systems are proposed to help both French and English speaking persons with various speech disorders. The proposed assistive systems use automatic speech recognition (ASR) and speech synthesis in order to enhance the quality of communication. These systems aim at improving the intelligibility of pathologic speech making it as natural as possible and close to the original voice of the speaker. The resynthesized utterances use new basic units, a new concatenating algorithm and a grafting technique to correct the poorly pronounced phonemes. The ASR responses are uttered by the new speech synthesis system in order to convey an intelligible message to listeners. Experiments involving four American speakers with severe dysarthria and two Acadian French speakers with sound substitution disorders (SSDs) are carried out to demonstrate the efficiency of the proposed methods. An improvement of the Perceptual Evaluation of the Speech Quality (PESQ) value of 5% and more than 20% is achieved by the speech synthesis systems that deal with SSD and dysarthria, respectively.
INTEGRATING MACHINE TRANSLATION AND SPEECH SYNTHESIS COMPONENT FOR ENGLISH TO DRAVIDIAN LANGUAGE SPEECH TO SPEECH TRANSLATION SYSTEM

Directory of Open Access Journals (Sweden)

J. SANGEETHA

2015-02-01

Full Text Available This paper provides an interface between the machine translation and speech synthesis system for converting English speech to Tamil text in English to Tamil speech to speech translation system. The speech translation system consists of three modules: automatic speech recognition, machine translation and text to speech synthesis. Many procedures for incorporation of speech recognition and machine translation have been projected. Still speech synthesis system has not yet been measured. In this paper, we focus on integration of machine translation and speech synthesis, and report a subjective evaluation to investigate the impact of speech synthesis, machine translation and the integration of machine translation and speech synthesis components. Here we implement a hybrid machine translation (combination of rule based and statistical machine translation and concatenative syllable based speech synthesis technique. In order to retain the naturalness and intelligibility of synthesized speech Auto Associative Neural Network (AANN prosody prediction is used in this work. The results of this system investigation demonstrate that the naturalness and intelligibility of the synthesized speech are strongly influenced by the fluency and correctness of the translated text.
Development of Trivia Game for speech understanding in background noise.

Science.gov (United States)

Schwartz, Kathryn; Ringleb, Stacie I; Sandberg, Hilary; Raymer, Anastasia; Watson, Ginger S

2015-01-01

Listening in noise is an everyday activity and poses a challenge for many people. To improve the ability to understand speech in noise, a computerized auditory rehabilitation game was developed. In Trivia Game players are challenged to answer trivia questions spoken aloud. As players progress through the game, the level of background noise increases. A study using Trivia Game was conducted as a proof-of-concept investigation in healthy participants. College students with normal hearing were randomly assigned to a control (n = 13) or a treatment (n = 14) group. Treatment participants played Trivia Game 12 times over a 4-week period. All participants completed objective (auditory-only and audiovisual formats) and subjective listening in noise measures at baseline and 4 weeks later. There were no statistical differences between the groups at baseline. At post-test, the treatment group significantly improved their overall speech understanding in noise in the audiovisual condition and reported significant benefits in their functional listening abilities. Playing Trivia Game improved speech understanding in noise in healthy listeners. Significant findings for the audiovisual condition suggest that participants improved face-reading abilities. Trivia Game may be a platform for investigating changes in speech understanding in individuals with sensory, linguistic and cognitive impairments.
An experimental Dutch keyboard-to-speech system for the speech impaired

NARCIS (Netherlands)

Deliege, R.J.H.

1989-01-01

An experimental Dutch keyboard-to-speech system has been developed to explor the possibilities and limitations of Dutch speech synthesis in a communication aid for the speech impaired. The system uses diphones and a formant synthesizer chip for speech synthesis. Input to the system is in

Asymmetric dynamic attunement of speech and gestures in the construction of children’s understanding

Directory of Open Access Journals (Sweden)

Lisette eDe Jonge-Hoekstra

2016-03-01

Full Text Available As children learn they use their speech to express words and their hands to gesture. This study investigates the interplay between real-time gestures and speech as children construct cognitive understanding during a hands-on science task. 12 children (M = 6, F = 6 from Kindergarten (n = 5 and first grade (n = 7 participated in this study. Each verbal utterance and gesture during the task were coded, on a complexity scale derived from dynamic skill theory. To explore the interplay between speech and gestures, we applied a cross recurrence quantification analysis (CRQA to the two coupled time series of the skill levels of verbalizations and gestures. The analysis focused on 1 the temporal relation between gestures and speech, 2 the relative strength and direction of the interaction between gestures and speech, 3 the relative strength and direction between gestures and speech for different levels of understanding, and 4 relations between CRQA measures and other child characteristics. The results show that older and younger children differ in the (temporal asymmetry in the gestures-speech interaction. For younger children, the balance leans more towards gestures leading speech in time, while the balance leans more towards speech leading gestures for older children. Secondly, at the group level, speech attracts gestures in a more dynamically stable fashion than vice versa, and this asymmetry in gestures and speech extends to lower and higher understanding levels. Yet, for older children, the mutual coupling between gestures and speech is more dynamically stable regarding the higher understanding levels. Gestures and speech are more synchronized in time as children are older. A higher score on schools’ language tests is related to speech attracting gestures more rigidly and more asymmetry between gestures and speech, only for the less difficult understanding levels. A higher score on math or past science tasks is related to less asymmetry between
Metaheuristic applications to speech enhancement

CERN Document Server

Kunche, Prajna

2016-01-01

This book serves as a basic reference for those interested in the application of metaheuristics to speech enhancement. The major goal of the book is to explain the basic concepts of optimization methods and their use in heuristic optimization in speech enhancement to scientists, practicing engineers, and academic researchers in speech processing. The authors discuss why it has been a challenging problem for researchers to develop new enhancement algorithms that aid in the quality and intelligibility of degraded speech. They present powerful optimization methods to speech enhancement that can help to solve the noise reduction problems. Readers will be able to understand the fundamentals of speech processing as well as the optimization techniques, how the speech enhancement algorithms are implemented by utilizing optimization methods, and will be given the tools to develop new algorithms. The authors also provide a comprehensive literature survey regarding the topic.
Music training improves the ability to understand speech-in-noise in older adults

OpenAIRE

Belleville, Sylvie; Zendel, Benjamin; West, Greg; Peretz, Isabelle

2017-01-01

It is well known that hearing abilities decline with age, and one of the most commonly reported hearing difficulties reported in older adults is a reduced ability to understand speech in noisy environments. Older musicians have an enhanced ability to understand speech in noise, and this has been associated with enhanced brain responses related to both speech processing and the deployment of attention, however the causal impact of music lessons in older adults is poorly understood. A sample of...
Characterization of authorship speeches in classroom

Directory of Open Access Journals (Sweden)

Daniella de Almeida Santos

2007-08-01

Full Text Available Our paper intends to discuss how the teacher's speech can interfere in the construction of arguments on the part of the students, when they are involved with the task of solving an experimental problem in sciences classes. Thus, we wanted to understand how teacher and students relate to each other in a discursive movement for the senses structuring of the obtained experimental data. With that concern, our focus is in the processes of the speeches authorship, both students' and teachers', in the episodes in which the actors of the teaching and learning process organize their speeches, mediated by the experimental activity.
Ability to solve riddles in patients with speech and language impairments after stroke.

Science.gov (United States)

Savić, Goran

2016-01-01

Successful riddle solving requires recognition of the meaning of words, attention, concentration, memory, connectivity and analysis of riddle content, and sufficiently developed associative thinking. The aim of the study was to determine the ability to solve riddles in stroke patients who do or do not have speech and language disorders (SLDs), to determine the presence of SLDs in relation to the lesion localization, as well as to define the relationship between riddle-solving and functional impairment of a body side. The sample consisted of 88 patients. The data used included age, sex, educational level, time of stroke onset, presence of an SLD, lesion localization, and functional damage of the body side. The patients were presented with a task of solving 10 riddles. A significant SLD was present in 38.60% of the patients. Brain lesions were found distributed at 46 different brain sites. Patients with different lesion localization had different success in solving riddles. Patients with perisylvian cortex brain lesions, or patients with Wernicke and global aphasia, had the poorest results. The group with SLDs had an average success of solved riddles of 26.76% (p = 0.000). The group with right-sided functional impairments had average success of 37.14%, and the group with functional impairments of the left side of the body 56.88% (p = 0.002). Most patients with SLDs had a low ability of solving riddles. Most of the patients with left brain lesions and perisylvian cortex damage demonstrated lower ability in solving riddles in relation to patients with right hemisphere lesions.
Ability to solve riddles in patients with speech and language impairments after stroke

Directory of Open Access Journals (Sweden)

Savić Goran

2016-01-01

Full Text Available Introduction. Successful riddle solving requires recognition of the meaning of words, attention, concentration, memory, connectivity and analysis of riddle content, and sufficiently developed associative thinking. Objective. The aim of the study was to determine the ability to solve riddles in stroke patients who do or do not have speech and language disorders (SLDs, to determine the presence of SLDs in relation to the lesion localization, as well as to define the relationship between riddle-solving and functional impairment of a body side. Methods. The sample consisted of 88 patients. The data used included age, sex, educational level, time of stroke onset, presence of an SLD, lesion localization, and functional damage of the body side. The patients were presented with a task of solving 10 riddles. Results. A significant SLD was present in 38.60% of the patients. Brain lesions were found distributed at 46 different brain sites. Patients with different lesion localization had different success in solving riddles. Patients with perisylvian cortex brain lesions, or patients with Wernicke and global aphasia, had the poorest results. The group with SLDs had an average success of solved riddles of 26.76% (p = 0.000. The group with right-sided functional impairments had average success of 37.14%, and the group with functional impairments of the left side of the body 56.88% (p = 0.002. Conclusion. Most patients with SLDs had a low ability of solving riddles. Most of the patients with left brain lesions and perisylvian cortex damage demonstrated lower ability in solving riddles in relation to patients with right hemisphere lesions.
Speech understanding in background noise with the two-microphone adaptive beamformer BEAM in the Nucleus Freedom Cochlear Implant System.

Science.gov (United States)

Spriet, Ann; Van Deun, Lieselot; Eftaxiadis, Kyriaky; Laneau, Johan; Moonen, Marc; van Dijk, Bas; van Wieringen, Astrid; Wouters, Jan

2007-02-01

This paper evaluates the benefit of the two-microphone adaptive beamformer BEAM in the Nucleus Freedom cochlear implant (CI) system for speech understanding in background noise by CI users. A double-blind evaluation of the two-microphone adaptive beamformer BEAM and a hardware directional microphone was carried out with five adult Nucleus CI users. The test procedure consisted of a pre- and post-test in the lab and a 2-wk trial period at home. In the pre- and post-test, the speech reception threshold (SRT) with sentences and the percentage correct phoneme scores for CVC words were measured in quiet and background noise at different signal-to-noise ratios. Performance was assessed for two different noise configurations (with a single noise source and with three noise sources) and two different noise materials (stationary speech-weighted noise and multitalker babble). During the 2-wk trial period at home, the CI users evaluated the noise reduction performance in different listening conditions by means of the SSQ questionnaire. In addition to the perceptual evaluation, the noise reduction performance of the beamformer was measured physically as a function of the direction of the noise source. Significant improvements of both the SRT in noise (average improvement of 5-16 dB) and the percentage correct phoneme scores (average improvement of 10-41%) were observed with BEAM compared to the standard hardware directional microphone. In addition, the SSQ questionnaire and subjective evaluation in controlled and real-life scenarios suggested a possible preference for the beamformer in noisy environments. The evaluation demonstrates that the adaptive noise reduction algorithm BEAM in the Nucleus Freedom CI-system may significantly increase the speech perception by cochlear implantees in noisy listening conditions. This is the first monolateral (adaptive) noise reduction strategy actually implemented in a mainstream commercial CI.
Artificial intelligence, expert systems, computer vision, and natural language processing

Science.gov (United States)

Gevarter, W. B.

1984-01-01

An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.
Multimodal Speech Capture System for Speech Rehabilitation and Learning.

Science.gov (United States)

Sebkhi, Nordine; Desai, Dhyey; Islam, Mohammad; Lu, Jun; Wilson, Kimberly; Ghovanloo, Maysam

2017-11-01

Speech-language pathologists (SLPs) are trained to correct articulation of people diagnosed with motor speech disorders by analyzing articulators' motion and assessing speech outcome while patients speak. To assist SLPs in this task, we are presenting the multimodal speech capture system (MSCS) that records and displays kinematics of key speech articulators, the tongue and lips, along with voice, using unobtrusive methods. Collected speech modalities, tongue motion, lips gestures, and voice are visualized not only in real-time to provide patients with instant feedback but also offline to allow SLPs to perform post-analysis of articulators' motion, particularly the tongue, with its prominent but hardly visible role in articulation. We describe the MSCS hardware and software components, and demonstrate its basic visualization capabilities by a healthy individual repeating the words "Hello World." A proof-of-concept prototype has been successfully developed for this purpose, and will be used in future clinical studies to evaluate its potential impact on accelerating speech rehabilitation by enabling patients to speak naturally. Pattern matching algorithms to be applied to the collected data can provide patients with quantitative and objective feedback on their speech performance, unlike current methods that are mostly subjective, and may vary from one SLP to another.
Plasticity in the Human Speech Motor System Drives Changes in Speech Perception

Science.gov (United States)

Lametti, Daniel R.; Rochet-Capellan, Amélie; Neufeld, Emily; Shiller, Douglas M.

2014-01-01

Recent studies of human speech motor learning suggest that learning is accompanied by changes in auditory perception. But what drives the perceptual change? Is it a consequence of changes in the motor system? Or is it a result of sensory inflow during learning? Here, subjects participated in a speech motor-learning task involving adaptation to altered auditory feedback and they were subsequently tested for perceptual change. In two separate experiments, involving two different auditory perceptual continua, we show that changes in the speech motor system that accompany learning drive changes in auditory speech perception. Specifically, we obtained changes in speech perception when adaptation to altered auditory feedback led to speech production that fell into the phonetic range of the speech perceptual tests. However, a similar change in perception was not observed when the auditory feedback that subjects' received during learning fell into the phonetic range of the perceptual tests. This indicates that the central motor outflow associated with vocal sensorimotor adaptation drives changes to the perceptual classification of speech sounds. PMID:25080594
When cognition kicks in: Working memory and speech understanding in noise

Directory of Open Access Journals (Sweden)

Jerker Ronnberg

2010-01-01

Full Text Available Perceptual load and cognitive load can be separately manipulated and dissociated in their effects on speech understanding in noise. The Ease of Language Understanding model assumes a theoretical position where perceptual task characteristics interact with the individual′s implicit capacities to extract the phonological elements of speech. Phonological precision and speed of lexical access are important determinants for listening in adverse conditions. If there are mismatches between the phonological elements perceived and phonological representations in long-term memory, explicit working memory (WM-related capacities will be continually invoked to reconstruct and infer the contents of the ongoing discourse. Whether this induces a high cognitive load or not will in turn depend on the individual′s storage and processing capacities in WM. Data suggest that modulated noise maskers may serve as triggers for speech maskers and therefore induce a WM, explicit mode of processing. Individuals with high WM capacity benefit more than low WM-capacity individuals from fast amplitude compression at low or negative input speech-to-noise ratios. The general conclusion is that there is an overarching interaction between the focal purpose of processing in the primary listening task and the extent to which a secondary, distracting task taps into these processes.
Extensions to the Speech Disorders Classification System (SDCS)

Science.gov (United States)

Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.

2010-01-01

This report describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). Part I describes a classification extension to the SDCS to differentiate motor speech disorders from speech delay and to differentiate among three sub-types of motor speech disorders.…
Research on the optoacoustic communication system for speech transmission by variable laser-pulse repetition rates

Science.gov (United States)

Jiang, Hongyan; Qiu, Hongbing; He, Ning; Liao, Xin

2018-06-01

For the optoacoustic communication from in-air platforms to submerged apparatus, a method based on speech recognition and variable laser-pulse repetition rates is proposed, which realizes character encoding and transmission for speech. Firstly, the theories and spectrum characteristics of the laser-generated underwater sound are analyzed; and moreover character conversion and encoding for speech as well as the pattern of codes for laser modulation is studied; lastly experiments to verify the system design are carried out. Results show that the optoacoustic system, where laser modulation is controlled by speech-to-character baseband codes, is beneficial to improve flexibility in receiving location for underwater targets as well as real-time performance in information transmission. In the overwater transmitter, a pulse laser is controlled to radiate by speech signals with several repetition rates randomly selected in the range of one to fifty Hz, and then in the underwater receiver laser pulse repetition rate and data can be acquired by the preamble and information codes of the corresponding laser-generated sound. When the energy of the laser pulse is appropriate, real-time transmission for speaker-independent speech can be realized in that way, which solves the problem of underwater bandwidth resource and provides a technical approach for the air-sea communication.
Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems.

Science.gov (United States)

Greene, Beth G; Logan, John S; Pisoni, David B

1986-03-01

We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered.
Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems

Science.gov (United States)

GREENE, BETH G.; LOGAN, JOHN S.; PISONI, DAVID B.

2012-01-01

We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered. PMID:23225916
Comprehension of synthetic speech and digitized natural speech by adults with aphasia.

Science.gov (United States)

Hux, Karen; Knollman-Porter, Kelly; Brown, Jessica; Wallace, Sarah E

2017-09-01

Using text-to-speech technology to provide simultaneous written and auditory content presentation may help compensate for chronic reading challenges if people with aphasia can understand synthetic speech output; however, inherent auditory comprehension challenges experienced by people with aphasia may make understanding synthetic speech difficult. This study's purpose was to compare the preferences and auditory comprehension accuracy of people with aphasia when listening to sentences generated with digitized natural speech, Alex synthetic speech (i.e., Macintosh platform), or David synthetic speech (i.e., Windows platform). The methodology required each of 20 participants with aphasia to select one of four images corresponding in meaning to each of 60 sentences comprising three stimulus sets. Results revealed significantly better accuracy given digitized natural speech than either synthetic speech option; however, individual participant performance analyses revealed three patterns: (a) comparable accuracy regardless of speech condition for 30% of participants, (b) comparable accuracy between digitized natural speech and one, but not both, synthetic speech option for 45% of participants, and (c) greater accuracy with digitized natural speech than with either synthetic speech option for remaining participants. Ranking and Likert-scale rating data revealed a preference for digitized natural speech and David synthetic speech over Alex synthetic speech. Results suggest many individuals with aphasia can comprehend synthetic speech options available on popular operating systems. Further examination of synthetic speech use to support reading comprehension through text-to-speech technology is thus warranted. Copyright © 2017 Elsevier Inc. All rights reserved.
Speech Acquisition and Automatic Speech Recognition for Integrated Spacesuit Audio Systems

Science.gov (United States)

Huang, Yiteng; Chen, Jingdong; Chen, Shaoyan

2010-01-01

A voice-command human-machine interface system has been developed for spacesuit extravehicular activity (EVA) missions. A multichannel acoustic signal processing method has been created for distant speech acquisition in noisy and reverberant environments. This technology reduces noise by exploiting differences in the statistical nature of signal (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, the automatic speech recognition (ASR) accuracy can be improved to the level at which crewmembers would find the speech interface useful. The developed speech human/machine interface will enable both crewmember usability and operational efficiency. It can enjoy a fast rate of data/text entry, small overall size, and can be lightweight. In addition, this design will free the hands and eyes of a suited crewmember. The system components and steps include beam forming/multi-channel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, model adaption, ASR HMM (Hidden Markov Model) training, and ASR decoding. A state-of-the-art phoneme recognizer can obtain an accuracy rate of 65 percent when the training and testing data are free of noise. When it is used in spacesuits, the rate drops to about 33 percent. With the developed microphone array speech-processing technologies, the performance is improved and the phoneme recognition accuracy rate rises to 44 percent. The recognizer can be further improved by combining the microphone array and HMM model adaptation techniques and using speech samples collected from inside spacesuits. In addition, arithmetic complexity models for the major HMMbased ASR components were developed. They can help real-time ASR system designers select proper tasks when in the face of constraints in computational resources.
Can You Understand Me? Speaking Robots and Accented Speech

Science.gov (United States)

Moussalli, Souheila; Cardoso, Walcir

2017-01-01

The results of our previous research on the pedagogical use of Speaking Robots (SRs) revealed positive effects on motivating students to practice their oral skills in a stress-free environment. However, our findings indicated that the SR was sometimes unable to understand students' foreign accented speech. In this paper, we report the results of a…
An Investigation of Secondary Teachers’ Understanding and Belief on Mathematical Problem Solving

Science.gov (United States)

Yuli Eko Siswono, Tatag; Wachidul Kohar, Ahmad; Kurniasari, Ika; Puji Astuti, Yuliani

2016-02-01

Weaknesses on problem solving of Indonesian students as reported by recent international surveys give rise to questions on how Indonesian teachers bring out idea of problem solving in mathematics lesson. An explorative study was undertaken to investigate how secondary teachers who teach mathematics at junior high school level understand and show belief toward mathematical problem solving. Participants were teachers from four cities in East Java province comprising 45 state teachers and 25 private teachers. Data was obtained through questionnaires and written test. The results of this study point out that the teachers understand pedagogical problem solving knowledge well as indicated by high score of observed teachers‘ responses showing understanding on problem solving as instruction as well as implementation of problem solving in teaching practice. However, they less understand on problem solving content knowledge such as problem solving strategies and meaning of problem itself. Regarding teacher's difficulties, teachers admitted to most frequently fail in (1) determining a precise mathematical model or strategies when carrying out problem solving steps which is supported by data of test result that revealed transformation error as the most frequently observed errors in teachers’ work and (2) choosing suitable real situation when designing context-based problem solving task. Meanwhile, analysis of teacher's beliefs on problem solving shows that teachers tend to view both mathematics and how students should learn mathematics as body static perspective, while they tend to believe to apply idea of problem solving as dynamic approach when teaching mathematics.
Shape understanding system machine understanding and human understanding

CERN Document Server

Les, Zbigniew

2015-01-01

This is the third book presenting selected results of research on the further development of the shape understanding system (SUS) carried out by authors in the newly founded Queen Jadwiga Research Institute of Understanding. In this book the new term Machine Understanding is introduced referring to a new area of research aiming to investigate the possibility of building machines with the ability to understand. It is presented that SUS needs to some extent mimic human understanding and for this reason machines are evaluated according to the rules applied for the evaluation of human understanding. The book shows how to formulate problems and how it can be tested if the machine is able to solve these problems.

The Contribution of Cognitive Factors to Individual Differences in Understanding Noise-Vocoded Speech in Young and Older Adults

Directory of Open Access Journals (Sweden)

Stephanie Rosemann

2017-06-01

Full Text Available Noise-vocoded speech is commonly used to simulate the sensation after cochlear implantation as it consists of spectrally degraded speech. High individual variability exists in learning to understand both noise-vocoded speech and speech perceived through a cochlear implant (CI. This variability is partly ascribed to differing cognitive abilities like working memory, verbal skills or attention. Although clinically highly relevant, up to now, no consensus has been achieved about which cognitive factors exactly predict the intelligibility of speech in noise-vocoded situations in healthy subjects or in patients after cochlear implantation. We aimed to establish a test battery that can be used to predict speech understanding in patients prior to receiving a CI. Young and old healthy listeners completed a noise-vocoded speech test in addition to cognitive tests tapping on verbal memory, working memory, lexicon and retrieval skills as well as cognitive flexibility and attention. Partial-least-squares analysis revealed that six variables were important to significantly predict vocoded-speech performance. These were the ability to perceive visually degraded speech tested by the Text Reception Threshold, vocabulary size assessed with the Multiple Choice Word Test, working memory gauged with the Operation Span Test, verbal learning and recall of the Verbal Learning and Retention Test and task switching abilities tested by the Comprehensive Trail-Making Test. Thus, these cognitive abilities explain individual differences in noise-vocoded speech understanding and should be considered when aiming to predict hearing-aid outcome.
Source-system windowing for speech analysis

NARCIS (Netherlands)

Yegnanarayana, B.; Satyanarayana Murthy, P.; Eggen, J.H.

1993-01-01

In this paper we propose a speech-analysis method to bring out characteristics of the vocal tract system in short segments which are much less than a pitch period. The method performs windowing in the source and system components of the speech signal and recombines them to obtain a signal reflecting
Understanding the Linguistic Characteristics of the Great Speeches

OpenAIRE

Mouritzen, Kristian

2016-01-01

This dissertation attempts to find the common traits of great speeches. It does so by closely examining the language of some of the most well-known speeches in world. These speeches are presented in the book Speeches that Changed the World (2006) by Simon Sebag Montefiore. The dissertation specifically looks at four variables: The beginnings and endings of the speeches, the use of passive voice, the use of personal pronouns and the difficulty of the language. These four variables are based on...
Speech recognition systems on the Cell Broadband Engine

Energy Technology Data Exchange (ETDEWEB)

Liu, Y; Jones, H; Vaidya, S; Perrone, M; Tydlitat, B; Nanda, A

2007-04-20

In this paper we describe our design, implementation, and first results of a prototype connected-phoneme-based speech recognition system on the Cell Broadband Engine{trademark} (Cell/B.E.). Automatic speech recognition decodes speech samples into plain text (other representations are possible) and must process samples at real-time rates. Fortunately, the computational tasks involved in this pipeline are highly data-parallel and can receive significant hardware acceleration from vector-streaming architectures such as the Cell/B.E. Identifying and exploiting these parallelism opportunities is challenging, but also critical to improving system performance. We observed, from our initial performance timings, that a single Cell/B.E. processor can recognize speech from thousands of simultaneous voice channels in real time--a channel density that is orders-of-magnitude greater than the capacity of existing software speech recognizers based on CPUs (central processing units). This result emphasizes the potential for Cell/B.E.-based speech recognition and will likely lead to the future development of production speech systems using Cell/B.E. clusters.
Speect: a multilingual text-to-speech system

CSIR Research Space (South Africa)

Louw, JA

2008-11-01

Full Text Available This paper introduces a new multilingual text-to-speech system, which we call Speect (Speech synthesis with extensible architecture), aiming to address the shortcomings of using Festival as a research sytem and Flite as a deployment system in a...
Done Wrong or Said Wrong? Young Children Understand the Normative Directions of Fit of Different Speech Acts

Science.gov (United States)

Rakoczy, Hannes; Tomasello, Michael

2009-01-01

Young children use and comprehend different kinds of speech acts from the beginning of their communicative development. But it is not clear how they understand the conventional and normative structure of such speech acts. In particular, imperative speech acts have a world-to-word direction of fit, such that their fulfillment means that the world…
Investigation of the relationship between students' problem solving and conceptual understanding of electricity

Science.gov (United States)

Cobanoglu Aktan, Derya

The purpose of this study was to investigate the relationship between students' qualitative problem solving and conceptual understanding of electricity. For the analysis data were collected from observations of group problem solving, from their homework artifacts, and from semi-structured interviews. The data for six undergraduate students were analyzed by qualitative research methods. The students in the study were found to use tools (such as computer simulations and formulas) differently from one another, and they made different levels of interpretations for the electricity representations. Consequently each student had different problem solving strategies. The students exhibited a wide range of levels of understanding of the electricity concepts. It was found that students' conceptual understandings and their problem solving strategies were closely linked with one another. The students who tended to use multiple tools to make high level interpretations for representations to arrive at a single solution exhibited a higher level of understanding than the students who tended to use tools to make low level interpretations to reach a solution. This study demonstrates a relationship between conceptual understanding and problem solving strategies. Similar to the results of the existing research on students' quantitative problem solving, it was found that students were able to give correct answers to some problems without fully understanding the concepts behind the problem. However, some problems required a conceptual understanding in order for a student to arrive at a correct answer. An implication of this study is that careful selection of qualitative questions is necessary for capturing high levels of conceptual understanding. Additionally, conceptual understanding among some types of problem solvers can be improved by activities or tasks that can help them reflect on their problem solving strategies and the tools they use.
Contributions of speech science to the technology of man-machine voice interactions

Science.gov (United States)

Lea, Wayne A.

1977-01-01

Research in speech understanding was reviewed. Plans which include prosodics research, phonological rules for speech understanding systems, and continued interdisciplinary phonetics research are discussed. Improved acoustic phonetic analysis capabilities in speech recognizers are suggested.
Understanding Political Influence in Modern-Era Conflict:A Qualitative Historical Analysis of Hassan Nasrallah’s Speeches

Directory of Open Access Journals (Sweden)

Reem Abu-Lughod

2012-09-01

Full Text Available Understanding Political Influence in Modern-Era Conflict: A Qualitative Historical Analysis of Hassan Nasrallah’s Speeches 'Abstract' 'This research examines and closely analyzes speeches delivered by Hezbollah’s secretary general and spokesman, Hassan Nasrallah, from a content analysis perspective. We reveal that several significant political phenomena that have occurred in Lebanon were impacted by the intensity of speeches delivered by Nasrallah; these three events being the 2006 War, the Doha Agreement, and the 2008 prisoner exchange. Data has been collected from transcribed speeches and analyzed using a qualitative historical analysis. Furthermore, we use latent analysis to assess Nasrallah’s underlying implications of his speeches and identify the themes he uses to influence his audience.'
Preliminary Analysis of Automatic Speech Recognition and Synthesis Technology.

Science.gov (United States)

1983-05-01

ANDELES CA 0 SHDAP ET AL MAY 93 UNCISSIFED UCG -020-8 MDA04-8’-C-415F/ 17/2 N mE = h IEEE 11111 10’ ~ 2.0 11-41 & 11111I25IID MICROCOPY RESOLUTION TEST...speech. Private industry, which sees a major market for improved speech recognition systems, is attempting to solve the problems involved in...manufacturer is able to market such a recognition system. A second requirement for the spotting of keywords in distress signals concerns the need for a
Influence of directionality and maximal power output on speech understanding with bone anchored hearing implants in single sided deafness

OpenAIRE

Krempaska, Silvia; Koval, Juraj; Schmid, Christoph; Pfiffner, Flurin; Kurz, Anja; Kompis, Martin

2014-01-01

Bone-anchored hearing implants (BAHI) are routinely used to alleviate the effects of the acoustic head shadow in single-sided sensorineural deafness (SSD). In this study, the influence of the directional microphone setting and the maximum power output of the BAHI sound processor on speech understanding in noise in a laboratory setting were investigated. Eight adult BAHI users with SSD participated in this pilot study. Speech understanding in noise was measured using a new Slovak speech-in-noi...
Design and realisation of an audiovisual speech activity detector

NARCIS (Netherlands)

Van Bree, K.C.

2006-01-01

For many speech telecommunication technologies a robust speech activity detector is important. An audio-only speech detector will givefalse positives when the interfering signal is speech or has speech characteristics. The modality video is suitable to solve this problem. In this report the approach
A Development of a System Enables Character Input and PC Operation via Voice for a Physically Disabled Person with a Speech Impediment

Science.gov (United States)

Tanioka, Toshimasa; Egashira, Hiroyuki; Takata, Mayumi; Okazaki, Yasuhisa; Watanabe, Kenzi; Kondo, Hiroki

We have designed and implemented a PC operation support system for a physically disabled person with a speech impediment via voice. Voice operation is an effective method for a physically disabled person with involuntary movement of the limbs and the head. We have applied a commercial speech recognition engine to develop our system for practical purposes. Adoption of a commercial engine reduces development cost and will contribute to make our system useful to another speech impediment people. We have customized commercial speech recognition engine so that it can recognize the utterance of a person with a speech impediment. We have restricted the words that the recognition engine recognizes and separated a target words from similar words in pronunciation to avoid misrecognition. Huge number of words registered in commercial speech recognition engines cause frequent misrecognition for speech impediments' utterance, because their utterance is not clear and unstable. We have solved this problem by narrowing the choice of input down in a small number and also by registering their ambiguous pronunciations in addition to the original ones. To realize all character inputs and all PC operation with a small number of words, we have designed multiple input modes with categorized dictionaries and have introduced two-step input in each mode except numeral input to enable correct operation with small number of words. The system we have developed is in practical level. The first author of this paper is physically disabled with a speech impediment. He has been able not only character input into PC but also to operate Windows system smoothly by using this system. He uses this system in his daily life. This paper is written by him with this system. At present, the speech recognition is customized to him. It is, however, possible to customize for other users by changing words and registering new pronunciation according to each user's utterance.
Improving Understanding of Emotional Speech Acoustic Content

Science.gov (United States)

Tinnemore, Anna

Children with cochlear implants show deficits in identifying emotional intent of utterances without facial or body language cues. A known limitation to cochlear implants is the inability to accurately portray the fundamental frequency contour of speech which carries the majority of information needed to identify emotional intent. Without reliable access to the fundamental frequency, other methods of identifying vocal emotion, if identifiable, could be used to guide therapies for training children with cochlear implants to better identify vocal emotion. The current study analyzed recordings of adults speaking neutral sentences with a set array of emotions in a child-directed and adult-directed manner. The goal was to identify acoustic cues that contribute to emotion identification that may be enhanced in child-directed speech, but are also present in adult-directed speech. Results of this study showed that there were significant differences in the variation of the fundamental frequency, the variation of intensity, and the rate of speech among emotions and between intended audiences.
Requirements for the evaluation of computational speech segregation systems

DEFF Research Database (Denmark)

May, Tobias; Dau, Torsten

2014-01-01

Recent studies on computational speech segregation reported improved speech intelligibility in noise when estimating and applying an ideal binary mask with supervised learning algorithms. However, an important requirement for such systems in technical applications is their robustness to acoustic...... associated with perceptual attributes in speech segregation. The results could help establish a framework for a systematic evaluation of future segregation systems....
Impact of Hearing Aid Technology on Outcomes in Daily Life II: Speech Understanding and Listening Effort.

Science.gov (United States)

Johnson, Jani A; Xu, Jingjing; Cox, Robyn M

2016-01-01

Modern hearing aid (HA) devices include a collection of acoustic signal-processing features designed to improve listening outcomes in a variety of daily auditory environments. Manufacturers market these features at successive levels of technological sophistication. The features included in costlier premium hearing devices are designed to result in further improvements to daily listening outcomes compared with the features included in basic hearing devices. However, independent research has not substantiated such improvements. This research was designed to explore differences in speech-understanding and listening-effort outcomes for older adults using premium-feature and basic-feature HAs in their daily lives. For this participant-blinded, repeated, crossover trial 45 older adults (mean age 70.3 years) with mild-to-moderate sensorineural hearing loss wore each of four pairs of bilaterally fitted HAs for 1 month. HAs were premium- and basic-feature devices from two major brands. After each 1-month trial, participants' speech-understanding and listening-effort outcomes were evaluated in the laboratory and in daily life. Three types of speech-understanding and listening-effort data were collected: measures of laboratory performance, responses to standardized self-report questionnaires, and participant diary entries about daily communication. The only statistically significant superiority for the premium-feature HAs occurred for listening effort in the loud laboratory condition and was demonstrated for only one of the tested brands. The predominant complaint of older adults with mild-to-moderate hearing impairment is difficulty understanding speech in various settings. The combined results of all the outcome measures used in this research suggest that, when fitted using scientifically based practices, both premium- and basic-feature HAs are capable of providing considerable, but essentially equivalent, improvements to speech understanding and listening effort in daily
Variable Span Filters for Speech Enhancement

DEFF Research Database (Denmark)

Jensen, Jesper Rindom; Benesty, Jacob; Christensen, Mads Græsbøll

2016-01-01

In this work, we consider enhancement of multichannel speech recordings. Linear filtering and subspace approaches have been considered previously for solving the problem. The current linear filtering methods, although many variants exist, have limited control of noise reduction and speech...
Automatic speech recognition (ASR) based approach for speech therapy of aphasic patients: A review

Science.gov (United States)

Jamal, Norezmi; Shanta, Shahnoor; Mahmud, Farhanahani; Sha'abani, MNAH

2017-09-01

This paper reviews the state-of-the-art an automatic speech recognition (ASR) based approach for speech therapy of aphasic patients. Aphasia is a condition in which the affected person suffers from speech and language disorder resulting from a stroke or brain injury. Since there is a growing body of evidence indicating the possibility of improving the symptoms at an early stage, ASR based solutions are increasingly being researched for speech and language therapy. ASR is a technology that transfers human speech into transcript text by matching with the system's library. This is particularly useful in speech rehabilitation therapies as they provide accurate, real-time evaluation for speech input from an individual with speech disorder. ASR based approaches for speech therapy recognize the speech input from the aphasic patient and provide real-time feedback response to their mistakes. However, the accuracy of ASR is dependent on many factors such as, phoneme recognition, speech continuity, speaker and environmental differences as well as our depth of knowledge on human language understanding. Hence, the review examines recent development of ASR technologies and its performance for individuals with speech and language disorders.
Speech understanding in noise with integrated in-ear and muff-style hearing protection systems

Directory of Open Access Journals (Sweden)

Sharon M Abel

2011-01-01

Full Text Available Integrated hearing protection systems are designed to enhance free field and radio communications during military operations while protecting against the damaging effects of high-level noise exposure. A study was conducted to compare the effect of increasing the radio volume on the intelligibility of speech over the radios of two candidate systems, in-ear and muff-style, in 85-dBA speech babble noise presented free field. Twenty normal-hearing, English-fluent subjects, half male and half female, were tested in same gender pairs. Alternating as talker and listener, their task was to discriminate consonant-vowel-consonant syllables that contrasted either the initial or final consonant. Percent correct consonant discrimination increased with increases in the radio volume. At the highest volume, subjects achieved 79% with the in-ear device but only 69% with the muff-style device, averaged across the gender of listener/talker pairs and consonant position. Although there was no main effect of gender, female listener/talkers showed a 10% advantage for the final consonant and male listener/talkers showed a 1% advantage for the initial consonant. These results indicate that normal hearing users can achieve reasonably high radio communication scores with integrated in-ear hearing protection in moderately high-level noise that provides both energetic and informational masking. The adequacy of the range of available radio volumes for users with hearing loss has yet to be determined.
Automatic Speech Recognition Systems for the Evaluation of Voice and Speech Disorders in Head and Neck Cancer

Directory of Open Access Journals (Sweden)

Andreas Maier

2010-01-01

Full Text Available In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appropriate for objective and quick evaluation of intelligibility. In this study we investigate the applicability of the method to speech disorders caused by head and neck cancer. Intelligibility was quantified by speech recognition on recordings of a standard text read by 41 German laryngectomized patients with cancer of the larynx or hypopharynx and 49 German patients who had suffered from oral cancer. The speech recognition provides the percentage of correctly recognized words of a sequence, that is, the word recognition rate. Automatic evaluation was compared to perceptual ratings by a panel of experts and to an age-matched control group. Both patient groups showed significantly lower word recognition rates than the control group. Automatic speech recognition yielded word recognition rates which complied with experts' evaluation of intelligibility on a significant level. Automatic speech recognition serves as a good means with low effort to objectify and quantify the most important aspect of pathologic speech—the intelligibility. The system was successfully applied to voice and speech disorders.

From birdsong to human speech recognition: bayesian inference on a hierarchy of nonlinear dynamical systems.

Science.gov (United States)

Yildiz, Izzet B; von Kriegstein, Katharina; Kiebel, Stefan J

2013-01-01

Our knowledge about the computational mechanisms underlying human learning and recognition of sound sequences, especially speech, is still very limited. One difficulty in deciphering the exact means by which humans recognize speech is that there are scarce experimental findings at a neuronal, microscopic level. Here, we show that our neuronal-computational understanding of speech learning and recognition may be vastly improved by looking at an animal model, i.e., the songbird, which faces the same challenge as humans: to learn and decode complex auditory input, in an online fashion. Motivated by striking similarities between the human and songbird neural recognition systems at the macroscopic level, we assumed that the human brain uses the same computational principles at a microscopic level and translated a birdsong model into a novel human sound learning and recognition model with an emphasis on speech. We show that the resulting Bayesian model with a hierarchy of nonlinear dynamical systems can learn speech samples such as words rapidly and recognize them robustly, even in adverse conditions. In addition, we show that recognition can be performed even when words are spoken by different speakers and with different accents-an everyday situation in which current state-of-the-art speech recognition models often fail. The model can also be used to qualitatively explain behavioral data on human speech learning and derive predictions for future experiments.
From birdsong to human speech recognition: bayesian inference on a hierarchy of nonlinear dynamical systems.

Directory of Open Access Journals (Sweden)

Izzet B Yildiz

Full Text Available Our knowledge about the computational mechanisms underlying human learning and recognition of sound sequences, especially speech, is still very limited. One difficulty in deciphering the exact means by which humans recognize speech is that there are scarce experimental findings at a neuronal, microscopic level. Here, we show that our neuronal-computational understanding of speech learning and recognition may be vastly improved by looking at an animal model, i.e., the songbird, which faces the same challenge as humans: to learn and decode complex auditory input, in an online fashion. Motivated by striking similarities between the human and songbird neural recognition systems at the macroscopic level, we assumed that the human brain uses the same computational principles at a microscopic level and translated a birdsong model into a novel human sound learning and recognition model with an emphasis on speech. We show that the resulting Bayesian model with a hierarchy of nonlinear dynamical systems can learn speech samples such as words rapidly and recognize them robustly, even in adverse conditions. In addition, we show that recognition can be performed even when words are spoken by different speakers and with different accents-an everyday situation in which current state-of-the-art speech recognition models often fail. The model can also be used to qualitatively explain behavioral data on human speech learning and derive predictions for future experiments.
Contribution of auditory working memory to speech understanding in mandarin-speaking cochlear implant users.

Science.gov (United States)

Tao, Duoduo; Deng, Rui; Jiang, Ye; Galvin, John J; Fu, Qian-Jie; Chen, Bing

2014-01-01

To investigate how auditory working memory relates to speech perception performance by Mandarin-speaking cochlear implant (CI) users. Auditory working memory and speech perception was measured in Mandarin-speaking CI and normal-hearing (NH) participants. Working memory capacity was measured using forward digit span and backward digit span; working memory efficiency was measured using articulation rate. Speech perception was assessed with: (a) word-in-sentence recognition in quiet, (b) word-in-sentence recognition in speech-shaped steady noise at +5 dB signal-to-noise ratio, (c) Chinese disyllable recognition in quiet, (d) Chinese lexical tone recognition in quiet. Self-reported school rank was also collected regarding performance in schoolwork. There was large inter-subject variability in auditory working memory and speech performance for CI participants. Working memory and speech performance were significantly poorer for CI than for NH participants. All three working memory measures were strongly correlated with each other for both CI and NH participants. Partial correlation analyses were performed on the CI data while controlling for demographic variables. Working memory efficiency was significantly correlated only with sentence recognition in quiet when working memory capacity was partialled out. Working memory capacity was correlated with disyllable recognition and school rank when efficiency was partialled out. There was no correlation between working memory and lexical tone recognition in the present CI participants. Mandarin-speaking CI users experience significant deficits in auditory working memory and speech performance compared with NH listeners. The present data suggest that auditory working memory may contribute to CI users' difficulties in speech understanding. The present pattern of results with Mandarin-speaking CI users is consistent with previous auditory working memory studies with English-speaking CI users, suggesting that the lexical importance
Associations between speech understanding and auditory and visual tests of verbal working memory: effects of linguistic complexity, task, age, and hearing loss.

Science.gov (United States)

Smith, Sherri L; Pichora-Fuller, M Kathleen

2015-01-01

Listeners with hearing loss commonly report having difficulty understanding speech, particularly in noisy environments. Their difficulties could be due to auditory and cognitive processing problems. Performance on speech-in-noise tests has been correlated with reading working memory span (RWMS), a measure often chosen to avoid the effects of hearing loss. If the goal is to assess the cognitive consequences of listeners' auditory processing abilities, however, then listening working memory span (LWMS) could be a more informative measure. Some studies have examined the effects of different degrees and types of masking on working memory, but less is known about the demands placed on working memory depending on the linguistic complexity of the target speech or the task used to measure speech understanding in listeners with hearing loss. Compared to RWMS, LWMS measures using different speech targets and maskers may provide a more ecologically valid approach. To examine the contributions of RWMS and LWMS to speech understanding, we administered two working memory measures (a traditional RWMS measure and a new LWMS measure), and a battery of tests varying in the linguistic complexity of the speech materials, the presence of babble masking, and the task. Participants were a group of younger listeners with normal hearing and two groups of older listeners with hearing loss (n = 24 per group). There was a significant group difference and a wider range in performance on LWMS than on RWMS. There was a significant correlation between both working memory measures only for the oldest listeners with hearing loss. Notably, there were only few significant correlations among the working memory and speech understanding measures. These findings suggest that working memory measures reflect individual differences that are distinct from those tapped by these measures of speech understanding.
Associations between speech understanding and auditory and visual tests of verbal working memory: effects of linguistic complexity, task, age, and hearing loss

Science.gov (United States)

Smith, Sherri L.; Pichora-Fuller, M. Kathleen

2015-01-01

Listeners with hearing loss commonly report having difficulty understanding speech, particularly in noisy environments. Their difficulties could be due to auditory and cognitive processing problems. Performance on speech-in-noise tests has been correlated with reading working memory span (RWMS), a measure often chosen to avoid the effects of hearing loss. If the goal is to assess the cognitive consequences of listeners’ auditory processing abilities, however, then listening working memory span (LWMS) could be a more informative measure. Some studies have examined the effects of different degrees and types of masking on working memory, but less is known about the demands placed on working memory depending on the linguistic complexity of the target speech or the task used to measure speech understanding in listeners with hearing loss. Compared to RWMS, LWMS measures using different speech targets and maskers may provide a more ecologically valid approach. To examine the contributions of RWMS and LWMS to speech understanding, we administered two working memory measures (a traditional RWMS measure and a new LWMS measure), and a battery of tests varying in the linguistic complexity of the speech materials, the presence of babble masking, and the task. Participants were a group of younger listeners with normal hearing and two groups of older listeners with hearing loss (n = 24 per group). There was a significant group difference and a wider range in performance on LWMS than on RWMS. There was a significant correlation between both working memory measures only for the oldest listeners with hearing loss. Notably, there were only few significant correlations among the working memory and speech understanding measures. These findings suggest that working memory measures reflect individual differences that are distinct from those tapped by these measures of speech understanding. PMID:26441769
Frontal and temporal contributions to understanding the iconic co-speech gestures that accompany speech.

Science.gov (United States)

Dick, Anthony Steven; Mok, Eva H; Raja Beharelle, Anjali; Goldin-Meadow, Susan; Small, Steven L

2014-03-01

In everyday conversation, listeners often rely on a speaker's gestures to clarify any ambiguities in the verbal message. Using fMRI during naturalistic story comprehension, we examined which brain regions in the listener are sensitive to speakers' iconic gestures. We focused on iconic gestures that contribute information not found in the speaker's talk, compared with those that convey information redundant with the speaker's talk. We found that three regions-left inferior frontal gyrus triangular (IFGTr) and opercular (IFGOp) portions, and left posterior middle temporal gyrus (MTGp)--responded more strongly when gestures added information to nonspecific language, compared with when they conveyed the same information in more specific language; in other words, when gesture disambiguated speech as opposed to reinforced it. An increased BOLD response was not found in these regions when the nonspecific language was produced without gesture, suggesting that IFGTr, IFGOp, and MTGp are involved in integrating semantic information across gesture and speech. In addition, we found that activity in the posterior superior temporal sulcus (STSp), previously thought to be involved in gesture-speech integration, was not sensitive to the gesture-speech relation. Together, these findings clarify the neurobiology of gesture-speech integration and contribute to an emerging picture of how listeners glean meaning from gestures that accompany speech. Copyright © 2012 Wiley Periodicals, Inc.
Students’ Relational Understanding in Quadrilateral Problem Solving Based on Adversity Quotient

Science.gov (United States)

Safitri, A. N.; Juniati, D.; Masriyah

2018-01-01

The type of research is qualitative approach which aims to describe how students’ relational understanding of solving mathematic problem that was seen from Adversity Quotient aspect. Research subjects were three 7th grade students of Junior High School. They were taken by category of Adversity Quotient (AQ) such quitter, camper, and climber. Data collected based on problem solving and interview. The research result showed that (1) at the stage of understanding the problem, the subjects were able to state and write down what is known and asked, and able to mention the concepts associated with the quadrilateral problem. (2) The three subjects devise a plan by linking concepts relating to quadrilateral problems. (3) The three subjects were able to solve the problem. (4) The three subjects were able to look back the answers. The three subjects were able to understand the problem, devise a plan, carry out the plan and look back. However, the quitter and camper subjects have not been able to give a reason for the steps they have taken.
Speech-language pathology students' self-reports on voice training: easier to understand or to do?

Science.gov (United States)

Lindhe, Christina; Hartelius, Lena

2009-01-01

The aim of the study was to describe the subjective ratings of the course 'Training of the student's own voice and speech', from a student-centred perspective. A questionnaire was completed after each of the six individual sessions. Six speech and language pathology (SLP) students rated how they perceived the practical exercises in terms of doing and understanding. The results showed that five of the six participants rated the exercises as significantly easier to understand than to do. The exercises were also rated as easier to do over time. Results are interpreted within in a theoretical framework of approaches to learning. The findings support the importance of both the physical and reflective aspects of the voice training process.
Computer-assisted CI fitting: Is the learning capacity of the intelligent agent FOX beneficial for speech understanding?

Science.gov (United States)

Meeuws, Matthias; Pascoal, David; Bermejo, Iñigo; Artaso, Miguel; De Ceulaer, Geert; Govaerts, Paul J

2017-07-01

The software application FOX ('Fitting to Outcome eXpert') is an intelligent agent to assist in the programing of cochlear implant (CI) processors. The current version utilizes a mixture of deterministic and probabilistic logic which is able to improve over time through a learning effect. This study aimed at assessing whether this learning capacity yields measurable improvements in speech understanding. A retrospective study was performed on 25 consecutive CI recipients with a median CI use experience of 10 years who came for their annual CI follow-up fitting session. All subjects were assessed by means of speech audiometry with open set monosyllables at 40, 55, 70, and 85 dB SPL in quiet with their home MAP. Other psychoacoustic tests were executed depending on the audiologist's clinical judgment. The home MAP and the corresponding test results were entered into FOX. If FOX suggested to make MAP changes, they were implemented and another speech audiometry was performed with the new MAP. FOX suggested MAP changes in 21 subjects (84%). The within-subject comparison showed a significant median improvement of 10, 3, 1, and 7% at 40, 55, 70, and 85 dB SPL, respectively. All but two subjects showed an instantaneous improvement in their mean speech audiometric score. Persons with long-term CI use, who received a FOX-assisted CI fitting at least 6 months ago, display improved speech understanding after MAP modifications, as recommended by the current version of FOX. This can be explained only by intrinsic improvements in FOX's algorithms, as they have resulted from learning. This learning is an inherent feature of artificial intelligence and it may yield measurable benefit in speech understanding even in long-term CI recipients.
Source Separation via Spectral Masking for Speech Recognition Systems

Directory of Open Access Journals (Sweden)

Gustavo Fernandes Rodrigues

2012-12-01

Full Text Available In this paper we present an insight into the use of spectral masking techniques in time-frequency domain, as a preprocessing step for the speech signal recognition. Speech recognition systems have their performance negatively affected in noisy environments or in the presence of other speech signals. The limits of these masking techniques for different levels of the signal-to-noise ratio are discussed. We show the robustness of the spectral masking techniques against four types of noise: white, pink, brown and human speech noise (bubble noise. The main contribution of this work is to analyze the performance limits of recognition systems using spectral masking. We obtain an increase of 18% on the speech hit rate, when the speech signals were corrupted by other speech signals or bubble noise, with different signal-to-noise ratio of approximately 1, 10 and 20 dB. On the other hand, applying the ideal binary masks to mixtures corrupted by white, pink and brown noise, results an average growth of 9% on the speech hit rate, with the same different signal-to-noise ratio. The experimental results suggest that the masking spectral techniques are more suitable for the case when it is applied a bubble noise, which is produced by human speech, than for the case of applying white, pink and brown noise.
Speech Motor Control in Fluent and Dysfluent Speech Production of an Individual with Apraxia of Speech and Broca's Aphasia

Science.gov (United States)

van Lieshout, Pascal H. H. M.; Bose, Arpita; Square, Paula A.; Steele, Catriona M.

2007-01-01

Apraxia of speech (AOS) is typically described as a motor-speech disorder with clinically well-defined symptoms, but without a clear understanding of the underlying problems in motor control. A number of studies have compared the speech of subjects with AOS to the fluent speech of controls, but only a few have included speech movement data and if…
Automatic Speech Recognition Systems for the Evaluation of Voice and Speech Disorders in Head and Neck Cancer

OpenAIRE

Andreas Maier; Tino Haderlein; Florian Stelzle; Elmar Nöth; Emeka Nkenke; Frank Rosanowski; Anne Schützenberger; Maria Schuster

2010-01-01

In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appropriate for objective and quick evaluation of intelligibility. In this study we investigate the applicability of the method to speech disorders caused by head and neck cancer. Intelligibility was quantified by speech recognition on recordings of a standard text read by 41 German laryngect...
Speech and nonspeech: What are we talking about?

Science.gov (United States)

Maas, Edwin

2017-08-01

Understanding of the behavioural, cognitive and neural underpinnings of speech production is of interest theoretically, and is important for understanding disorders of speech production and how to assess and treat such disorders in the clinic. This paper addresses two claims about the neuromotor control of speech production: (1) speech is subserved by a distinct, specialised motor control system and (2) speech is holistic and cannot be decomposed into smaller primitives. Both claims have gained traction in recent literature, and are central to a task-dependent model of speech motor control. The purpose of this paper is to stimulate thinking about speech production, its disorders and the clinical implications of these claims. The paper poses several conceptual and empirical challenges for these claims - including the critical importance of defining speech. The emerging conclusion is that a task-dependent model is called into question as its two central claims are founded on ill-defined and inconsistently applied concepts. The paper concludes with discussion of methodological and clinical implications, including the potential utility of diadochokinetic (DDK) tasks in assessment of motor speech disorders and the contraindication of nonspeech oral motor exercises to improve speech function.
Political Discourse Analysis Through Solving Problems of Graph Theory

Directory of Open Access Journals (Sweden)

Monica Patrut

2010-03-01

Full Text Available In this article, we show how, using graph theory, we can make a content analysis of political discourse. Assumptions of this analysis are:
- we have a corpus of speech of each party or candidate;
- we consider that speech conveys economic, political, socio-cultural values, these taking the form of words or word families;
- we consider that there are interdependences between the values of a political discourse; they are given by the co-occurrence of two values, as words in the text, within a well defined fragment, or they are determined by the internal logic of political discourse;
- established links between values in a political speech have associated positive numbers indicating the "power" of those links; these "powers" are defined according to both the number of co-occurrences of values, and the internal logic of the discourse where they occur.
In this context we intend to highlight the following:
a which is the dominant value in a political speech;
b which groups of values have ties between them and have no connection with the rest;
c which is the order in which political values should be set in order to obtain an equivalent but more synthetic speech compared to the already given one;
d which are the links between values that form the "core" political speech.
To solve these problems, we shall use the Political Analyst program. After that, we shall present the concepts necessary to the understanding of the introductory graph theory, useful in understanding the analysis of the software and then the operation of the program. This paper extends the previous paper [6].
Primate vocal communication: a useful tool for understanding human speech and language evolution?

Science.gov (United States)

Fedurek, Pawel; Slocombe, Katie E

2011-04-01

Language is a uniquely human trait, and questions of how and why it evolved have been intriguing scientists for years. Nonhuman primates (primates) are our closest living relatives, and their behavior can be used to estimate the capacities of our extinct ancestors. As humans and many primate species rely on vocalizations as their primary mode of communication, the vocal behavior of primates has been an obvious target for studies investigating the evolutionary roots of human speech and language. By studying the similarities and differences between human and primate vocalizations, comparative research has the potential to clarify the evolutionary processes that shaped human speech and language. This review examines some of the seminal and recent studies that contribute to our knowledge regarding the link between primate calls and human language and speech. We focus on three main aspects of primate vocal behavior: functional reference, call combinations, and vocal learning. Studies in these areas indicate that despite important differences, primate vocal communication exhibits some key features characterizing human language. They also indicate, however, that some critical aspects of speech, such as vocal plasticity, are not shared with our primate cousins. We conclude that comparative research on primate vocal behavior is a very promising tool for deepening our understanding of the evolution of human speech and language, but much is still to be done as many aspects of monkey and ape vocalizations remain largely unexplored.
Junior High School Students’ Understanding and Problem Solving Skills on the Topics of Line and Angles

Science.gov (United States)

Irsal, I. L.; Jupri, A.; Prabawanto, S.

2017-09-01

Line and angles is important topics to learn to develop the geometry skills and also mathematics skills such as understanding and problem solving skills. But, the fact was given by Indonesian researcher show that Indonesian students’ understanding and problem solving skills still low in this topics. This fact be a background to investigate students’ understanding and problem solving skills in line and angles topics. To investigate these skills, this study used descriptive-qualitative approach. Individual written test (essay) and interview was used in this study. 72 students grade 8th from one of Junior High School in Lembang, worked the written test and 18 of them were interviewed. Based on result, almost of student were have a good instrumental understanding in line and angles topic in same area, but almost all student have a low instrumental understanding in line and angles topic in different area. Almost all student have a low relational understanding. Also, almost all student have a low problem solving skills especially in make and use strategy to solve the problem and looking back their answer. Based on result there is need a meaningfulness learning strategy, which can make students build their understanding and develop their problem solving skill independently.
Neuroscience-inspired computational systems for speech recognition under noisy conditions

Science.gov (United States)

Schafer, Phillip B.

advantage of the neural representation's invariance in noise. The scheme centers on a speech similarity measure based on the longest common subsequence between spike sequences. The combined encoding and decoding scheme outperforms a benchmark system in extremely noisy acoustic conditions. Finally, I consider methods for decoding spike representations of continuous speech. To help guide the alignment of templates to words, I design a syllable detection scheme that robustly marks the locations of syllabic nuclei. The scheme combines SVM-based training with a peak selection algorithm designed to improve noise tolerance. By incorporating syllable information into the ASR system, I obtain strong recognition results in noisy conditions, although the performance in noiseless conditions is below the state of the art. The work presented here constitutes a novel approach to the problem of ASR that can be applied in the many challenging acoustic environments in which we use computer technologies today. The proposed spike-based processing methods can potentially be exploited in effcient hardware implementations and could significantly reduce the computational costs of ASR. The work also provides a framework for understanding the advantages of spike-based acoustic coding in the human brain.
Toward A Dual-Learning Systems Model of Speech Category Learning

Directory of Open Access Journals (Sweden)

Bharath eChandrasekaran

2014-07-01

Full Text Available More than two decades of work in vision posits the existence of dual-learning systems of category learning. The reflective system uses working memory to develop and test rules for classifying in an explicit fashion, while the reflexive system operates by implicitly associating perception with actions that lead to reinforcement. Dual-learning systems models hypothesize that in learning natural categories, learners initially use the reflective system and, with practice, transfer control to the reflexive system. The role of reflective and reflexive systems in auditory category learning and more specifically in speech category learning has not been systematically examined. In this article we describe a neurobiologically-constrained dual-learning systems theoretical framework that is currently being developed in speech category learning and review recent applications of this framework. Using behavioral and computational modeling approaches, we provide evidence that speech category learning is predominantly mediated by the reflexive learning system. In one application, we explore the effects of normal aging on non-speech and speech category learning. We find an age related deficit in reflective-optimal but not reflexive-optimal auditory category learning. Prominently, we find a large age-related deficit in speech learning. The computational modeling suggests that older adults are less likely to transition from simple, reflective, uni-dimensional rules to more complex, reflexive, multi-dimensional rules. In a second application we summarize a recent study examining auditory category learning in individuals with elevated depressive symptoms. We find a deficit in reflective-optimal and an enhancement in reflexive-optimal auditory category learning. Interestingly, individuals with elevated depressive symptoms also show an advantage in learning speech categories. We end with a brief summary and description of a number of future directions.
Speech enhancement theory and practice

CERN Document Server

Loizou, Philipos C

2013-01-01

With the proliferation of mobile devices and hearing devices, including hearing aids and cochlear implants, there is a growing and pressing need to design algorithms that can improve speech intelligibility without sacrificing quality. Responding to this need, Speech Enhancement: Theory and Practice, Second Edition introduces readers to the basic problems of speech enhancement and the various algorithms proposed to solve these problems. Updated and expanded, this second edition of the bestselling textbook broadens its scope to include evaluation measures and enhancement algorithms aimed at impr
Speech understanding in noise with an eyeglass hearing aid: asymmetric fitting and the head shadow benefit of anterior microphones.

Science.gov (United States)

Mens, Lucas H M

2011-01-01

To test speech understanding in noise using array microphones integrated in an eyeglass device and to test if microphones placed anteriorly at the temple provide better directivity than above the pinna. Sentences were presented from the front and uncorrelated noise from 45, 135, 225 and 315°. Fifteen hearing impaired participants with a significant speech discrimination loss were included, as well as 5 normal hearing listeners. The device (Varibel) improved speech understanding in noise compared to most conventional directional devices with a directional benefit of 5.3 dB in the asymmetric fit mode, which was not significantly different from the bilateral fully directional mode (6.3 dB). Anterior microphones outperformed microphones at a conventional position above the pinna by 2.6 dB. By integrating microphones in an eyeglass frame, a long array can be used resulting in a higher directionality index and improved speech understanding in noise. An asymmetric fit did not significantly reduce performance and can be considered to increase acceptance and environmental awareness. Directional microphones at the temple seemed to profit more from the head shadow than above the pinna, better suppressing noise from behind the listener.

Causal inference of asynchronous audiovisual speech

Directory of Open Access Journals (Sweden)

John F Magnotti

2013-11-01

Full Text Available During speech perception, humans integrate auditory information from the voice with visual information from the face. This multisensory integration increases perceptual precision, but only if the two cues come from the same talker; this requirement has been largely ignored by current models of speech perception. We describe a generative model of multisensory speech perception that includes this critical step of determining the likelihood that the voice and face information have a common cause. A key feature of the model is that it is based on a principled analysis of how an observer should solve this causal inference problem using the asynchrony between two cues and the reliability of the cues. This allows the model to make predictions abut the behavior of subjects performing a synchrony judgment task, predictive power that does not exist in other approaches, such as post hoc fitting of Gaussian curves to behavioral data. We tested the model predictions against the performance of 37 subjects performing a synchrony judgment task viewing audiovisual speech under a variety of manipulations, including varying asynchronies, intelligibility, and visual cue reliability. The causal inference model outperformed the Gaussian model across two experiments, providing a better fit to the behavioral data with fewer parameters. Because the causal inference model is derived from a principled understanding of the task, model parameters are directly interpretable in terms of stimulus and subject properties.
[Modeling developmental aspects of sensorimotor control of speech production].

Science.gov (United States)

Kröger, B J; Birkholz, P; Neuschaefer-Rube, C

2007-05-01

Detailed knowledge of the neurophysiology of speech acquisition is important for understanding the developmental aspects of speech perception and production and for understanding developmental disorders of speech perception and production. A computer implemented neural model of sensorimotor control of speech production was developed. The model is capable of demonstrating the neural functions of different cortical areas during speech production in detail. (i) Two sensory and two motor maps or neural representations and the appertaining neural mappings or projections establish the sensorimotor feedback control system. These maps and mappings are already formed and trained during the prelinguistic phase of speech acquisition. (ii) The feedforward sensorimotor control system comprises the lexical map (representations of sounds, syllables, and words of the first language) and the mappings from lexical to sensory and to motor maps. The training of the appertaining mappings form the linguistic phase of speech acquisition. (iii) Three prelinguistic learning phases--i. e. silent mouthing, quasi stationary vocalic articulation, and realisation of articulatory protogestures--can be defined on the basis of our simulation studies using the computational neural model. These learning phases can be associated with temporal phases of prelinguistic speech acquisition obtained from natural data. The neural model illuminates the detailed function of specific cortical areas during speech production. In particular it can be shown that developmental disorders of speech production may result from a delayed or incorrect process within one of the prelinguistic learning phases defined by the neural model.
Digitized Ethnic Hate Speech: Understanding Effects of Digital Media Hate Speech on Citizen Journalism in Kenya

Directory of Open Access Journals (Sweden)

Stephen Gichuhi Kimotho

2016-06-01

Full Text Available Ethnicity in Kenya permeates all spheres of life. However, it is in politics that ethnicity is most visible. Election time in Kenya often leads to ethnic competition and hatred, often expressed through various media. Ethnic hate speech characterized the 2007 general elections in party rallies and through text messages, emails, posters and leaflets. This resulted in widespread skirmishes that left over 1200 people dead, and many displaced (KNHRC, 2008. In 2013, however, the new battle zone was the war of words on social media platform. More than any other time in Kenyan history, Kenyans poured vitriolic ethnic hate speech through digital media like Facebook, tweeter and blogs. Although scholars have studied the role and effects of the mainstream media like television and radio in proliferating the ethnic hate speech in Kenya (Michael Chege, 2008; Goldstein & Rotich, 2008a; Ismail & Deane, 2008; Jacqueline Klopp & Prisca Kamungi, 2007, little has been done in regard to social media. This paper investigated the nature of digitized hate speech by: describing the forms of ethnic hate speech on social media in Kenya; the effects of ethnic hate speech on Kenyan’s perception of ethnic entities; ethnic conflict and ethics of citizen journalism. This study adopted a descriptive interpretive design, and utilized Austin’s Speech Act Theory, which explains use of language to achieve desired purposes and direct behaviour (Tarhom & Miracle, 2013. Content published between January and April 2013 from six purposefully identified blogs was analysed. Questionnaires were used to collect data from university students as they form a good sample of Kenyan population, are most active on social media and are drawn from all parts of the country. Qualitative data were analysed using NVIVO 10 software, while responses from the questionnaire were analysed using IBM SPSS version 21. The findings indicated that Facebook and Twitter were the main platforms used to
Citizens as Censors : Understanding the Limits of Free Speech in India

OpenAIRE

Tjäder, Henriette

2016-01-01

This thesis aims to provide an understanding of the phenomenon of citizen censorship in India and its implications for free speech. It is especially concerned with public protests where groups of citizens demand government action in order to ban or censor controversial material. These groups tend to invoke feelings of offense or hurt religious sentiments as a justification for restriction. The point of departure of this thesis is research on social movement outcomes and the history of Indian ...
Speech-driven environmental control systems--a qualitative analysis of users' perceptions.

Science.gov (United States)

Judge, Simon; Robertson, Zoë; Hawley, Mark; Enderby, Pam

2009-05-01

To explore users' experiences and perceptions of speech-driven environmental control systems (SPECS) as part of a larger project aiming to develop a new SPECS. The motivation for this part of the project was to add to the evidence base for the use of SPECS and to determine the key design specifications for a new speech-driven system from a user's perspective. Semi-structured interviews were conducted with 12 users of SPECS from around the United Kingdom. These interviews were transcribed and analysed using a qualitative method based on framework analysis. Reliability is the main influence on the use of SPECS. All the participants gave examples of occasions when their speech-driven system was unreliable; in some instances, this unreliability was reported as not being a problem (e.g., for changing television channels); however, it was perceived as a problem for more safety critical functions (e.g., opening a door). Reliability was cited by participants as the reason for using a switch-operated system as back up. Benefits of speech-driven systems focused on speech operation enabling access when other methods were not possible; quicker operation and better aesthetic considerations. Overall, there was a perception of increased independence from the use of speech-driven environmental control. In general, speech was considered a useful method of operating environmental controls by the participants interviewed; however, their perceptions regarding reliability often influenced their decision to have backup or alternative systems for certain functions.
Reestablishing speech understanding through musical ear training after cochlear implantation: a study of the potential cortical plasticity in the brain

DEFF Research Database (Denmark)

Petersen, Bjørn; Mortensen, Malene V; Gjedde, Albert

2009-01-01

the behavioral and neurologic effects of musical ear training on CI users' speech and music perception. The goal is to find and work out musical methods to improve CI users' auditory capabilities and, in a longer perspective, provide an efficient strategy for improving speech understanding for both adults......Cochlear implants (CIs) provide impressive speech perception for persons with severe hearing loss, but many CI recipients fail in perceiving speech prosody and music. Successful rehabilitation depends on cortical plasticity in the brain and postoperative measures. The present study evaluates...
Will smart surveillance systems listen, understand and speak Slovene?

Directory of Open Access Journals (Sweden)

Simon Dobrišek

2013-12-01

Full Text Available The paper deals with the spoken language technologies that could enable the so-called smart (intelligent surveillance systems to listen, understand and speak Slovenian in the near future. Advanced computational methods of artificial perception and pattern recognition enable such systems to be at least to some extent aware of the environment, the presence of people and other phenomena that could be subject to surveillance. Speech is one such phenomenon that has the potential to be a key source of information in certain security situations. Technologies that enable automatic speech and speaker recognition as well as their psychophysical state by computer analysis of acoustic speech signals provide an entirely new dimension to the development of smart surveillance systems. Automatic recognition of spoken threats, screaming and crying for help, as well as a suspicious psycho-physical state of a speaker provide such systems to some extent with intelligent behaviour. The paper investigates the current state of development of these technologies and the requirements and possibilities of these systems to be used for the Slovenian spoken language, as well as different possible security application scenarios. It also addresses the broader legal and ethical issues raised by the development and use of such technologies, especially as audio surveillance is one of the most sensitive issues of privacy protection.
Electronic Control System Of Home Appliances Using Speech Command Words

Directory of Open Access Journals (Sweden)

Aye Min Soe

2015-06-01

Full Text Available Abstract The main idea of this paper is to develop a speech recognition system. By using this system smart home appliances are controlled by spoken words. The spoken words chosen for recognition are Fan On Fan Off Light On Light Off TV On and TV Off. The input of the system takes speech signals to control home appliances. The proposed system has two main parts speech recognition and smart home appliances electronic control system. Speech recognition is implemented in MATLAB environment. In this process it contains two main modules feature extraction and feature matching. Mel Frequency Cepstral Coefficients MFCC is used for feature extraction. Vector Quantization VQ approach using clustering algorithm is applied for feature matching. In electrical home appliances control system RF module is used to carry command signal from PC to microcontroller wirelessly. Microcontroller is connected to driver circuit for relay and motor. The input commands are recognized very well. The system is a good performance to control home appliances by spoken words.
Do Children Understand the Basic Relationship Between Speech and Writing? The Mow-Motorcycle Test.

Science.gov (United States)

Rozin, Paul; And Others

School children (N=218) who have not yet attained moderate reading fluency were tested for their awareness of a fundamental relationship between our writing system and speech: that the sounds of speech are represented in writing. Children were shown a long and short word written on a card (e.g., mow and motorcycle), and asked which word…
Intelligibility of speech of children with speech and sound disorders

OpenAIRE

Ivetac, Tina

2014-01-01

The purpose of this study is to examine speech intelligibility of children with primary speech and sound disorders aged 3 to 6 years in everyday life. The research problem is based on the degree to which parents or guardians, immediate family members (sister, brother, grandparents), extended family members (aunt, uncle, cousin), child's friends, other acquaintances, child's teachers and strangers understand the speech of children with speech sound disorders. We examined whether the level ...
SOME EXAMPLES OF APPLIED SYSTEMS WITH SPEECH INTERFACE

Directory of Open Access Journals (Sweden)

V. A. Zhitko

2017-01-01

Full Text Available Three examples of applied systems with a speech interface are considered in the article. The first two of these provide the end user with the opportunity to ask verbally the question and to hear the response from the system, creating an addition to the traditional I / O via the keyboard and computer screen. The third example, the «IntonTrainer» system, provides the user with the possibility of voice interaction and is designed for in-depth self-learning of the intonation of oral speech.
Didactic speech synthesizer – acoustic module, formants model

OpenAIRE

Teixeira, João Paulo; Fernandes, Anildo

2013-01-01

Text-to-speech synthesis is the main subject treated in this work. It will be presented the constitution of a generic text-to-speech system conversion, explained the functions of the various modules and described the development techniques using the formants model. The development of a didactic formant synthesiser under Matlab environment will also be described. This didactic synthesiser is intended for a didactic understanding of the formant model of speech production.
Indonesian Text-To-Speech System Using Diphone Concatenative Synthesis

Directory of Open Access Journals (Sweden)

Sutarman

2015-02-01

Full Text Available In this paper, we describe the design and develop a database of Indonesian diphone synthesis using speech segment of recorded voice to be converted from text to speech and save it as audio file like WAV or MP3. In designing and develop a database of Indonesian diphone there are several steps to follow; First, developed Diphone database includes: create a list of sample of words consisting of diphones organized by prioritizing looking diphone located in the middle of a word if not at the beginning or end; recording the samples of words by segmentation. ;create diphones made with a tool Diphone Studio 1.3. Second, develop system using Microsoft Visual Delphi 6.0, includes: the conversion system from the input of numbers, acronyms, words, and sentences into representations diphone. There are two kinds of conversion (process alleged in analyzing the Indonesian text-to-speech system. One is to convert the text to be sounded to phonem and two, to convert the phonem to speech. Method used in this research is called Diphone Concatenative synthesis, in which recorded sound segments are collected. Every segment consists of a diphone (2 phonems. This synthesizer may produce voice with high level of naturalness. The Indonesian Text to Speech system can differentiate special phonemes like in ‘Beda’ and ‘Bedak’ but sample of other spesific words is necessary to put into the system. This Indonesia TTS system can handle texts with abbreviation, there is the facility to add such words.
Perceptual and Acoustic Reliability Estimates for the Speech Disorders Classification System (SDCS)

Science.gov (United States)

Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.

2010-01-01

A companion paper describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). The SDCS uses perceptual and acoustic data reduction methods to obtain information on a speaker's speech, prosody, and voice. The present paper provides reliability estimates for…
Performance Assessment of Dynaspeak Speech Recognition System on Inflight Databases

National Research Council Canada - National Science Library

Barry, Timothy

2004-01-01

.... To aid in the assessment of various commercially available speech recognition systems, several aircraft speech databases have been developed at the Air Force Research Laboratory's Human Effectiveness Directorate...
A diphone-based speech-synthesis system for British English

NARCIS (Netherlands)

Pijper, de J.R.

1987-01-01

This article describes a keyboard-to-speech system for British English synthetic speech based on diphones. It concentrates on the development and composition of the diphone inventory and briefly describes a computer program which makes it possible to quickly concatenate diphones and synthesise
Recognizing speech in a novel accent: the motor theory of speech perception reframed.

Science.gov (United States)

Moulin-Frier, Clément; Arbib, Michael A

2013-08-01

The motor theory of speech perception holds that we perceive the speech of another in terms of a motor representation of that speech. However, when we have learned to recognize a foreign accent, it seems plausible that recognition of a word rarely involves reconstruction of the speech gestures of the speaker rather than the listener. To better assess the motor theory and this observation, we proceed in three stages. Part 1 places the motor theory of speech perception in a larger framework based on our earlier models of the adaptive formation of mirror neurons for grasping, and for viewing extensions of that mirror system as part of a larger system for neuro-linguistic processing, augmented by the present consideration of recognizing speech in a novel accent. Part 2 then offers a novel computational model of how a listener comes to understand the speech of someone speaking the listener's native language with a foreign accent. The core tenet of the model is that the listener uses hypotheses about the word the speaker is currently uttering to update probabilities linking the sound produced by the speaker to phonemes in the native language repertoire of the listener. This, on average, improves the recognition of later words. This model is neutral regarding the nature of the representations it uses (motor vs. auditory). It serve as a reference point for the discussion in Part 3, which proposes a dual-stream neuro-linguistic architecture to revisits claims for and against the motor theory of speech perception and the relevance of mirror neurons, and extracts some implications for the reframing of the motor theory.
SPEECH VISUALIZATION SISTEM AS A BASIS FOR SPEECH TRAINING AND COMMUNICATION AIDS

Directory of Open Access Journals (Sweden)

Oliana KRSTEVA

1997-09-01

Full Text Available One receives much more information through a visual sense than through a tactile one. However, most visual aids for hearing-impaired persons are not wearable because it is difficult to make them compact and it is not a best way to mask always their vision.Generally it is difficult to get the integrated patterns by a single mathematical transform of signals, such as a Foruier transform. In order to obtain the integrated pattern speech parameters should be carefully extracted by an analysis according as each parameter, and a visual pattern, which can intuitively be understood by anyone, must be synthesized from them. Successful integration of speech parameters will never disturb understanding of individual features, so that the system can be used for speech training and communication.
High profile students’ growth of mathematical understanding in solving linier programing problems

Science.gov (United States)

Utomo; Kusmayadi, TA; Pramudya, I.

2018-04-01

Linear program has an important role in human’s life. This linear program is learned in senior high school and college levels. This material is applied in economy, transportation, military and others. Therefore, mastering linear program is useful for provision of life. This research describes a growth of mathematical understanding in solving linear programming problems based on the growth of understanding by the Piere-Kieren model. Thus, this research used qualitative approach. The subjects were students of grade XI in Salatiga city. The subjects of this study were two students who had high profiles. The researcher generally chose the subjects based on the growth of understanding from a test result in the classroom; the mark from the prerequisite material was ≥ 75. Both of the subjects were interviewed by the researcher to know the students’ growth of mathematical understanding in solving linear programming problems. The finding of this research showed that the subjects often folding back to the primitive knowing level to go forward to the next level. It happened because the subjects’ primitive understanding was not comprehensive.
Idaho's Three-Tiered System for Speech-Language Paratherapist Training and Utilization.

Science.gov (United States)

Longhurst, Thomas M.

1997-01-01

Discusses the development and current implementation of Idaho's three-tiered system of speech-language paratherapists. Support personnel providing speech-language services to learners with special communication needs in educational settings must obtain one of three certification levels: (1) speech-language aide, (2) associate degree…

General Systems Theory: Application To The Design Of Speech Communication Courses

Science.gov (United States)

Tucker, Raymond K.

1971-01-01

General systems theory can be applied to problems in the teaching of speech communication courses. The author describes general systems theory as it is applied to the designing, conducting and evaluation of speech communication courses. (Author/MS)
Automatic Speech Acquisition and Recognition for Spacesuit Audio Systems

Science.gov (United States)

Ye, Sherry

2015-01-01

NASA has a widely recognized but unmet need for novel human-machine interface technologies that can facilitate communication during astronaut extravehicular activities (EVAs), when loud noises and strong reverberations inside spacesuits make communication challenging. WeVoice, Inc., has developed a multichannel signal-processing method for speech acquisition in noisy and reverberant environments that enables automatic speech recognition (ASR) technology inside spacesuits. The technology reduces noise by exploiting differences between the statistical nature of signals (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, ASR accuracy can be improved to the level at which crewmembers will find the speech interface useful. System components and features include beam forming/multichannel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, and ASR decoding. Arithmetic complexity models were developed and will help designers of real-time ASR systems select proper tasks when confronted with constraints in computational resources. In Phase I of the project, WeVoice validated the technology. The company further refined the technology in Phase II and developed a prototype for testing and use by suited astronauts.
Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System.

Science.gov (United States)

Partila, Pavol; Voznak, Miroslav; Tovarek, Jaromir

2015-01-01

The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.
A speech-controlled environmental control system for people with severe dysarthria.

Science.gov (United States)

Hawley, Mark S; Enderby, Pam; Green, Phil; Cunningham, Stuart; Brownsell, Simon; Carmichael, James; Parker, Mark; Hatzis, Athanassios; O'Neill, Peter; Palmer, Rebecca

2007-06-01

Automatic speech recognition (ASR) can provide a rapid means of controlling electronic assistive technology. Off-the-shelf ASR systems function poorly for users with severe dysarthria because of the increased variability of their articulations. We have developed a limited vocabulary speaker dependent speech recognition application which has greater tolerance to variability of speech, coupled with a computerised training package which assists dysarthric speakers to improve the consistency of their vocalisations and provides more data for recogniser training. These applications, and their implementation as the interface for a speech-controlled environmental control system (ECS), are described. The results of field trials to evaluate the training program and the speech-controlled ECS are presented. The user-training phase increased the recognition rate from 88.5% to 95.4% (p<0.001). Recognition rates were good for people with even the most severe dysarthria in everyday usage in the home (mean word recognition rate 86.9%). Speech-controlled ECS were less accurate (mean task completion accuracy 78.6% versus 94.8%) but were faster to use than switch-scanning systems, even taking into account the need to repeat unsuccessful operations (mean task completion time 7.7s versus 16.9s, p<0.001). It is concluded that a speech-controlled ECS is a viable alternative to switch-scanning systems for some people with severe dysarthria and would lead, in many cases, to more efficient control of the home.
Speech-associated gestures, Broca’s area, and the human mirror system

Science.gov (United States)

Skipper, Jeremy I.; Goldin-Meadow, Susan; Nusbaum, Howard C.; Small, Steven L

2009-01-01

Speech-associated gestures are hand and arm movements that not only convey semantic information to listeners but are themselves actions. Broca’s area has been assumed to play an important role both in semantic retrieval or selection (as part of a language comprehension system) and in action recognition (as part of a “mirror” or “observation–execution matching” system). We asked whether the role that Broca’s area plays in processing speech-associated gestures is consistent with the semantic retrieval/selection account (predicting relatively weak interactions between Broca’s area and other cortical areas because the meaningful information that speech-associated gestures convey reduces semantic ambiguity and thus reduces the need for semantic retrieval/selection) or the action recognition account (predicting strong interactions between Broca’s area and other cortical areas because speech-associated gestures are goal-direct actions that are “mirrored”). We compared the functional connectivity of Broca’s area with other cortical areas when participants listened to stories while watching meaningful speech-associated gestures, speech-irrelevant self-grooming hand movements, or no hand movements. A network analysis of neuroimaging data showed that interactions involving Broca’s area and other cortical areas were weakest when spoken language was accompanied by meaningful speech-associated gestures, and strongest when spoken language was accompanied by self-grooming hand movements or by no hand movements at all. Results are discussed with respect to the role that the human mirror system plays in processing speech-associated movements. PMID:17533001
Musicians do not benefit from differences in fundamental frequency when listening to speech in competing speech backgrounds

DEFF Research Database (Denmark)

Madsen, Sara Miay Kim; Whiteford, Kelly L.; Oxenham, Andrew J.

2017-01-01

Recent studies disagree on whether musicians have an advantage over non-musicians in understanding speech in noise. However, it has been suggested that musicians may be able to use diferences in fundamental frequency (F0) to better understand target speech in the presence of interfering talkers....... Here we studied a relatively large (N=60) cohort of young adults, equally divided between nonmusicians and highly trained musicians, to test whether the musicians were better able to understand speech either in noise or in a two-talker competing speech masker. The target speech and competing speech...... were presented with either their natural F0 contours or on a monotone F0, and the F0 diference between the target and masker was systematically varied. As expected, speech intelligibility improved with increasing F0 diference between the target and the two-talker masker for both natural and monotone...
The Role of Corticostriatal Systems in Speech Category Learning.

Science.gov (United States)

Yi, Han-Gyol; Maddox, W Todd; Mumford, Jeanette A; Chandrasekaran, Bharath

2016-04-01

One of the most difficult category learning problems for humans is learning nonnative speech categories. While feedback-based category training can enhance speech learning, the mechanisms underlying these benefits are unclear. In this functional magnetic resonance imaging study, we investigated neural and computational mechanisms underlying feedback-dependent speech category learning in adults. Positive feedback activated a large corticostriatal network including the dorsolateral prefrontal cortex, inferior parietal lobule, middle temporal gyrus, caudate, putamen, and the ventral striatum. Successful learning was contingent upon the activity of domain-general category learning systems: the fast-learning reflective system, involving the dorsolateral prefrontal cortex that develops and tests explicit rules based on the feedback content, and the slow-learning reflexive system, involving the putamen in which the stimuli are implicitly associated with category responses based on the reward value in feedback. Computational modeling of response strategies revealed significant use of reflective strategies early in training and greater use of reflexive strategies later in training. Reflexive strategy use was associated with increased activation in the putamen. Our results demonstrate a critical role for the reflexive corticostriatal learning system as a function of response strategy and proficiency during speech category learning. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Digital speech processing using Matlab

CERN Document Server

Gopi, E S

2014-01-01

Digital Speech Processing Using Matlab deals with digital speech pattern recognition, speech production model, speech feature extraction, and speech compression. The book is written in a manner that is suitable for beginners pursuing basic research in digital speech processing. Matlab illustrations are provided for most topics to enable better understanding of concepts. This book also deals with the basic pattern recognition techniques (illustrated with speech signals using Matlab) such as PCA, LDA, ICA, SVM, HMM, GMM, BPN, and KSOM.
Speech-To-Text Conversion STT System Using Hidden Markov Model HMM

Directory of Open Access Journals (Sweden)

Su Myat Mon

2015-06-01

Full Text Available Abstract Speech is an easiest way to communicate with each other. Speech processing is widely used in many applications like security devices household appliances cellular phones ATM machines and computers. The human computer interface has been developed to communicate or interact conveniently for one who is suffering from some kind of disabilities. Speech-to-Text Conversion STT systems have a lot of benefits for the deaf or dumb people and find their applications in our daily lives. In the same way the aim of the system is to convert the input speech signals into the text output for the deaf or dumb students in the educational fields. This paper presents an approach to extract features by using Mel Frequency Cepstral Coefficients MFCC from the speech signals of isolated spoken words. And Hidden Markov Model HMM method is applied to train and test the audio files to get the recognized spoken word. The speech database is created by using MATLAB.Then the original speech signals are preprocessed and these speech samples are extracted to the feature vectors which are used as the observation sequences of the Hidden Markov Model HMM recognizer. The feature vectors are analyzed in the HMM depending on the number of states.
Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System

Directory of Open Access Journals (Sweden)

Pavol Partila

2015-01-01

Full Text Available The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.
Introduction and Overview of the Vicens-Reddy Speech Recognition System.

Science.gov (United States)

Kameny, Iris; Ritea, H.

The Vicens-Reddy System is unique in the sense that it approaches the problem of speech recognition as a whole, rather than treating particular aspects of the problems as in previous attempts. For example, where earlier systems treated only segmentation of speech into phoneme groups, or detected phonemes in a given context, the Vicens-Reddy System…
Modeling speech intelligibility in adverse conditions

DEFF Research Database (Denmark)

Dau, Torsten

2012-01-01

) in conditions with nonlinearly processed speech. Instead of considering the reduction of the temporal modulation energy as the intelligibility metric, as assumed in the STI, the sEPSM applies the signal-to-noise ratio in the envelope domain (SNRenv). This metric was shown to be the key for predicting...... understanding speech when more than one person is talking, even when reduced audibility has been fully compensated for by a hearing aid. The reasons for these difficulties are not well understood. This presentation highlights recent concepts of the monaural and binaural signal processing strategies employed...... by the normal as well as impaired auditory system. Jørgensen and Dau [(2011). J. Acoust. Soc. Am. 130, 1475-1487] proposed the speech-based envelope power spectrum model (sEPSM) in an attempt to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII...
Some Applications of Algebraic System Solving

Science.gov (United States)

Roanes-Lozano, Eugenio

2011-01-01

Technology and, in particular, computer algebra systems, allows us to change both the way we teach mathematics and the mathematical curriculum. Curiously enough, unlike what happens with linear system solving, algebraic system solving is not widely known. The aim of this paper is to show that, although the theory lying behind the "exact…
Understanding the role of speech production in reading: Evidence for a print-to-speech neural network using graphical analysis.

Science.gov (United States)

Cummine, Jacqueline; Cribben, Ivor; Luu, Connie; Kim, Esther; Bahktiari, Reyhaneh; Georgiou, George; Boliek, Carol A

2016-05-01

The neural circuitry associated with language processing is complex and dynamic. Graphical models are useful for studying complex neural networks as this method provides information about unique connectivity between regions within the context of the entire network of interest. Here, the authors explored the neural networks during covert reading to determine the role of feedforward and feedback loops in covert speech production. Brain activity of skilled adult readers was assessed in real word and pseudoword reading tasks with functional MRI (fMRI). The authors provide evidence for activity coherence in the feedforward system (inferior frontal gyrus-supplementary motor area) during real word reading and in the feedback system (supramarginal gyrus-precentral gyrus) during pseudoword reading. Graphical models provided evidence of an extensive, highly connected, neural network when individuals read real words that relied on coordination of the feedforward system. In contrast, when individuals read pseudowords the authors found a limited/restricted network that relied on coordination of the feedback system. Together, these results underscore the importance of considering multiple pathways and articulatory loops during language tasks and provide evidence for a print-to-speech neural network. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Low-Frequency Cortical Entrainment to Speech Reflects Phoneme-Level Processing.

Science.gov (United States)

Di Liberto, Giovanni M; O'Sullivan, James A; Lalor, Edmund C

2015-10-05

The human ability to understand speech is underpinned by a hierarchical auditory system whose successive stages process increasingly complex attributes of the acoustic input. It has been suggested that to produce categorical speech perception, this system must elicit consistent neural responses to speech tokens (e.g., phonemes) despite variations in their acoustics. Here, using electroencephalography (EEG), we provide evidence for this categorical phoneme-level speech processing by showing that the relationship between continuous speech and neural activity is best described when that speech is represented using both low-level spectrotemporal information and categorical labeling of phonetic features. Furthermore, the mapping between phonemes and EEG becomes more discriminative for phonetic features at longer latencies, in line with what one might expect from a hierarchical system. Importantly, these effects are not seen for time-reversed speech. These findings may form the basis for future research on natural language processing in specific cohorts of interest and for broader insights into how brains transform acoustic input into meaning. Copyright © 2015 Elsevier Ltd. All rights reserved.
A System for Detecting Miscues in Dyslexic Read Speech

DEFF Research Database (Denmark)

Rasmussen, Morten Højfeldt; Tan, Zheng-Hua; Lindberg, Børge

2009-01-01

While miscue detection in general is a well explored research field little attention has so far been paid to miscue detection in dyslexic read speech. This domain differs substantially from the domains that are commonly researched, as for example dyslexic read speech includes frequent regressions...... that the system detects miscues at a false alarm rate of 5.3% and a miscue detection rate of 40.1%. These results are worse than current state of the art reading tutors perhaps indicating that dyslexic read speech is a challenge to handle....
Dynamic Programming Algorithms in Speech Recognition

Directory of Open Access Journals (Sweden)

Titus Felix FURTUNA

2008-01-01

Full Text Available In a system of speech recognition containing words, the recognition requires the comparison between the entry signal of the word and the various words of the dictionary. The problem can be solved efficiently by a dynamic comparison algorithm whose goal is to put in optimal correspondence the temporal scales of the two words. An algorithm of this type is Dynamic Time Warping. This paper presents two alternatives for implementation of the algorithm designed for recognition of the isolated words.
The impact of problem solving strategy with online feedback on students’ conceptual understanding

Science.gov (United States)

Pratiwi, H. Y.; Winarko, W.; Ayu, H. D.

2018-04-01

The study aimed to determine the impact of the implementation of problem solving strategy with online feedback towards the students’ concept understanding. This study used quasi experimental design with post-test only control design. The participants were all Physics Education students of Kanjuruhan University year 2015. Then, they were divided into two different groups; 30 students belong to experiment class and the remaining 30 students belong to class of control. The students’ concept understanding was measured by the concept understanding test on multiple integral lesson. The result of the concept understanding test was analyzed by prerequisite test and stated to be normal and homogenic distributed, then the hypothesis was examined by T-test. The result of the study shows that there is difference in the concept understanding between experiment class and control class. Next, the result also shows that the students’ concept understanding which was taught using problem solving strategy with online feedback was higher than those using conventional learning; with average score of 72,10 for experiment class and 52,27 for control class.
Music and Speech Perception in Children Using Sung Speech.

Science.gov (United States)

Nie, Yingjiu; Galvin, John J; Morikawa, Michael; André, Victoria; Wheeler, Harley; Fu, Qian-Jie

2018-01-01

This study examined music and speech perception in normal-hearing children with some or no musical training. Thirty children (mean age = 11.3 years), 15 with and 15 without formal music training participated in the study. Music perception was measured using a melodic contour identification (MCI) task; stimuli were a piano sample or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note). Speech perception was measured in quiet and in steady noise using a matrix-styled sentence recognition task; stimuli were naturally intonated speech or sung speech with a fixed pitch (same note for each word) or a mixed pitch (different notes for each word). Significant musician advantages were observed for MCI and speech in noise but not for speech in quiet. MCI performance was significantly poorer with the mixed timbre stimuli. Speech performance in noise was significantly poorer with the fixed or mixed pitch stimuli than with spoken speech. Across all subjects, age at testing and MCI performance were significantly correlated with speech performance in noise. MCI and speech performance in quiet was significantly poorer for children than for adults from a related study using the same stimuli and tasks; speech performance in noise was significantly poorer for young than for older children. Long-term music training appeared to benefit melodic pitch perception and speech understanding in noise in these pediatric listeners.
The Diagnosis and Understanding of Apraxia of Speech: Why Including Neurodegenerative Etiologies May Be Important

Science.gov (United States)

Duffy, Joseph R.; Josephs, Keith A.

2012-01-01

Purpose: To discuss apraxia of speech (AOS) as it occurs in neurodegenerative disease (progressive AOS [PAOS]) and how its careful study may contribute to general concepts of AOS and help refine its diagnostic criteria. Method: The article summarizes our current understanding of the clinical features and neuroanatomical and pathologic correlates…

FUSING SPEECH SIGNAL AND PALMPRINT FEATURES FOR AN SECURED AUTHENTICATION SYSTEM

Directory of Open Access Journals (Sweden)

P.K. Mahesh

2011-11-01

Full Text Available In the application of Biometric authentication, personal identification is regarded as an effective method for automatic recognition, with a high confidence, a person’s identity. Using multimodal biometric systems we typically get better performance compare to single biometric modality. This paper proposes the multimodal biometrics system for identity verification using two traits, i.e., speech signal and palmprint. Integrating the palmprint and speech information increases robustness of person authentication. The proposed system is designed for applications where the training data contains a speech signal and palmprint. It is well known that the performance of person authentication using only speech signal or palmprint is deteriorated by feature changes with time. The final decision is made by fusion at matching score level architecture in which feature vectors are created independently for query measures and are then compared to the enrolment templates, which are stored during database preparation.
How is the McGurk effect modulated by Cued Speech in deaf and hearing adults?

OpenAIRE

Bayard, Clémence; Colin, Cécile; Leybaert, Jacqueline

2014-01-01

Speech perception for both hearing and deaf people involves an integrative process between auditory and lip-reading information. In order to disambiguate information from lips, manual cues from Cued Speech may be added. Cued Speech (CS) is a system of manual aids developed to help deaf people to clearly and completely understand speech visually (Cornett, 1967). Within this system, both labial and manual information, as lone input sources, remain ambiguous. Perceivers, therefore, have to combi...
Translation of the Speech Therapy Programs in the Logomon Assisted Therapy System

Directory of Open Access Journals (Sweden)

SCHIPOR, D. M.

2010-05-01

Full Text Available This interdisciplinary research was developed with a view to create and implement an intelligent informatics system for the treatment of dyslalic disorders, specific to the Romanian language (CBTS system - computer-based speech therapy, as a complementary speech therapy method, customised and client-oriented. The rules of the logotherapeutic guide have been expressed in pseudocode programs in order to allow a greater flexibility in expressing the logotherapeutic procedures in an informatics system. The pseudocode logopedic programs comprise the succession of stages of the therapeutic program from a speech therapy perspective, and based on what the expert system can achieve. The LOGOMON system is conceived in order to assist the physical therapist and the child during the entire therapeutic period, recording the main data related to the child, which proved to be useful in diagnosis and treatment. The experimental validation of the system proved that assisted therapy contributes to the improvement of classical therapy, to obtaining optimal results in correcting the dyslalic person's speech.
Automated recognition of helium speech. Phase I: Investigation of microprocessor based analysis/synthesis system

Science.gov (United States)

Jelinek, H. J.

1986-01-01

This is the Final Report of Electronic Design Associates on its Phase I SBIR project. The purpose of this project is to develop a method for correcting helium speech, as experienced in diver-surface communication. The goal of the Phase I study was to design, prototype, and evaluate a real time helium speech corrector system based upon digital signal processing techniques. The general approach was to develop hardware (an IBM PC board) to digitize helium speech and software (a LAMBDA computer based simulation) to translate the speech. As planned in the study proposal, this initial prototype may now be used to assess expected performance from a self contained real time system which uses an identical algorithm. The Final Report details the work carried out to produce the prototype system. Four major project tasks were: a signal processing scheme for converting helium speech to normal sounding speech was generated. The signal processing scheme was simulated on a general purpose (LAMDA) computer. Actual helium speech was supplied to the simulation and the converted speech was generated. An IBM-PC based 14 bit data Input/Output board was designed and built. A bibliography of references on speech processing was generated.
Connected digit speech recognition system for Malayalam language

Indian Academy of Sciences (India)

A connected digit speech recognition is important in many applications such as automated banking system, catalogue-dialing, automatic data entry, automated banking system, etc. This paper presents an optimum speaker-independent connected digit recognizer for Malayalam language. The system employs Perceptual ...
Verbal Problem-Solving Difficulties in Autism Spectrum Disorders and Atypical Language Development

Science.gov (United States)

Alderson-Day, Ben

2018-01-01

Children with autism spectrum disorders (ASDs) adopt less efficient strategies than typically developing (TD) peers on the Twenty Questions Task (TQT), a measure of verbal problem-solving skills. Although problems with the TQT are typically associated with executive dysfunction, they have also been reported in children who are deaf, suggesting a role for atypical language development. To test the contribution of language history to ASD problem solving, TQT performance was compared in children with high-functioning autism (HFA), children with Asperger syndrome (AS) and TD children. The HFA group used significantly less efficient strategies than both AS and TD children. No group differences were evident on tests of question understanding, planning or verbal fluency. Potential explanations for differences in verbal problem-solving skill are discussed with reference to the development of inner speech and use of visual strategies in ASD. PMID:25346354
Speech understanding and directional hearing for hearing-impaired subjects with in-the-ear and behind-the-ear hearing aids

NARCIS (Netherlands)

Leeuw, A. R.; Dreschler, W. A.

1987-01-01

With respect to acoustical properties, in-the-ear (ITE) aids should give better understanding and directional hearing than behind-the-ear (BTE) aids. Also hearing-impaired subjects often prefer ITEs. A study was performed to assess objectively the improvement in speech understanding and directional
Speech and Communication Disorders

Science.gov (United States)

... to being completely unable to speak or understand speech. Causes include Hearing disorders and deafness Voice problems, ... or those caused by cleft lip or palate Speech problems like stuttering Developmental disabilities Learning disorders Autism ...
Speech Compression

Directory of Open Access Journals (Sweden)

Jerry D. Gibson

2016-06-01

Full Text Available Speech compression is a key technology underlying digital cellular communications, VoIP, voicemail, and voice response systems. We trace the evolution of speech coding based on the linear prediction model, highlight the key milestones in speech coding, and outline the structures of the most important speech coding standards. Current challenges, future research directions, fundamental limits on performance, and the critical open problem of speech coding for emergency first responders are all discussed.
Systems Acquisition Manager’s Guide for the Use of Models and Simulations

Science.gov (United States)

1994-09-01

are many others that deserve recognition , but in fairness to all, there are too many to mention. The three research fellows would like to thank all of...information or user accounts: Modeling and Simulation Information System ATTN: Administrative Supr 1901 N. Beauregard St., Suite 510, Alexa . VA 22311...intelli- gence (e.g., understanding visual images, understanding speech and written text, problem 1-1 solving and medical diagnosis). (Krueger) Brawler
Evaluating the speech output component of a smart-home system

NARCIS (Netherlands)

Möller, S.; Krebber, J.; Smeele, P.

2006-01-01

This paper describes four experiments which have been carried out to evaluate the speech output component of the INSPIRE spoken dialogue system, providing speech control for di.erent devices located in a smart home environment. The aim is to quantify the impact of different factors on the
Speech Planning Happens before Speech Execution: Online Reaction Time Methods in the Study of Apraxia of Speech

Science.gov (United States)

Maas, Edwin; Mailend, Marja-Liisa

2012-01-01

Purpose: The purpose of this article is to present an argument for the use of online reaction time (RT) methods to the study of apraxia of speech (AOS) and to review the existing small literature in this area and the contributions it has made to our fundamental understanding of speech planning (deficits) in AOS. Method: Following a brief…
Why do I hear but not understand? Stochastic undersampling as a model of degraded neural encoding of speech

Directory of Open Access Journals (Sweden)

Enrique A Lopez-Poveda

2014-10-01

Full Text Available Hearing impairment is a serious disease with increasing prevalence. It is defined based on increased audiometric thresholds but increased thresholds are only partly responsible for the greater difficulty understanding speech in noisy environments experienced by some older listeners or by hearing-impaired listeners. Identifying the additional factors and mechanisms that impair intelligibility is fundamental to understanding hearing impairment but these factors remain uncertain. Traditionally, these additional factors have been sought in the way the speech spectrum is encoded in the pattern of impaired mechanical cochlear responses. Recent studies, however, are steering the focus toward impaired encoding of the speech waveform in the auditory nerve. In our recent work, we gave evidence that a significant factor might be the loss of afferent auditory nerve fibers, a pathology that comes with aging or noise overexposure. Our approach was based on a signal-processing analogy whereby the auditory nerve may be regarded as a stochastic sampler of the sound waveform and deafferentation may be described in terms of waveform undersampling. We showed that stochastic undersampling simultaneously degrades the encoding of soft and rapid waveform features, and that this degrades speech intelligibility in noise more than in quiet without significant increases in audiometric thresholds. Here, we review our recent work in a broader context and argue that the stochastic undersampling analogy may be extended to study the perceptual consequences of various different hearing pathologies and their treatment.
The National Outcomes Measurement System for Pediatric Speech-Language Pathology

Science.gov (United States)

Mullen, Robert; Schooling, Tracy

2010-01-01

Purpose: The American Speech-Language-Hearing Association's (ASHA's) National Outcomes Measurement System (NOMS) was developed in the late 1990s. The primary purpose was to serve as a source of data for speech-language pathologists (SLPs) who found themselves called on to provide empirical evidence of the functional outcomes associated with their…
Musical Experience and the Aging Auditory System: Implications for Cognitive Abilities and Hearing Speech in Noise

Science.gov (United States)

Parbery-Clark, Alexandra; Strait, Dana L.; Anderson, Samira; Hittner, Emily; Kraus, Nina

2011-01-01

Much of our daily communication occurs in the presence of background noise, compromising our ability to hear. While understanding speech in noise is a challenge for everyone, it becomes increasingly difficult as we age. Although aging is generally accompanied by hearing loss, this perceptual decline cannot fully account for the difficulties experienced by older adults for hearing in noise. Decreased cognitive skills concurrent with reduced perceptual acuity are thought to contribute to the difficulty older adults experience understanding speech in noise. Given that musical experience positively impacts speech perception in noise in young adults (ages 18–30), we asked whether musical experience benefits an older cohort of musicians (ages 45–65), potentially offsetting the age-related decline in speech-in-noise perceptual abilities and associated cognitive function (i.e., working memory). Consistent with performance in young adults, older musicians demonstrated enhanced speech-in-noise perception relative to nonmusicians along with greater auditory, but not visual, working memory capacity. By demonstrating that speech-in-noise perception and related cognitive function are enhanced in older musicians, our results imply that musical training may reduce the impact of age-related auditory decline. PMID:21589653
Musical experience and the aging auditory system: implications for cognitive abilities and hearing speech in noise.

Science.gov (United States)

Parbery-Clark, Alexandra; Strait, Dana L; Anderson, Samira; Hittner, Emily; Kraus, Nina

2011-05-11

Much of our daily communication occurs in the presence of background noise, compromising our ability to hear. While understanding speech in noise is a challenge for everyone, it becomes increasingly difficult as we age. Although aging is generally accompanied by hearing loss, this perceptual decline cannot fully account for the difficulties experienced by older adults for hearing in noise. Decreased cognitive skills concurrent with reduced perceptual acuity are thought to contribute to the difficulty older adults experience understanding speech in noise. Given that musical experience positively impacts speech perception in noise in young adults (ages 18-30), we asked whether musical experience benefits an older cohort of musicians (ages 45-65), potentially offsetting the age-related decline in speech-in-noise perceptual abilities and associated cognitive function (i.e., working memory). Consistent with performance in young adults, older musicians demonstrated enhanced speech-in-noise perception relative to nonmusicians along with greater auditory, but not visual, working memory capacity. By demonstrating that speech-in-noise perception and related cognitive function are enhanced in older musicians, our results imply that musical training may reduce the impact of age-related auditory decline.
Musical experience and the aging auditory system: implications for cognitive abilities and hearing speech in noise.

Directory of Open Access Journals (Sweden)

Alexandra Parbery-Clark

Full Text Available Much of our daily communication occurs in the presence of background noise, compromising our ability to hear. While understanding speech in noise is a challenge for everyone, it becomes increasingly difficult as we age. Although aging is generally accompanied by hearing loss, this perceptual decline cannot fully account for the difficulties experienced by older adults for hearing in noise. Decreased cognitive skills concurrent with reduced perceptual acuity are thought to contribute to the difficulty older adults experience understanding speech in noise. Given that musical experience positively impacts speech perception in noise in young adults (ages 18-30, we asked whether musical experience benefits an older cohort of musicians (ages 45-65, potentially offsetting the age-related decline in speech-in-noise perceptual abilities and associated cognitive function (i.e., working memory. Consistent with performance in young adults, older musicians demonstrated enhanced speech-in-noise perception relative to nonmusicians along with greater auditory, but not visual, working memory capacity. By demonstrating that speech-in-noise perception and related cognitive function are enhanced in older musicians, our results imply that musical training may reduce the impact of age-related auditory decline.
Combined Electric and Acoustic Stimulation With Hearing Preservation: Effect of Cochlear Implant Low-Frequency Cutoff on Speech Understanding and Perceived Listening Difficulty.

Science.gov (United States)

Gifford, René H; Davis, Timothy J; Sunderhaus, Linsey W; Menapace, Christine; Buck, Barbara; Crosson, Jillian; O'Neill, Lori; Beiter, Anne; Segel, Phil

The primary objective of this study was to assess the effect of electric and acoustic overlap for speech understanding in typical listening conditions using semidiffuse noise. This study used a within-subjects, repeated measures design including 11 experienced adult implant recipients (13 ears) with functional residual hearing in the implanted and nonimplanted ear. The aided acoustic bandwidth was fixed and the low-frequency cutoff for the cochlear implant (CI) was varied systematically. Assessments were completed in the R-SPACE sound-simulation system which includes a semidiffuse restaurant noise originating from eight loudspeakers placed circumferentially about the subject's head. AzBio sentences were presented at 67 dBA with signal to noise ratio varying between +10 and 0 dB determined individually to yield approximately 50 to 60% correct for the CI-alone condition with full CI bandwidth. Listening conditions for all subjects included CI alone, bimodal (CI + contralateral hearing aid), and bilateral-aided electric and acoustic stimulation (EAS; CI + bilateral hearing aid). Low-frequency cutoffs both below and above the original "clinical software recommendation" frequency were tested for all patients, in all conditions. Subjects estimated listening difficulty for all conditions using listener ratings based on a visual analog scale. Three primary findings were that (1) there was statistically significant benefit of preserved acoustic hearing in the implanted ear for most overlap conditions, (2) the default clinical software recommendation rarely yielded the highest level of speech recognition (1 of 13 ears), and (3) greater EAS overlap than that provided by the clinical recommendation yielded significant improvements in speech understanding. For standard-electrode CI recipients with preserved hearing, spectral overlap of acoustic and electric stimuli yielded significantly better speech understanding and less listening effort in a laboratory-based, restaurant
The influence of age, hearing, and working memory on the speech comprehension benefit derived from an automatic speech recognition system

NARCIS (Netherlands)

Zekveld, A.A.; Kramer, S.E.; Kessens, J.M.; Vlaming, M.S.M.G.; Houtgast, T.

2009-01-01

Objective: The aim of the current study was to examine whether partly incorrect subtitles that are automatically generated by an Automatic Speech Recognition (ASR) system, improve speech comprehension by listeners with hearing impairment. In an earlier study (Zekveld et al. 2008), we showed that
College Students' Perceptions of the C-Print Speech-to-Text Transcription System.

Science.gov (United States)

Elliot, L B; Stinson, M S; McKee, B G; Everhart, V S; Francis, P J

2001-01-01

C-Print is a real-time speech-to-text transcription system used as a support service with deaf students in mainstreamed classes. Questionnaires were administered to 36 college students in 32 courses in which the C-Print system was used in addition to interpreting and note taking. Twenty-two of these students were also interviewed. Questionnaire items included student ratings of lecture comprehension. Student ratings indicated good comprehension with C-Print, and the mean rating was significantly higher than that for understanding of the interpreter. Students also rated the hard copy printout provided by C-Print as helpful, and they reported that they used these notes more frequently than the handwritten notes from a paid student note taker. Interview results were consistent with those for the questionnaire. Questionnaire and interview responses regarding use of C-Print as the only support service indicated that this arrangement would be acceptable to many students, but not to others. Communication characteristics were related to responses to the questionnaire. Students who were relatively proficient in reading and writing English, and in speech-reading, responded more favorably to C-Print.

An analysis of the masking of speech by competing speech using self-report data.

Science.gov (United States)

Agus, Trevor R; Akeroyd, Michael A; Noble, William; Bhullar, Navjot

2009-01-01

Many of the items in the "Speech, Spatial, and Qualities of Hearing" scale questionnaire [S. Gatehouse and W. Noble, Int. J. Audiol. 43, 85-99 (2004)] are concerned with speech understanding in a variety of backgrounds, both speech and nonspeech. To study if this self-report data reflected informational masking, previously collected data on 414 people were analyzed. The lowest scores (greatest difficulties) were found for the two items in which there were two speech targets, with successively higher scores for competing speech (six items), energetic masking (one item), and no masking (three items). The results suggest significant masking by competing speech in everyday listening situations.
Hate speech

Directory of Open Access Journals (Sweden)

Anne Birgitta Nilsen

2014-12-01

Full Text Available The manifesto of the Norwegian terrorist Anders Behring Breivik is based on the “Eurabia” conspiracy theory. This theory is a key starting point for hate speech amongst many right-wing extremists in Europe, but also has ramifications beyond these environments. In brief, proponents of the Eurabia theory claim that Muslims are occupying Europe and destroying Western culture, with the assistance of the EU and European governments. By contrast, members of Al-Qaeda and other extreme Islamists promote the conspiracy theory “the Crusade” in their hate speech directed against the West. Proponents of the latter theory argue that the West is leading a crusade to eradicate Islam and Muslims, a crusade that is similarly facilitated by their governments. This article presents analyses of texts written by right-wing extremists and Muslim extremists in an effort to shed light on how hate speech promulgates conspiracy theories in order to spread hatred and intolerance.The aim of the article is to contribute to a more thorough understanding of hate speech’s nature by applying rhetorical analysis. Rhetorical analysis is chosen because it offers a means of understanding the persuasive power of speech. It is thus a suitable tool to describe how hate speech works to convince and persuade. The concepts from rhetorical theory used in this article are ethos, logos and pathos. The concept of ethos is used to pinpoint factors that contributed to Osama bin Laden's impact, namely factors that lent credibility to his promotion of the conspiracy theory of the Crusade. In particular, Bin Laden projected common sense, good morals and good will towards his audience. He seemed to have coherent and relevant arguments; he appeared to possess moral credibility; and his use of language demonstrated that he wanted the best for his audience.The concept of pathos is used to define hate speech, since hate speech targets its audience's emotions. In hate speech it is the
Implementation of basic chemistry experiment based on metacognition to increase problem-solving and build concept understanding

Science.gov (United States)

Zuhaida, A.

2018-04-01

Implementation of the experiment have the three aspects of the goal: 1) develop basic skills of experimenting; 2) develop problem-solving skills with a scientific approach; 3) improve understanding of the subject matter. On the implementation of the experiment, students have some weaknesses include: observing, identifying problems, managing information, analyzing, and evaluating. This weakness is included in the metacognition indicator.The objective of the research is to implementation of Basic Chemistry Experiment based on metacognition to increase problem-solving skills and build concept understanding for students of Science Education Department. The method of this research is a quasi- experimental method with pretest-posttest control group design. Problem-solving skills are measured through performance assessments using rubrics from problem solving reports, and results presentation. The conceptual mastery is measured through a description test. The result of the research: (1) improve the problem solving skills of the students with very high category; (2) increase the students’ concept understanding better than the conventional experiment with the result of N-gain in medium category, and (3) increase student's response positively for learning implementation. The contribution of this research is to extend the implementation of practical learning for some subjects, and to improve the students' competence in science.
Blind speech separation system for humanoid robot with FastICA for audio filtering and separation

Science.gov (United States)

Budiharto, Widodo; Santoso Gunawan, Alexander Agung

2016-07-01

Nowadays, there are many developments in building intelligent humanoid robot, mainly in order to handle voice and image. In this research, we propose blind speech separation system using FastICA for audio filtering and separation that can be used in education or entertainment. Our main problem is to separate the multi speech sources and also to filter irrelevant noises. After speech separation step, the results will be integrated with our previous speech and face recognition system which is based on Bioloid GP robot and Raspberry Pi 2 as controller. The experimental results show the accuracy of our blind speech separation system is about 88% in command and query recognition cases.
Network speech systems technology program

Science.gov (United States)

Weinstein, C. J.

1981-09-01

This report documents work performed during FY 1981 on the DCA-sponsored Network Speech Systems Technology Program. The two areas of work reported are: (1) communication system studies in support of the evolving Defense Switched Network (DSN) and (2) design and implementation of satellite/terrestrial interfaces for the Experimental Integrated Switched Network (EISN). The system studies focus on the development and evaluation of economical and endurable network routing procedures. Satellite/terrestrial interface development includes circuit-switched and packet-switched connections to the experimental wideband satellite network. Efforts in planning and coordination of EISN experiments are reported in detail in a separate EISN Experiment Plan.
Verbal problem-solving difficulties in autism spectrum disorders and atypical language development.

Science.gov (United States)

Alderson-Day, Ben

2014-12-01

Children with autism spectrum disorders (ASDs) adopt less efficient strategies than typically developing (TD) peers on the Twenty Questions Task (TQT), a measure of verbal problem-solving skills. Although problems with the TQT are typically associated with executive dysfunction, they have also been reported in children who are deaf, suggesting a role for atypical language development. To test the contribution of language history to ASD problem solving, TQT performance was compared in children with high-functioning autism (HFA), children with Asperger syndrome (AS) and TD children. The HFA group used significantly less efficient strategies than both AS and TD children. No group differences were evident on tests of question understanding, planning or verbal fluency. Potential explanations for differences in verbal problem-solving skill are discussed with reference to the development of inner speech and use of visual strategies in ASD. © 2014 International Society for Autism Research, Wiley Periodicals, Inc.
Using Systemic Problem Solving (SPS) to Assess Student ...

African Journals Online (AJOL)

This paper focuses on the uses of systemic problem solving in chemistry at the tertiary level. Traditional problem solving (TPS) is a useful tool to help teachers examine recall of information, comprehension, and application. However, systemic problem solving (SPS) can challenge students and probe higher cognitive skills ...
Principles of speech coding

CERN Document Server

Ogunfunmi, Tokunbo

2010-01-01

It is becoming increasingly apparent that all forms of communication-including voice-will be transmitted through packet-switched networks based on the Internet Protocol (IP). Therefore, the design of modern devices that rely on speech interfaces, such as cell phones and PDAs, requires a complete and up-to-date understanding of the basics of speech coding. Outlines key signal processing algorithms used to mitigate impairments to speech quality in VoIP networksOffering a detailed yet easily accessible introduction to the field, Principles of Speech Coding provides an in-depth examination of the
The influence of age, hearing, and working memory on the speech comprehension benefit derived from an automatic speech recognition system.

Science.gov (United States)

Zekveld, Adriana A; Kramer, Sophia E; Kessens, Judith M; Vlaming, Marcel S M G; Houtgast, Tammo

2009-04-01

The aim of the current study was to examine whether partly incorrect subtitles that are automatically generated by an Automatic Speech Recognition (ASR) system, improve speech comprehension by listeners with hearing impairment. In an earlier study (Zekveld et al. 2008), we showed that speech comprehension in noise by young listeners with normal hearing improves when presenting partly incorrect, automatically generated subtitles. The current study focused on the effects of age, hearing loss, visual working memory capacity, and linguistic skills on the benefit obtained from automatically generated subtitles during listening to speech in noise. In order to investigate the effects of age and hearing loss, three groups of participants were included: 22 young persons with normal hearing (YNH, mean age = 21 years), 22 middle-aged adults with normal hearing (MA-NH, mean age = 55 years) and 30 middle-aged adults with hearing impairment (MA-HI, mean age = 57 years). The benefit from automatic subtitling was measured by Speech Reception Threshold (SRT) tests (Plomp & Mimpen, 1979). Both unimodal auditory and bimodal audiovisual SRT tests were performed. In the audiovisual tests, the subtitles were presented simultaneously with the speech, whereas in the auditory test, only speech was presented. The difference between the auditory and audiovisual SRT was defined as the audiovisual benefit. Participants additionally rated the listening effort. We examined the influences of ASR accuracy level and text delay on the audiovisual benefit and the listening effort using a repeated measures General Linear Model analysis. In a correlation analysis, we evaluated the relationships between age, auditory SRT, visual working memory capacity and the audiovisual benefit and listening effort. The automatically generated subtitles improved speech comprehension in noise for all ASR accuracies and delays covered by the current study. Higher ASR accuracy levels resulted in more benefit obtained
Dual silent communication system development based on subvocal speech and Raspberry Pi

Directory of Open Access Journals (Sweden)

José Daniel Ramírez-Corzo

2016-09-01

Additionally, in this article we show the speech subvocal signals’ recording system realization. The average accuracy percentage was 72.5 %, and includes a total of 50 words by class, this is 200 signals. Finally, it demonstrated that using the Raspberry Pi it is possible to set a silent communication system, using subvocal. speech signals.
Imaging for understanding speech communication: Advances and challenges

Science.gov (United States)

Narayanan, Shrikanth

2005-04-01

Research in speech communication has relied on a variety of instrumentation methods to illuminate details of speech production and perception. One longstanding challenge has been the ability to examine real-time changes in the shaping of the vocal tract; a goal that has been furthered by imaging techniques such as ultrasound, movement tracking, and magnetic resonance imaging. The spatial and temporal resolution afforded by these techniques, however, has limited the scope of the investigations that could be carried out. In this talk, we focus on some recent advances in magnetic resonance imaging that allow us to perform near real-time investigations on the dynamics of vocal tract shaping during speech. Examples include Demolin et al. (2000) (4-5 images/second, ultra-fast turbo spin echo) and Mady et al. (2001,2002) (8 images/second, T1 fast gradient echo). A recent study by Narayanan et al. (2004) that used a spiral readout scheme to accelerate image acquisition has allowed for image reconstruction rates of 24 images/second. While these developments offer exciting prospects, a number of challenges lie ahead, including: (1) improving image acquisition protocols, hardware for enhancing signal-to-noise ratio, and optimizing spatial sampling; (2) acquiring quality synchronized audio; and (3) analyzing and modeling image data including cross-modality registration. [Work supported by NIH and NSF.
Description of Student’s Metacognitive Ability in Understanding and Solving Mathematics Problem

Science.gov (United States)

Ahmad, Herlina; Febryanti, Fatimah; Febryanti, Fatimah; Muthmainnah

2018-01-01

This research was conducted qualitative which was aim to describe metacognitive ability to understand and solve the problems of mathematics. The subject of the research was the first year students at computer and networking department of SMK Mega Link Majene. The sample was taken by purposive sampling technique. The data obtained used the research instrument based on the form of students achievements were collected by using test of student’s achievement and interview guidance. The technique of collecting data researcher had observation to ascertain the model that used by teacher was teaching model of developing metacognitive. The technique of data analysis in this research was reduction data, presentation and conclusion. Based on the whole findings in this study it was shown that student’s metacognitive ability generally not develops optimally. It was because of limited scope of the materials, and cognitive teaching strategy handled by verbal presentation and trained continuously in facing cognitive tasks, such as understanding and solving problem.
Speech activity detection for the automated speaker recognition system of critical use

Directory of Open Access Journals (Sweden)

M. M. Bykov

2017-06-01

Full Text Available In the article, the authors developed a method for detecting speech activity for an automated system for recognizing critical use of speeches with wavelet parameterization of speech signal and classification at intervals of “language”/“pause” using a curvilinear neural network. The method of wavelet-parametrization proposed by the authors allows choosing the optimal parameters of wavelet transformation in accordance with the user-specified error of presentation of speech signal. Also, the method allows estimating the loss of information depending on the selected parameters of continuous wavelet transformation (NPP, which allowed to reduce the number of scalable coefficients of the LVP of the speech signal in order of magnitude with the allowable degree of distortion of the local spectrum of the LVP. An algorithm for detecting speech activity with a curvilinear neural network classifier is also proposed, which shows the high quality of segmentation of speech signals at intervals "language" / "pause" and is resistant to the presence in the speech signal of narrowband noise and technogenic noise due to the inherent properties of the curvilinear neural network.
Automatic Emotion Recognition in Speech: Possibilities and Significance

Directory of Open Access Journals (Sweden)

Milana Bojanić

2009-12-01

Full Text Available Automatic speech recognition and spoken language understanding are crucial steps towards a natural humanmachine interaction. The main task of the speech communication process is the recognition of the word sequence, but the recognition of prosody, emotion and stress tags may be of particular importance as well. This paper discusses thepossibilities of recognition emotion from speech signal in order to improve ASR, and also provides the analysis of acoustic features that can be used for the detection of speaker’s emotion and stress. The paper also provides a short overview of emotion and stress classification techniques. The importance and place of emotional speech recognition is shown in the domain of human-computer interactive systems and transaction communication model. The directions for future work are given at the end of this work.
Speech processing system demonstrated by positron emission tomography (PET). A review of the literature

International Nuclear Information System (INIS)

Hirano, Shigeru; Naito, Yasushi; Kojima, Hisayoshi

1996-01-01

We review the literature on speech processing in the central nervous system as demonstrated by positron emission tomography (PET). Activation study using PET has been proved to be a useful and non-invasive method of investigating the speech processing system in normal subjects. In speech recognition, the auditory association areas and lexico-semantic areas called Wernicke's area play important roles. Broca's area, motor areas, supplementary motor cortices and the prefrontal area have been proved to be related to speech output. Visual speech stimulation activates not only the visual association areas but also the temporal region and prefrontal area, especially in lexico-semantic processing. Higher level speech processing, such as conversation which includes auditory processing, vocalization and thinking, activates broad areas in both hemispheres. This paper also discusses problems to be resolved in the future. (author) 42 refs
End-User Recommendations on LOGOMON - a Computer Based Speech Therapy System for Romanian Language

Directory of Open Access Journals (Sweden)

SCHIPOR, O. A.

2010-11-01

Full Text Available In this paper we highlight the relations between LOGOMON - a Computer Based Speech Therapy System and dyslalia's training steps. Dyslalia is a speech disorder that affects pronunciation of one or many sounds. This presentation of the system is completed by a research regarding end-user (i.e. teachers and parents attitude about the speech assisted therapy in general and about LOGOMON System in particular. The results of this research allow the improvement of our CBST system because the obtained information can be a source of adaptability to different expectations of the beneficiaries.
Automatic speech recognition used for evaluation of text-to-speech systems

Czech Academy of Sciences Publication Activity Database

Vích, Robert; Nouza, J.; Vondra, Martin

-, č. 5042 (2008), s. 136-148 ISSN 0302-9743 R&D Projects: GA AV ČR 1ET301710509; GA AV ČR 1QS108040569 Institutional research plan: CEZ:AV0Z20670512 Keywords : speech recognition * speech processing Subject RIV: JA - Electronics ; Optoelectronics, Electrical Engineering
Modeling auditory processing and speech perception in hearing-impaired listeners

DEFF Research Database (Denmark)

Jepsen, Morten Løve

in a diagnostic rhyme test. The framework was constructed such that discrimination errors originating from the front-end and the back-end were separated. The front-end was fitted to individual listeners with cochlear hearing loss according to non-speech data, and speech data were obtained in the same listeners......A better understanding of how the human auditory system represents and analyzes sounds and how hearing impairment affects such processing is of great interest for researchers in the fields of auditory neuroscience, audiology, and speech communication as well as for applications in hearing......-instrument and speech technology. In this thesis, the primary focus was on the development and evaluation of a computational model of human auditory signal-processing and perception. The model was initially designed to simulate the normal-hearing auditory system with particular focus on the nonlinear processing...
Is the Speech Transmission Index (STI) a robust measure of sound system speech intelligibility performance?

Science.gov (United States)

Mapp, Peter

2002-11-01

Although RaSTI is a good indicator of the speech intelligibility capability of auditoria and similar spaces, during the past 2-3 years it has been shown that RaSTI is not a robust predictor of sound system intelligibility performance. Instead, it is now recommended, within both national and international codes and standards, that full STI measurement and analysis be employed. However, new research is reported, that indicates that STI is not as flawless, nor robust as many believe. The paper highlights a number of potential error mechanisms. It is shown that the measurement technique and signal excitation stimulus can have a significant effect on the overall result and accuracy, particularly where DSP-based equipment is employed. It is also shown that in its current state of development, STI is not capable of appropriately accounting for a number of fundamental speech and system attributes, including typical sound system frequency response variations and anomalies. This is particularly shown to be the case when a system is operating under reverberant conditions. Comparisons between actual system measurements and corresponding word score data are reported where errors of up to 50 implications for VA and PA system performance verification will be discussed.
Optimal speech motor control and token-to-token variability: a Bayesian modeling approach.

Science.gov (United States)

Patri, Jean-François; Diard, Julien; Perrier, Pascal

2015-12-01

The remarkable capacity of the speech motor system to adapt to various speech conditions is due to an excess of degrees of freedom, which enables producing similar acoustical properties with different sets of control strategies. To explain how the central nervous system selects one of the possible strategies, a common approach, in line with optimal motor control theories, is to model speech motor planning as the solution of an optimality problem based on cost functions. Despite the success of this approach, one of its drawbacks is the intrinsic contradiction between the concept of optimality and the observed experimental intra-speaker token-to-token variability. The present paper proposes an alternative approach by formulating feedforward optimal control in a probabilistic Bayesian modeling framework. This is illustrated by controlling a biomechanical model of the vocal tract for speech production and by comparing it with an existing optimal control model (GEPPETO). The essential elements of this optimal control model are presented first. From them the Bayesian model is constructed in a progressive way. Performance of the Bayesian model is evaluated based on computer simulations and compared to the optimal control model. This approach is shown to be appropriate for solving the speech planning problem while accounting for variability in a principled way.

Understanding speech when wearing communication headsets and hearing protectors with subband processing.

Science.gov (United States)

Brammer, Anthony J; Yu, Gongqiang; Bernstein, Eric R; Cherniack, Martin G; Peterson, Donald R; Tufts, Jennifer B

2014-08-01

An adaptive, delayless, subband feed-forward control structure is employed to improve the speech signal-to-noise ratio (SNR) in the communication channel of a circumaural headset/hearing protector (HPD) from 90 Hz to 11.3 kHz, and to provide active noise control (ANC) from 50 to 800 Hz to complement the passive attenuation of the HPD. The task involves optimizing the speech SNR for each communication channel subband, subject to limiting the maximum sound level at the ear, maintaining a speech SNR preferred by users, and reducing large inter-band gain differences to improve speech quality. The performance of a proof-of-concept device has been evaluated in a pseudo-diffuse sound field when worn by human subjects under conditions of environmental noise and speech that do not pose a risk to hearing, and by simulation for other conditions. For the environmental noises employed in this study, subband speech SNR control combined with subband ANC produced greater improvement in word scores than subband ANC alone, and improved the consistency of word scores across subjects. The simulation employed a subject-specific linear model, and predicted that word scores are maintained in excess of 90% for sound levels outside the HPD of up to ∼115 dBA.
Speech interaction strategies for a humanoid assistant

Directory of Open Access Journals (Sweden)

Stüker Sebastian

2018-01-01

Full Text Available The goal of SecondHands, a H2020 project, is to design a robot that can offer help to a maintenance technician in a proactive manner. The robot is to act as a second pair of hands that can assist the technician when he is in need of help. In order for the robot to be of real help to the technician, it needs to understand his needs and follow his commands. Interaction via speech is a crucial part of this. Due to the nature of the situation in which the interactions take place, often the technician needs to speak to the robot when under stress performing strenuous physical labor, the classical turn based interaction schemes need to be transformed into dialogue systems that perform stream processing, anticipating user intentions, correcting itself as more information become available, in order to be able to respond in a rapid manner. In order to meet these demands, we are developing low-latency streaming based automatic speech recognition systems in combination with recurrent neural network based Natural Language Understanding systems that perform slot filling and intent recognition in order for the robot to provide assistance in a rapid manner, that can be partly based on speculative classifications that are then being refined as more speech becomes available.
An analysis of the masking of speech by competing speech using self-report data (L)

OpenAIRE

Agus, Trevor R.; Akeroyd, Michael A.; Noble, William; Bhullar, Navjot

2009-01-01

Many of the items in the “Speech, Spatial, and Qualities of Hearing” scale questionnaire [S. Gatehouse and W. Noble, Int. J. Audiol.43, 85–99 (2004)] are concerned with speech understanding in a variety of backgrounds, both speech and nonspeech. To study if this self-report data reflected informational masking, previously collected data on 414 people were analyzed. The lowest scores (greatest difficulties) were found for the two items in which there were two speech targets, with successively ...
Confluent-Functional solving systems

Directory of Open Access Journals (Sweden)

V.N. Koval

2001-08-01

Full Text Available The paper proposes a statistical knowledge-acquision approach. The solving systems are considered, which are able to find unknown structural dependences between situational and transforming variables on the basis of statistically analyzed input information. Situational variables describe features, states and relations between environment objects. Transforming variables describe transforming influences, exerted by a goal-oriented system onto an environment. Unknown environment rules are simulated by a structural equations system, associating situational and transforming variables.
How may the basal ganglia contribute to auditory categorization and speech perception?

Directory of Open Access Journals (Sweden)

Sung-Joo eLim

2014-08-01

Full Text Available Listeners must accomplish two complementary perceptual feats in extracting a message from speech. They must discriminate linguistically-relevant acoustic variability and generalize across irrelevant variability. Said another way, they must categorize speech. Since the mapping of acoustic variability is language-specific, these categories must be learned from experience. Thus, understanding how, in general, the auditory system acquires and represents categories can inform us about the toolbox of mechanisms available to speech perception. This perspective invites consideration of findings from cognitive neuroscience literatures outside of the speech domain as a means of constraining models of speech perception. Although neurobiological models of speech perception have mainly focused on cerebral cortex, research outside the speech domain is consistent with the possibility of significant subcortical contributions in category learning. Here, we review the functional role of one such structure, the basal ganglia. We examine research from animal electrophysiology, human neuroimaging, and behavior to consider characteristics of basal ganglia processing that may be advantageous for speech category learning. We also present emerging evidence for a direct role for basal ganglia in learning auditory categories in a complex, naturalistic task intended to model the incidental manner in which speech categories are acquired. To conclude, we highlight new research questions that arise in incorporating the broader neuroscience research literature in modeling speech perception, and suggest how understanding contributions of the basal ganglia can inform attempts to optimize training protocols for learning non-native speech categories in adulthood.
Accountability Steps for Highly Reluctant Speech: Tiered-Services Consultation in a Head Start Classroom

Science.gov (United States)

Howe, Heather; Barnett, David

2013-01-01

This consultation description reports parent and teacher problem solving for a preschool child with no typical speech directed to teachers or peers, and, by parent report, normal speech at home. This child's initial pattern of speech was similar to selective mutism, a low-incidence disorder often first detected during the preschool years, but…
LIBERDADE DE EXPRESSÃO E DISCURSO DO ÓDIO NO BRASIL / FREE SPEECH AND HATE SPEECH IN BRAZIL

Directory of Open Access Journals (Sweden)

Nevita Maria Pessoa de Aquino Franca Luna

2014-12-01

Full Text Available The purpose of this article is to analyze the restriction of free speech when it comes close to hate speech. In this perspective, the aim of this study is to answer the question: what is the understanding adopted by the Brazilian Supreme Court in cases involving the conflict between free speech and hate speech? The methodology combines a bibliographic review on the theoretical assumptions of the research (concept of free speech and hate speech, and understanding of the rights of defense of traditionally discriminated minorities and empirical research (documental and jurisprudential analysis of judged cases of American Court, German Court and Brazilian Court. Firstly, free speech is discussed, defining its meaning, content and purpose. Then, the hate speech is pointed as an inhibitor element of free speech for offending members of traditionally discriminated minorities, who are outnumbered or in a situation of cultural, socioeconomic or political subordination. Subsequently, are discussed some aspects of American (negative freedom and German models (positive freedom, to demonstrate that different cultures adopt different legal solutions. At the end, it is concluded that there is an approximation of the Brazilian understanding with the German doctrine, from the analysis of landmark cases as the publisher Siegfried Ellwanger (2003 and the Samba School Unidos do Viradouro (2008. The Brazilian comprehension, a multicultural country made up of different ethnicities, leads to a new process of defending minorities who, despite of involving the collision of fundamental rights (dignity, equality and freedom, is still restrained by incompatible barriers of a contemporary pluralistic democracy.
Voice Activity Detection. Fundamentals and Speech Recognition System Robustness

OpenAIRE

Ramirez, J.; Gorriz, J. M.; Segura, J. C.

2007-01-01

This chapter has shown an overview of the main challenges in robust speech detection and a review of the state of the art and applications. VADs are frequently used in a number of applications including speech coding, speech enhancement and speech recognition. A precise VAD extracts a set of discriminative speech features from the noisy speech and formulates the decision in terms of well defined rule. The chapter has summarized three robust VAD methods that yield high speech/non-speech discri...
The natural statistics of audiovisual speech.

Directory of Open Access Journals (Sweden)

Chandramouli Chandrasekaran

2009-07-01

Full Text Available Humans, like other animals, are exposed to a continuous stream of signals, which are dynamic, multimodal, extended, and time varying in nature. This complex input space must be transduced and sampled by our sensory systems and transmitted to the brain where it can guide the selection of appropriate actions. To simplify this process, it's been suggested that the brain exploits statistical regularities in the stimulus space. Tests of this idea have largely been confined to unimodal signals and natural scenes. One important class of multisensory signals for which a quantitative input space characterization is unavailable is human speech. We do not understand what signals our brain has to actively piece together from an audiovisual speech stream to arrive at a percept versus what is already embedded in the signal structure of the stream itself. In essence, we do not have a clear understanding of the natural statistics of audiovisual speech. In the present study, we identified the following major statistical features of audiovisual speech. First, we observed robust correlations and close temporal correspondence between the area of the mouth opening and the acoustic envelope. Second, we found the strongest correlation between the area of the mouth opening and vocal tract resonances. Third, we observed that both area of the mouth opening and the voice envelope are temporally modulated in the 2-7 Hz frequency range. Finally, we show that the timing of mouth movements relative to the onset of the voice is consistently between 100 and 300 ms. We interpret these data in the context of recent neural theories of speech which suggest that speech communication is a reciprocally coupled, multisensory event, whereby the outputs of the signaler are matched to the neural processes of the receiver.
A causal test of the motor theory of speech perception: a case of impaired speech production and spared speech perception.

Science.gov (United States)

Stasenko, Alena; Bonn, Cory; Teghipco, Alex; Garcea, Frank E; Sweet, Catherine; Dombovy, Mary; McDonough, Joyce; Mahon, Bradford Z

2015-01-01

The debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. Here, we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. We found that the patient showed a normal phonemic categorical boundary when discriminating two non-words that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the non-word stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labelling impairment. These data suggest that while the motor system is not causally involved in perception of the speech signal, it may be used when other cues (e.g., meaning, context) are not available.
Training of ultra-fast speech comprehension induces functional reorganization of the central-visual system in late-blind humans

Directory of Open Access Journals (Sweden)

Susanne eDietrich

2013-10-01

Full Text Available Individuals suffering from vision loss of a peripheral origin may learn to understand spoken language at a rate of up to about 22 syllables (syl per seconds (s – exceeding by far the maximum performance level of untrained listeners (ca. 8 syl/s. Previous findings indicate the central-visual system to contribute to the processing of accelerated speech in blind subjects. As an extension, the present training study addresses the issue whether acquisition of ultra-fast (18 syl/s speech perception skills induces de novo central-visual hemodynamic activation in late-blind participants. Furthermore, we asked to what extent subjects with normal or residual vision can improve understanding of accelerated verbal utterances by means of specific training measures. To these ends, functional magnetic resonance imaging (fMRI was performed while subjects were listening to forward and reversed sentence utterances of moderately fast and ultra-fast syllable rates (8 or 18 syl/s prior to and after a training period of ca. six months. Four of six participants showed – independently from residual visual functions – considerable enhancement of ultra-fast speech perception (about 70 percentage points correctly repeated words whereas behavioral performance did not change in the two remaining participants. Only subjects with very low visual acuity displayed training-induced hemodynamic activation of the central-visual system. By contrast, participants with moderately impaired or even normal visual acuity showed, instead, increased right-hemispheric frontal or bilateral anterior temporal lobe responses after training. All subjects with significant training effects displayed a concomitant increase of hemodynamic activation of left-hemispheric SMA. In spite of similar behavioral performance, trained experts appear to use distinct strategies of ultra-fast speech processing depending on whether the occipital cortex is still deployed for visual processing.
BILINGUAL MULTIMODAL SYSTEM FOR TEXT-TO-AUDIOVISUAL SPEECH AND SIGN LANGUAGE SYNTHESIS

Directory of Open Access Journals (Sweden)

A. A. Karpov

2014-09-01

Full Text Available We present a conceptual model, architecture and software of a multimodal system for audio-visual speech and sign language synthesis by the input text. The main components of the developed multimodal synthesis system (signing avatar are: automatic text processor for input text analysis; simulation 3D model of human's head; computer text-to-speech synthesizer; a system for audio-visual speech synthesis; simulation 3D model of human’s hands and upper body; multimodal user interface integrating all the components for generation of audio, visual and signed speech. The proposed system performs automatic translation of input textual information into speech (audio information and gestures (video information, information fusion and its output in the form of multimedia information. A user can input any grammatically correct text in Russian or Czech languages to the system; it is analyzed by the text processor to detect sentences, words and characters. Then this textual information is converted into symbols of the sign language notation. We apply international «Hamburg Notation System» - HamNoSys, which describes the main differential features of each manual sign: hand shape, hand orientation, place and type of movement. On their basis the 3D signing avatar displays the elements of the sign language. The virtual 3D model of human’s head and upper body has been created using VRML virtual reality modeling language, and it is controlled by the software based on OpenGL graphical library. The developed multimodal synthesis system is a universal one since it is oriented for both regular users and disabled people (in particular, for the hard-of-hearing and visually impaired, and it serves for multimedia output (by audio and visual modalities of input textual information.
Improving the speech intelligibility in classrooms

Science.gov (United States)

Lam, Choi Ling Coriolanus

One of the major acoustical concerns in classrooms is the establishment of effective verbal communication between teachers and students. Non-optimal acoustical conditions, resulting in reduced verbal communication, can cause two main problems. First, they can lead to reduce learning efficiency. Second, they can also cause fatigue, stress, vocal strain and health problems, such as headaches and sore throats, among teachers who are forced to compensate for poor acoustical conditions by raising their voices. Besides, inadequate acoustical conditions can induce the usage of public address system. Improper usage of such amplifiers or loudspeakers can lead to impairment of students' hearing systems. The social costs of poor classroom acoustics will be large to impair the learning of children. This invisible problem has far reaching implications for learning, but is easily solved. Many researches have been carried out that they have accurately and concisely summarized the research findings on classrooms acoustics. Though, there is still a number of challenging questions remaining unanswered. Most objective indices for speech intelligibility are essentially based on studies of western languages. Even several studies of tonal languages as Mandarin have been conducted, there is much less on Cantonese. In this research, measurements have been done in unoccupied rooms to investigate the acoustical parameters and characteristics of the classrooms. The speech intelligibility tests, which based on English, Mandarin and Cantonese, and the survey were carried out on students aged from 5 years old to 22 years old. It aims to investigate the differences in intelligibility between English, Mandarin and Cantonese of the classrooms in Hong Kong. The significance on speech transmission index (STI) related to Phonetically Balanced (PB) word scores will further be developed. Together with developed empirical relationship between the speech intelligibility in classrooms with the variations
Enhancement of speech signals - with a focus on voiced speech models

DEFF Research Database (Denmark)

Nørholm, Sidsel Marie

This thesis deals with speech enhancement, i.e., noise reduction in speech signals. This has applications in, e.g., hearing aids and teleconference systems. We consider a signal-driven approach to speech enhancement where a model of the speech is assumed and filters are generated based...... on this model. The basic model used in this thesis is the harmonic model which is a commonly used model for describing the voiced part of the speech signal. We show that it can be beneficial to extend the model to take inharmonicities or the non-stationarity of speech into account. Extending the model...
On the use of the distortion-sensitivity approach in examining the role of linguistic abilities in speech understanding in noise.

Science.gov (United States)

Goverts, S Theo; Huysmans, Elke; Kramer, Sophia E; de Groot, Annette M B; Houtgast, Tammo

2011-12-01

Researchers have used the distortion-sensitivity approach in the psychoacoustical domain to investigate the role of auditory processing abilities in speech perception in noise (van Schijndel, Houtgast, & Festen, 2001; Goverts & Houtgast, 2010). In this study, the authors examined the potential applicability of the distortion-sensitivity approach for investigating the role of linguistic abilities in speech understanding in noise. The authors applied the distortion-sensitivity approach by measuring the processing of visually presented masked text in a condition with manipulated syntactic, lexical, and semantic cues and while using the Text Reception Threshold (George et al., 2007; Kramer, Zekveld, & Houtgast, 2009; Zekveld, George, Kramer, Goverts, & Houtgast, 2007) method. Two groups that differed in linguistic abilities were studied: 13 native and 10 non-native speakers of Dutch, all typically hearing university students. As expected, the non-native subjects showed substantially reduced performance. The results of the distortion-sensitivity approach yielded differentiated results on the use of specific linguistic cues in the 2 groups. The results show the potential value of the distortion-sensitivity approach in studying the role of linguistic abilities in speech understanding in noise of individuals with hearing impairment.
The impact of cochlear implantation on speech understanding, subjective hearing performance, and tinnitus perception in patients with unilateral severe to profound hearing loss.

Science.gov (United States)

Távora-Vieira, Dayse; Marino, Roberta; Acharya, Aanand; Rajan, Gunesh P

2015-03-01

This study aimed to determine the impact of cochlear implantation on speech understanding in noise, subjective perception of hearing, and tinnitus perception of adult patients with unilateral severe to profound hearing loss and to investigate whether duration of deafness and age at implantation would influence the outcomes. In addition, this article describes the auditory training protocol used for unilaterally deaf patients. This is a prospective study of subjects undergoing cochlear implantation for unilateral deafness with or without associated tinnitus. Speech perception in noise was tested using the Bamford-Kowal-Bench speech-in-noise test presented at 65 dB SPL. The Speech, Spatial, and Qualities of Hearing Scale and the Abbreviated Profile of Hearing Aid Benefit were used to evaluate the subjective perception of hearing with a cochlear implant and quality of life. Tinnitus disturbance was measured using the Tinnitus Reaction Questionnaire. Data were collected before cochlear implantation and 3, 6, 12, and 24 months after implantation. Twenty-eight postlingual unilaterally deaf adults with or without tinnitus were implanted. There was a significant improvement in speech perception in noise across time in all spatial configurations. There was an overall significant improvement on the subjective perception of hearing and quality of life. Tinnitus disturbance reduced significantly across time. Age at implantation and duration of deafness did not influence the outcomes significantly. Cochlear implantation provided significant improvement in speech understanding in challenging situations, subjective perception of hearing performance, and quality of life. Cochlear implantation also resulted in reduced tinnitus disturbance. Age at implantation and duration of deafness did not seem to influence the outcomes.
The Effectiveness of Clear Speech as a Masker

Science.gov (United States)

Calandruccio, Lauren; Van Engen, Kristin; Dhar, Sumitrajit; Bradlow, Ann R.

2010-01-01

Purpose: It is established that speaking clearly is an effective means of enhancing intelligibility. Because any signal-processing scheme modeled after known acoustic-phonetic features of clear speech will likely affect both target and competing speech, it is important to understand how speech recognition is affected when a competing speech signal…
Hate Speech Detection: A Solved Problem? The Challenging Case of Long Tail on Twitter

OpenAIRE

Zhang, Ziqi; Luo, Lei

2018-01-01

In recent years, the increasing propagation of hate speech on social media and the urgent need for effective counter-measures have drawn significant investment from governments, companies, and empirical research. Despite a large number of emerging, scientific studies to address the problem, the performance of existing automated methods at identifying specific types of hate speech - as opposed to identifying non-hate -is still very unsatisfactory, and the reasons behind are poorly understood. ...
Systemic multimodal approach to speech therapy treatment in autistic children.

Science.gov (United States)

Tamas, Daniela; Marković, Slavica; Milankov, Vesela

2013-01-01

Conditions in which speech therapy treatment is applied in autistic children are often not in accordance with characteristics of opinions and learning of people with autism. A systemic multimodal approach means motivating autistic people to develop their language speech skill through the procedure which allows reliving of their personal experience according to the contents that are presented in the their natural social environment. This research was aimed at evaluating the efficiency of speech treatment based on the systemic multimodal approach to the work with autistic children. The study sample consisted of 34 children, aged from 8 to 16 years, diagnosed to have different autistic disorders, whose results showed a moderate and severe clinical picture of autism on the Childhood Autism Rating Scale. The applied instruments for the evaluation of ability were the Childhood Autism Rating Scale and Ganzberg II test. The study subjects were divided into two groups according to the type of treatment: children who were covered by the continuing treatment and systemic multimodal approach in the treatment, and children who were covered by classical speech treatment. It is shown that the systemic multimodal approach in teaching autistic children affects the stimulation of communication, socialization, self-service and work as well as that the progress achieved in these areas of functioning was retainable after long time, too. By applying the systemic multimodal approach when dealing with autistic children and by comparing their achievements on tests applied before, during and after the application of this mode, it has been concluded that certain improvement has been achieved in the functionality within the diagnosed category. The results point to a possible direction in the creation of new methods, plans and programs in dealing with autistic children based on empirical and interactive learning.
Solving the stability-accuracy-diversity dilemma of recommender systems

Science.gov (United States)

Hou, Lei; Liu, Kecheng; Liu, Jianguo; Zhang, Runtong

2017-02-01

Recommender systems are of great significance in predicting the potential interesting items based on the target user's historical selections. However, the recommendation list for a specific user has been found changing vastly when the system changes, due to the unstable quantification of item similarities, which is defined as the recommendation stability problem. To improve the similarity stability and recommendation stability is crucial for the user experience enhancement and the better understanding of user interests. While the stability as well as accuracy of recommendation could be guaranteed by recommending only popular items, studies have been addressing the necessity of diversity which requires the system to recommend unpopular items. By ranking the similarities in terms of stability and considering only the most stable ones, we present a top- n-stability method based on the Heat Conduction algorithm (denoted as TNS-HC henceforth) for solving the stability-accuracy-diversity dilemma. Experiments on four benchmark data sets indicate that the TNS-HC algorithm could significantly improve the recommendation stability and accuracy simultaneously and still retain the high-diversity nature of the Heat Conduction algorithm. Furthermore, we compare the performance of the TNS-HC algorithm with a number of benchmark recommendation algorithms. The result suggests that the TNS-HC algorithm is more efficient in solving the stability-accuracy-diversity triple dilemma of recommender systems.

Methods for eliciting, annotating, and analyzing databases for child speech development.

Science.gov (United States)

Beckman, Mary E; Plummer, Andrew R; Munson, Benjamin; Reidy, Patrick F

2017-09-01

Methods from automatic speech recognition (ASR), such as segmentation and forced alignment, have facilitated the rapid annotation and analysis of very large adult speech databases and databases of caregiver-infant interaction, enabling advances in speech science that were unimaginable just a few decades ago. This paper centers on two main problems that must be addressed in order to have analogous resources for developing and exploiting databases of young children's speech. The first problem is to understand and appreciate the differences between adult and child speech that cause ASR models developed for adult speech to fail when applied to child speech. These differences include the fact that children's vocal tracts are smaller than those of adult males and also changing rapidly in size and shape over the course of development, leading to between-talker variability across age groups that dwarfs the between-talker differences between adult men and women. Moreover, children do not achieve fully adult-like speech motor control until they are young adults, and their vocabularies and phonological proficiency are developing as well, leading to considerably more within-talker variability as well as more between-talker variability. The second problem then is to determine what annotation schemas and analysis techniques can most usefully capture relevant aspects of this variability. Indeed, standard acoustic characterizations applied to child speech reveal that adult-centered annotation schemas fail to capture phenomena such as the emergence of covert contrasts in children's developing phonological systems, while also revealing children's nonuniform progression toward community speech norms as they acquire the phonological systems of their native languages. Both problems point to the need for more basic research into the growth and development of the articulatory system (as well as of the lexicon and phonological system) that is oriented explicitly toward the construction of
Speech coding

Energy Technology Data Exchange (ETDEWEB)

Ravishankar, C., Hughes Network Systems, Germantown, MD

1998-05-08

Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the
Speech Understanding Systems. Summary of Results of the Five-Year Research Effort at Carnegie-Mellon University

Science.gov (United States)

1977-08-01

EVC’ (EYVR,O) S IV AH2 M CAI (-,0) S IV fEYVL. EYC (EYP.O) (AYL.0) AYC’ (AYR.C) CALCULUS (.- (-,0),-) (K.O) AE31 EL4 (- (-,0),-) (K.0) IH3 L I6 S...SOLVING CARTOGRAPHV CA-’,[ SYSTEM,- CAUSAL REASONING CELl ASAMLIUVLV THEORY CHIECKiNG PROCF-. CHESS CH[qS; PLAY ING PROGnAMk4S CIRCU’IT AI;ALVSiS
Binaural Speech Understanding With Bilateral Cochlear Implants in Reverberation.

Science.gov (United States)

Kokkinakis, Kostas

2018-03-08

demonstrate the impeding properties of reverberation on binaural speech understanding. In addition, results indicate that CI recipients who struggle in everyday listening environments are also more likely to benefit less in highly reverberant environments from their bilateral processors.
Understanding catastrophizing from a misdirected problem-solving perspective.

Science.gov (United States)

Flink, Ida K; Boersma, Katja; MacDonald, Shane; Linton, Steven J

2012-05-01

The aim is to explore pain catastrophizing from a problem-solving perspective. The links between catastrophizing, problem framing, and problem-solving behaviour are examined through two possible models of mediation as inferred by two contemporary and complementary theoretical models, the misdirected problem solving model (Eccleston & Crombez, 2007) and the fear-anxiety-avoidance model (Asmundson, Norton, & Vlaeyen, 2004). In this prospective study, a general population sample (n= 173) with perceived problems with spinal pain filled out questionnaires twice; catastrophizing and problem framing were assessed on the first occasion and health care seeking (as a proxy for medically oriented problem solving) was assessed 7 months later. Two different approaches were used to explore whether the data supported any of the proposed models of mediation. First, multiple regressions were used according to traditional recommendations for mediation analyses. Second, a bootstrapping method (n= 1000 bootstrap resamples) was used to explore the significance of the indirect effects in both possible models of mediation. The results verified the concepts included in the misdirected problem solving model. However, the direction of the relations was more in line with the fear-anxiety-avoidance model. More specifically, the mediation analyses provided support for viewing catastrophizing as a mediator of the relation between biomedical problem framing and medically oriented problem-solving behaviour. These findings provide support for viewing catastrophizing from a problem-solving perspective and imply a need to examine and address problem framing and catastrophizing in back pain patients. ©2011 The British Psychological Society.
Speech understanding in noise with an eyeglass hearing aid: asymmetric fitting and the head shadow benefit of anterior microphones.

NARCIS (Netherlands)

Mens, L.H.M.

2011-01-01

OBJECTIVE: To test speech understanding in noise using array microphones integrated in an eyeglass device and to test if microphones placed anteriorly at the temple provide better directivity than above the pinna. DESIGN: Sentences were presented from the front and uncorrelated noise from 45, 135,
78 FR 49693 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

Science.gov (United States)

2013-08-15

...-Speech Services for Individuals with Hearing and Speech Disabilities, Report and Order (Order), document...] Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services; Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and Speech Disabilities...
Normal Aspects of Speech, Hearing, and Language.

Science.gov (United States)

Minifie, Fred. D., Ed.; And Others

This book is written as a guide to the understanding of the processes involved in human speech communication. Ten authorities contributed material to provide an introduction to the physiological aspects of speech production and reception, the acoustical aspects of speech production and transmission, the psychophysics of sound reception, the nature…
The Different Patterns of Gesture between Genders in Mathematical Problem Solving of Geometry

Science.gov (United States)

Harisman, Y.; Noto, M. S.; Bakar, M. T.; Amam, A.

2017-02-01

This article discusses about students’ gesture between genders in answering problems of geometry. Gesture aims to check students’ understanding which is undefined from their writings. This study is a qualitative research, there were seven questions given to two students of eight grade Junior High School who had the equal ability. The data of this study were collected from mathematical problem solving test, videoing students’ presentation, and interviewing students by asking questions to check their understandings in geometry problems, in this case the researchers would observe the students’ gesture. The result of this study revealed that there were patterns of gesture through students’ conversation and prosodic cues, such as tones, intonation, speech rate and pause. Female students tended to give indecisive gestures, for instance bowing, hesitating, embarrassing, nodding many times in shifting cognitive comprehension, forwarding their body and asking questions to the interviewer when they found tough questions. However, male students acted some gestures such as playing their fingers, focusing on questions, taking longer time to answer hard questions, staying calm in shifting cognitive comprehension. We suggest to observe more sample and focus on students’ gesture consistency in showing their understanding to solve the given problems.
Religion, hate speech, and non-domination

OpenAIRE

Bonotti, Matteo

2017-01-01

In this paper I argue that one way of explaining what is wrong with hate speech is by critically assessing what kind of freedom free speech involves and, relatedly, what kind of freedom hate speech undermines. More specifically, I argue that the main arguments for freedom of speech (e.g. from truth, from autonomy, and from democracy) rely on a “positive” conception of freedom intended as autonomy and self-mastery (Berlin, 2006), and can only partially help us to understand what is wrong with ...
Recovering With Acquired Apraxia of Speech: The First 2 Years.

Science.gov (United States)

Haley, Katarina L; Shafer, Jennifer N; Harmon, Tyson G; Jacks, Adam

2016-12-01

This study was intended to document speech recovery for 1 person with acquired apraxia of speech quantitatively and on the basis of her lived experience. The second author sustained a traumatic brain injury that resulted in acquired apraxia of speech. Over a 2-year period, she documented her recovery through 22 video-recorded monologues. We analyzed these monologues using a combination of auditory perceptual, acoustic, and qualitative methods. Recovery was evident for all quantitative variables examined. For speech sound production, the recovery was most prominent during the first 3 months, but slower improvement was evident for many months. Measures of speaking rate, fluency, and prosody changed more gradually throughout the entire period. A qualitative analysis of topics addressed in the monologues was consistent with the quantitative speech recovery and indicated a subjective dynamic relationship between accuracy and rate, an observation that several factors made speech sound production variable, and a persisting need for cognitive effort while speaking. Speech features improved over an extended time, but the recovery trajectories differed, indicating dynamic reorganization of the underlying speech production system. The relationship among speech dimensions should be examined in other cases and in population samples. The combination of quantitative and qualitative analysis methods offers advantages for understanding clinically relevant aspects of recovery.
The Impact of the Picture Exchange Communication System on Requesting and Speech Development in Preschoolers with Autism Spectrum Disorders and Similar Characteristics

Science.gov (United States)

Ganz, Jennifer B.; Simpson, Richard L.; Corbin-Newsome, Jawanda

2008-01-01

By definition children with autism spectrum disorders (ASD) experience difficulty understanding and using language. Accordingly, visual and picture-based strategies such as the Picture Exchange Communication System (PECS) show promise in ameliorating speech and language deficits. This study reports the results of a multiple baseline across…
Implementation of Hybrid Speech Dereverberation Systems and Proposing Dual Microphone Farsi Database in Order to Evaluating Enhancement Systems

Directory of Open Access Journals (Sweden)

Farhad Faghani

2013-01-01

Full Text Available In various applications, such as speech recognition and automatic teleconferencing, the recorded speech signals may be corrupted by both noise and reverberation. Reverberation causes a noticeable change in speech intelligibility and quality. In this research, firstly reverberation is described. There are some de-reverberation enhancement algorithms that use only one microphone. They mostly use inverse filtering and spectral subtraction as their sub-systems. On the other hand, there are many multi-microphone speech enhancement systems; Delay-and-sum beam former is the most famous amongst them. Moreover, several efficient approaches have been also reported that use linear prediction (LP residual signal, inverse filtering, and phase error. Despite the improvements and benefits gained by the use of several input microphones, considering the tradeoff between these gains and the complexity and computational cost forced by the use of more microphones, many researchers have focused on dual-microphones systems. So, a review on Microphone array signal processing is explained and then an arrangement for two microphones systems is proposed. As we want to evaluate these algorithms for Farsi speech signals, the problem of speech intelligibility assessment has been explained and a Farsi word list for Diagnostic Rhyme Test (DRT is presented.The structure of presented word list is similar to that of English DRT words. In this research, after a brief study of above-mentioned methods, we propose and implement some hybrid techniques to benefit from the advantages of several methods and achieve significant improvement in output signals. It will be shown that the proposed method performs superior to the state-of-the-art dereverberation algorithms.
Exploring Students' Understanding of Ordinary Differential Equations Using Computer Algebraic System (CAS)

Science.gov (United States)

Maat, Siti Mistima; Zakaria, Effandi

2011-01-01

Ordinary differential equations (ODEs) are one of the important topics in engineering mathematics that lead to the understanding of technical concepts among students. This study was conducted to explore the students' understanding of ODEs when they solve ODE questions using a traditional method as well as a computer algebraic system, particularly…
Advocate: A Distributed Architecture for Speech-to-Speech Translation

Science.gov (United States)

2009-01-01

tecture, are either wrapped natural-language processing ( NLP ) components or objects developed from scratch using the architecture’s API. GATE is...framework, we put together a demonstration Arabic -to- English speech translation system using both internally developed ( Arabic speech recognition and MT...conditions of our Arabic S2S demonstration system described earlier. Once again, the data size was varied and eighty identical requests were
A Russian Keyword Spotting System Based on Large Vocabulary Continuous Speech Recognition and Linguistic Knowledge

Directory of Open Access Journals (Sweden)

Valentin Smirnov

2016-01-01

Full Text Available The paper describes the key concepts of a word spotting system for Russian based on large vocabulary continuous speech recognition. Key algorithms and system settings are described, including the pronunciation variation algorithm, and the experimental results on the real-life telecom data are provided. The description of system architecture and the user interface is provided. The system is based on CMU Sphinx open-source speech recognition platform and on the linguistic models and algorithms developed by Speech Drive LLC. The effective combination of baseline statistic methods, real-world training data, and the intensive use of linguistic knowledge led to a quality result applicable to industrial use.
Man-system interface based on automatic speech recognition: integration to a virtual control desk

Energy Technology Data Exchange (ETDEWEB)

Jorge, Carlos Alexandre F.; Mol, Antonio Carlos A.; Pereira, Claudio M.N.A.; Aghina, Mauricio Alves C., E-mail: calexandre@ien.gov.b, E-mail: mol@ien.gov.b, E-mail: cmnap@ien.gov.b, E-mail: mag@ien.gov.b [Instituto de Engenharia Nuclear (IEN/CNEN-RJ), Rio de Janeiro, RJ (Brazil); Nomiya, Diogo V., E-mail: diogonomiya@gmail.co [Universidade Federal do Rio de Janeiro (UFRJ), RJ (Brazil)

2009-07-01

This work reports the implementation of a man-system interface based on automatic speech recognition, and its integration to a virtual nuclear power plant control desk. The later is aimed to reproduce a real control desk using virtual reality technology, for operator training and ergonomic evaluation purpose. An automatic speech recognition system was developed to serve as a new interface with users, substituting computer keyboard and mouse. They can operate this virtual control desk in front of a computer monitor or a projection screen through spoken commands. The automatic speech recognition interface developed is based on a well-known signal processing technique named cepstral analysis, and on artificial neural networks. The speech recognition interface is described, along with its integration with the virtual control desk, and results are presented. (author)
Man-system interface based on automatic speech recognition: integration to a virtual control desk

International Nuclear Information System (INIS)

Jorge, Carlos Alexandre F.; Mol, Antonio Carlos A.; Pereira, Claudio M.N.A.; Aghina, Mauricio Alves C.; Nomiya, Diogo V.

2009-01-01

This work reports the implementation of a man-system interface based on automatic speech recognition, and its integration to a virtual nuclear power plant control desk. The later is aimed to reproduce a real control desk using virtual reality technology, for operator training and ergonomic evaluation purpose. An automatic speech recognition system was developed to serve as a new interface with users, substituting computer keyboard and mouse. They can operate this virtual control desk in front of a computer monitor or a projection screen through spoken commands. The automatic speech recognition interface developed is based on a well-known signal processing technique named cepstral analysis, and on artificial neural networks. The speech recognition interface is described, along with its integration with the virtual control desk, and results are presented. (author)
Eyes and ears: Using eye tracking and pupillometry to understand challenges to speech recognition.

Science.gov (United States)

Van Engen, Kristin J; McLaughlin, Drew J

2018-05-04

Although human speech recognition is often experienced as relatively effortless, a number of common challenges can render the task more difficult. Such challenges may originate in talkers (e.g., unfamiliar accents, varying speech styles), the environment (e.g. noise), or in listeners themselves (e.g., hearing loss, aging, different native language backgrounds). Each of these challenges can reduce the intelligibility of spoken language, but even when intelligibility remains high, they can place greater processing demands on listeners. Noisy conditions, for example, can lead to poorer recall for speech, even when it has been correctly understood. Speech intelligibility measures, memory tasks, and subjective reports of listener difficulty all provide critical information about the effects of such challenges on speech recognition. Eye tracking and pupillometry complement these methods by providing objective physiological measures of online cognitive processing during listening. Eye tracking records the moment-to-moment direction of listeners' visual attention, which is closely time-locked to unfolding speech signals, and pupillometry measures the moment-to-moment size of listeners' pupils, which dilate in response to increased cognitive load. In this paper, we review the uses of these two methods for studying challenges to speech recognition. Copyright © 2018. Published by Elsevier B.V.
Speech and Language Disturbances in Neurology Practice

Directory of Open Access Journals (Sweden)

Oğuz Tanrıdağ

2009-12-01

Full Text Available Despite the well-known facts discerned from interesting cases of speech and language disturbances over thousands of years, the scientific background and the limitless discussions for nearly 150 years, this field has been considered one of the least important subjects in neurological sciences. In this review, we first analyze the possible causes for this “stepchild” attitude towards this subject and we then summarize the practical aspects concerning speech and language disturbances. Our underlying expectation with this review is to explain the facts concerning those disturbances that might offer us opportunities to better understand the nervous system and the affected patients

Part-of-speech effects on text-to-speech synthesis

CSIR Research Space (South Africa)

Schlunz, GI

2010-11-01

Full Text Available One of the goals of text-to-speech (TTS) systems is to produce natural-sounding synthesised speech. Towards this end various natural language processing (NLP) tasks are performed to model the prosodic aspects of the TTS voice. One of the fundamental...
The Interpersonal Metafunction Analysis of Barack Obama's Victory Speech

Science.gov (United States)

Ye, Ruijuan

2010-01-01

This paper carries on a tentative interpersonal metafunction analysis of Barack Obama's victory speech from the interpersonal metafunction, which aims to help readers understand and evaluate the speech regarding its suitability, thus to provide some guidance for readers to make better speeches. This study has promising implications for speeches as…
Understanding Freedom of Speech in America: The Origin & Evolution of the 1st Amendment.

Science.gov (United States)

Barnes, Judy

In this booklet the content and implications of the First Amendment are analyzed. Historical origins of free speech from ancient Greece to England before the discovery of America, free speech in colonial America, and the Bill of Rights and its meaning for free speech are outlined. The evolution of the First Amendment is described, and the…
Speech-to-Speech Relay Service

Science.gov (United States)

Consumer Guide Speech to Speech Relay Service Speech-to-Speech (STS) is one form of Telecommunications Relay Service (TRS). TRS is a service that allows persons with hearing and speech disabilities ...
DEVELOPMENT OF AUTOMATED SPEECH RECOGNITION SYSTEM FOR EGYPTIAN ARABIC PHONE CONVERSATIONS

Directory of Open Access Journals (Sweden)

A. N. Romanenko

2016-07-01

Full Text Available The paper deals with description of several speech recognition systems for the Egyptian Colloquial Arabic. The research is based on the CALLHOME Egyptian corpus. The description of both systems, classic: based on Hidden Markov and Gaussian Mixture Models, and state-of-the-art: deep neural network acoustic models is given. We have demonstrated the contribution from the usage of speaker-dependent bottleneck features; for their extraction three extractors based on neural networks were trained. For their training three datasets in several languageswere used:Russian, English and differentArabic dialects.We have studied the possibility of application of a small Modern Standard Arabic (MSA corpus to derive phonetic transcriptions. The experiments have shown that application of the extractor obtained on the basis of the Russian dataset enables to increase significantly the quality of the Arabic speech recognition. We have also stated that the usage of phonetic transcriptions based on modern standard Arabic decreases recognition quality. Nevertheless, system operation results remain applicable in practice. In addition, we have carried out the study of obtained models application for the keywords searching problem solution. The systems obtained demonstrate good results as compared to those published before. Some ways to improve speech recognition are offered.
Co-Thought and Co-Speech Gestures Are Generated by the Same Action Generation Process

Science.gov (United States)

Chu, Mingyuan; Kita, Sotaro

2016-01-01

People spontaneously gesture when they speak (co-speech gestures) and when they solve problems silently (co-thought gestures). In this study, we first explored the relationship between these 2 types of gestures and found that individuals who produced co-thought gestures more frequently also produced co-speech gestures more frequently (Experiments…
Clock Math — a System for Solving SLEs Exactly

Directory of Open Access Journals (Sweden)

Jakub Hladík

2013-01-01

Full Text Available In this paper, we present a GPU-accelerated hybrid system that solves ill-conditioned systems of linear equations exactly. Exactly means without rounding errors due to using integer arithmetics. First, we scale floating-point numbers up to integers, then we solve dozens of SLEs within different modular arithmetics and then we assemble sub-solutions back using the Chinese remainder theorem. This approach effectively bypasses current CPU floating-point limitations. The system is capable of solving Hilbert’s matrix without losing a single bit of precision, and with a significant speedup compared to existing CPU solvers.
Evaluation of Speech Recognition of Cochlear Implant Recipients Using Adaptive, Digital Remote Microphone Technology and a Speech Enhancement Sound Processing Algorithm.

Science.gov (United States)

Wolfe, Jace; Morais, Mila; Schafer, Erin; Agrawal, Smita; Koch, Dawn

2015-05-01

Cochlear implant recipients often experience difficulty with understanding speech in the presence of noise. Cochlear implant manufacturers have developed sound processing algorithms designed to improve speech recognition in noise, and research has shown these technologies to be effective. Remote microphone technology utilizing adaptive, digital wireless radio transmission has also been shown to provide significant improvement in speech recognition in noise. There are no studies examining the potential improvement in speech recognition in noise when these two technologies are used simultaneously. The goal of this study was to evaluate the potential benefits and limitations associated with the simultaneous use of a sound processing algorithm designed to improve performance in noise (Advanced Bionics ClearVoice) and a remote microphone system that incorporates adaptive, digital wireless radio transmission (Phonak Roger). A two-by-two way repeated measures design was used to examine performance differences obtained without these technologies compared to the use of each technology separately as well as the simultaneous use of both technologies. Eleven Advanced Bionics (AB) cochlear implant recipients, ages 11 to 68 yr. AzBio sentence recognition was measured in quiet and in the presence of classroom noise ranging in level from 50 to 80 dBA in 5-dB steps. Performance was evaluated in four conditions: (1) No ClearVoice and no Roger, (2) ClearVoice enabled without the use of Roger, (3) ClearVoice disabled with Roger enabled, and (4) simultaneous use of ClearVoice and Roger. Speech recognition in quiet was better than speech recognition in noise for all conditions. Use of ClearVoice and Roger each provided significant improvement in speech recognition in noise. The best performance in noise was obtained with the simultaneous use of ClearVoice and Roger. ClearVoice and Roger technology each improves speech recognition in noise, particularly when used at the same time
STATE-OF-THE-ART TASKS AND ACHIEVEMENTS OF PARALINGUISTIC SPEECH ANALYSIS SYSTEMS

Directory of Open Access Journals (Sweden)

A. A. Karpov

2016-07-01

Full Text Available We present analytical survey of state-of-the-art actual tasks in the area of computational paralinguistics, as well as the recent achievements of automatic systems for paralinguistic analysis of conversational speech. Paralinguistics studies non-verbal aspects of human communication and speech such as: natural emotions, accents, psycho-physiological states, pronunciation features, speaker’s voice parameters, etc. We describe architecture of a baseline computer system for acoustical paralinguistic analysis, its main components and useful speech processing methods. We present some information on an International contest called Computational Paralinguistics Challenge (ComParE, which is held each year since 2009 in the framework of the International conference INTERSPEECH organized by the International Speech Communication Association. We present sub-challenges (tasks that were proposed at the ComParE Challenges in 2009-2016, and analyze winning computer systems for each sub-challenge and obtained results. The last completed ComParE-2015 Challenge was organized in September 2015 in Germany and proposed 3 sub-challenges: 1 Degree of Nativeness (DN sub-challenge, determination of nativeness degree of speakers based on acoustics; 2 Parkinson's Condition (PC sub-challenge, recognition of a degree of Parkinson’s condition based on speech analysis; 3 Eating Condition (EC sub-challenge, determination of the eating condition state during speaking or a dialogue, and classification of consumed food type (one of seven classes of food by the speaker. In the last sub-challenge (EC, the winner was a joint Turkish-Russian team consisting of the authors of the given paper. We have developed the most efficient computer-based system for detection and classification of the corresponding (EC acoustical paralinguistic events. The paper deals with the architecture of this system, its main modules and methods, as well as the description of used training and evaluation
The semantic system is involved in mathematical problem solving.

Science.gov (United States)

Zhou, Xinlin; Li, Mengyi; Li, Leinian; Zhang, Yiyun; Cui, Jiaxin; Liu, Jie; Chen, Chuansheng

2018-02-01

Numerous studies have shown that the brain regions around bilateral intraparietal cortex are critical for number processing and arithmetical computation. However, the neural circuits for more advanced mathematics such as mathematical problem solving (with little routine arithmetical computation) remain unclear. Using functional magnetic resonance imaging (fMRI), this study (N = 24 undergraduate students) compared neural bases of mathematical problem solving (i.e., number series completion, mathematical word problem solving, and geometric problem solving) and arithmetical computation. Direct subject- and item-wise comparisons revealed that mathematical problem solving typically had greater activation than arithmetical computation in all 7 regions of the semantic system (which was based on a meta-analysis of 120 functional neuroimaging studies on semantic processing). Arithmetical computation typically had greater activation in the supplementary motor area and left precentral gyrus. The results suggest that the semantic system in the brain supports mathematical problem solving. Copyright © 2017 Elsevier Inc. All rights reserved.
Geo-Sandbox: An Interactive Geoscience Training Tool with Analytics to Better Understand Student Problem Solving Approaches

Science.gov (United States)

Butt, N.; Pidlisecky, A.; Ganshorn, H.; Cockett, R.

2015-12-01

The software company 3 Point Science has developed three interactive learning programs designed to teach, test and practice visualization skills and geoscience concepts. A study was conducted with 21 geoscience students at the University of Calgary who participated in 2 hour sessions of software interaction and written pre and post-tests. Computer and SMART touch table interfaces were used to analyze user interaction, problem solving methods and visualization skills. By understanding and pinpointing user problem solving methods it is possible to reconstruct viewpoints and thought processes. This could allow us to give personalized feedback in real time, informing the user of problem solving tips and possible misconceptions.
THE PHYSICAL LABORATORY ACTIVITIES WITH PROBLEM SOLVING APPROACH TO INCREASE CRITICAL THINKING SKILL AND UNDERSTANDING STUDENT CONCEPT

Directory of Open Access Journals (Sweden)

Eli Trisnowati

2017-10-01

Full Text Available This study aims to investigate the description of the improvement of students’ critical thinking skills and the concept understanding by implementing the problem-solving approach. This study was in laboratory activities. This study was done in four times meeting. The try out subjects was 31 students of grades X of MAN Yogyakarta III. This research is using the quasi experimental method with the pretest-posttest design. The data were collected by using multiple choices tests with assessment rubric and observation sheets. The data are analyzed by using multivariate analysis. Based on the result, the gain standard value of students’ conceptual understanding and students’ critical thinking skills for grade X who learned through student’s worksheet with a problem-solving approach, called treatment class, are higher than students who learned without student’s worksheet with a problem-solving approach, called control class.
Charisma in business speeches

DEFF Research Database (Denmark)

Niebuhr, Oliver; Brem, Alexander; Novák-Tót, Eszter

2016-01-01

to business speeches. Consistent with the public opinion, our findings are indicative of Steve Jobs being a more charismatic speaker than Mark Zuckerberg. Beyond previous studies, our data suggest that rhythm and emphatic accentuation are also involved in conveying charisma. Furthermore, the differences...... between Steve Jobs and Mark Zuckerberg and the investor- and customer-related sections of their speeches support the modern understanding of charisma as a gradual, multiparametric, and context-sensitive concept....
Understanding and quantifying cognitive complexity level in mathematical problem solving items

Directory of Open Access Journals (Sweden)

SUSAN E. EMBRETSON

2008-09-01

Full Text Available The linear logistic test model (LLTM; Fischer, 1973 has been applied to a wide variety of new tests. When the LLTM application involves item complexity variables that are both theoretically interesting and empirically supported, several advantages can result. These advantages include elaborating construct validity at the item level, defining variables for test design, predicting parameters of new items, item banking by sources of complexity and providing a basis for item design and item generation. However, despite the many advantages of applying LLTM to test items, it has been applied less often to understand the sources of complexity for large-scale operational test items. Instead, previously calibrated item parameters are modeled using regression techniques because raw item response data often cannot be made available. In the current study, both LLTM and regression modeling are applied to mathematical problem solving items from a widely used test. The findings from the two methods are compared and contrasted for their implications for continued development of ability and achievement tests based on mathematical problem solving items.
THE ONTOGENESIS OF SPEECH DEVELOPMENT

Directory of Open Access Journals (Sweden)

T. E. Braudo

2017-01-01

Full Text Available The purpose of this article is to acquaint the specialists, working with children having developmental disorders, with age-related norms for speech development. Many well-known linguists and psychologists studied speech ontogenesis (logogenesis. Speech is a higher mental function, which integrates many functional systems. Speech development in infants during the first months after birth is ensured by the innate hearing and emerging ability to fix the gaze on the face of an adult. Innate emotional reactions are also being developed during this period, turning into nonverbal forms of communication. At about 6 months a baby starts to pronounce some syllables; at 7–9 months – repeats various sounds combinations, pronounced by adults. At 10–11 months a baby begins to react on the words, referred to him/her. The first words usually appear at an age of 1 year; this is the start of the stage of active speech development. At this time it is acceptable, if a child confuses or rearranges sounds, distorts or misses them. By the age of 1.5 years a child begins to understand abstract explanations of adults. Significant vocabulary enlargement occurs between 2 and 3 years; grammatical structures of the language are being formed during this period (a child starts to use phrases and sentences. Preschool age (3–7 y. o. is characterized by incorrect, but steadily improving pronunciation of sounds and phonemic perception. The vocabulary increases; abstract speech and retelling are being formed. Children over 7 y. o. continue to improve grammar, writing and reading skills. The described stages may not have strict age boundaries, as soon as they are dependent not only on environment, but also on the child’s mental constitution, heredity and character.
ISOLATED SPEECH RECOGNITION SYSTEM FOR TAMIL LANGUAGE USING STATISTICAL PATTERN MATCHING AND MACHINE LEARNING TECHNIQUES

Directory of Open Access Journals (Sweden)

VIMALA C.

2015-05-01

Full Text Available In recent years, speech technology has become a vital part of our daily lives. Various techniques have been proposed for developing Automatic Speech Recognition (ASR system and have achieved great success in many applications. Among them, Template Matching techniques like Dynamic Time Warping (DTW, Statistical Pattern Matching techniques such as Hidden Markov Model (HMM and Gaussian Mixture Models (GMM, Machine Learning techniques such as Neural Networks (NN, Support Vector Machine (SVM, and Decision Trees (DT are most popular. The main objective of this paper is to design and develop a speaker-independent isolated speech recognition system for Tamil language using the above speech recognition techniques. The background of ASR system, the steps involved in ASR, merits and demerits of the conventional and machine learning algorithms and the observations made based on the experiments are presented in this paper. For the above developed system, highest word recognition accuracy is achieved with HMM technique. It offered 100% accuracy during training process and 97.92% for testing process.
Galerkin projection methods for solving multiple related linear systems

Energy Technology Data Exchange (ETDEWEB)

Chan, T.F.; Ng, M.; Wan, W.L.

1996-12-31

We consider using Galerkin projection methods for solving multiple related linear systems A{sup (i)}x{sup (i)} = b{sup (i)} for 1 {le} i {le} s, where A{sup (i)} and b{sup (i)} are different in general. We start with the special case where A{sup (i)} = A and A is symmetric positive definite. The method generates a Krylov subspace from a set of direction vectors obtained by solving one of the systems, called the seed system, by the CG method and then projects the residuals of other systems orthogonally onto the generated Krylov subspace to get the approximate solutions. The whole process is repeated with another unsolved system as a seed until all the systems are solved. We observe in practice a super-convergence behaviour of the CG process of the seed system when compared with the usual CG process. We also observe that only a small number of restarts is required to solve all the systems if the right-hand sides are close to each other. These two features together make the method particularly effective. In this talk, we give theoretical proof to justify these observations. Furthermore, we combine the advantages of this method and the block CG method and propose a block extension of this single seed method. The above procedure can actually be modified for solving multiple linear systems A{sup (i)}x{sup (i)} = b{sup (i)}, where A{sup (i)} are now different. We can also extend the previous analytical results to this more general case. Applications of this method to multiple related linear systems arising from image restoration and recursive least squares computations are considered as examples.
A convex optimization approach for solving large scale linear systems

Directory of Open Access Journals (Sweden)

Debora Cores

2017-01-01

Full Text Available The well-known Conjugate Gradient (CG method minimizes a strictly convex quadratic function for solving large-scale linear system of equations when the coefficient matrix is symmetric and positive definite. In this work we present and analyze a non-quadratic convex function for solving any large-scale linear system of equations regardless of the characteristics of the coefficient matrix. For finding the global minimizers, of this new convex function, any low-cost iterative optimization technique could be applied. In particular, we propose to use the low-cost globally convergent Spectral Projected Gradient (SPG method, which allow us to extend this optimization approach for solving consistent square and rectangular linear system, as well as linear feasibility problem, with and without convex constraints and with and without preconditioning strategies. Our numerical results indicate that the new scheme outperforms state-of-the-art iterative techniques for solving linear systems when the symmetric part of the coefficient matrix is indefinite, and also for solving linear feasibility problems.
Differences in early speech patterns between Parkinson variant of multiple system atrophy and Parkinson's disease.

Science.gov (United States)

Huh, Young Eun; Park, Jongkyu; Suh, Mee Kyung; Lee, Sang Eun; Kim, Jumin; Jeong, Yuri; Kim, Hee-Tae; Cho, Jin Whan

2015-08-01

In Parkinson variant of multiple system atrophy (MSA-P), patterns of early speech impairment and their distinguishing features from Parkinson's disease (PD) require further exploration. Here, we compared speech data among patients with early-stage MSA-P, PD, and healthy subjects using quantitative acoustic and perceptual analyses. Variables were analyzed for men and women in view of gender-specific features of speech. Acoustic analysis revealed that male patients with MSA-P exhibited more profound speech abnormalities than those with PD, regarding increased voice pitch, prolonged pause time, and reduced speech rate. This might be due to widespread pathology of MSA-P in nigrostriatal or extra-striatal structures related to speech production. Although several perceptual measures were mildly impaired in MSA-P and PD patients, none of these parameters showed a significant difference between patient groups. Detailed speech analysis using acoustic measures may help distinguish between MSA-P and PD early in the disease process. Copyright © 2015 Elsevier Inc. All rights reserved.
Problem solving using soft systems methodology.

Science.gov (United States)

Land, L

This article outlines a method of problem solving which considers holistic solutions to complex problems. Soft systems methodology allows people involved in the problem situation to have control over the decision-making process.

Speech control interface for Eurocontrol’s LINK2000+ system

Directory of Open Access Journals (Sweden)

Dan-Cristian ION

2012-06-01

Full Text Available This paper continues recent research of the authors, considering the use of speech recognition in air traffic control. It proposes the use of a voice control interface for Eurocontrol’s LINK2000+ system, offering an alternative means to improve air transport safety and efficiency.
Speech overlap detection in a two-pass speaker diarization system

NARCIS (Netherlands)

Huijbregts, M.A.H.; Leeuwen, D.A. van; Jong, F. M. G de

2009-01-01

In this paper we present the two-pass speaker diarization system that we developed for the NIST RT09s evaluation. In the first pass of our system a model for speech overlap detection is gen- erated automatically. This model is used in two ways to reduce the diarization errors due to overlapping
Speech overlap detection in a two-pass speaker diarization system

NARCIS (Netherlands)

Huijbregts, M.; Leeuwen, D.A. van; Jong, F.M.G. de

2009-01-01

In this paper we present the two-pass speaker diarization system that we developed for the NIST RT09s evaluation. In the first pass of our system a model for speech overlap detection is generated automatically. This model is used in two ways to reduce the diarization errors due to overlapping
[Improving speech comprehension using a new cochlear implant speech processor].

Science.gov (United States)

Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A

2009-06-01

The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg
Development of a System for Automatic Recognition of Speech

Directory of Open Access Journals (Sweden)

Roman Jarina

2003-01-01

Full Text Available The article gives a review of a research on processing and automatic recognition of speech signals (ARR at the Department of Telecommunications of the Faculty of Electrical Engineering, University of iilina. On-going research is oriented to speech parametrization using 2-dimensional cepstral analysis, and to an application of HMMs and neural networks for speech recognition in Slovak language. The article summarizes achieved results and outlines future orientation of our research in automatic speech recognition.
Effects of single-channel phonemic compression schemes on the understanding of speech by hearing-impaired listeners

NARCIS (Netherlands)

Goedegebure, A.; Hulshof, M.; Maas, R. J.; Dreschler, W. A.; Verschuure, H.

2001-01-01

The effect of digital processing on speech intelligibility was studied in hearing-impaired listeners with moderate to severe high-frequency losses. The amount of smoothed phonemic compression in a high-frequency channel was varied using wide-band control. Two alternative systems were tested to
Multistage Spectral Relaxation Method for Solving the Hyperchaotic Complex Systems

Directory of Open Access Journals (Sweden)

Hassan Saberi Nik

2014-01-01

Full Text Available We present a pseudospectral method application for solving the hyperchaotic complex systems. The proposed method, called the multistage spectral relaxation method (MSRM is based on a technique of extending Gauss-Seidel type relaxation ideas to systems of nonlinear differential equations and using the Chebyshev pseudospectral methods to solve the resulting system on a sequence of multiple intervals. In this new application, the MSRM is used to solve famous hyperchaotic complex systems such as hyperchaotic complex Lorenz system and the complex permanent magnet synchronous motor. We compare this approach to the Runge-Kutta based ode45 solver to show that the MSRM gives accurate results.
Schizophrenia, Narrative, and Neurocognition: The Utility of Life-Stories in Understanding Social Problem-Solving Skills.

Science.gov (United States)

Moe, Aubrey M; Breitborde, Nicholas J K; Bourassa, Kyle J; Gallagher, Colin J; Shakeel, Mohammed K; Docherty, Nancy M

2018-01-22

Schizophrenia researchers have focused on phenomenological aspects of the disorder to better understand its underlying nature. In particular, development of personal narratives-that is, the complexity with which people form, organize, and articulate their "life stories"-has recently been investigated in individuals with schizophrenia. However, less is known about how aspects of narrative relate to indicators of neurocognitive and social functioning. The objective of the present study was to investigate the association of linguistic complexity of life-story narratives to measures of cognitive and social problem-solving abilities among people with schizophrenia. Thirty-two individuals with a diagnosis of schizophrenia completed a research battery consisting of clinical interviews, a life-story narrative, neurocognitive testing, and a measure assessing multiple aspects of social problem solving. Narrative interviews were assessed for linguistic complexity using computerized technology. The results indicate differential relationships of linguistic complexity and neurocognition to domains of social problem-solving skills. More specifically, although neurocognition predicted how well one could both describe and enact a solution to a social problem, linguistic complexity alone was associated with accurately recognizing that a social problem had occurred. In addition, linguistic complexity appears to be a cognitive factor that is discernible from other broader measures of neurocognition. Linguistic complexity may be more relevant in understanding earlier steps of the social problem-solving process than more traditional, broad measures of cognition, and thus is relevant in conceptualizing treatment targets. These findings also support the relevance of developing narrative-focused psychotherapies. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Cortical activity patterns predict robust speech discrimination ability in noise

Science.gov (United States)

Shetake, Jai A.; Wolf, Jordan T.; Cheung, Ryan J.; Engineer, Crystal T.; Ram, Satyananda K.; Kilgard, Michael P.

2012-01-01

The neural mechanisms that support speech discrimination in noisy conditions are poorly understood. In quiet conditions, spike timing information appears to be used in the discrimination of speech sounds. In this study, we evaluated the hypothesis that spike timing is also used to distinguish between speech sounds in noisy conditions that significantly degrade neural responses to speech sounds. We tested speech sound discrimination in rats and recorded primary auditory cortex (A1) responses to speech sounds in background noise of different intensities and spectral compositions. Our behavioral results indicate that rats, like humans, are able to accurately discriminate consonant sounds even in the presence of background noise that is as loud as the speech signal. Our neural recordings confirm that speech sounds evoke degraded but detectable responses in noise. Finally, we developed a novel neural classifier that mimics behavioral discrimination. The classifier discriminates between speech sounds by comparing the A1 spatiotemporal activity patterns evoked on single trials with the average spatiotemporal patterns evoked by known sounds. Unlike classifiers in most previous studies, this classifier is not provided with the stimulus onset time. Neural activity analyzed with the use of relative spike timing was well correlated with behavioral speech discrimination in quiet and in noise. Spike timing information integrated over longer intervals was required to accurately predict rat behavioral speech discrimination in noisy conditions. The similarity of neural and behavioral discrimination of speech in noise suggests that humans and rats may employ similar brain mechanisms to solve this problem. PMID:22098331
Development of a speech-based dialogue system for report dictation and machine control in the endoscopic laboratory.

Science.gov (United States)

Molnar, B; Gergely, J; Toth, G; Pronai, L; Zagoni, T; Papik, K; Tulassay, Z

2000-01-01

Reporting and machine control based on speech technology can enhance work efficiency in the gastrointestinal endoscopy laboratory. The status and activation of endoscopy laboratory equipment were described as a multivariate parameter and function system. Speech recognition, text evaluation and action definition engines were installed. Special programs were developed for the grammatical analysis of command sentences, and a rule-based expert system for the definition of machine answers. A speech backup engine provides feedback to the user. Techniques were applied based on the "Hidden Markov" model of discrete word, user-independent speech recognition and on phoneme-based speech synthesis. Speech samples were collected from three male low-tone investigators. The dictation module and machine control modules were incorporated in a personal computer (PC) simulation program. Altogether 100 unidentified patient records were analyzed. The sentences were grouped according to keywords, which indicate the main topics of a gastrointestinal endoscopy report. They were: "endoscope", "esophagus", "cardia", "fundus", "corpus", "antrum", "pylorus", "bulbus", and "postbulbar section", in addition to the major pathological findings: "erosion", "ulceration", and "malignancy". "Biopsy" and "diagnosis" were also included. We implemented wireless speech communication control commands for equipment including an endoscopy unit, video, monitor, printer, and PC. The recognition rate was 95%. Speech technology may soon become an integrated part of our daily routine in the endoscopy laboratory. A central speech and laboratory computer could be the most efficient alternative to having separate speech recognition units in all items of equipment.
Understanding adults’ strong problem-solving skills based on PIAAC

OpenAIRE

Hämäläinen, Raija; De Wever, Bram; Nissinen, Kari; Cincinnato, Sebastiano

2017-01-01

Purpose Research has shown that the problem-solving skills of adults with a vocational education and training (VET) background in technology-rich environments (TREs) are often inadequate. However, some adults with a VET background do have sound problem-solving skills. The present study aims to provide insight into the socio-demographic, work-related and everyday life factors that are associated with a strong problem-solving performance. Design/methodology/approach The study builds...
Family-Centered Services for Children with ASD and Limited Speech: The Experiences of Parents and Speech-Language Pathologists

Science.gov (United States)

Mandak, Kelsey; Light, Janice

2018-01-01

Although family-centered services have long been discussed as essential in providing successful services to families of children with autism spectrum disorder (ASD), ideal implementation is often lacking. This study aimed to increase understanding of how families with children with ASD and limited speech receive services from speech-language…
Speech profile of patients undergoing primary palatoplasty.

Science.gov (United States)

Menegueti, Katia Ignacio; Mangilli, Laura Davison; Alonso, Nivaldo; Andrade, Claudia Regina Furquim de

2017-10-26

To characterize the profile and speech characteristics of patients undergoing primary palatoplasty in a Brazilian university hospital, considering the time of intervention (early, before two years of age; late, after two years of age). Participants were 97 patients of both genders with cleft palate and/or cleft and lip palate, assigned to the Speech-language Pathology Department, who had been submitted to primary palatoplasty and presented no prior history of speech-language therapy. Patients were divided into two groups: early intervention group (EIG) - 43 patients undergoing primary palatoplasty before 2 years of age and late intervention group (LIG) - 54 patients undergoing primary palatoplasty after 2 years of age. All patients underwent speech-language pathology assessment. The following parameters were assessed: resonance classification, presence of nasal turbulence, presence of weak intraoral air pressure, presence of audible nasal air emission, speech understandability, and compensatory articulation disorder (CAD). At statistical significance level of 5% (p≤0.05), no significant difference was observed between the groups in the following parameters: resonance classification (p=0.067); level of hypernasality (p=0.113), presence of nasal turbulence (p=0.179); presence of weak intraoral air pressure (p=0.152); presence of nasal air emission (p=0.369), and speech understandability (p=0.113). The groups differed with respect to presence of compensatory articulation disorders (p=0.020), with the LIG presenting higher occurrence of altered phonemes. It was possible to assess the general profile and speech characteristics of the study participants. Patients submitted to early primary palatoplasty present better speech profile.
Comparative efficacy of the picture exchange communication system (PECS) versus a speech-generating device: effects on social-communicative skills and speech development.

Science.gov (United States)

Boesch, Miriam C; Wendt, Oliver; Subramanian, Anu; Hsu, Ning

2013-09-01

The Picture Exchange Communication System (PECS) and a speech-generating device (SGD) were compared in a study with a multiple baseline, alternating treatment design. The effectiveness of these methods in increasing social-communicative behavior and natural speech production were assessed with three elementary school-aged children with severe autism who demonstrated extremely limited functional communication skills. Results for social-communicative behavior were mixed for all participants in both treatment conditions. Relatively little difference was observed between PECS and SGD conditions. Although findings were inconclusive, data patterns suggest that Phase II of the PECS training protocol is conducive to encouraging social-communicative behavior. Data for speech outcomes did not reveal any increases across participants, and no differences between treatment conditions were observed.
The Architecture of Children's Use of Language and Tools When Problem Solving Collaboratively with Robotics

Science.gov (United States)

Mills, Kathy A.; Chandra, Vinesh; Park, Ji Yong

2013-01-01

This paper demonstrates, following Vygotsky, that language and tool use has a critical role in the collaborative problem-solving behaviour of school-age children. It reports original ethnographic classroom research examining the convergence of speech and practical activity in children's collaborative problem solving with robotics programming…
VoIP Speech Encryption System Using Stream Cipher with Chaotic ...

African Journals Online (AJOL)

pc

2018-03-22

Mar 22, 2018 ... The technologies of Internet doesn't give any security mechanism and there is ... VoIP system, both digital (e.g., PC, PDA) and analog (e.g., telephone) devices ... the protection to speech through traditional encryption schemes ...
Reflective Learning and Prospective Teachers' Conceptual Understanding, Critical Thinking, Problem Solving, and Mathematical Communication Skills

Science.gov (United States)

Junsay, Merle L.

2016-01-01

This is a quasi-experimental study that explored the effects of reflective learning on prospective teachers' conceptual understanding, critical thinking, problem solving, and mathematical communication skills and the relationship of these variables. It involved 60 prospective teachers from two basic mathematics classes of an institution of higher…
Cleft Audit Protocol for Speech (CAPS-A): A Comprehensive Training Package for Speech Analysis

Science.gov (United States)

Sell, D.; John, A.; Harding-Bell, A.; Sweeney, T.; Hegarty, F.; Freeman, J.

2009-01-01

Background: The previous literature has largely focused on speech analysis systems and ignored process issues, such as the nature of adequate speech samples, data acquisition, recording and playback. Although there has been recognition of the need for training on tools used in speech analysis associated with cleft palate, little attention has been…
Radiological evaluation of esophageal speech on total laryngectomee

International Nuclear Information System (INIS)

Chung, Tae Sub; Suh, Jung Ho; Kim, Dong Ik; Kim, Gwi Eon; Hong, Won Phy; Lee, Won Sang

1988-01-01

Total laryngectomee requires some form of alaryngeal speech for communication. Generally, esophageal speech is regarded as the most available and comfortable technique for alaryngeal speech. But esophageal speech is difficult to train, so many patients are unable to attain esophageal speech for communication. To understand mechanism of esophageal of esophageal speech on total laryngectomee, evaluation of anatomical change of the pharyngoesophageal segment is very important. We used video fluoroscopy for evaluation of pharyngesophageal segment during esophageal speech. Eighteen total laryngectomees were evaluated with video fluoroscopy from Dec. 1986 to May 1987 at Y.U.M.C. Our results were as follows: 1. Peseudoglottis is the most important factor for esophageal speech, which is visualized in 7 cases among 8 cases of excellent esophageal speech group. 2. Two cases of longer A-P diameter at the pseudoglottis have the best quality of esophageal speech than others. 3. Two cases of mucosal vibration at the pharyngoesophageal segment can make excellent esophageal speech. 4. The cases of failed esophageal speech are poor aerophagia in 6 cases, abscence of pseudoglottis in 4 cases and poor air ejection in 3 cases. 5. Aerophagia synchronizes with diaphragmatic motion in 8 cases of excellent esophageal speech.
Application of Interpersonal Meaning in Hillary’s and Trump’s Election Speeches

Directory of Open Access Journals (Sweden)

Kuang Ping

2017-12-01

Full Text Available Presidential election speeches, as one significant part of western political life, deserve people’s attention. This paper focuses on the use of interpersonal meaning in political speeches. The nine texts selected from the Internet are analyzed from the perspectives of mood, modality, personal pronoun and tense system based on the theory of Halliday’s Systemic Functional Grammar. It aims to study the way how interpersonal meaning is realized through language by making the contrastive analysis of the speeches given by Hillary and Trump. After making a minute analysis, the paper comes to the following conclusions: (1 As for mood, Trump and Hillary mainly employ the declarative to deliver messages and make statements, and imperative is used to motivate the audiences and narrow the gap between the candidates and the audiences, and interrogative is to make the audiences concentrate on the content of the speeches. (2 With respect to the modality system, the median modal operator holds the dominant position in both Trump’s and Hillary’s speeches to make the speeches less aggressive. In this aspect, Trump does better than Hillary. (3 In regard to personal pronoun, the plural form of first personal pronoun is mainly employed by the two candidates to close the relationship with audiences. (4 Regards to tense system, simple present tense are mostly used to establish the intimacy of the audiences and the candidates. Then two influential factors are discussed. One is their personal background and the other is their language levels. This paper is helpful for people to deeply understand the two candidates’ language differences.

Speech and audio processing for coding, enhancement and recognition

CERN Document Server

Togneri, Roberto; Narasimha, Madihally

2015-01-01

This book describes the basic principles underlying the generation, coding, transmission and enhancement of speech and audio signals, including advanced statistical and machine learning techniques for speech and speaker recognition with an overview of the key innovations in these areas. Key research undertaken in speech coding, speech enhancement, speech recognition, emotion recognition and speaker diarization are also presented, along with recent advances and new paradigms in these areas. · Offers readers a single-source reference on the significant applications of speech and audio processing to speech coding, speech enhancement and speech/speaker recognition. Enables readers involved in algorithm development and implementation issues for speech coding to understand the historical development and future challenges in speech coding research; · Discusses speech coding methods yielding bit-streams that are multi-rate and scalable for Voice-over-IP (VoIP) Networks; · �...
SynFace—Speech-Driven Facial Animation for Virtual Speech-Reading Support

Directory of Open Access Journals (Sweden)

Giampiero Salvi

2009-01-01

Full Text Available This paper describes SynFace, a supportive technology that aims at enhancing audio-based spoken communication in adverse acoustic conditions by providing the missing visual information in the form of an animated talking head. Firstly, we describe the system architecture, consisting of a 3D animated face model controlled from the speech input by a specifically optimised phonetic recogniser. Secondly, we report on speech intelligibility experiments with focus on multilinguality and robustness to audio quality. The system, already available for Swedish, English, and Flemish, was optimised for German and for Swedish wide-band speech quality available in TV, radio, and Internet communication. Lastly, the paper covers experiments with nonverbal motions driven from the speech signal. It is shown that turn-taking gestures can be used to affect the flow of human-human dialogues. We have focused specifically on two categories of cues that may be extracted from the acoustic signal: prominence/emphasis and interactional cues (turn-taking/back-channelling.
New approaches to characterizing and understanding biofouling of spiral wound membrane systems

KAUST Repository

van Loosdrecht, Mark C.M.

2012-06-01

Historically, biofouling research on spiral wound membrane systems is typically problem solving oriented. Membrane modules are studied as black box systems, investigated by autopsies. Biofouling is not a simple process. Many factors influence each other in a non-linear fashion. These features make biofouling a subject which is not easy to study using a fundamental scientific approach. Nevertheless to solve or minimize the negative impacts of biofouling, a clear understanding of the interacting basic principles is needed. Recent research into microbiological characterizing of biofouling, small scale test units, application of in situ visualization methods, and model approaches allow such an integrated study of biofouling. © IWA Publishing 2012.
New approaches to characterizing and understanding biofouling of spiral wound membrane systems

KAUST Repository

van Loosdrecht, Mark C.M.; Bereschenko, Ludmilla A.; Radu, Andrea I.; Kruithof, Joop C.; Picioreanu, Cristian; Johns, Michael L.; Vrouwenvelder, Johannes S.

2012-01-01

Historically, biofouling research on spiral wound membrane systems is typically problem solving oriented. Membrane modules are studied as black box systems, investigated by autopsies. Biofouling is not a simple process. Many factors influence each other in a non-linear fashion. These features make biofouling a subject which is not easy to study using a fundamental scientific approach. Nevertheless to solve or minimize the negative impacts of biofouling, a clear understanding of the interacting basic principles is needed. Recent research into microbiological characterizing of biofouling, small scale test units, application of in situ visualization methods, and model approaches allow such an integrated study of biofouling. © IWA Publishing 2012.
Study of solving a Toda dynamic system with loop algebra

International Nuclear Information System (INIS)

Zhu Qiao; Yang Zhanying; Shi Kangjie; Wen Junqing

2006-01-01

The authors construct a Toda system with Loop algebra, and prove that the Lax equation L=[L,M] can be solved by means of solving a regular Riemann-Hilbert problem. In our system, M in Lax pair is an antisymmetrical matrix, while L=L + + M, and L + is a quasi-upper triangular matrix of loop algebra. In order to check our result, the authors exactly solve an R-H problem under a given initial condition as an example. (authors)
ON INTEGRATED COURSE “SOCIAL AND SPEECH COMMUNICATIONS” FOR STUDENTS OF ART HIGHER EDUCATIONAL ESTABLISHMENT

Directory of Open Access Journals (Sweden)

Elena Nicolaevna Klemenova

2013-11-01

Full Text Available The article describes the experience in teaching the course “Social and Speech Communication”. As the result of training the students are to master the arsenal of means for effective communication, the base of which turns out to be linguistic communication and its bearer that is the language personality, get knowledge about complex processes of information exchange, discover the psychological peculiarities of verbal and non-verbal communication, learn how to communicate for solving professional and personal problems.The skill of fluent mastering all kinds of speech activity, the skill of correct and intellectual communication in various spheres and structures, the skill of speech event linguistic analysis including from the point of view of their esthetical value represent the unity of systemic and individual approach in the sphere of humanitarian training for future architects, designers and managers.DOI: http://dx.doi.org/10.12731/2218-7405-2013-7-43
Speech Intelligibility Evaluation for Mobile Phones

DEFF Research Database (Denmark)

Jørgensen, Søren; Cubick, Jens; Dau, Torsten

2015-01-01

In the development process of modern telecommunication systems, such as mobile phones, it is common practice to use computer models to objectively evaluate the transmission quality of the system, instead of time-consuming perceptual listening tests. Such models have typically focused on the quality...... of the transmitted speech, while little or no attention has been provided to speech intelligibility. The present study investigated to what extent three state-of-the art speech intelligibility models could predict the intelligibility of noisy speech transmitted through mobile phones. Sentences from the Danish...... Dantale II speech material were mixed with three different kinds of background noise, transmitted through three different mobile phones, and recorded at the receiver via a local network simulator. The speech intelligibility of the transmitted sentences was assessed by six normal-hearing listeners...
Speech Recognition for the iCub Platform

Directory of Open Access Journals (Sweden)

Bertrand Higy

2018-02-01

Full Text Available This paper describes open source software (available at https://github.com/robotology/natural-speech to build automatic speech recognition (ASR systems and run them within the YARP platform. The toolkit is designed (i to allow non-ASR experts to easily create their own ASR system and run it on iCub and (ii to build deep learning-based models specifically addressing the main challenges an ASR system faces in the context of verbal human–iCub interactions. The toolkit mostly consists of Python, C++ code and shell scripts integrated in YARP. As additional contribution, a second codebase (written in Matlab is provided for more expert ASR users who want to experiment with bio-inspired and developmental learning-inspired ASR systems. Specifically, we provide code for two distinct kinds of speech recognition: “articulatory” and “unsupervised” speech recognition. The first is largely inspired by influential neurobiological theories of speech perception which assume speech perception to be mediated by brain motor cortex activities. Our articulatory systems have been shown to outperform strong deep learning-based baselines. The second type of recognition systems, the “unsupervised” systems, do not use any supervised information (contrary to most ASR systems, including our articulatory systems. To some extent, they mimic an infant who has to discover the basic speech units of a language by herself. In addition, we provide resources consisting of pre-trained deep learning models for ASR, and a 2.5-h speech dataset of spoken commands, the VoCub dataset, which can be used to adapt an ASR system to the typical acoustic environments in which iCub operates.
Speech Enhancement by MAP Spectral Amplitude Estimation Using a Super-Gaussian Speech Model

Directory of Open Access Journals (Sweden)

Lotter Thomas

2005-01-01

Full Text Available This contribution presents two spectral amplitude estimators for acoustical background noise suppression based on maximum a posteriori estimation and super-Gaussian statistical modelling of the speech DFT amplitudes. The probability density function of the speech spectral amplitude is modelled with a simple parametric function, which allows a high approximation accuracy for Laplace- or Gamma-distributed real and imaginary parts of the speech DFT coefficients. Also, the statistical model can be adapted to optimally fit the distribution of the speech spectral amplitudes for a specific noise reduction system. Based on the super-Gaussian statistical model, computationally efficient maximum a posteriori speech estimators are derived, which outperform the commonly applied Ephraim-Malah algorithm.
The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests

Science.gov (United States)

Heinrich, Antje; Henshaw, Helen; Ferguson, Melanie A.

2015-01-01

Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests. Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study. Forty-four listeners aged between 50 and 74 years with mild sensorineural hearing loss were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet), to medium (digit triplet perception in speech-shaped noise) to high (sentence perception in modulated noise); cognitive tests of attention, memory, and non-verbal intelligence quotient; and self-report questionnaires of general health-related and hearing-specific quality of life. Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that
The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests

Directory of Open Access Journals (Sweden)

Antje eHeinrich

2015-06-01

Full Text Available Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests.Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study.Forty-four listeners aged between 50-74 years with mild SNHL were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet, to medium (digit triplet perception in speech-shaped noise to high (sentence perception in modulated noise; cognitive tests of attention, memory, and nonverbal IQ; and self-report questionnaires of general health-related and hearing-specific quality of life.Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that auditory environments pose on
Multiple Problem-Solving Strategies Provide Insight into Students' Understanding of Open-Ended Linear Programming Problems

Science.gov (United States)

Sole, Marla A.

2016-01-01

Open-ended questions that can be solved using different strategies help students learn and integrate content, and provide teachers with greater insights into students' unique capabilities and levels of understanding. This article provides a problem that was modified to allow for multiple approaches. Students tended to employ high-powered, complex,…
Digitized Ethnic Hate Speech: Understanding Effects of Digital Media Hate Speech on Citizen Journalism in Kenya

Science.gov (United States)

Kimotho, Stephen Gichuhi; Nyaga, Rahab Njeri

2016-01-01

Ethnicity in Kenya permeates all spheres of life. However, it is in politics that ethnicity is most visible. Election time in Kenya often leads to ethnic competition and hatred, often expressed through various media. Ethnic hate speech characterized the 2007 general elections in party rallies and through text messages, emails, posters and…
Segmental intelligibility of synthetic speech produced by rule.

Science.gov (United States)

Logan, J S; Greene, B G; Pisoni, D B

1989-08-01

This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk--Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener's processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener.
Segmental intelligibility of synthetic speech produced by rule

Science.gov (United States)

Logan, John S.; Greene, Beth G.; Pisoni, David B.

2012-01-01

This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk—Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener’s processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener. PMID:2527884
Speech emotion recognition methods: A literature review

Science.gov (United States)

Basharirad, Babak; Moradhaseli, Mohammadreza

2017-10-01

Recently, attention of the emotional speech signals research has been boosted in human machine interfaces due to availability of high computation capability. There are many systems proposed in the literature to identify the emotional state through speech. Selection of suitable feature sets, design of a proper classifications methods and prepare an appropriate dataset are the main key issues of speech emotion recognition systems. This paper critically analyzed the current available approaches of speech emotion recognition methods based on the three evaluating parameters (feature set, classification of features, accurately usage). In addition, this paper also evaluates the performance and limitations of available methods. Furthermore, it highlights the current promising direction for improvement of speech emotion recognition systems.
Speech perception in noise in unilateral hearing loss

OpenAIRE

Mondelli, Maria Fernanda Capoani Garcia; dos Santos, Marina de Marchi; José, Maria Renata

2016-01-01

ABSTRACT INTRODUCTION: Unilateral hearing loss is characterized by a decrease of hearing in one ear only. In the presence of ambient noise, individuals with unilateral hearing loss are faced with greater difficulties understanding speech than normal listeners. OBJECTIVE: To evaluate the speech perception of individuals with unilateral hearing loss in speech perception with and without competitive noise, before and after the hearing aid fitting process. METHODS: The study included 30 adu...
Indian accent text-to-speech system for web browsing

Indian Academy of Sciences (India)

This paper describes a 'web reader' which 'reads out' the textual contents of a selected web page in Hindi or in English with Indian accent. The content of the page is downloaded and parsed into suitable textual form. It is then passed on to an indigenously developed text-to-speech system for Hindi/Indian English, ...
The relationship between the intelligibility of time-compressed speech and speech in noise in young and elderly listeners

Science.gov (United States)

Versfeld, Niek J.; Dreschler, Wouter A.

2002-01-01

A conventional measure to determine the ability to understand speech in noisy backgrounds is the so-called speech reception threshold (SRT) for sentences. It yields the signal-to-noise ratio (in dB) for which half of the sentences are correctly perceived. The SRT defines to what degree speech must be audible to a listener in order to become just intelligible. There are indications that elderly listeners have greater difficulty in understanding speech in adverse listening conditions than young listeners. This may be partly due to the differences in hearing sensitivity (presbycusis), hence audibility, but other factors, such as temporal acuity, may also play a significant role. A potential measure for the temporal acuity may be the threshold to which speech can be accelerated, or compressed in time. A new test is introduced where the speech rate is varied adaptively. In analogy to the SRT, the time-compression threshold (or TCT) then is defined as the speech rate (expressed in syllables per second) for which half of the sentences are correctly perceived. In experiment I, the TCT test is introduced and normative data are provided. In experiment II, four groups of subjects (young and elderly normal-hearing and hearing-impaired subjects) participated, and the SRT's in stationary and fluctuating speech-shaped noise were determined, as well as the TCT. The results show that the SRT in fluctuating noise and the TCT are highly correlated. All tests indicate that, even after correction for the hearing loss, elderly normal-hearing subjects perform worse than young normal-hearing subjects. The results indicate that the use of the TCT test or the SRT test in fluctuating noise is preferred over the SRT test in stationary noise.
Improvements in speech understanding with wireless binaural broadband digital hearing instruments in adults with sensorineural hearing loss.

Science.gov (United States)

Kreisman, Brian M; Mazevski, Annette G; Schum, Donald J; Sockalingam, Ravichandran

2010-03-01

This investigation examined whether speech intelligibility in noise can be improved using a new, binaural broadband hearing instrument system. Participants were 36 adults with symmetrical, sensorineural hearing loss (18 experienced hearing instrument users and 18 without prior experience). Participants were fit binaurally in a planned comparison, randomized crossover design study with binaural broadband hearing instruments and advanced digital hearing instruments. Following an adjustment period with each device, participants underwent two speech-in-noise tests: the QuickSIN and the Hearing in Noise Test (HINT). Results suggested significantly better performance on the QuickSIN and the HINT measures with the binaural broadband hearing instruments, when compared with the advanced digital hearing instruments and unaided, across and within all noise conditions.

The Relationship Between Spectral Modulation Detection and Speech Recognition: Adult Versus Pediatric Cochlear Implant Recipients.

Science.gov (United States)

Gifford, René H; Noble, Jack H; Camarata, Stephen M; Sunderhaus, Linsey W; Dwyer, Robert T; Dawant, Benoit M; Dietrich, Mary S; Labadie, Robert F

2018-01-01

Adult cochlear implant (CI) recipients demonstrate a reliable relationship between spectral modulation detection and speech understanding. Prior studies documenting this relationship have focused on postlingually deafened adult CI recipients-leaving an open question regarding the relationship between spectral resolution and speech understanding for adults and children with prelingual onset of deafness. Here, we report CI performance on the measures of speech recognition and spectral modulation detection for 578 CI recipients including 477 postlingual adults, 65 prelingual adults, and 36 prelingual pediatric CI users. The results demonstrated a significant correlation between spectral modulation detection and various measures of speech understanding for 542 adult CI recipients. For 36 pediatric CI recipients, however, there was no significant correlation between spectral modulation detection and speech understanding in quiet or in noise nor was spectral modulation detection significantly correlated with listener age or age at implantation. These findings suggest that pediatric CI recipients might not depend upon spectral resolution for speech understanding in the same manner as adult CI recipients. It is possible that pediatric CI users are making use of different cues, such as those contained within the temporal envelope, to achieve high levels of speech understanding. Further investigation is warranted to investigate the relationship between spectral and temporal resolution and speech recognition to describe the underlying mechanisms driving peripheral auditory processing in pediatric CI users.
Assessment of speech intelligibility in background noise and reverberation

DEFF Research Database (Denmark)

Nielsen, Jens Bo

Reliable methods for assessing speech intelligibility are essential within hearing research, audiology, and related areas. Such methods can be used for obtaining a better understanding of how speech intelligibility is affected by, e.g., various environmental factors or different types of hearing...... impairment. In this thesis, two sentence-based tests for speech intelligibility in Danish were developed. The first test is the Conversational Language Understanding Evaluation (CLUE), which is based on the principles of the original American-English Hearing in Noise Test (HINT). The second test...... is a modified version of CLUE where the speech material and the scoring rules have been reconsidered. An extensive validation of the modified test was conducted with both normal-hearing and hearing-impaired listeners. The validation showed that the test produces reliable results for both groups of listeners...
The speech perception skills of children with and without speech sound disorder.

Science.gov (United States)

Hearnshaw, Stephanie; Baker, Elise; Munro, Natalie

To investigate whether Australian-English speaking children with and without speech sound disorder (SSD) differ in their overall speech perception accuracy. Additionally, to investigate differences in the perception of specific phonemes and the association between speech perception and speech production skills. Twenty-five Australian-English speaking children aged 48-60 months participated in this study. The SSD group included 12 children and the typically developing (TD) group included 13 children. Children completed routine speech and language assessments in addition to an experimental Australian-English lexical and phonetic judgement task based on Rvachew's Speech Assessment and Interactive Learning System (SAILS) program (Rvachew, 2009). This task included eight words across four word-initial phonemes-/k, ɹ, ʃ, s/. Children with SSD showed significantly poorer perceptual accuracy on the lexical and phonetic judgement task compared with TD peers. The phonemes /ɹ/ and /s/ were most frequently perceived in error across both groups. Additionally, the phoneme /ɹ/ was most commonly produced in error. There was also a positive correlation between overall speech perception and speech production scores. Children with SSD perceived speech less accurately than their typically developing peers. The findings suggest that an Australian-English variation of a lexical and phonetic judgement task similar to the SAILS program is promising and worthy of a larger scale study. Copyright © 2017 Elsevier Inc. All rights reserved.
Neural Entrainment to Speech Modulates Speech Intelligibility

NARCIS (Netherlands)

Riecke, Lars; Formisano, Elia; Sorger, Bettina; Baskent, Deniz; Gaudrain, Etienne

2018-01-01

Speech is crucial for communication in everyday life. Speech-brain entrainment, the alignment of neural activity to the slow temporal fluctuations (envelope) of acoustic speech input, is a ubiquitous element of current theories of speech processing. Associations between speech-brain entrainment and
Experimental quantum computing to solve systems of linear equations.

Science.gov (United States)

Cai, X-D; Weedbrook, C; Su, Z-E; Chen, M-C; Gu, Mile; Zhu, M-J; Li, Li; Liu, Nai-Le; Lu, Chao-Yang; Pan, Jian-Wei

2013-06-07

Solving linear systems of equations is ubiquitous in all areas of science and engineering. With rapidly growing data sets, such a task can be intractable for classical computers, as the best known classical algorithms require a time proportional to the number of variables N. A recently proposed quantum algorithm shows that quantum computers could solve linear systems in a time scale of order log(N), giving an exponential speedup over classical computers. Here we realize the simplest instance of this algorithm, solving 2×2 linear equations for various input vectors on a quantum computer. We use four quantum bits and four controlled logic gates to implement every subroutine required, demonstrating the working principle of this algorithm.
Applied information system-based in enhancing students' understanding towards higher order thinking (HOTS)

Science.gov (United States)

Hua, Ang Kean; Ping, Owi Wei

2017-05-01

The application of information and communications technology (ICT) had become more important in our daily life, especially in educational field. Teachers are encouraged to use information system-based in teaching Mathematical courses. Higher Order Thinking Skills (HOTS) approach is unable to explain using chalk and talk methods. It needs students to analyze, evaluate, and create by their own natural abilities. The aim of this research study was to evaluate the effectiveness of the application information system-based in enhance the students understanding about HOTS question. Mixed-methods or quantitative and qualitative approach was applied in collecting data, which involve only the standard five students and the teachers in Sabak Bernam, Selangor. Pra-postests was held before and after using information system-based in teaching to evaluate the students' understanding. The result from post-test indicates significant improvement which proves that the use of information system based able to enhance students' understanding about HOTS question and solve it. There were several factor influenced the students such as students' attitude, teachers attraction, school facilities, and computer approach. Teachers play an important role in attracting students to learn. Therefore, the school should provide a conducive learning environment and good facilities for students to learn so that they are able to access more information and always exposed to new knowledge. As conclusion, information system-based are able to enhance students understanding the need of HOTS questions and solve it.
Development of a problem solving evaluation instrument; untangling of specific problem solving assets

Science.gov (United States)

Adams, Wendy Kristine

The purpose of my research was to produce a problem solving evaluation tool for physics. To do this it was necessary to gain a thorough understanding of how students solve problems. Although physics educators highly value problem solving and have put extensive effort into understanding successful problem solving, there is currently no efficient way to evaluate problem solving skill. Attempts have been made in the past; however, knowledge of the principles required to solve the subject problem are so absolutely critical that they completely overshadow any other skills students may use when solving a problem. The work presented here is unique because the evaluation tool removes the requirement that the student already have a grasp of physics concepts. It is also unique because I picked a wide range of people and picked a wide range of tasks for evaluation. This is an important design feature that helps make things emerge more clearly. This dissertation includes an extensive literature review of problem solving in physics, math, education and cognitive science as well as descriptions of studies involving student use of interactive computer simulations, the design and validation of a beliefs about physics survey and finally the design of the problem solving evaluation tool. I have successfully developed and validated a problem solving evaluation tool that identifies 44 separate assets (skills) necessary for solving problems. Rigorous validation studies, including work with an independent interviewer, show these assets identified by this content-free evaluation tool are the same assets that students use to solve problems in mechanics and quantum mechanics. Understanding this set of component assets will help teachers and researchers address problem solving within the classroom.
The motor theory of speech perception revisited.

Science.gov (United States)

Massaro, Dominic W; Chen, Trevor H

2008-04-01

Galantucci, Fowler, and Turvey (2006) have claimed that perceiving speech is perceiving gestures and that the motor system is recruited for perceiving speech. We make the counter argument that perceiving speech is not perceiving gestures, that the motor system is not recruitedfor perceiving speech, and that speech perception can be adequately described by a prototypical pattern recognition model, the fuzzy logical model of perception (FLMP). Empirical evidence taken as support for gesture and motor theory is reconsidered in more detail and in the framework of the FLMR Additional theoretical and logical arguments are made to challenge gesture and motor theory.
The knowledge and attitudes of occupational therapy, physiotherapy and speech-language therapy students, regarding the speech-language therapist's role in the hospital stroke rehabilitation team.

Science.gov (United States)

Felsher, L; Ross, E

1994-01-01

The purpose of the present study was to survey and compare the knowledge and attitudes of final year occupational therapy, physiotherapy and speech-language therapy students, concerning the role of the speech-language therapist as a member of the stroke rehabilitation team in the hospital setting. In order to achieve this aim, a questionnaire was administered to final year students in these three disciplines, and included questions on most areas of stroke rehabilitation with which the speech-language therapist might be involved, as well as the concepts of rehabilitation and teamwork in relation to stroke rehabilitation. Results suggested a fairly good understanding of the concepts of rehabilitation and teamwork. Students appeared to have a greater understanding of those disorders following a stroke, with which the speech-language therapist is commonly involved, such as Aphasia, Dysarthria, Verbal Apraxia and Dysphagia. However, students appeared to show less understanding of those disorders post-stroke, for which the speech-language therapist's role is less well defined, such as Agraphia, Alexia and Amnesia. In addition, a high percentage of role duplication/overlapping in several aspects of stroke rehabilitation, such as family and social support, was found. Several implications for facilitating communication, collaboration and understanding between paramedical professions, as well as for further research are also provided.
Augmentative and Alternative Communication in Autism: A Comparison of the Picture Exchange Communication System and Speech-Output Technology

Science.gov (United States)

Boesch, Miriam Chacon

2011-01-01

The purpose of this comparative efficacy study was to investigate the Picture Exchange Communication System (PECS) and a speech-generating device (SGD) in developing requesting skills, social-communicative behavior, and speech for three elementary-age children with severe autism and little to no functional speech. Requesting was selected as the…
Speech Clarity Index (Ψ): A Distance-Based Speech Quality Indicator and Recognition Rate Prediction for Dysarthric Speakers with Cerebral Palsy

Science.gov (United States)

Kayasith, Prakasith; Theeramunkong, Thanaruk

It is a tedious and subjective task to measure severity of a dysarthria by manually evaluating his/her speech using available standard assessment methods based on human perception. This paper presents an automated approach to assess speech quality of a dysarthric speaker with cerebral palsy. With the consideration of two complementary factors, speech consistency and speech distinction, a speech quality indicator called speech clarity index (Ψ) is proposed as a measure of the speaker's ability to produce consistent speech signal for a certain word and distinguished speech signal for different words. As an application, it can be used to assess speech quality and forecast speech recognition rate of speech made by an individual dysarthric speaker before actual exhaustive implementation of an automatic speech recognition system for the speaker. The effectiveness of Ψ as a speech recognition rate predictor is evaluated by rank-order inconsistency, correlation coefficient, and root-mean-square of difference. The evaluations had been done by comparing its predicted recognition rates with ones predicted by the standard methods called the articulatory and intelligibility tests based on the two recognition systems (HMM and ANN). The results show that Ψ is a promising indicator for predicting recognition rate of dysarthric speech. All experiments had been done on speech corpus composed of speech data from eight normal speakers and eight dysarthric speakers.
Speech Intelligibility Potential of General and Specialized Deep Neural Network Based Speech Enhancement Systems

DEFF Research Database (Denmark)

Kolbæk, Morten; Tan, Zheng-Hua; Jensen, Jesper

2017-01-01

In this paper, we study aspects of single microphone speech enhancement (SE) based on deep neural networks (DNNs). Specifically, we explore the generalizability capabilities of state-of-the-art DNN-based SE systems with respect to the background noise type, the gender of the target speaker...... general. Finally, we compare how a DNN-based SE system trained to be noise type general, speaker general, and SNR general performs relative to a state-of-the-art short-time spectral amplitude minimum mean square error (STSA-MMSE) based SE algorithm. We show that DNN-based SE systems, when trained...... a state-of-the-art STSA-MMSE based SE method, when tested using a range of unseen speakers and noise types. Finally, a listening test using several DNN-based SE systems tested in unseen speaker conditions show that these systems can improve SI for some SNR and noise type configurations but degrade SI...
The Military Language Tutor (MILT) Program: An Advanced Authoring System.

Science.gov (United States)

Kaplan, Jonathan D.; Sabol, Mark A.; Wisher, Robert A.; Seidel, Robert J.

1998-01-01

Discusses the Military Language Tutor (MILT), a language-tutor authoring system, examining the development of a proof of principal version of MILT's two-dimensional Arabic microworld, which uses speech input to control an animated agent in solving an authored problem and describing an evaluation of the speech-driven microworld at Fort Campbell,…
Primary progressive aphasia and apraxia of speech.

Science.gov (United States)

Jung, Youngsin; Duffy, Joseph R; Josephs, Keith A

2013-09-01

Primary progressive aphasia is a neurodegenerative syndrome characterized by progressive language dysfunction. The majority of primary progressive aphasia cases can be classified into three subtypes: nonfluent/agrammatic, semantic, and logopenic variants. Each variant presents with unique clinical features, and is associated with distinctive underlying pathology and neuroimaging findings. Unlike primary progressive aphasia, apraxia of speech is a disorder that involves inaccurate production of sounds secondary to impaired planning or programming of speech movements. Primary progressive apraxia of speech is a neurodegenerative form of apraxia of speech, and it should be distinguished from primary progressive aphasia given its discrete clinicopathological presentation. Recently, there have been substantial advances in our understanding of these speech and language disorders. The clinical, neuroimaging, and histopathological features of primary progressive aphasia and apraxia of speech are reviewed in this article. The distinctions among these disorders for accurate diagnosis are increasingly important from a prognostic and therapeutic standpoint. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Performance Evaluation of Speech Recognition Systems as a Next-Generation Pilot-Vehicle Interface Technology

Science.gov (United States)

Arthur, Jarvis J., III; Shelton, Kevin J.; Prinzel, Lawrence J., III; Bailey, Randall E.

2016-01-01

During the flight trials known as Gulfstream-V Synthetic Vision Systems Integrated Technology Evaluation (GV-SITE), a Speech Recognition System (SRS) was used by the evaluation pilots. The SRS system was intended to be an intuitive interface for display control (rather than knobs, buttons, etc.). This paper describes the performance of the current "state of the art" Speech Recognition System (SRS). The commercially available technology was evaluated as an application for possible inclusion in commercial aircraft flight decks as a crew-to-vehicle interface. Specifically, the technology is to be used as an interface from aircrew to the onboard displays, controls, and flight management tasks. A flight test of a SRS as well as a laboratory test was conducted.
Hidden Markov models in automatic speech recognition

Science.gov (United States)

Wrzoskowicz, Adam

1993-11-01

This article describes a method for constructing an automatic speech recognition system based on hidden Markov models (HMMs). The author discusses the basic concepts of HMM theory and the application of these models to the analysis and recognition of speech signals. The author provides algorithms which make it possible to train the ASR system and recognize signals on the basis of distinct stochastic models of selected speech sound classes. The author describes the specific components of the system and the procedures used to model and recognize speech. The author discusses problems associated with the choice of optimal signal detection and parameterization characteristics and their effect on the performance of the system. The author presents different options for the choice of speech signal segments and their consequences for the ASR process. The author gives special attention to the use of lexical, syntactic, and semantic information for the purpose of improving the quality and efficiency of the system. The author also describes an ASR system developed by the Speech Acoustics Laboratory of the IBPT PAS. The author discusses the results of experiments on the effect of noise on the performance of the ASR system and describes methods of constructing HMM's designed to operate in a noisy environment. The author also describes a language for human-robot communications which was defined as a complex multilevel network from an HMM model of speech sounds geared towards Polish inflections. The author also added mandatory lexical and syntactic rules to the system for its communications vocabulary.
Speech-Language Dissociations, Distractibility, and Childhood Stuttering

Science.gov (United States)

Conture, Edward G.; Walden, Tedra A.; Lambert, Warren E.

2015-01-01

Purpose This study investigated the relation among speech-language dissociations, attentional distractibility, and childhood stuttering. Method Participants were 82 preschool-age children who stutter (CWS) and 120 who do not stutter (CWNS). Correlation-based statistics (Bates, Appelbaum, Salcedo, Saygin, & Pizzamiglio, 2003) identified dissociations across 5 norm-based speech-language subtests. The Behavioral Style Questionnaire Distractibility subscale measured attentional distractibility. Analyses addressed (a) between-groups differences in the number of children exhibiting speech-language dissociations; (b) between-groups distractibility differences; (c) the relation between distractibility and speech-language dissociations; and (d) whether interactions between distractibility and dissociations predicted the frequency of total, stuttered, and nonstuttered disfluencies. Results More preschool-age CWS exhibited speech-language dissociations compared with CWNS, and more boys exhibited dissociations compared with girls. In addition, male CWS were less distractible than female CWS and female CWNS. For CWS, but not CWNS, less distractibility (i.e., greater attention) was associated with more speech-language dissociations. Last, interactions between distractibility and dissociations did not predict speech disfluencies in CWS or CWNS. Conclusions The present findings suggest that for preschool-age CWS, attentional processes are associated with speech-language dissociations. Future investigations are warranted to better understand the directionality of effect of this association (e.g., inefficient attentional processes → speech-language dissociations vs. inefficient attentional processes ← speech-language dissociations). PMID:26126203
Adapting to foreign-accented speech: The role of delay in testing

NARCIS (Netherlands)

Witteman, M.J.; Bardhan, N.P.; Weber, A.C.; McQueen, J.M.

2011-01-01

Understanding speech usually seems easy, but it can become noticeably harder when the speaker has a foreign accent. This is because foreign accents add considerable variation to speech. Research on foreign-accented speech shows that participants are able to adapt quickly to this type of variation.
Speech Silicon: An FPGA Architecture for Real-Time Hidden Markov-Model-Based Speech Recognition

Directory of Open Access Journals (Sweden)

Schuster Jeffrey

2006-01-01

Full Text Available This paper examines the design of an FPGA-based system-on-a-chip capable of performing continuous speech recognition on medium sized vocabularies in real time. Through the creation of three dedicated pipelines, one for each of the major operations in the system, we were able to maximize the throughput of the system while simultaneously minimizing the number of pipeline stalls in the system. Further, by implementing a token-passing scheme between the later stages of the system, the complexity of the control was greatly reduced and the amount of active data present in the system at any time was minimized. Additionally, through in-depth analysis of the SPHINX 3 large vocabulary continuous speech recognition engine, we were able to design models that could be efficiently benchmarked against a known software platform. These results, combined with the ability to reprogram the system for different recognition tasks, serve to create a system capable of performing real-time speech recognition in a vast array of environments.
Speech Silicon: An FPGA Architecture for Real-Time Hidden Markov-Model-Based Speech Recognition

Directory of Open Access Journals (Sweden)

Alex K. Jones

2006-11-01

Full Text Available This paper examines the design of an FPGA-based system-on-a-chip capable of performing continuous speech recognition on medium sized vocabularies in real time. Through the creation of three dedicated pipelines, one for each of the major operations in the system, we were able to maximize the throughput of the system while simultaneously minimizing the number of pipeline stalls in the system. Further, by implementing a token-passing scheme between the later stages of the system, the complexity of the control was greatly reduced and the amount of active data present in the system at any time was minimized. Additionally, through in-depth analysis of the SPHINX 3 large vocabulary continuous speech recognition engine, we were able to design models that could be efficiently benchmarked against a known software platform. These results, combined with the ability to reprogram the system for different recognition tasks, serve to create a system capable of performing real-time speech recognition in a vast array of environments.

HIV health information access using spoken dialogue systems: touchtone vs speech

CSIR Research Space (South Africa)

Sharma Grover, A

2009-04-01

Full Text Available This paper presents the work in the design of a SDS for the provision of health information to caregivers of HIV positive children. The authors specifically address the frequently debated question of input modality in speech systems; touchtone...
Understanding Time and Problem Solving Experience: A Case Study of the Invisible Police

Directory of Open Access Journals (Sweden)

Amir Khorasani

Full Text Available In this paper we will explore the relation between the actors’ understanding of time and the problem solving strategies in a complicated situation. Drawing on ethnography and conversation analysis we will focus on the institutional interaction order governing the scenes these movies exhibit. Using phenomenology and Ernest Pople indices, we aim to analyze the understanding made of the time in these conversations. In doing so we will consider the moment in which the violators rationalize the reasons behind their violations. The results show that while the time that law, the police and even the road technologies rely on is homogeneous and linear, the drivers employ the expressions connotating an iterative understanding of time. The paper concludes with showing how the law breaking drivers base their conversations on a nonlinear time to manage the difficult situations they are involved with. This suggests that far from a universal category, time is a category constantly taking different shapes in different everyday encounters.
Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections

NARCIS (Netherlands)

Huijbregts, M.A.H.; Wooters, Chuck; Ordelman, Roeland J.F.

2007-01-01

In this paper we discuss the speech activity detection system that we used for detecting speech regions in the Dutch TRECVID video collection. The system is designed to filter non-speech like music or sound effects out of the signal without the use of predefined non-speech models. Because the system
Talker-specific learning in amnesia: Insight into mechanisms of adaptive speech perception.

Science.gov (United States)

Trude, Alison M; Duff, Melissa C; Brown-Schmidt, Sarah

2014-05-01

A hallmark of human speech perception is the ability to comprehend speech quickly and effortlessly despite enormous variability across talkers. However, current theories of speech perception do not make specific claims about the memory mechanisms involved in this process. To examine whether declarative memory is necessary for talker-specific learning, we tested the ability of amnesic patients with severe declarative memory deficits to learn and distinguish the accents of two unfamiliar talkers by monitoring their eye-gaze as they followed spoken instructions. Analyses of the time-course of eye fixations showed that amnesic patients rapidly learned to distinguish these accents and tailored perceptual processes to the voice of each talker. These results demonstrate that declarative memory is not necessary for this ability and points to the involvement of non-declarative memory mechanisms. These results are consistent with findings that other social and accommodative behaviors are preserved in amnesia and contribute to our understanding of the interactions of multiple memory systems in the use and understanding of spoken language. Copyright © 2014 Elsevier Ltd. All rights reserved.
Foucault's "fearless speech" and the transformation and mentoring of medical students

Directory of Open Access Journals (Sweden)

Papadimos Thomas J

2008-04-01

Full Text Available Abstract In his six 1983 lectures published under the title, Fearless Speech (2001, Michel Foucault developed the theme of free speech and its relation to frankness, truth-telling, criticism, and duty. Derived from the ancient Greek word parrhesia, Foucault's analysis of free speech is relevant to the mentoring of medical students. This is especially true given the educational and social need to transform future physicians into able citizens who practice a fearless freedom of expression on behalf of their patients, the public, the medical profession, and themselves in the public and political arena. In this paper, we argue that Foucault's understanding of free speech, or parrhesia, should be read as an ethical response to the American Medical Association's recent educational effort, Initiative to Transform Medical Education (ITME: Recommendations for change in the system of medical education (2007. In this document, the American Medical Association identifies gaps in medical education, emphasizing the need to enhance health system safety and quality, to improve education in training institutions, and to address the inadequacy of physician preparedness in new content areas. These gaps, and their relationship to the ITME goal of promoting excellence in patient care by implementing reform in the US system of medical education, call for a serious consideration and use of Foucault's parrhesia in the way that medical students are trained and mentored.
Linear program differentiation for single-channel speech separation

DEFF Research Database (Denmark)

Pearlmutter, Barak A.; Olsson, Rasmus Kongsgaard

2006-01-01

Many apparently difficult problems can be solved by reduction to linear programming. Such problems are often subproblems within larger systems. When gradient optimisation of the entire larger system is desired, it is necessary to propagate gradients through the internally-invoked LP solver....... For instance, when an intermediate quantity z is the solution to a linear program involving constraint matrix A, a vector of sensitivities dE/dz will induce sensitivities dE/dA. Here we show how these can be efficiently calculated, when they exist. This allows algorithmic differentiation to be applied...... to algorithms that invoke linear programming solvers as subroutines, as is common when using sparse representations in signal processing. Here we apply it to gradient optimisation of over complete dictionaries for maximally sparse representations of a speech corpus. The dictionaries are employed in a single...
Survey on Chatbot Design Techniques in Speech Conversation Systems

OpenAIRE

Sameera A. Abdul-Kader; Dr. John Woods

2015-01-01

Human-Computer Speech is gaining momentum as a technique of computer interaction. There has been a recent upsurge in speech based search engines and assistants such as Siri, Google Chrome and Cortana. Natural Language Processing (NLP) techniques such as NLTK for Python can be applied to analyse speech, and intelligent responses can be found by designing an engine to provide appropriate human like responses. This type of programme is called a Chatbot, which is the focus of this study. This pap...
How Should Children with Speech Sound Disorders be Classified? A Review and Critical Evaluation of Current Classification Systems

Science.gov (United States)

Waring, R.; Knight, R.

2013-01-01

Background: Children with speech sound disorders (SSD) form a heterogeneous group who differ in terms of the severity of their condition, underlying cause, speech errors, involvement of other aspects of the linguistic system and treatment response. To date there is no universal and agreed-upon classification system. Instead, a number of…
ANALYSIS OF MULTIMODAL FUSION TECHNIQUES FOR AUDIO-VISUAL SPEECH RECOGNITION

Directory of Open Access Journals (Sweden)

D.V. Ivanko

2016-05-01

Full Text Available The paper deals with analytical review, covering the latest achievements in the field of audio-visual (AV fusion (integration of multimodal information. We discuss the main challenges and report on approaches to address them. One of the most important tasks of the AV integration is to understand how the modalities interact and influence each other. The paper addresses this problem in the context of AV speech processing and speech recognition. In the first part of the review we set out the basic principles of AV speech recognition and give the classification of audio and visual features of speech. Special attention is paid to the systematization of the existing techniques and the AV data fusion methods. In the second part we provide a consolidated list of tasks and applications that use the AV fusion based on carried out analysis of research area. We also indicate used methods, techniques, audio and video features. We propose classification of the AV integration, and discuss the advantages and disadvantages of different approaches. We draw conclusions and offer our assessment of the future in the field of AV fusion. In the further research we plan to implement a system of audio-visual Russian continuous speech recognition using advanced methods of multimodal fusion.
Contrast in concept-to-speech generation

NARCIS (Netherlands)

Theune, Mariet; Walker, M.; Rambow, O.

2002-01-01

In concept-to-speech systems, spoken output is generated on the basis of a text that has been produced by the system itself. In such systems, linguistic information from the text generation component may be exploited to achieve a higher prosodic quality of the speech output than can be obtained in a
The Timing and Effort of Lexical Access in Natural and Degraded Speech

NARCIS (Netherlands)

Wagner, Anita E; Toffanin, Paolo; Başkent, Deniz

2016-01-01

Understanding speech is effortless in ideal situations, and although adverse conditions, such as caused by hearing impairment, often render it an effortful task, they do not necessarily suspend speech comprehension. A prime example of this is speech perception by cochlear implant users, whose
Complex Problem Solving in Radiologic Technology: Understanding the Roles of Experience, Reflective Judgment, and Workplace Culture

Science.gov (United States)

Yates, Jennifer L.

2011-01-01

The purpose of this research study was to explore the process of learning and development of problem solving skills in radiologic technologists. The researcher sought to understand the nature of difficult problems encountered in clinical practice, to identify specific learning practices leading to the development of professional expertise, and to…
The effect of problem posing and problem solving with realistic mathematics education approach to the conceptual understanding and adaptive reasoning

Science.gov (United States)

Mahendra, Rengga; Slamet, Isnandar; Budiyono

2017-12-01

One of the difficulties of students in learning mathematics is on the subject of geometry that requires students to understand abstract things. The aim of this research is to determine the effect of learning model Problem Posing and Problem Solving with Realistic Mathematics Education Approach to conceptual understanding and students' adaptive reasoning in learning mathematics. This research uses a kind of quasi experimental research. The population of this research is all seventh grade students of Junior High School 1 Jaten, Indonesia. The sample was taken using stratified cluster random sampling technique. The test of the research hypothesis was analyzed by using t-test. The results of this study indicate that the model of Problem Posing learning with Realistic Mathematics Education Approach can improve students' conceptual understanding significantly in mathematics learning. In addition tu, the results also showed that the model of Problem Solving learning with Realistic Mathematics Education Approach can improve students' adaptive reasoning significantly in learning mathematics. Therefore, the model of Problem Posing and Problem Solving learning with Realistic Mathematics Education Approach is appropriately applied in mathematics learning especially on the subject of geometry so as to improve conceptual understanding and students' adaptive reasoning. Furthermore, the impact can improve student achievement.
Inner Speech's Relationship With Overt Speech in Poststroke Aphasia.

Science.gov (United States)

Stark, Brielle C; Geva, Sharon; Warburton, Elizabeth A

2017-09-18

Relatively preserved inner speech alongside poor overt speech has been documented in some persons with aphasia (PWA), but the relationship of overt speech with inner speech is still largely unclear, as few studies have directly investigated these factors. The present study investigates the relationship of relatively preserved inner speech in aphasia with selected measures of language and cognition. Thirty-eight persons with chronic aphasia (27 men, 11 women; average age 64.53 ± 13.29 years, time since stroke 8-111 months) were classified as having relatively preserved inner and overt speech (n = 21), relatively preserved inner speech with poor overt speech (n = 8), or not classified due to insufficient measurements of inner and/or overt speech (n = 9). Inner speech scores (by group) were correlated with selected measures of language and cognition from the Comprehensive Aphasia Test (Swinburn, Porter, & Al, 2004). The group with poor overt speech showed a significant relationship of inner speech with overt naming (r = .95, p speech and language and cognition factors were not significant for the group with relatively good overt speech. As in previous research, we show that relatively preserved inner speech is found alongside otherwise severe production deficits in PWA. PWA with poor overt speech may rely more on preserved inner speech for overt picture naming (perhaps due to shared resources with verbal working memory) and for written picture description (perhaps due to reliance on inner speech due to perceived task difficulty). Assessments of inner speech may be useful as a standard component of aphasia screening, and therapy focused on improving and using inner speech may prove clinically worthwhile. https://doi.org/10.23641/asha.5303542.
Nobel peace speech

Directory of Open Access Journals (Sweden)

Joshua FRYE

2017-07-01

Full Text Available The Nobel Peace Prize has long been considered the premier peace prize in the world. According to Geir Lundestad, Secretary of the Nobel Committee, of the 300 some peace prizes awarded worldwide, “none is in any way as well known and as highly respected as the Nobel Peace Prize” (Lundestad, 2001. Nobel peace speech is a unique and significant international site of public discourse committed to articulating the universal grammar of peace. Spanning over 100 years of sociopolitical history on the world stage, Nobel Peace Laureates richly represent an important cross-section of domestic and international issues increasingly germane to many publics. Communication scholars’ interest in this rhetorical genre has increased in the past decade. Yet, the norm has been to analyze a single speech artifact from a prestigious or controversial winner rather than examine the collection of speeches for generic commonalities of import. In this essay, we analyze the discourse of Nobel peace speech inductively and argue that the organizing principle of the Nobel peace speech genre is the repetitive form of normative liberal principles and values that function as rhetorical topoi. These topoi include freedom and justice and appeal to the inviolable, inborn right of human beings to exercise certain political and civil liberties and the expectation of equality of protection from totalitarian and tyrannical abuses. The significance of this essay to contemporary communication theory is to expand our theoretical understanding of rhetoric’s role in the maintenance and development of an international and cross-cultural vocabulary for the grammar of peace.
From Gesture to Speech

Directory of Open Access Journals (Sweden)

Maurizio Gentilucci

2012-11-01

Full Text Available One of the major problems concerning the evolution of human language is to understand how sounds became associated to meaningful gestures. It has been proposed that the circuit controlling gestures and speech evolved from a circuit involved in the control of arm and mouth movements related to ingestion. This circuit contributed to the evolution of spoken language, moving from a system of communication based on arm gestures. The discovery of the mirror neurons has provided strong support for the gestural theory of speech origin because they offer a natural substrate for the embodiment of language and create a direct link between sender and receiver of a message. Behavioural studies indicate that manual gestures are linked to mouth movements used for syllable emission. Grasping with the hand selectively affected movement of inner or outer parts of the mouth according to syllable pronunciation and hand postures, in addition to hand actions, influenced the control of mouth grasp and vocalization. Gestures and words are also related to each other. It was found that when producing communicative gestures (emblems the intention to interact directly with a conspecific was transferred from gestures to words, inducing modification in voice parameters. Transfer effects of the meaning of representational gestures were found on both vocalizations and meaningful words. It has been concluded that the results of our studies suggest the existence of a system relating gesture to vocalization which was precursor of a more general system reciprocally relating gesture to word.
Cultural-historical and cognitive approaches to understanding the origins of development of written speech

Directory of Open Access Journals (Sweden)

L.F. Obukhova

2014-08-01

Full Text Available We present an analysis of the emergence and development of written speech, its relationship to the oral speech, connections to the symbolic and modeling activities of preschool children – playing and drawing. While a child's drawing is traditionally interpreted in psychology either as a measure of intellectual development, or as a projective technique, or as a criterion for creative giftedness of the child, in this article, the artistic activity is analyzed as a prerequisite for development of written speech. The article substantiates the hypothesis that the mastery of “picture writing” – the ability to display the verbal content in a schematic picturesque plan – is connected to the success of writing speech at school age. Along with the classical works of L.S. Vygotsky, D.B. Elkonin, A.R. Luria, dedicated to finding the origins of writing, the article presents the current Russian and foreign frameworks of forming the preconditions of writing, based on the concepts of cultural-historical theory (“higher mental functions”, “zone of proximal development”, etc.. In Western psychology, a number of pilot studies used the developmental function of drawing for teaching the written skills to children of 5-7 years old. However, in cognitive psychology, relationship between drawing and writing is most often reduced mainly to the analysis of general motor circuits. Despite the recovery in research on writing and its origins in the last decade, either in domestic or in foreign psychology, the written speech is not a sufficiently studied problem.
A Noninvasive Imaging Approach to Understanding Speech Changes following Deep Brain Stimulation in Parkinson's Disease

Science.gov (United States)

Narayana, Shalini; Jacks, Adam; Robin, Donald A.; Poizner, Howard; Zhang, Wei; Franklin, Crystal; Liotti, Mario; Vogel, Deanie; Fox, Peter T.

2009-01-01

Purpose: To explore the use of noninvasive functional imaging and "virtual" lesion techniques to study the neural mechanisms underlying motor speech disorders in Parkinson's disease. Here, we report the use of positron emission tomography (PET) and transcranial magnetic stimulation (TMS) to explain exacerbated speech impairment following…
Are mirror neurons the basis of speech perception? Evidence from five cases with damage to the purported human mirror system

Science.gov (United States)

Rogalsky, Corianne; Love, Tracy; Driscoll, David; Anderson, Steven W.; Hickok, Gregory

2013-01-01

The discovery of mirror neurons in macaque has led to a resurrection of motor theories of speech perception. Although the majority of lesion and functional imaging studies have associated perception with the temporal lobes, it has also been proposed that the ‘human mirror system’, which prominently includes Broca’s area, is the neurophysiological substrate of speech perception. Although numerous studies have demonstrated a tight link between sensory and motor speech processes, few have directly assessed the critical prediction of mirror neuron theories of speech perception, namely that damage to the human mirror system should cause severe deficits in speech perception. The present study measured speech perception abilities of patients with lesions involving motor regions in the left posterior frontal lobe and/or inferior parietal lobule (i.e., the proposed human ‘mirror system’). Performance was at or near ceiling in patients with fronto-parietal lesions. It is only when the lesion encroaches on auditory regions in the temporal lobe that perceptual deficits are evident. This suggests that ‘mirror system’ damage does not disrupt speech perception, but rather that auditory systems are the primary substrate for speech perception. PMID:21207313
The Galker test of speech reception in noise

DEFF Research Database (Denmark)

Lauritsen, Maj-Britt Glenn; Söderström, Margareta; Kreiner, Svend

2016-01-01

PURPOSE: We tested "the Galker test", a speech reception in noise test developed for primary care for Danish preschool children, to explore if the children's ability to hear and understand speech was associated with gender, age, middle ear status, and the level of background noise. METHODS......: The Galker test is a 35-item audio-visual, computerized word discrimination test in background noise. Included were 370 normally developed children attending day care center. The children were examined with the Galker test, tympanometry, audiometry, and the Reynell test of verbal comprehension. Parents...... and daycare teachers completed questionnaires on the children's ability to hear and understand speech. As most of the variables were not assessed using interval scales, non-parametric statistics (Goodman-Kruskal's gamma) were used for analyzing associations with the Galker test score. For comparisons...

The effects of students' reasoning abilities on conceptual understandings and problem-solving skills in introductory mechanics

International Nuclear Information System (INIS)

Ates, S; Cataloglu, E

2007-01-01

The purpose of this study was to determine if there are relationships among freshmen/first year students' reasoning abilities, conceptual understandings and problem-solving skills in introductory mechanics. The sample consisted of 165 freshmen science education prospective teachers (female = 86, male = 79; age range 17-21) who were enrolled in an introductory physics course. Data collection was done during the fall semesters in two successive years. At the beginning of each semester, the force concept inventory (FCI) and the classroom test of scientific reasoning (CTSR) were administered to assess students' initial understanding of basic concepts in mechanics and reasoning levels. After completing the course, the FCI and the mechanics baseline test (MBT) were administered. The results indicated that there was a significant difference in problem-solving skill test mean scores, as measured by the MBT, among concrete, formal and postformal reasoners. There were no significant differences in conceptual understanding levels of pre- and post-test mean scores, as measured by FCI, among the groups. The Benferroni post hoc comparison test revealed which set of reasoning levels showed significant difference for the MBT scores. No statistical difference between formal and postformal reasoners' mean scores was observed, while the mean scores between concrete and formal reasoners and concrete and postformal reasoners were statistically significantly different
Musician advantage for speech-on-speech perception

NARCIS (Netherlands)

Başkent, Deniz; Gaudrain, Etienne

Evidence for transfer of musical training to better perception of speech in noise has been mixed. Unlike speech-in-noise, speech-on-speech perception utilizes many of the skills that musical training improves, such as better pitch perception and stream segregation, as well as use of higher-level
Multiparameter extrapolation and deflation methods for solving equation systems

Directory of Open Access Journals (Sweden)

A. J. Hughes Hallett

1984-01-01

Full Text Available Most models in economics and the applied sciences are solved by first order iterative techniques, usually those based on the Gauss-Seidel algorithm. This paper examines the convergence of multiparameter extrapolations (accelerations of first order iterations, as an improved approximation to the Newton method for solving arbitrary nonlinear equation systems. It generalises my earlier results on single parameter extrapolations. Richardson's generalised method and the deflation method for detecting successive solutions in nonlinear equation systems are also presented as multiparameter extrapolations of first order iterations. New convergence results are obtained for those methods.
Child speech, language and communication need re-examined in a public health context: a new direction for the speech and language therapy profession.

Science.gov (United States)

Law, James; Reilly, Sheena; Snow, Pamela C

2013-01-01

Historically speech and language therapy services for children have been framed within a rehabilitative framework with explicit assumptions made about providing therapy to individuals. While this is clearly important in many cases, we argue that this model needs revisiting for a number of reasons. First, our understanding of the nature of disability, and therefore communication disabilities, has changed over the past century. Second, there is an increasing understanding of the impact that the social gradient has on early communication difficulties. Finally, understanding how these factors interact with one other and have an impact across the life course remains poorly understood. To describe the public health paradigm and explore its implications for speech and language therapy with children. We test the application of public health methodologies to speech and language therapy services by looking at four dimensions of service delivery: (1) the uptake of services and whether those children who need services receive them; (2) the development of universal prevention services in relation to social disadvantage; (3) the risk of over-interpreting co-morbidity from clinical samples; and (4) the overlap between communicative competence and mental health. It is concluded that there is a strong case for speech and language therapy services to be reconceptualized to respond to the needs of the whole population and according to socially determined needs, focusing on primary prevention. This is not to disregard individual need, but to highlight the needs of the population as a whole. Although the socio-political context is different between countries, we maintain that this is relevant wherever speech and language therapists have a responsibility for covering whole populations. Finally, we recommend that speech and language therapy services be conceptualized within the framework laid down in The Ottawa Charter for Health Promotion. © 2013 Royal College of Speech and Language
Improving Language Models in Speech-Based Human-Machine Interaction

Directory of Open Access Journals (Sweden)

Raquel Justo

2013-02-01

Full Text Available This work focuses on speech-based human-machine interaction. Specifically, a Spoken Dialogue System (SDS that could be integrated into a robot is considered. Since Automatic Speech Recognition is one of the most sensitive tasks that must be confronted in such systems, the goal of this work is to improve the results obtained by this specific module. In order to do so, a hierarchical Language Model (LM is considered. Different series of experiments were carried out using the proposed models over different corpora and tasks. The results obtained show that these models provide greater accuracy in the recognition task. Additionally, the influence of the Acoustic Modelling (AM in the improvement percentage of the Language Models has also been explored. Finally the use of hierarchical Language Models in a language understanding task has been successfully employed, as shown in an additional series of experiments.
Air Traffic Controllers’ Long-Term Speech-in-Noise Training Effects: A Control Group Study

Science.gov (United States)

Zaballos, María T.P.; Plasencia, Daniel P.; González, María L.Z.; de Miguel, Angel R.; Macías, Ángel R.

2016-01-01

Introduction: Speech perception in noise relies on the capacity of the auditory system to process complex sounds using sensory and cognitive skills. The possibility that these can be trained during adulthood is of special interest in auditory disorders, where speech in noise perception becomes compromised. Air traffic controllers (ATC) are constantly exposed to radio communication, a situation that seems to produce auditory learning. The objective of this study has been to quantify this effect. Subjects and Methods: 19 ATC and 19 normal hearing individuals underwent a speech in noise test with three signal to noise ratios: 5, 0 and −5 dB. Noise and speech were presented through two different loudspeakers in azimuth position. Speech tokes were presented at 65 dB SPL, while white noise files were at 60, 65 and 70 dB respectively. Results: Air traffic controllers outperform the control group in all conditions [P<0.05 in ANOVA and Mann-Whitney U tests]. Group differences were largest in the most difficult condition, SNR=−5 dB. However, no correlation between experience and performance were found for any of the conditions tested. The reason might be that ceiling performance is achieved much faster than the minimum experience time recorded, 5 years, although intrinsic cognitive abilities cannot be disregarded. Discussion: ATC demonstrated enhanced ability to hear speech in challenging listening environments. This study provides evidence that long-term auditory training is indeed useful in achieving better speech-in-noise understanding even in adverse conditions, although good cognitive qualities are likely to be a basic requirement for this training to be effective. Conclusion: Our results show that ATC outperform the control group in all conditions. Thus, this study provides evidence that long-term auditory training is indeed useful in achieving better speech-in-noise understanding even in adverse conditions. PMID:27991470
Air traffic controllers' long-term speech-in-noise training effects: A control group study.

Science.gov (United States)

Zaballos, Maria T P; Plasencia, Daniel P; González, María L Z; de Miguel, Angel R; Macías, Ángel R

2016-01-01

Speech perception in noise relies on the capacity of the auditory system to process complex sounds using sensory and cognitive skills. The possibility that these can be trained during adulthood is of special interest in auditory disorders, where speech in noise perception becomes compromised. Air traffic controllers (ATC) are constantly exposed to radio communication, a situation that seems to produce auditory learning. The objective of this study has been to quantify this effect. 19 ATC and 19 normal hearing individuals underwent a speech in noise test with three signal to noise ratios: 5, 0 and -5 dB. Noise and speech were presented through two different loudspeakers in azimuth position. Speech tokes were presented at 65 dB SPL, while white noise files were at 60, 65 and 70 dB respectively. Air traffic controllers outperform the control group in all conditions [P<0.05 in ANOVA and Mann-Whitney U tests]. Group differences were largest in the most difficult condition, SNR=-5 dB. However, no correlation between experience and performance were found for any of the conditions tested. The reason might be that ceiling performance is achieved much faster than the minimum experience time recorded, 5 years, although intrinsic cognitive abilities cannot be disregarded. ATC demonstrated enhanced ability to hear speech in challenging listening environments. This study provides evidence that long-term auditory training is indeed useful in achieving better speech-in-noise understanding even in adverse conditions, although good cognitive qualities are likely to be a basic requirement for this training to be effective. Our results show that ATC outperform the control group in all conditions. Thus, this study provides evidence that long-term auditory training is indeed useful in achieving better speech-in-noise understanding even in adverse conditions.
Specific acoustic models for spontaneous and dictated style in indonesian speech recognition

Science.gov (United States)

Vista, C. B.; Satriawan, C. H.; Lestari, D. P.; Widyantoro, D. H.

2018-03-01

The performance of an automatic speech recognition system is affected by differences in speech style between the data the model is originally trained upon and incoming speech to be recognized. In this paper, the usage of GMM-HMM acoustic models for specific speech styles is investigated. We develop two systems for the experiments; the first employs a speech style classifier to predict the speech style of incoming speech, either spontaneous or dictated, then decodes this speech using an acoustic model specifically trained for that speech style. The second system uses both acoustic models to recognise incoming speech and decides upon a final result by calculating a confidence score of decoding. Results show that training specific acoustic models for spontaneous and dictated speech styles confers a slight recognition advantage as compared to a baseline model trained on a mixture of spontaneous and dictated training data. In addition, the speech style classifier approach of the first system produced slightly more accurate results than the confidence scoring employed in the second system.
Motor functions and adaptive behaviour in children with childhood apraxia of speech.

Science.gov (United States)

Tükel, Şermin; Björelius, Helena; Henningsson, Gunilla; McAllister, Anita; Eliasson, Ann Christin

2015-01-01

Undiagnosed motor and behavioural problems have been reported for children with childhood apraxia of speech (CAS). This study aims to understand the extent of these problems by determining the profile of and relationships between speech/non-speech oral, manual and overall body motor functions and adaptive behaviours in CAS. Eighteen children (five girls and 13 boys) with CAS, 4 years 4 months to 10 years 6 months old, participated in this study. The assessments used were the Verbal Motor Production Assessment for Children (VMPAC), Bruininks-Oseretsky Test of Motor Proficiency (BOT-2) and Adaptive Behaviour Assessment System (ABAS-II). Median result of speech/non-speech oral motor function was between -1 and -2 SD of the mean VMPAC norms. For BOT-2 and ABAS-II, the median result was between the mean and -1 SD of test norms. However, on an individual level, many children had co-occurring difficulties (below -1 SD of the mean) in overall and manual motor functions and in adaptive behaviour, despite few correlations between sub-tests. In addition to the impaired speech motor output, children displayed heterogeneous motor problems suggesting the presence of a global motor deficit. The complex relationship between motor functions and behaviour may partly explain the undiagnosed developmental difficulties in CAS.
Simulation of modified hybrid noise reduction algorithm to enhance the speech quality

International Nuclear Information System (INIS)

Waqas, A.; Muhammad, T.; Jamal, H.

2013-01-01

Speech is the most essential method of correspondence of humankind. Cell telephony, portable hearing assistants and, hands free are specific provisions in this respect. The performance of these communication devices could be affected because of distortions which might augment them. There are two essential sorts of distortions that might be recognized, specifically: convolutive and additive noises. These mutilations contaminate the clean speech and make it unsatisfactory to human audiences i.e. perceptual value and intelligibility of speech signal diminishes. The objective of speech upgrade systems is to enhance the quality and understandability of speech to make it more satisfactory to audiences. This paper recommends a modified hybrid approach for single channel devices to process the noisy signals considering only the effect of background noises. It is a mixture of pre-processing relative spectral amplitude (RASTA) filter, which is approximated by a straight forward 4th order band-pass filter, and conventional minimum mean square error short time spectral amplitude (MMSE STSA85) estimator. To analyze the performance of the algorithm an objective parameter called Perceptual estimation of speech quality (PESQ) is measured. The results show that the modified algorithm performs well to remove the background noises. SIMULINK implementation is also performed and its profile report has been generated to observe the execution time. (author)
Discovering Steiner Triple Systems through Problem Solving

Science.gov (United States)

Sriraman, Bharath

2004-01-01

An attempt to implement problem solving as a teacher of ninth grade algebra is described. The problems selected were not general ones, they involved combinations and represented various situations and were more complex which lead to the discovery of Steiner triple systems.
Neural pathways for visual speech perception

Directory of Open Access Journals (Sweden)

Lynne E Bernstein

2014-12-01

Full Text Available This paper examines the questions, what levels of speech can be perceived visually, and how is visual speech represented by the brain? Review of the literature leads to the conclusions that every level of psycholinguistic speech structure (i.e., phonetic features, phonemes, syllables, words, and prosody can be perceived visually, although individuals differ in their abilities to do so; and that there are visual modality-specific representations of speech qua speech in higher-level vision brain areas. That is, the visual system represents the modal patterns of visual speech. The suggestion that the auditory speech pathway receives and represents visual speech is examined in light of neuroimaging evidence on the auditory speech pathways. We outline the generally agreed-upon organization of the visual ventral and dorsal pathways and examine several types of visual processing that might be related to speech through those pathways, specifically, face and body, orthography, and sign language processing. In this context, we examine the visual speech processing literature, which reveals widespread diverse patterns activity in posterior temporal cortices in response to visual speech stimuli. We outline a model of the visual and auditory speech pathways and make several suggestions: (1 The visual perception of speech relies on visual pathway representations of speech qua speech. (2 A proposed site of these representations, the temporal visual speech area (TVSA has been demonstrated in posterior temporal cortex, ventral and posterior to multisensory posterior superior temporal sulcus (pSTS. (3 Given that visual speech has dynamic and configural features, its representations in feedforward visual pathways are expected to integrate these features, possibly in TVSA.
Solving Disparities Through Payment And Delivery System Reform: A Program To Achieve Health Equity.

Science.gov (United States)

DeMeester, Rachel H; Xu, Lucy J; Nocon, Robert S; Cook, Scott C; Ducas, Andrea M; Chin, Marshall H

2017-06-01

Payment systems generally do not directly encourage or support the reduction of health disparities. In 2013 the Finding Answers: Solving Disparities through Payment and Delivery System Reform program of the Robert Wood Johnson Foundation sought to understand how alternative payment models might intentionally incorporate a disparities-reduction component to promote health equity. A qualitative analysis of forty proposals to the program revealed that applicants generally did not link payment reform tightly to disparities reduction. Most proposed general pay-for-performance, global payment, or shared savings plans, combined with multicomponent system interventions. None of the applicants proposed making any financial payments contingent on having successfully reduced disparities. Most applicants did not address how they would optimize providers' intrinsic and extrinsic motivation to reduce disparities. A better understanding of how payment and care delivery models might be designed and implemented to reduce health disparities is essential. Project HOPE—The People-to-People Health Foundation, Inc.
Speech networks at rest and in action: interactions between functional brain networks controlling speech production

Science.gov (United States)

Fuertinger, Stefan

2015-01-01

Speech production is one of the most complex human behaviors. Although brain activation during speaking has been well investigated, our understanding of interactions between the brain regions and neural networks remains scarce. We combined seed-based interregional correlation analysis with graph theoretical analysis of functional MRI data during the resting state and sentence production in healthy subjects to investigate the interface and topology of functional networks originating from the key brain regions controlling speech, i.e., the laryngeal/orofacial motor cortex, inferior frontal and superior temporal gyri, supplementary motor area, cingulate cortex, putamen, and thalamus. During both resting and speaking, the interactions between these networks were bilaterally distributed and centered on the sensorimotor brain regions. However, speech production preferentially recruited the inferior parietal lobule (IPL) and cerebellum into the large-scale network, suggesting the importance of these regions in facilitation of the transition from the resting state to speaking. Furthermore, the cerebellum (lobule VI) was the most prominent region showing functional influences on speech-network integration and segregation. Although networks were bilaterally distributed, interregional connectivity during speaking was stronger in the left vs. right hemisphere, which may have underlined a more homogeneous overlap between the examined networks in the left hemisphere. Among these, the laryngeal motor cortex (LMC) established a core network that fully overlapped with all other speech-related networks, determining the extent of network interactions. Our data demonstrate complex interactions of large-scale brain networks controlling speech production and point to the critical role of the LMC, IPL, and cerebellum in the formation of speech production network. PMID:25673742
New approach to solve symmetric fully fuzzy linear systems

Indian Academy of Sciences (India)

concepts of fuzzy set theory and then define a fully fuzzy linear system of equations. .... To represent the above problem as fully fuzzy linear system, we represent x .... Fully fuzzy linear systems can be solved by Linear programming approach, ...
Workflow Agents vs. Expert Systems: Problem Solving Methods in Work Systems Design

Science.gov (United States)

Clancey, William J.; Sierhuis, Maarten; Seah, Chin

2009-01-01

During the 1980s, a community of artificial intelligence researchers became interested in formalizing problem solving methods as part of an effort called "second generation expert systems" (2nd GES). How do the motivations and results of this research relate to building tools for the workplace today? We provide an historical review of how the theory of expertise has developed, a progress report on a tool for designing and implementing model-based automation (Brahms), and a concrete example how we apply 2nd GES concepts today in an agent-based system for space flight operations (OCAMS). Brahms incorporates an ontology for modeling work practices, what people are doing in the course of a day, characterized as "activities." OCAMS was developed using a simulation-to-implementation methodology, in which a prototype tool was embedded in a simulation of future work practices. OCAMS uses model-based methods to interactively plan its actions and keep track of the work to be done. The problem solving methods of practice are interactive, employing reasoning for and through action in the real world. Analogously, it is as if a medical expert system were charged not just with interpreting culture results, but actually interacting with a patient. Our perspective shifts from building a "problem solving" (expert) system to building an actor in the world. The reusable components in work system designs include entire "problem solvers" (e.g., a planning subsystem), interoperability frameworks, and workflow agents that use and revise models dynamically in a network of people and tools. Consequently, the research focus shifts so "problem solving methods" include ways of knowing that models do not fit the world, and ways of interacting with other agents and people to gain or verify information and (ultimately) adapt rules and procedures to resolve problematic situations.
Application of Homotopy Analysis Method to Solve Relativistic Toda Lattice System

International Nuclear Information System (INIS)

Wang Qi

2010-01-01

In this letter, the homotopy analysis method is successfully applied to solve the Relativistic Toda lattice system. Comparisons are made between the results of the proposed method and exact solutions. Analysis results show that homotopy analysis method is a powerful and easy-to-use analytic tool to solve systems of differential-difference equations. (general)
Effects of Synthetic Speech Output on Requesting and Natural Speech Production in Children with Autism: A Preliminary Study

Science.gov (United States)

Schlosser, Ralf W.; Sigafoos, Jeff; Luiselli, James K.; Angermeier, Katie; Harasymowyz, Ulana; Schooley, Katherine; Belfiore, Phil J.

2007-01-01

Requesting is often taught as an initial target during augmentative and alternative communication intervention in children with autism. Speech-generating devices are purported to have advantages over non-electronic systems due to their synthetic speech output. On the other hand, it has been argued that speech output, being in the auditory…
The Beginnings of Danish Speech Perception

DEFF Research Database (Denmark)

Østerbye, Torkil

, in the light of the rich and complex Danish sound system. The first two studies report on native adults’ perception of Danish speech sounds in quiet and noise. The third study examined the development of language-specific perception in native Danish infants at 6, 9 and 12 months of age. The book points......Little is known about the perception of speech sounds by native Danish listeners. However, the Danish sound system differs in several interesting ways from the sound systems of other languages. For instance, Danish is characterized, among other features, by a rich vowel inventory and by different...... reductions of speech sounds evident in the pronunciation of the language. This book (originally a PhD thesis) consists of three studies based on the results of two experiments. The experiments were designed to provide knowledge of the perception of Danish speech sounds by Danish adults and infants...
Multiengine Speech Processing Using SNR Estimator in Variable Noisy Environments

Directory of Open Access Journals (Sweden)

Ahmad R. Abu-El-Quran

2012-01-01

Full Text Available We introduce a multiengine speech processing system that can detect the location and the type of audio signal in variable noisy environments. This system detects the location of the audio source using a microphone array; the system examines the audio first, determines if it is speech/nonspeech, then estimates the value of the signal to noise (SNR using a Discrete-Valued SNR Estimator. Using this SNR value, instead of trying to adapt the speech signal to the speech processing system, we adapt the speech processing system to the surrounding environment of the captured speech signal. In this paper, we introduced the Discrete-Valued SNR Estimator and a multiengine classifier, using Multiengine Selection or Multiengine Weighted Fusion. Also we use the SI as example of the speech processing. The Discrete-Valued SNR Estimator achieves an accuracy of 98.4% in characterizing the environment's SNR. Compared to a conventional single engine SI system, the improvement in accuracy was as high as 9.0% and 10.0% for the Multiengine Selection and Multiengine Weighted Fusion, respectively.

Perceived Speech Quality Estimation Using DTW Algorithm

Directory of Open Access Journals (Sweden)

S. Arsenovski

2009-06-01

Full Text Available In this paper a method for speech quality estimation is evaluated by simulating the transfer of speech over packet switched and mobile networks. The proposed system uses Dynamic Time Warping algorithm for test and received speech comparison. Several tests have been made on a test speech sample of a single speaker with simulated packet (frame loss effects on the perceived speech. The achieved results have been compared with measured PESQ values on the used transmission channel and their correlation has been observed.
The benefit obtained from visually displayed text from an automatic speech recognizer during listening to speech presented in noise

NARCIS (Netherlands)

Zekveld, A.A.; Kramer, S.E.; Kessens, J.M.; Vlaming, M.S.M.G.; Houtgast, T.

2008-01-01

OBJECTIVES: The aim of this study was to evaluate the benefit that listeners obtain from visually presented output from an automatic speech recognition (ASR) system during listening to speech in noise. DESIGN: Auditory-alone and audiovisual speech reception thresholds (SRTs) were measured. The SRT
Spectral integration in speech and non-speech sounds

Science.gov (United States)

Jacewicz, Ewa

2005-04-01

Spectral integration (or formant averaging) was proposed in vowel perception research to account for the observation that a reduction of the intensity of one of two closely spaced formants (as in /u/) produced a predictable shift in vowel quality [Delattre et al., Word 8, 195-210 (1952)]. A related observation was reported in psychoacoustics, indicating that when the components of a two-tone periodic complex differ in amplitude and frequency, its perceived pitch is shifted toward that of the more intense tone [Helmholtz, App. XIV (1875/1948)]. Subsequent research in both fields focused on the frequency interval that separates these two spectral components, in an attempt to determine the size of the bandwidth for spectral integration to occur. This talk will review the accumulated evidence for and against spectral integration within the hypothesized limit of 3.5 Bark for static and dynamic signals in speech perception and psychoacoustics. Based on similarities in the processing of speech and non-speech sounds, it is suggested that spectral integration may reflect a general property of the auditory system. A larger frequency bandwidth, possibly close to 3.5 Bark, may be utilized in integrating acoustic information, including speech, complex signals, or sound quality of a violin.
Speech endpoint detection with non-language speech sounds for generic speech processing applications

Science.gov (United States)

McClain, Matthew; Romanowski, Brian

2009-05-01

Non-language speech sounds (NLSS) are sounds produced by humans that do not carry linguistic information. Examples of these sounds are coughs, clicks, breaths, and filled pauses such as "uh" and "um" in English. NLSS are prominent in conversational speech, but can be a significant source of errors in speech processing applications. Traditionally, these sounds are ignored by speech endpoint detection algorithms, where speech regions are identified in the audio signal prior to processing. The ability to filter NLSS as a pre-processing step can significantly enhance the performance of many speech processing applications, such as speaker identification, language identification, and automatic speech recognition. In order to be used in all such applications, NLSS detection must be performed without the use of language models that provide knowledge of the phonology and lexical structure of speech. This is especially relevant to situations where the languages used in the audio are not known apriori. We present the results of preliminary experiments using data from American and British English speakers, in which segments of audio are classified as language speech sounds (LSS) or NLSS using a set of acoustic features designed for language-agnostic NLSS detection and a hidden-Markov model (HMM) to model speech generation. The results of these experiments indicate that the features and model used are capable of detection certain types of NLSS, such as breaths and clicks, while detection of other types of NLSS such as filled pauses will require future research.
Behavioural, computational, and neuroimaging studies of acquired apraxia of speech

Directory of Open Access Journals (Sweden)

Kirrie J Ballard

2014-11-01

Full Text Available A critical examination of speech motor control depends on an in-depth understanding of network connectivity associated with Brodmann areas 44 and 45 and surrounding cortices. Damage to these areas has been associated with two conditions - the speech motor programming disorder apraxia of speech (AOS and the linguistic / grammatical disorder of Broca’s aphasia. Here we focus on AOS, which is most commonly associated with damage to posterior Broca's area and adjacent cortex. We provide an overview of our own studies into the nature of AOS, including behavioral and neuroimaging methods, to explore components of the speech motor network that are associated with normal and disordered speech motor programming in AOS. Behavioral, neuroimaging, and computational modeling studies are indicating that AOS is associated with impairment in learning feedforward models and/or implementing feedback mechanisms and with the functional contribution of BA6. While functional connectivity methods are not yet routinely applied to the study of AOS, we highlight the need for focusing on the functional impact of localised lesions throughout the speech network, as well as larger scale comparative studies to distinguish the unique behavioral and neurological signature of AOS. By coupling these methods with neural network models, we have a powerful set of tools to improve our understanding of the neural mechanisms that underlie AOS, and speech production generally.
Speech and Swallowing in Parkinson’s Disease

OpenAIRE

Tjaden, Kris

2008-01-01

Dysarthria and dysphagia occur frequently in Parkinson’s disease (PD). Reduced speech intelligibility is a significant functional limitation of dysarthria, and in the case of PD is likely related articulatory and phonatory impairment. Prosodically-based treatments show the most promise for addressing these deficits as well as for maximizing speech intelligibility. Communication-oriented strategies also may help to enhance mutual understanding between a speaker and listener. Dysphagia in PD ca...
Speech-in-speech perception and executive function involvement.

Directory of Open Access Journals (Sweden)

Marcela Perrone-Bertolotti

Full Text Available This present study investigated the link between speech-in-speech perception capacities and four executive function components: response suppression, inhibitory control, switching and working memory. We constructed a cross-modal semantic priming paradigm using a written target word and a spoken prime word, implemented in one of two concurrent auditory sentences (cocktail party situation. The prime and target were semantically related or unrelated. Participants had to perform a lexical decision task on visual target words and simultaneously listen to only one of two pronounced sentences. The attention of the participant was manipulated: The prime was in the pronounced sentence listened to by the participant or in the ignored one. In addition, we evaluate the executive function abilities of participants (switching cost, inhibitory-control cost and response-suppression cost and their working memory span. Correlation analyses were performed between the executive and priming measurements. Our results showed a significant interaction effect between attention and semantic priming. We observed a significant priming effect in the attended but not in the ignored condition. Only priming effects obtained in the ignored condition were significantly correlated with some of the executive measurements. However, no correlation between priming effects and working memory capacity was found. Overall, these results confirm, first, the role of attention for semantic priming effect and, second, the implication of executive functions in speech-in-noise understanding capacities.
Infants' brain responses to speech suggest analysis by synthesis.

Science.gov (United States)

Kuhl, Patricia K; Ramírez, Rey R; Bosseler, Alexis; Lin, Jo-Fu Lotus; Imada, Toshiaki

2014-08-05

Historic theories of speech perception (Motor Theory and Analysis by Synthesis) invoked listeners' knowledge of speech production to explain speech perception. Neuroimaging data show that adult listeners activate motor brain areas during speech perception. In two experiments using magnetoencephalography (MEG), we investigated motor brain activation, as well as auditory brain activation, during discrimination of native and nonnative syllables in infants at two ages that straddle the developmental transition from language-universal to language-specific speech perception. Adults are also tested in Exp. 1. MEG data revealed that 7-mo-old infants activate auditory (superior temporal) as well as motor brain areas (Broca's area, cerebellum) in response to speech, and equivalently for native and nonnative syllables. However, in 11- and 12-mo-old infants, native speech activates auditory brain areas to a greater degree than nonnative, whereas nonnative speech activates motor brain areas to a greater degree than native speech. This double dissociation in 11- to 12-mo-old infants matches the pattern of results obtained in adult listeners. Our infant data are consistent with Analysis by Synthesis: auditory analysis of speech is coupled with synthesis of the motor plans necessary to produce the speech signal. The findings have implications for: (i) perception-action theories of speech perception, (ii) the impact of "motherese" on early language learning, and (iii) the "social-gating" hypothesis and humans' development of social understanding.
Effects of Social Cognitive Impairment on Speech Disorder in Schizophrenia

OpenAIRE

Docherty, Nancy M.; McCleery, Amanda; Divilbiss, Marielle; Schumann, Emily B.; Moe, Aubrey; Shakeel, Mohammed K.

2012-01-01

Disordered speech in schizophrenia impairs social functioning because it impedes communication with others. Treatment approaches targeting this symptom have been limited by an incomplete understanding of its causes. This study examined the process underpinnings of speech disorder, assessed in terms of communication failure. Contributions of impairments in 2 social cognitive abilities, emotion perception and theory of mind (ToM), to speech disorder were assessed in 63 patients with schizophren...
Electrophysiological evidence for differences between fusion and combination illusions in audiovisual speech perception

DEFF Research Database (Denmark)

Baart, Martijn; Lindborg, Alma Cornelia; Andersen, Tobias S

2017-01-01

Incongruent audiovisual speech stimuli can lead to perceptual illusions such as fusions or combinations. Here, we investigated the underlying audiovisual integration process by measuring ERPs. We observed that visual speech-induced suppression of P2 amplitude (which is generally taken as a measure...... of audiovisual integration) for fusions was comparable to suppression obtained with fully congruent stimuli, whereas P2 suppression for combinations was larger. We argue that these effects arise because the phonetic incongruency is solved differently for both types of stimuli. This article is protected...
APPRECIATING SPEECH THROUGH GAMING

Directory of Open Access Journals (Sweden)

Mario T Carreon

2014-06-01

Full Text Available This paper discusses the Speech and Phoneme Recognition as an Educational Aid for the Deaf and Hearing Impaired (SPREAD application and the ongoing research on its deployment as a tool for motivating deaf and hearing impaired students to learn and appreciate speech. This application uses the Sphinx-4 voice recognition system to analyze the vocalization of the student and provide prompt feedback on their pronunciation. The packaging of the application as an interactive game aims to provide additional motivation for the deaf and hearing impaired student through visual motivation for them to learn and appreciate speech.
Examining speech perception in noise and cognitive functions in the elderly.

Science.gov (United States)

Meister, Hartmut; Schreitmüller, Stefan; Grugel, Linda; Beutner, Dirk; Walger, Martin; Meister, Ingo

2013-12-01

The purpose of this study was to investigate the relationship of cognitive functions (i.e., working memory [WM]) and speech recognition against different background maskers in older individuals. Speech reception thresholds (SRTs) were determined using a matrix-sentence test. Unmodulated noise, modulated noise (International Collegium for Rehabilitative Audiology [ICRA] noise 5-250), and speech fragments (International Speech Test Signal [ISTS]) were used as background maskers. Verbal WM was assessed using the Verbal Learning and Memory Test (VLMT; Helmstaedter & Durwen, 1990). Measurements were conducted with 14 normal-hearing older individuals and a control group of 12 normal-hearing young listeners. Despite their normal hearing ability, the young listeners outperformed the older individuals in all background maskers. These differences were largest for the modulated maskers. SRTs were significantly correlated with the scores of the VLMT. A linear regression model also included WM as the only significant predictor variable. The results support the assumption that WM plays an important role for speech understanding and that it might have impact on results obtained using speech audiometry. Thus, an individual's WM capacity should be considered with aural diagnosis and rehabilitation. The VLMT proved to be a clinically applicable test for WM. Further cognitive functions important with speech understanding are currently being investigated within the SAKoLA (Sprachaudiometrie und kognitive Leistungen im Alter [Speech Audiometry and Cognitive Functions in the Elderly]) project.
Practising verbal maritime communication with computer dialogue systems using automatic speech recognition (My Practice session)

OpenAIRE

John, Peter; Wellmann, J.; Appell, J.E.

2016-01-01

This My Practice session presents a novel online tool for practising verbal communication in a maritime setting. It is based on low-fi ChatBot simulation exercises which employ computer-based dialogue systems. The ChatBot exercises are equipped with an automatic speech recognition engine specifically designed for maritime communication. The speech input and output functionality enables learners to communicate with the computer freely and spontaneously. The exercises replicate real communicati...
Digitized Ethnic Hate Speech: Understanding Effects of Digital Media Hate Speech on Citizen Journalism in Kenya

OpenAIRE

Stephen Gichuhi Kimotho; Rahab Njeri Nyaga

2016-01-01

Ethnicity in Kenya permeates all spheres of life. However, it is in politics that ethnicity is most visible. Election time in Kenya often leads to ethnic competition and hatred, often expressed through various media. Ethnic hate speech characterized the 2007 general elections in party rallies and through text messages, emails, posters and leaflets. This resulted in widespread skirmishes that left over 1200 people dead, and many displaced (KNHRC, 2008). In 2013, however, the new battle zone wa...
Implementation of a Tour Guide Robot System Using RFID Technology and Viterbi Algorithm-Based HMM for Speech Recognition

Directory of Open Access Journals (Sweden)

Neng-Sheng Pai

2014-01-01

Full Text Available This paper applied speech recognition and RFID technologies to develop an omni-directional mobile robot into a robot with voice control and guide introduction functions. For speech recognition, the speech signals were captured by short-time processing. The speaker first recorded the isolated words for the robot to create speech database of specific speakers. After the speech pre-processing of this speech database, the feature parameters of cepstrum and delta-cepstrum were obtained using linear predictive coefficient (LPC. Then, the Hidden Markov Model (HMM was used for model training of the speech database, and the Viterbi algorithm was used to find an optimal state sequence as the reference sample for speech recognition. The trained reference model was put into the industrial computer on the robot platform, and the user entered the isolated words to be tested. After processing by the same reference model and comparing with previous reference model, the path of the maximum total probability in various models found using the Viterbi algorithm in the recognition was the recognition result. Finally, the speech recognition and RFID systems were achieved in an actual environment to prove its feasibility and stability, and implemented into the omni-directional mobile robot.
Syntactic error modeling and scoring normalization in speech recognition: Error modeling and scoring normalization in the speech recognition task for adult literacy training

Science.gov (United States)

Olorenshaw, Lex; Trawick, David

1991-01-01

The purpose was to develop a speech recognition system to be able to detect speech which is pronounced incorrectly, given that the text of the spoken speech is known to the recognizer. Better mechanisms are provided for using speech recognition in a literacy tutor application. Using a combination of scoring normalization techniques and cheater-mode decoding, a reasonable acceptance/rejection threshold was provided. In continuous speech, the system was tested to be able to provide above 80 pct. correct acceptance of words, while correctly rejecting over 80 pct. of incorrectly pronounced words.
Enhancement of a radiation safety system through the use of a microprocessor-controlled speech synthesizer

International Nuclear Information System (INIS)

Keefe, D.J.; McDowell, W.P.

1980-01-01

A speech synthesizer is being used to differentiate eight separate safety alarms on a high energy accelerator at Argonne National Laboratory. A single board microcomputer monitors eight signals from an existing radiation safety logic circuit. The microcomputer is programmed to output the proper code at the proper time and sequence to a speech synthesizer which supplies the audio input to a local public address system. This eliminates the requirement for eight different alarm tones and the personnel training required to differentiate among them. A twenty-word vocabulary was found adequate to supply the necessary safety announcements. The article describes the techniques used to interface the speech synthesizer into the existing safety logic circuit
Unvoiced Speech Recognition Using Tissue-Conductive Acoustic Sensor

Directory of Open Access Journals (Sweden)

Heracleous Panikos

2007-01-01

Full Text Available We present the use of stethoscope and silicon NAM (nonaudible murmur microphones in automatic speech recognition. NAM microphones are special acoustic sensors, which are attached behind the talker's ear and can capture not only normal (audible speech, but also very quietly uttered speech (nonaudible murmur. As a result, NAM microphones can be applied in automatic speech recognition systems when privacy is desired in human-machine communication. Moreover, NAM microphones show robustness against noise and they might be used in special systems (speech recognition, speech transform, etc. for sound-impaired people. Using adaptation techniques and a small amount of training data, we achieved for a 20 k dictation task a word accuracy for nonaudible murmur recognition in a clean environment. In this paper, we also investigate nonaudible murmur recognition in noisy environments and the effect of the Lombard reflex on nonaudible murmur recognition. We also propose three methods to integrate audible speech and nonaudible murmur recognition using a stethoscope NAM microphone with very promising results.
Who Decides What Is Acceptable Speech on Campus? Why Restricting Free Speech Is Not the Answer.

Science.gov (United States)

Ceci, Stephen J; Williams, Wendy M

2018-05-01

Recent protests on dozens of campuses have led to the cancellation of controversial talks, and violence has accompanied several of these protests. Psychological science provides an important lens through which to view, understand, and potentially reduce these conflicts. In this article, we frame opposing sides' arguments within a long-standing corpus of psychological research on selective perception, confirmation bias, myside bias, illusion of understanding, blind-spot bias, groupthink/in-group bias, motivated skepticism, and naive realism. These concepts inform dueling claims: (a) the protestors' violence was justified by a higher moral responsibility to prevent marginalized groups from being victimized by hate speech, versus (b) the students' right to hear speakers was infringed upon. Psychological science cannot, however, be the sole arbiter of these campus debates; legal and philosophical considerations are also relevant. Thus, we augment psychological science with insights from these literatures to shed light on complexities associated with positions supporting free speech and those protesting hate speech. We conclude with a set of principles, most supported by empirical research, to inform university policies and help ensure vigorous freedom of expression within the context of an inclusive, diverse community.
Speech and swallowing outcomes in buccal mucosa carcinoma

Directory of Open Access Journals (Sweden)

Sunila John

2011-01-01

Full Text Available Buccal carcinoma is one of the most common malignant neoplasms among all oral cancers in India. Understanding the role of speech language pathologists (SLPs in the domains of evaluation and management strategies of this condition is limited, especially in the Indian context. This is a case report of a young adult with recurrent squamous cell carcinoma of the buccal mucosa with no deleterious habits usually associated with buccal mucosa carcinoma. Following composite resection, pectoralis major myocutaneous flap reconstruction, he developed severe oral dysphagia and demonstrated unintelligible speech. This case report focuses on the issues of swallowing and speech deficits in buccal mucosa carcinoma that need to be addressed by SLPs, and the outcomes of speech and swallowing rehabilitation and prognostic issues.

Understanding Adults' Strong Problem-Solving Skills Based on PIAAC

Science.gov (United States)

Hämäläinen, Raija; De Wever, Bram; Nissinen, Kari; Cincinnato, Sebastiano

2017-01-01

Purpose: Research has shown that the problem-solving skills of adults with a vocational education and training (VET) background in technology-rich environments (TREs) are often inadequate. However, some adults with a VET background do have sound problem-solving skills. The present study aims to provide insight into the socio-demographic,…
Effects of irrelevant speech and traffic noise on speech perception and cognitive performance in elementary school children.

Science.gov (United States)

Klatte, Maria; Meis, Markus; Sukowski, Helga; Schick, August

2007-01-01

The effects of background noise of moderate intensity on short-term storage and processing of verbal information were analyzed in 6 to 8 year old children. In line with adult studies on "irrelevant sound effect" (ISE), serial recall of visually presented digits was severely disrupted by background speech that the children did not understand. Train noises of equal Intensity however, had no effect. Similar results were demonstrated with tasks requiring storage and processing of heard information. Memory for nonwords, execution of oral instructions and categorizing speech sounds were significantly disrupted by irrelevant speech. The affected functions play a fundamental role in the acquisition of spoken and written language. Implications concerning current models of the ISE and the acoustic conditions in schools and kindergardens are discussed.
Speech networks at rest and in action: interactions between functional brain networks controlling speech production.

Science.gov (United States)

Simonyan, Kristina; Fuertinger, Stefan

2015-04-01

Speech production is one of the most complex human behaviors. Although brain activation during speaking has been well investigated, our understanding of interactions between the brain regions and neural networks remains scarce. We combined seed-based interregional correlation analysis with graph theoretical analysis of functional MRI data during the resting state and sentence production in healthy subjects to investigate the interface and topology of functional networks originating from the key brain regions controlling speech, i.e., the laryngeal/orofacial motor cortex, inferior frontal and superior temporal gyri, supplementary motor area, cingulate cortex, putamen, and thalamus. During both resting and speaking, the interactions between these networks were bilaterally distributed and centered on the sensorimotor brain regions. However, speech production preferentially recruited the inferior parietal lobule (IPL) and cerebellum into the large-scale network, suggesting the importance of these regions in facilitation of the transition from the resting state to speaking. Furthermore, the cerebellum (lobule VI) was the most prominent region showing functional influences on speech-network integration and segregation. Although networks were bilaterally distributed, interregional connectivity during speaking was stronger in the left vs. right hemisphere, which may have underlined a more homogeneous overlap between the examined networks in the left hemisphere. Among these, the laryngeal motor cortex (LMC) established a core network that fully overlapped with all other speech-related networks, determining the extent of network interactions. Our data demonstrate complex interactions of large-scale brain networks controlling speech production and point to the critical role of the LMC, IPL, and cerebellum in the formation of speech production network. Copyright © 2015 the American Physiological Society.
Autonomic nervous system responses during perception of masked speech may reflect constructs other than subjective listening effort

Directory of Open Access Journals (Sweden)

Alexander L. Francis

2016-03-01

Full Text Available Typically, understanding speech seems effortless and automatic. However, a variety of factors may, independently or interactively, make listening more effortful. Physiological measures may help to distinguish between the application of different cognitive mechanisms whose operation is perceived as effortful. In the present study, physiological and behavioral measures associated with task demand were collected along with behavioral measures of performance while participants listened to and repeated sentences. The goal was to measure psychophysiological reactivity associated with three degraded listening conditions, each of which differed in terms of the source of the difficulty (distortion, energetic masking, and informational masking, and therefore were expected to engage different cognitive mechanisms. These conditions were chosen to be matched for overall performance (keywords correct, and were compared to listening to unmasked speech produced by a natural voice. The three degraded conditions were: (1 Unmasked speech produced by a computer speech synthesizer, (2 Speech produced by a natural voice and masked by speech-shaped noise and (3 Speech produced by a natural voice and masked by two-talker babble. Masked conditions were both presented at a -8 dB signal to noise ratio (SNR, a level shown in previous research to result in comparable levels of performance for these stimuli and maskers. Performance was measured in terms of proportion of key words identified correctly, and task demand or effort was quantified subjectively by self-report. Measures of psychophysiological reactivity included electrodermal (skin conductance response frequency and amplitude, blood pulse amplitude and pulse rate. Results suggest that the two masked conditions evoked stronger psychophysiological reactivity than did the two unmasked conditions even when behavioral measures of listening performance and listeners’ subjective perception of task demand were comparable
Speech perception as an active cognitive process

Directory of Open Access Journals (Sweden)

Shannon eHeald

2014-03-01

Full Text Available One view of speech perception is that acoustic signals are transformed into representations for pattern matching to determine linguistic structure. This process can be taken as a statistical pattern-matching problem, assuming realtively stable linguistic categories are characterized by neural representations related to auditory properties of speech that can be compared to speech input. This kind of pattern matching can be termed a passive process which implies rigidity of processingd with few demands on cognitive processing. An alternative view is that speech recognition, even in early stages, is an active process in which speech analysis is attentionally guided. Note that this does not mean consciously guided but that information-contingent changes in early auditory encoding can occur as a function of context and experience. Active processing assumes that attention, plasticity, and listening goals are important in considering how listeners cope with adverse circumstances that impair hearing by masking noise in the environment or hearing loss. Although theories of speech perception have begun to incorporate some active processing, they seldom treat early speech encoding as plastic and attentionally guided. Recent research has suggested that speech perception is the product of both feedforward and feedback interactions between a number of brain regions that include descending projections perhaps as far downstream as the cochlea. It is important to understand how the ambiguity of the speech signal and constraints of context dynamically determine cognitive resources recruited during perception including focused attention, learning, and working memory. Theories of speech perception need to go beyond the current corticocentric approach in order to account for the intrinsic dynamics of the auditory encoding of speech. In doing so, this may provide new insights into ways in which hearing disorders and loss may be treated either through augementation or
New approach to solve fully fuzzy system of linear equations using ...

Indian Academy of Sciences (India)

Known example problems are solved to illustrate the efficacy and ... The concept of fuzzy set and fuzzy number were first introduced by Zadeh .... (iii) Fully fuzzy linear systems can be solved by linear programming approach, Gauss elim-.
Model-based inverse estimation for active contraction stresses of tongue muscles using 3D surface shape in speech production.

Science.gov (United States)

Koike, Narihiko; Ii, Satoshi; Yoshinaga, Tsukasa; Nozaki, Kazunori; Wada, Shigeo

2017-11-07

This paper presents a novel inverse estimation approach for the active contraction stresses of tongue muscles during speech. The proposed method is based on variational data assimilation using a mechanical tongue model and 3D tongue surface shapes for speech production. The mechanical tongue model considers nonlinear hyperelasticity, finite deformation, actual geometry from computed tomography (CT) images, and anisotropic active contraction by muscle fibers, the orientations of which are ideally determined using anatomical drawings. The tongue deformation is obtained by solving a stationary force-equilibrium equation using a finite element method. An inverse problem is established to find the combination of muscle contraction stresses that minimizes the Euclidean distance of the tongue surfaces between the mechanical analysis and CT results of speech production, where a signed-distance function represents the tongue surface. Our approach is validated through an ideal numerical example and extended to the real-world case of two Japanese vowels, /ʉ/ and /ɯ/. The results capture the target shape completely and provide an excellent estimation of the active contraction stresses in the ideal case, and exhibit similar tendencies as in previous observations and simulations for the actual vowel cases. The present approach can reveal the relative relationship among the muscle contraction stresses in similar utterances with different tongue shapes, and enables the investigation of the coordination of tongue muscles during speech using only the deformed tongue shape obtained from medical images. This will enhance our understanding of speech motor control. Copyright © 2017 Elsevier Ltd. All rights reserved.
A study on nonlinear characteristics of speech sound with reference to some languages of North East region

Science.gov (United States)

Dutta, Rashmi

parametric model, which has been most useful in practical applications. Developing a system that can understand natural language has been a continuing goal of speech researchers. Fully automatic high quality machine translation systems are extremely difficult to build. The difficulty arises from the following reasons: In any natural language text, only part of the information to be conveyed is explicitly expressed. It is the human mind which fills up and supplements the details using contextual.
Crosslinguistic Application of English-Centric Rhythm Descriptors in Motor Speech Disorders

Science.gov (United States)

Liss, Julie M.; Utianski, Rene; Lansford, Kaitlin

2014-01-01

Background Rhythmic disturbances are a hallmark of motor speech disorders, in which the motor control deficits interfere with the outward flow of speech and by extension speech understanding. As the functions of rhythm are language-specific, breakdowns in rhythm should have language-specific consequences for communication. Objective The goals of this paper are to (i) provide a review of the cognitive- linguistic role of rhythm in speech perception in a general sense and crosslinguistically; (ii) present new results of lexical segmentation challenges posed by different types of dysarthria in American English, and (iii) offer a framework for crosslinguistic considerations for speech rhythm disturbances in the diagnosis and treatment of communication disorders associated with motor speech disorders. Summary This review presents theoretical and empirical reasons for considering speech rhythm as a critical component of communication deficits in motor speech disorders, and addresses the need for crosslinguistic research to explore language-universal versus language-specific aspects of motor speech disorders. PMID:24157596
Apraxia of Speech

Science.gov (United States)

... Health Info » Voice, Speech, and Language Apraxia of Speech On this page: What is apraxia of speech? ... about apraxia of speech? What is apraxia of speech? Apraxia of speech (AOS)—also known as acquired ...
[Effect of speech estimation on social anxiety].

Science.gov (United States)

Shirotsuki, Kentaro; Sasagawa, Satoko; Nomura, Shinobu

2009-02-01

This study investigates the effect of speech estimation on social anxiety to further understanding of this characteristic of Social Anxiety Disorder (SAD). In the first study, we developed the Speech Estimation Scale (SES) to assess negative estimation before giving a speech which has been reported to be the most fearful social situation in SAD. Undergraduate students (n = 306) completed a set of questionnaires, which consisted of the Short Fear of Negative Evaluation Scale (SFNE), the Social Interaction Anxiety Scale (SIAS), the Social Phobia Scale (SPS), and the SES. Exploratory factor analysis showed an adequate one-factor structure with eight items. Further analysis indicated that the SES had good reliability and validity. In the second study, undergraduate students (n = 315) completed the SFNE, SIAS, SPS, SES, and the Self-reported Depression Scale (SDS). The results of path analysis showed that fear of negative evaluation from others (FNE) predicted social anxiety, and speech estimation mediated the relationship between FNE and social anxiety. These results suggest that speech estimation might maintain SAD symptoms, and could be used as a specific target for cognitive intervention in SAD.
CONFLICT RESOLUTION STRATEGIES IN TURKISH AND AMERICAN SPEECH COMMUNITIES: A SCHOOL SETTING

Directory of Open Access Journals (Sweden)

Nuray Alagozlu

2015-07-01

Full Text Available Conflicts in communication are very common in every culture. However, resolving them varies from one culture to another. Conflict management strategies in communication revolve around five solutions collaboration, compromise, avoidance, competition, and accomodation as stated by Kilman (1977. This study attempts to explore ways of terminating verbal conflicts in academic settings. In the study, first, we aim to evaluate the ways of solving conflicts in two settings: a Turkish and an American University. Secondly, taking a pragmatic perspective, a classification of speech acts used to end conflicts is targeted according to both Killman’s strategies and a facework analysis. specifically, it is aimed to investigate:  generally how Turkish and American speakers end conflicts in discourse and which strategies they use in order to resolve conflicts  how “face” is reflected in those speech acts as categorized by Ting Toomey (1988, 1992.  any differences between Turkish and American speakers styles  any changes in conflict resolution due to power status in both cultures. Results are valuable in that they add up to the knowledge about intercultural pragmatic language use and cultural cognitions. Moreover, as the research aims to reveal basic verbal and behavioural differences between two communities, it is likely to contribute to intercultural understanding.
The effect of instantaneous input dynamic range setting on the speech perception of children with the nucleus 24 implant.

Science.gov (United States)

Davidson, Lisa S; Skinner, Margaret W; Holstad, Beth A; Fears, Beverly T; Richter, Marie K; Matusofsky, Margaret; Brenner, Christine; Holden, Timothy; Birath, Amy; Kettel, Jerrica L; Scollie, Susan

2009-06-01

The purpose of this study was to examine the effects of a wider instantaneous input dynamic range (IIDR) setting on speech perception and comfort in quiet and noise for children wearing the Nucleus 24 implant system and the Freedom speech processor. In addition, children's ability to understand soft and conversational level speech in relation to aided sound-field thresholds was examined. Thirty children (age, 7 to 17 years) with the Nucleus 24 cochlear implant system and the Freedom speech processor with two different IIDR settings (30 versus 40 dB) were tested on the Consonant Nucleus Consonant (CNC) word test at 50 and 60 dB SPL, the Bamford-Kowal-Bench Speech in Noise Test, and a loudness rating task for four-talker speech noise. Aided thresholds for frequency-modulated tones, narrowband noise, and recorded Ling sounds were obtained with the two IIDRs and examined in relation to CNC scores at 50 dB SPL. Speech Intelligibility Indices were calculated using the long-term average speech spectrum of the CNC words at 50 dB SPL measured at each test site and aided thresholds. Group mean CNC scores at 50 dB SPL with the 40 IIDR were significantly higher (p Speech in Noise Test were not significantly different for the two IIDRs. Significantly improved aided thresholds at 250 to 6000 Hz as well as higher Speech Intelligibility Indices afforded improved audibility for speech presented at soft levels (50 dB SPL). These results indicate that an increased IIDR provides improved word recognition for soft levels of speech without compromising comfort of higher levels of speech sounds or sentence recognition in noise.
Better together: Simultaneous presentation of speech and gesture in math instruction supports generalization and retention.

Science.gov (United States)

Congdon, Eliza L; Novack, Miriam A; Brooks, Neon; Hemani-Lopez, Naureen; O'Keefe, Lucy; Goldin-Meadow, Susan

2017-08-01

When teachers gesture during instruction, children retain and generalize what they are taught (Goldin-Meadow, 2014). But why does gesture have such a powerful effect on learning? Previous research shows that children learn most from a math lesson when teachers present one problem-solving strategy in speech while simultaneously presenting a different, but complementary, strategy in gesture (Singer & Goldin-Meadow, 2005). One possibility is that gesture is powerful in this context because it presents information simultaneously with speech. Alternatively, gesture may be effective simply because it involves the body, in which case the timing of information presented in speech and gesture may be less important for learning. Here we find evidence for the importance of simultaneity: 3 rd grade children retain and generalize what they learn from a math lesson better when given instruction containing simultaneous speech and gesture than when given instruction containing sequential speech and gesture. Interpreting these results in the context of theories of multimodal learning, we find that gesture capitalizes on its synchrony with speech to promote learning that lasts and can be generalized.
Private speech of learning disabled and normally achieving children in classroom academic and laboratory contexts.

Science.gov (United States)

Berk, L E; Landau, S

1993-04-01

Learning disabled (LD) children are often targets for cognitive-behavioral interventions designed to train them in effective use of a self-directed speech. The purpose of this study was to determine if, indeed, these children display immature private speech in the naturalistic classroom setting. Comparisons were made of the private speech, motor accompaniment to task, and attention of LD and normally achieving classmates during academic seatwork. Setting effects were examined by comparing classroom data with observations during academic seatwork and puzzle solving in the laboratory. Finally, a subgroup of LD children symptomatic of attention-deficit hyperactivity disorder (ADHD) was compared with pure LD and normally achieving controls to determine if the presumed immature private speech is a function of a learning disability or externalizing behavior problems. Results indicated that LD children used more task-relevant private speech than controls, an effect that was especially pronounced for the LD/ADHD subgroup. Use of private speech was setting- and task-specific. Implications for intervention and future research methodology are discussed.
Improvements in Speech Understanding With Wireless Binaural Broadband Digital Hearing Instruments in Adults With Sensorineural Hearing Loss

OpenAIRE

Kreisman, Brian M.; Mazevski, Annette G.; Schum, Donald J.; Sockalingam, Ravichandran

2010-01-01

This investigation examined whether speech intelligibility in noise can be improved using a new, binaural broadband hearing instrument system. Participants were 36 adults with symmetrical, sensorineural hearing loss (18 experienced hearing instrument users and 18 without prior experience). Participants were fit binaurally in a planned comparison, randomized crossover design study with binaural broadband hearing instruments and advanced digital hearing instruments. Following an adjustment peri...
Integration of speech and gesture in aphasia.

Science.gov (United States)

Cocks, Naomi; Byrne, Suzanne; Pritchard, Madeleine; Morgan, Gary; Dipper, Lucy

2018-02-07

Information from speech and gesture is often integrated to comprehend a message. This integration process requires the appropriate allocation of cognitive resources to both the gesture and speech modalities. People with aphasia are likely to find integration of gesture and speech difficult. This is due to a reduction in cognitive resources, a difficulty with resource allocation or a combination of the two. Despite it being likely that people who have aphasia will have difficulty with integration, empirical evidence describing this difficulty is limited. Such a difficulty was found in a single case study by Cocks et al. in 2009, and is replicated here with a greater number of participants. To determine whether individuals with aphasia have difficulties understanding messages in which they have to integrate speech and gesture. Thirty-one participants with aphasia (PWA) and 30 control participants watched videos of an actor communicating a message in three different conditions: verbal only, gesture only, and verbal and gesture message combined. The message related to an action in which the name of the action (e.g., 'eat') was provided verbally and the manner of the action (e.g., hands in a position as though eating a burger) was provided gesturally. Participants then selected a picture that 'best matched' the message conveyed from a choice of four pictures which represented a gesture match only (G match), a verbal match only (V match), an integrated verbal-gesture match (Target) and an unrelated foil (UR). To determine the gain that participants obtained from integrating gesture and speech, a measure of multimodal gain (MMG) was calculated. The PWA were less able to integrate gesture and speech than the control participants and had significantly lower MMG scores. When the PWA had difficulty integrating, they more frequently selected the verbal match. The findings suggest that people with aphasia can have difficulty integrating speech and gesture in order to obtain
Comparative Efficacy of the Picture Exchange Communication System (PECS) versus a Speech-Generating Device: Effects on Requesting Skills

Science.gov (United States)

Boesch, Miriam C.; Wendt, Oliver; Subramanian, Anu; Hsu, Ning

2013-01-01

An experimental, single-subject research study investigated the comparative efficacy of the Picture Exchange Communication System (PECS) versus a speech-generating device (SGD) in developing requesting skills for three elementary-age children with severe autism and little to no functional speech. Results demonstrated increases in requesting…
Modelling speech intelligibility in adverse conditions

DEFF Research Database (Denmark)

Jørgensen, Søren; Dau, Torsten

2013-01-01

Jørgensen and Dau (J Acoust Soc Am 130:1475-1487, 2011) proposed the speech-based envelope power spectrum model (sEPSM) in an attempt to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII) in conditions with nonlinearly processed speech...... subjected to phase jitter, a condition in which the spectral structure of the intelligibility of speech signal is strongly affected, while the broadband temporal envelope is kept largely intact. In contrast, the effects of this distortion can be predicted -successfully by the spectro-temporal modulation...... suggest that the SNRenv might reflect a powerful decision metric, while some explicit across-frequency analysis seems crucial in some conditions. How such across-frequency analysis is "realized" in the auditory system remains unresolved....
On the Use of Evolutionary Algorithms to Improve the Robustness of Continuous Speech Recognition Systems in Adverse Conditions

Directory of Open Access Journals (Sweden)

Sid-Ahmed Selouani

2003-07-01

Full Text Available Limiting the decrease in performance due to acoustic environment changes remains a major challenge for continuous speech recognition (CSR systems. We propose a novel approach which combines the Karhunen-LoÃƒÂ¨ve transform (KLT in the mel-frequency domain with a genetic algorithm (GA to enhance the data representing corrupted speech. The idea consists of projecting noisy speech parameters onto the space generated by the genetically optimized principal axis issued from the KLT. The enhanced parameters increase the recognition rate for highly interfering noise environments. The proposed hybrid technique, when included in the front-end of an HTK-based CSR system, outperforms that of the conventional recognition process in severe interfering car noise environments for a wide range of signal-to-noise ratios (SNRs varying from 16 dB to Ã¢ÂˆÂ’4 dB. We also showed the effectiveness of the KLT-GA method in recognizing speech subject to telephone channel degradations.

An automatic speech recognition system with speaker-independent identification support

Science.gov (United States)

Caranica, Alexandru; Burileanu, Corneliu

2015-02-01

The novelty of this work relies on the application of an open source research software toolkit (CMU Sphinx) to train, build and evaluate a speech recognition system, with speaker-independent support, for voice-controlled hardware applications. Moreover, we propose to use the trained acoustic model to successfully decode offline voice commands on embedded hardware, such as an ARMv6 low-cost SoC, Raspberry PI. This type of single-board computer, mainly used for educational and research activities, can serve as a proof-of-concept software and hardware stack for low cost voice automation systems.
Speech Alarms Pilot Study

Science.gov (United States)

Sandor, A.; Moses, H. R.

2016-01-01

Currently on the International Space Station (ISS) and other space vehicles Caution & Warning (C&W) alerts are represented with various auditory tones that correspond to the type of event. This system relies on the crew's ability to remember what each tone represents in a high stress, high workload environment when responding to the alert. Furthermore, crew receive a year or more in advance of the mission that makes remembering the semantic meaning of the alerts more difficult. The current system works for missions conducted close to Earth where ground operators can assist as needed. On long duration missions, however, they will need to work off-nominal events autonomously. There is evidence that speech alarms may be easier and faster to recognize, especially during an off-nominal event. The Information Presentation Directed Research Project (FY07-FY09) funded by the Human Research Program included several studies investigating C&W alerts. The studies evaluated tone alerts currently in use with NASA flight deck displays along with candidate speech alerts. A follow-on study used four types of speech alerts to investigate how quickly various types of auditory alerts with and without a speech component - either at the beginning or at the end of the tone - can be identified. Even though crew were familiar with the tone alert from training or direct mission experience, alerts starting with a speech component were identified faster than alerts starting with a tone. The current study replicated the results from the previous study in a more rigorous experimental design to determine if the candidate speech alarms are ready for transition to operations or if more research is needed. Four types of alarms (caution, warning, fire, and depressurization) were presented to participants in both tone and speech formats in laboratory settings and later in the Human Exploration Research Analog (HERA). In the laboratory study, the alerts were presented by software and participants were
The Functional Connectome of Speech Control.

Directory of Open Access Journals (Sweden)

Stefan Fuertinger

2015-07-01

Full Text Available In the past few years, several studies have been directed to understanding the complexity of functional interactions between different brain regions during various human behaviors. Among these, neuroimaging research installed the notion that speech and language require an orchestration of brain regions for comprehension, planning, and integration of a heard sound with a spoken word. However, these studies have been largely limited to mapping the neural correlates of separate speech elements and examining distinct cortical or subcortical circuits involved in different aspects of speech control. As a result, the complexity of the brain network machinery controlling speech and language remained largely unknown. Using graph theoretical analysis of functional MRI (fMRI data in healthy subjects, we quantified the large-scale speech network topology by constructing functional brain networks of increasing hierarchy from the resting state to motor output of meaningless syllables to complex production of real-life speech as well as compared to non-speech-related sequential finger tapping and pure tone discrimination networks. We identified a segregated network of highly connected local neural communities (hubs in the primary sensorimotor and parietal regions, which formed a commonly shared core hub network across the examined conditions, with the left area 4p playing an important role in speech network organization. These sensorimotor core hubs exhibited features of flexible hubs based on their participation in several functional domains across different networks and ability to adaptively switch long-range functional connectivity depending on task content, resulting in a distinct community structure of each examined network. Specifically, compared to other tasks, speech production was characterized by the formation of six distinct neural communities with specialized recruitment of the prefrontal cortex, insula, putamen, and thalamus, which collectively
AUTOMATIC SPEECH RECOGNITION SYSTEM CONCERNING THE MOROCCAN DIALECTE (Darija and Tamazight)

OpenAIRE

A. EL GHAZI; C. DAOUI; N. IDRISSI

2012-01-01

In this work we present an automatic speech recognition system for Moroccan dialect mainly: Darija (Arab dialect) and Tamazight. Many approaches have been used to model the Arabic and Tamazightphonetic units. In this paper, we propose to use the hidden Markov model (HMM) for modeling these phoneticunits. Experimental results show that the proposed approach further improves the recognition.
Common neural substrates support speech and non-speech vocal tract gestures.

Science.gov (United States)

Chang, Soo-Eun; Kenney, Mary Kay; Loucks, Torrey M J; Poletto, Christopher J; Ludlow, Christy L

2009-08-01

The issue of whether speech is supported by the same neural substrates as non-speech vocal tract gestures has been contentious. In this fMRI study we tested whether producing non-speech vocal tract gestures in humans shares the same functional neuroanatomy as non-sense speech syllables. Production of non-speech vocal tract gestures, devoid of phonological content but similar to speech in that they had familiar acoustic and somatosensory targets, was compared to the production of speech syllables without meaning. Brain activation related to overt production was captured with BOLD fMRI using a sparse sampling design for both conditions. Speech and non-speech were compared using voxel-wise whole brain analyses, and ROI analyses focused on frontal and temporoparietal structures previously reported to support speech production. Results showed substantial activation overlap between speech and non-speech function in regions. Although non-speech gesture production showed greater extent and amplitude of activation in the regions examined, both speech and non-speech showed comparable left laterality in activation for both target perception and production. These findings posit a more general role of the previously proposed "auditory dorsal stream" in the left hemisphere--to support the production of vocal tract gestures that are not limited to speech processing.
Unvoiced Speech Recognition Using Tissue-Conductive Acoustic Sensor

Directory of Open Access Journals (Sweden)

Hiroshi Saruwatari

2007-01-01

Full Text Available We present the use of stethoscope and silicon NAM (nonaudible murmur microphones in automatic speech recognition. NAM microphones are special acoustic sensors, which are attached behind the talker's ear and can capture not only normal (audible speech, but also very quietly uttered speech (nonaudible murmur. As a result, NAM microphones can be applied in automatic speech recognition systems when privacy is desired in human-machine communication. Moreover, NAM microphones show robustness against noise and they might be used in special systems (speech recognition, speech transform, etc. for sound-impaired people. Using adaptation techniques and a small amount of training data, we achieved for a 20 k dictation task a 93.9% word accuracy for nonaudible murmur recognition in a clean environment. In this paper, we also investigate nonaudible murmur recognition in noisy environments and the effect of the Lombard reflex on nonaudible murmur recognition. We also propose three methods to integrate audible speech and nonaudible murmur recognition using a stethoscope NAM microphone with very promising results.
Speech and language therapists' views about AAC system acceptance by people with acquired communication disorders.

Science.gov (United States)

Pampoulou, Eliada

2018-04-18

Some adults with acquired communication disorders are faced with an inability to communicate coherently through verbal speech with their communication partners. Despite the fact that a variety of augmentative and alternative communication (AAC) aided systems is available to assist them in communicating, not all adults accept them. In Cyprus, there is scant research focusing on the factors that are linked to AAC system acceptance and abandonment. To address this gap, this research involves exploring the experiences of six speech and language therapists supporting adults with acquired communication disorders, who could benefit from the use of AAC systems. The main research question is: What are the factors that influence AAC system acceptance or abandonment? The method used for data collection, was semi-structured interviews and the transcripts were analyzed thematically. The findings show that a number of factors influence the acceptance of AAC systems. These include the time since onset and acceptance of disability, the person's attitude towards communication facilitators, and the perceptions about AAC systems. These findings indicate that the process of accepting an AAC system is multi-layered and these layers are interrelated. More research is warranted focusing directly on the experiences of people with acquired communication disorders and their communication partners. Implications for Rehabilitation The different myths about AAC systems need to be challenged such that awareness about their usefulness is raised. AAC specialists need to find ways to spread the message that AAC systems can actually support language, speech and communication through different dissemination avenues, such as articles in newspapers and talks through the media.
The role of periodicity in perceiving speech in quiet and in background noise.

Science.gov (United States)

Steinmetzger, Kurt; Rosen, Stuart

2015-12-01

The ability of normal-hearing listeners to perceive sentences in quiet and in background noise was investigated in a variety of conditions mixing the presence and absence of periodicity (i.e., voicing) in both target and masker. Experiment 1 showed that in quiet, aperiodic noise-vocoded speech and speech with a natural amount of periodicity were equally intelligible, while fully periodic speech was much harder to understand. In Experiments 2 and 3, speech reception thresholds for these targets were measured in the presence of four different maskers: speech-shaped noise, harmonic complexes with a dynamically varying F0 contour, and 10 Hz amplitude-modulated versions of both. For experiment 2, results of experiment 1 were used to identify conditions with equal intelligibility in quiet, while in experiment 3 target intelligibility in quiet was near ceiling. In the presence of a masker, periodicity in the target speech mattered little, but listeners strongly benefited from periodicity in the masker. Substantial fluctuating-masker benefits required the target speech to be almost perfectly intelligible in quiet. In summary, results suggest that the ability to exploit periodicity cues may be an even more important factor when attempting to understand speech embedded in noise than the ability to benefit from masker fluctuations.
Usability Assessment of Text-to-Speech Synthesis for Additional Detail in an Automated Telephone Banking System

OpenAIRE

Morton , Hazel; Gunson , Nancie; Marshall , Diarmid; McInnes , Fergus; Ayres , Andrea; Jack , Mervyn

2010-01-01

Abstract This paper describes a comprehensive usability evaluation of an automated telephone banking system which employs text-to-speech (TTS) synthesis in offering additional detail on customers? account transactions. The paper describes a series of four experiments in which TTS was employed to offer an extra level of detail to recent transactions listings within an established banking service which otherwise uses recorded speech from a professional recording artist. Results from ...
Reliance on auditory feedback in children with childhood apraxia of speech.

Science.gov (United States)

Iuzzini-Seigel, Jenya; Hogan, Tiffany P; Guarino, Anthony J; Green, Jordan R

2015-01-01

Children with childhood apraxia of speech (CAS) have been hypothesized to continuously monitor their speech through auditory feedback to minimize speech errors. We used an auditory masking paradigm to determine the effect of attenuating auditory feedback on speech in 30 children: 9 with CAS, 10 with speech delay, and 11 with typical development. The masking only affected the speech of children with CAS as measured by voice onset time and vowel space area. These findings provide preliminary support for greater reliance on auditory feedback among children with CAS. Readers of this article should be able to (i) describe the motivation for investigating the role of auditory feedback in children with CAS; (ii) report the effects of feedback attenuation on speech production in children with CAS, speech delay, and typical development, and (iii) understand how the current findings may support a feedforward program deficit in children with CAS. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
The integration of marketing problem-solving modes and marketing management support systems

NARCIS (Netherlands)

B. Wierenga (Berend); G.H. van Bruggen (Gerrit)

1997-01-01

textabstractFocuses on the issue of problem solving in marketing and develops a classification of marketing problem-solving modes (MPSMs). Typology of MPSMs; Relationship among MPSMs; Marketing management support systems.
A Comparison of Two Scoring Methods for an Automated Speech Scoring System

Science.gov (United States)

Xi, Xiaoming; Higgins, Derrick; Zechner, Klaus; Williamson, David

2012-01-01

This paper compares two alternative scoring methods--multiple regression and classification trees--for an automated speech scoring system used in a practice environment. The two methods were evaluated on two criteria: construct representation and empirical performance in predicting human scores. The empirical performance of the two scoring models…
System to solve three designs of the fuel management

International Nuclear Information System (INIS)

Castillo M, J. A.; Ortiz S, J. J.; Montes T, J. L.; Perusquia del C, R.; Marinez R, R.

2015-09-01

In this paper preliminary results are presented, obtained with the development of a computer system that resolves three stages of the nuclear fuel management, which are: the axial and radial designs of fuel, as well as the design of nuclear fuel reloads. The novelty of the system is that the solution is obtained solving the 3 mentioned stages, in coupled form. For this, heuristic techniques are used for each stage, in each one of these has a function objective that is applied to particular problems, but in all cases the obtained partial results are used as input data for the next stage. The heuristic techniques that were used to solve the coupled problem are: tabu search, neural networks and a hybrid between the scatter search and path re linking. The system applies an iterative process from the design of a fuel cell to the reload design, since are preliminary results the reload is designed using the operation strategy Haling type. In each one of the stages nuclear parameters inherent to the design are monitored. The results so far show the advantage of solving the problem in a coupled manner, even when a large amount of computer resources is used. (Author)
A user-operated test of suprathreshold acuity in noise for adult hearing screening: The SUN (Speech Understanding in Noise) test.

Science.gov (United States)

Paglialonga, Alessia; Tognola, Gabriella; Grandori, Ferdinando

2014-09-01

A novel, user-operated test of suprathreshold acuity in noise for use in adult hearing screening (AHS) was developed. The Speech Understanding in Noise test (SUN) is a speech-in-noise test that makes use of a list of vowel-consonant-vowel (VCV) stimuli in background noise presented in a three-alternative forced choice (3AFC) paradigm by means of a touch sensitive screen. The test is automated, easy-to-use, and provides self-explanatory results (i.e., 'no hearing difficulties', or 'a hearing check would be advisable', or 'a hearing check is recommended'). The test was developed from its building blocks (VCVs and speech-shaped noise) through two main steps: (i) development of the test list through equalization of the intelligibility of test stimuli across the set and (ii) optimization of the test results through maximization of the test sensitivity and specificity. The test had 82.9% sensitivity and 85.9% specificity compared to conventional pure-tone screening, and 83.8% sensitivity and 83.9% specificity to identify individuals with disabling hearing impairment. Results obtained so far showed that the test could be easily performed by adults and older adults in less than one minute per ear and that its results were not influenced by ambient noise (up to 65dBA), suggesting that the test might be a viable method for AHS in clinical as well as non-clinical settings. Copyright © 2014 Elsevier Ltd. All rights reserved.
Computational speech segregation based on an auditory-inspired modulation analysis

DEFF Research Database (Denmark)

May, Tobias; Dau, Torsten

2014-01-01

A monaural speech segregation system is presented that estimates the ideal binary mask from noisy speech based on the supervised learning of amplitude modulation spectrogram (AMS) features. Instead of using linearly scaled modulation filters with constant absolute bandwidth, an auditory- inspired...... about speech activity present in neighboring time-frequency units. In order to evaluate the generalization performance of the system to unseen acoustic conditions, the speech segregation system is trained with a limited set of low signal-to-noise ratio (SNR) conditions, but tested over a wide range...
Impact of speech-generating devices on the language development of a child with childhood apraxia of speech: a case study.

Science.gov (United States)

Lüke, Carina

2016-01-01

The purpose of the study was to evaluate the effectiveness of speech-generating devices (SGDs) on the communication and language development of a 2-year-old boy with severe childhood apraxia of speech (CAS). An A-B design was used over a treatment period of 1 year, followed by three additional follow-up measurements, in order to evaluate the implementation of SGDs in the speech therapy of a 2;7-year-old boy with severe CAS. In total, 53 therapy sessions were videotaped and analyzed to better understand his communicative (operationalized as means of communication) and linguistic (operationalized as intelligibility and consistency of speech-productions, lexical and grammatical development) development. The trend-lines of baseline phase A and intervention phase B were compared and percentage of non-overlapping data points were calculated to verify the value of the intervention. The use of SGDs led to an immediate increase in the communicative development of the child. An increase in all linguistic variables was observed, with a latency effect of eight to nine treatment sessions. The implementation of SGDs in speech therapy has the potential to be highly effective in regards to both communicative and linguistic competencies in young children with severe CAS. Implications for Rehabilitation Childhood apraxia of speech (CAS) is a neurological speech sound disorder which results in significant deficits in speech production and lead to a higher risk for language, reading and spelling difficulties. Speech-generating devices (SGD), as one method of augmentative and alternative communication (AAC), can effectively enhance the communicative and linguistic development of children with severe CAS.
Auditory Brainstem Response to Complex Sounds Predicts Self-Reported Speech-in-Noise Performance

Science.gov (United States)

Anderson, Samira; Parbery-Clark, Alexandra; White-Schwoch, Travis; Kraus, Nina

2013-01-01

Purpose: To compare the ability of the auditory brainstem response to complex sounds (cABR) to predict subjective ratings of speech understanding in noise on the Speech, Spatial, and Qualities of Hearing Scale (SSQ; Gatehouse & Noble, 2004) relative to the predictive ability of the Quick Speech-in-Noise test (QuickSIN; Killion, Niquette,…
Modern Tools in Patient-Centred Speech Therapy for Romanian Language

Directory of Open Access Journals (Sweden)

Mirela Danubianu

2016-03-01

Full Text Available The most common way to communicate with those around us is speech. Suffering from a speech disorder can have negative social effects: from leaving the individuals with low confidence and moral to problems with social interaction and the ability to live independently like adults. The speech therapy intervention is a complex process having particular objectives such as: discovery and identification of speech disorder and directing the therapy to correction, recovery, compensation, adaptation and social integration of patients. Computer-based Speech Therapy systems are a real help for therapists by creating a special learning environment. The Romanian language is a phonetic one, with special linguistic particularities. This paper aims to present a few computer-based speech therapy systems developed for the treatment of various speech disorders specific to Romanian language.
Ultra low bit-rate speech coding

CERN Document Server

Ramasubramanian, V

2015-01-01

"Ultra Low Bit-Rate Speech Coding" focuses on the specialized topic of speech coding at very low bit-rates of 1 Kbits/sec and less, particularly at the lower ends of this range, down to 100 bps. The authors set forth the fundamental results and trends that form the basis for such ultra low bit-rates to be viable and provide a comprehensive overview of various techniques and systems in literature to date, with particular attention to their work in the paradigm of unit-selection based segment quantization. The book is for research students, academic faculty and researchers, and industry practitioners in the areas of speech processing and speech coding.
A Flexible Question-and-Answer Task for Measuring Speech Understanding

Directory of Open Access Journals (Sweden)

Virginia Best

2016-11-01

Full Text Available This report introduces a new speech task based on simple questions and answers. The task differs from a traditional sentence recall task in that it involves an element of comprehension and can be implemented in an ongoing fashion. It also contains two target items (the question and the answer that may be associated with different voices and locations to create dynamic listening scenarios. A set of 227 questions was created, covering six broad categories (days of the week, months of the year, numbers, colors, opposites, and sizes. All questions and their one-word answers were spoken by 11 female and 11 male talkers. In this study, listeners were presented with question-answer pairs and asked to indicate whether the answer was true or false. Responses were given as simple button or key presses, which are quick to make and easy to score. Two preliminary experiments are presented that illustrate different ways of implementing the basic task. In the first experiment, question-answer pairs were presented in speech-shaped noise, and performance was compared across subjects, question categories, and time, to examine the different sources of variability. In the second experiment, sequences of question-answer pairs were presented amidst competing conversations in an ongoing, spatially dynamic listening scenario. Overall, the question-and-answer task appears to be feasible and could be implemented flexibly in a number of different ways.

The pitch hunt : The role of voice pitch in top-down repair of interrupted speech

NARCIS (Netherlands)

Clarke, Jeanne Nora

2017-01-01

For a normal hearing person, understanding speech from a single talker is effortless in quiet surrounding. It becomes more challenging when the talker is surrounded by a loud crowd or other kind of noise. Individuals with hearing impairment might already experience problems in understanding speech
Mock Trial: A Window to Free Speech Rights and Abilities

Science.gov (United States)

Schwartz, Sherry

2010-01-01

This article provides some strategies to alleviate the current tensions between personal responsibility and freedom of speech rights in the public school classroom. The article advocates the necessity of making sure students understand the points and implications of the first amendment by providing a mock trial unit concerning free speech rights.…
Bi-Modal Face and Speech Authentication: a BioLogin Demonstration System

OpenAIRE

Marcel, Sébastien; Mariéthoz, Johnny; Rodriguez, Yann; Cardinaux, Fabien

2006-01-01

This paper presents a bi-modal (face and speech) authentication demonstration system that simulates the login of a user using its face and its voice. This demonstration is called BioLogin. It runs both on Linux and Windows and the Windows version is freely available for download. Bio\\-Login is implemented using an open source machine learning library and its machine vision package.
Speech Production and Speech Discrimination by Hearing-Impaired Children.

Science.gov (United States)

Novelli-Olmstead, Tina; Ling, Daniel

1984-01-01

Seven hearing impaired children (five to seven years old) assigned to the Speakers group made highly significant gains in speech production and auditory discrimination of speech, while Listeners made only slight speech production gains and no gains in auditory discrimination. Combined speech and auditory training was more effective than auditory…
The Speech Act Theory between Linguistics and Language Philosophy

Directory of Open Access Journals (Sweden)

Liviu-Mihail MARINESCU

2006-10-01

Full Text Available Of all the issues in the general theory of language usage, speech act theory has probably aroused the widest interest. Psychologists, forexample, have suggested that the acquisition of the concepts underlying speech acts may be a prerequisite for the acquisition of language in general,literary critics have looked to speech act theory for an illumination of textual subtleties or for an understanding of the nature of literary genres,anthropologists have hoped to find in the theory some account of the nature of magical incantations, philosophers have seen potential applications to,amongst other things, the status of ethical statements, while linguists have seen the notions of speech act theory as variously applicable to problemsin syntax, semantics, second language learning, and elsewhere.
Stuttering Frequency, Speech Rate, Speech Naturalness, and Speech Effort During the Production of Voluntary Stuttering.

Science.gov (United States)

Davidow, Jason H; Grossman, Heather L; Edge, Robin L

2018-05-01

Voluntary stuttering techniques involve persons who stutter purposefully interjecting disfluencies into their speech. Little research has been conducted on the impact of these techniques on the speech pattern of persons who stutter. The present study examined whether changes in the frequency of voluntary stuttering accompanied changes in stuttering frequency, articulation rate, speech naturalness, and speech effort. In total, 12 persons who stutter aged 16-34 years participated. Participants read four 300-syllable passages during a control condition, and three voluntary stuttering conditions that involved attempting to produce purposeful, tension-free repetitions of initial sounds or syllables of a word for two or more repetitions (i.e., bouncing). The three voluntary stuttering conditions included bouncing on 5%, 10%, and 15% of syllables read. Friedman tests and follow-up Wilcoxon signed ranks tests were conducted for the statistical analyses. Stuttering frequency, articulation rate, and speech naturalness were significantly different between the voluntary stuttering conditions. Speech effort did not differ between the voluntary stuttering conditions. Stuttering frequency was significantly lower during the three voluntary stuttering conditions compared to the control condition, and speech effort was significantly lower during two of the three voluntary stuttering conditions compared to the control condition. Due to changes in articulation rate across the voluntary stuttering conditions, it is difficult to conclude, as has been suggested previously, that voluntary stuttering is the reason for stuttering reductions found when using voluntary stuttering techniques. Additionally, future investigations should examine different types of voluntary stuttering over an extended period of time to determine their impact on stuttering frequency, speech rate, speech naturalness, and speech effort.
New approach to solve symmetric fully fuzzy linear systems

Indian Academy of Sciences (India)

In this paper, we present a method to solve fully fuzzy linear systems with symmetric coefﬁcient matrix. The symmetric coefﬁcient matrix is decomposed into two systems of equations by using Cholesky method and then a solution can be obtained. Numerical examples are given to illustrate our method.
Improved Methods for Pitch Synchronous Linear Prediction Analysis of Speech

OpenAIRE

劉, 麗清

2015-01-01

Linear prediction (LP) analysis has been applied to speech system over the last few decades. LP technique is well-suited for speech analysis due to its ability to model speech production process approximately. Hence LP analysis has been widely used for speech enhancement, low-bit-rate speech coding in cellular telephony, speech recognition, characteristic parameter extraction (vocal tract resonances frequencies, fundamental frequency called pitch) and so on. However, the performance of the co...
Common neural substrates support speech and non-speech vocal tract gestures

OpenAIRE

Chang, Soo-Eun; Kenney, Mary Kay; Loucks, Torrey M.J.; Poletto, Christopher J.; Ludlow, Christy L.

2009-01-01

The issue of whether speech is supported by the same neural substrates as non-speech vocal-tract gestures has been contentious. In this fMRI study we tested whether producing non-speech vocal tract gestures in humans shares the same functional neuroanatomy as non-sense speech syllables. Production of non-speech vocal tract gestures, devoid of phonological content but similar to speech in that they had familiar acoustic and somatosensory targets, were compared to the production of speech sylla...
CAR2 - Czech Database of Car Speech

Directory of Open Access Journals (Sweden)

P. Sovka

1999-12-01

Full Text Available This paper presents new Czech language two-channel (stereo speech database recorded in car environment. The created database was designed for experiments with speech enhancement for communication purposes and for the study and the design of a robust speech recognition systems. Tools for automated phoneme labelling based on Baum-Welch re-estimation were realised. The noise analysis of the car background environment was done.
CAR2 - Czech Database of Car Speech

OpenAIRE

Pollak, P.; Vopicka, J.; Hanzl, V.; Sovka, Pavel

1999-01-01

This paper presents new Czech language two-channel (stereo) speech database recorded in car environment. The created database was designed for experiments with speech enhancement for communication purposes and for the study and the design of a robust speech recognition systems. Tools for automated phoneme labelling based on Baum-Welch re-estimation were realised. The noise analysis of the car background environment was done.
Electrophysiological evidence for differences between fusion and combination illusions in audiovisual speech perception.

Science.gov (United States)

Baart, Martijn; Lindborg, Alma; Andersen, Tobias S

2017-11-01

Incongruent audiovisual speech stimuli can lead to perceptual illusions such as fusions or combinations. Here, we investigated the underlying audiovisual integration process by measuring ERPs. We observed that visual speech-induced suppression of P2 amplitude (which is generally taken as a measure of audiovisual integration) for fusions was similar to suppression obtained with fully congruent stimuli, whereas P2 suppression for combinations was larger. We argue that these effects arise because the phonetic incongruency is solved differently for both types of stimuli. © 2017 The Authors. European Journal of Neuroscience published by Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Quadcopter Control Using Speech Recognition

Science.gov (United States)

Malik, H.; Darma, S.; Soekirno, S.

2018-04-01

This research reported a comparison from a success rate of speech recognition systems that used two types of databases they were existing databases and new databases, that were implemented into quadcopter as motion control. Speech recognition system was using Mel frequency cepstral coefficient method (MFCC) as feature extraction that was trained using recursive neural network method (RNN). MFCC method was one of the feature extraction methods that most used for speech recognition. This method has a success rate of 80% - 95%. Existing database was used to measure the success rate of RNN method. The new database was created using Indonesian language and then the success rate was compared with results from an existing database. Sound input from the microphone was processed on a DSP module with MFCC method to get the characteristic values. Then, the characteristic values were trained using the RNN which result was a command. The command became a control input to the single board computer (SBC) which result was the movement of the quadcopter. On SBC, we used robot operating system (ROS) as the kernel (Operating System).
Evaluation of speech reception threshold in noise in young Cochlear™ Nucleus® system 6 implant recipients using two different digital remote microphone technologies and a speech enhancement sound processing algorithm.

Science.gov (United States)

Razza, Sergio; Zaccone, Monica; Meli, Aannalisa; Cristofari, Eliana

2017-12-01

Children affected by hearing loss can experience difficulties in challenging and noisy environments even when deafness is corrected by Cochlear implant (CI) devices. These patients have a selective attention deficit in multiple listening conditions. At present, the most effective ways to improve the performance of speech recognition in noise consists of providing CI processors with noise reduction algorithms and of providing patients with bilateral CIs. The aim of this study was to compare speech performances in noise, across increasing noise levels, in CI recipients using two kinds of wireless remote-microphone radio systems that use digital radio frequency transmission: the Roger Inspiro accessory and the Cochlear Wireless Mini Microphone accessory. Eleven Nucleus Cochlear CP910 CI young user subjects were studied. The signal/noise ratio, at a speech reception threshold (SRT) value of 50%, was measured in different conditions for each patient: with CI only, with the Roger or with the MiniMic accessory. The effect of the application of the SNR-noise reduction algorithm in each of these conditions was also assessed. The tests were performed with the subject positioned in front of the main speaker, at a distance of 2.5 m. Another two speakers were positioned at 3.50 m. The main speaker at 65 dB issued disyllabic words. Babble noise signal was delivered through the other speakers, with variable intensity. The use of both wireless remote microphones improved the SRT results. Both systems improved gain of speech performances. The gain was higher with the Mini Mic system (SRT = -4.76) than the Roger system (SRT = -3.01). The addition of the NR algorithm did not statistically further improve the results. There is significant improvement in speech recognition results with both wireless digital remote microphone accessories, in particular with the Mini Mic system when used with the CP910 processor. The use of a remote microphone accessory surpasses the benefit of
ACOUSTIC SPEECH RECOGNITION FOR MARATHI LANGUAGE USING SPHINX

Directory of Open Access Journals (Sweden)

Aman Ankit

2016-09-01

Full Text Available Speech recognition or speech to text processing, is a process of recognizing human speech by the computer and converting into text. In speech recognition, transcripts are created by taking recordings of speech as audio and their text transcriptions. Speech based applications which include Natural Language Processing (NLP techniques are popular and an active area of research. Input to such applications is in natural language and output is obtained in natural language. Speech recognition mostly revolves around three approaches namely Acoustic phonetic approach, Pattern recognition approach and Artificial intelligence approach. Creation of acoustic model requires a large database of speech and training algorithms. The output of an ASR system is recognition and translation of spoken language into text by computers and computerized devices. ASR today finds enormous application in tasks that require human machine interfaces like, voice dialing, and etc. Our key contribution in this paper is to create corpora for Marathi language and explore the use of Sphinx engine for automatic speech recognition
Introductory speeches

International Nuclear Information System (INIS)

2001-01-01

This CD is multimedia presentation of programme safety upgrading of Bohunice V1 NPP. This chapter consist of introductory commentary and 4 introductory speeches (video records): (1) Introductory speech of Vincent Pillar, Board chairman and director general of Slovak electric, Plc. (SE); (2) Introductory speech of Stefan Schmidt, director of SE - Bohunice Nuclear power plants; (3) Introductory speech of Jan Korec, Board chairman and director general of VUJE Trnava, Inc. - Engineering, Design and Research Organisation, Trnava; Introductory speech of Dietrich Kuschel, Senior vice-president of FRAMATOME ANP Project and Engineering
Nonlinear evolution equations and solving algebraic systems: the importance of computer algebra

International Nuclear Information System (INIS)

Gerdt, V.P.; Kostov, N.A.

1989-01-01

In the present paper we study the application of computer algebra to solve the nonlinear polynomial systems which arise in investigation of nonlinear evolution equations. We consider several systems which are obtained in classification of integrable nonlinear evolution equations with uniform rank. Other polynomial systems are related with the finding of algebraic curves for finite-gap elliptic potentials of Lame type and generalizations. All systems under consideration are solved using the method based on construction of the Groebner basis for corresponding polynomial ideals. The computations have been carried out using computer algebra systems. 20 refs
Subjective Quality Measurement of Speech Its Evaluation, Estimation and Applications

CERN Document Server

Kondo, Kazuhiro

2012-01-01

It is becoming crucial to accurately estimate and monitor speech quality in various ambient environments to guarantee high quality speech communication. This practical hands-on book shows speech intelligibility measurement methods so that the readers can start measuring or estimating speech intelligibility of their own system. The book also introduces subjective and objective speech quality measures, and describes in detail speech intelligibility measurement methods. It introduces a diagnostic rhyme test which uses rhyming word-pairs, and includes: An investigation into the effect of word familiarity on speech intelligibility. Speech intelligibility measurement of localized speech in virtual 3-D acoustic space using the rhyme test. Estimation of speech intelligibility using objective measures, including the ITU standard PESQ measures, and automatic speech recognizers.
Phoneme Compression: processing of the speech signal and effects on speech intelligibility in hearing-Impaired listeners

NARCIS (Netherlands)

A. Goedegebure (Andre)

2005-01-01

textabstractHearing-aid users often continue to have problems with poor speech understanding in difficult acoustical conditions. Another generally accounted problem is that certain sounds become too loud whereas other sounds are still not audible. Dynamic range compression is a signal processing
Predicting speech intelligibility in conditions with nonlinearly processed noisy speech

DEFF Research Database (Denmark)

Jørgensen, Søren; Dau, Torsten

2013-01-01

The speech-based envelope power spectrum model (sEPSM; [1]) was proposed in order to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII). The sEPSM applies the signal-tonoise ratio in the envelope domain (SNRenv), which was demonstrated...... to successfully predict speech intelligibility in conditions with nonlinearly processed noisy speech, such as processing with spectral subtraction. Moreover, a multiresolution version (mr-sEPSM) was demonstrated to account for speech intelligibility in various conditions with stationary and fluctuating...

Cognitive Spare Capacity and Speech Communication: A Narrative Overview

Directory of Open Access Journals (Sweden)

Mary Rudner

2014-01-01

Full Text Available Background noise can make speech communication tiring and cognitively taxing, especially for individuals with hearing impairment. It is now well established that better working memory capacity is associated with better ability to understand speech under adverse conditions as well as better ability to benefit from the advanced signal processing in modern hearing aids. Recent work has shown that although such processing cannot overcome hearing handicap, it can increase cognitive spare capacity, that is, the ability to engage in higher level processing of speech. This paper surveys recent work on cognitive spare capacity and suggests new avenues of investigation.
Intra- and intersubject comparison of cochlear implant systems using the Esprit and the Tempo+ behind-the-ear speech processor.

Science.gov (United States)

Kompis, Martin; Jenk, Martin; Vischer, Mattheus W; Seifert, Eberhard; Häusler, Rudolf

2002-12-01

A patient with bilateral profound deafness was implanted with a Nucleus CI24M cochlear implant (CI) and used an Esprit behind-the-ear (BTE) speech processor. Thirteen months later, the implant had to be removed because of a cholesteatoma. As the same electrode could not be reinserted, a Medel combi40s CI was implanted in the same ear, and the patient used a Tempo+ BTE processor. After 1 year of use of the Combi40s/Tempo+ system, speech recognition was better and was rated better subjectively than with the CI24M/Esprit system. Speech recognition and subjective ratings were also assessed for two matched groups of nine CI users each, using either an Esprit or a Tempo+ processor. On average, speech recognition scores were higher for the group of Tempo+ users, but the difference was not statistically significant. Users of the Esprit processors rated their device higher in terms of cosmetic appearance and comfort of wearing.
Exploring Australian speech-language pathologists' use and perceptions ofnon-speech oral motor exercises.

Science.gov (United States)

Rumbach, Anna F; Rose, Tanya A; Cheah, Mynn

2018-01-29

To explore Australian speech-language pathologists' use of non-speech oral motor exercises, and rationales for using/not using non-speech oral motor exercises in clinical practice. A total of 124 speech-language pathologists practising in Australia, working with paediatric and/or adult clients with speech sound difficulties, completed an online survey. The majority of speech-language pathologists reported that they did not use non-speech oral motor exercises when working with paediatric or adult clients with speech sound difficulties. However, more than half of the speech-language pathologists working with adult clients who have dysarthria reported using non-speech oral motor exercises with this population. The most frequently reported rationale for using non-speech oral motor exercises in speech sound difficulty management was to improve awareness/placement of articulators. The majority of speech-language pathologists agreed there is no clear clinical or research evidence base to support non-speech oral motor exercise use with clients who have speech sound difficulties. This study provides an overview of Australian speech-language pathologists' reported use and perceptions of non-speech oral motor exercises' applicability and efficacy in treating paediatric and adult clients who have speech sound difficulties. The research findings provide speech-language pathologists with insight into how and why non-speech oral motor exercises are currently used, and adds to the knowledge base regarding Australian speech-language pathology practice of non-speech oral motor exercises in the treatment of speech sound difficulties. Implications for Rehabilitation Non-speech oral motor exercises refer to oral motor activities which do not involve speech, but involve the manipulation or stimulation of oral structures including the lips, tongue, jaw, and soft palate. Non-speech oral motor exercises are intended to improve the function (e.g., movement, strength) of oral structures. The
A Novel Real-Time Speech Summarizer System for the Learning of Sustainability

Directory of Open Access Journals (Sweden)

Hsiu-Wen Wang

2015-04-01

Full Text Available As the number of speech and video documents increases on the Internet and portable devices proliferate, speech summarization becomes increasingly essential. Relevant research in this domain has typically focused on broadcasts and news; however, the automatic summarization methods used in the past may not apply to other speech domains (e.g., speech in lectures. Therefore, this study explores the lecture speech domain. The features used in previous research were analyzed and suitable features were selected following experimentation; subsequently, a three-phase real-time speech summarizer for the learning of sustainability (RTSSLS was proposed. Phase One involved selecting independent features (e.g., centrality, resemblance to the title, sentence length, term frequency, and thematic words and calculating the independent feature scores; Phase Two involved calculating the dependent features, such as the position compared with the independent feature scores; and Phase Three involved comparing these feature scores to obtain weighted averages of the function-scores, determine the highest-scoring sentence, and provide a summary. In practical results, the accuracies of macro-average and micro-average for the RTSSLS were 70% and 73%, respectively. Therefore, using a RTSSLS can enable users to acquire key speech information for the learning of sustainability.
Ordinal models of audiovisual speech perception

DEFF Research Database (Denmark)

Andersen, Tobias

2011-01-01

Audiovisual information is integrated in speech perception. One manifestation of this is the McGurk illusion in which watching the articulating face alters the auditory phonetic percept. Understanding this phenomenon fully requires a computational model with predictive power. Here, we describe...
Understanding the determinants of problem-solving behavior in a complex environment

Science.gov (United States)

Casner, Stephen A.

1994-01-01

It is often argued that problem-solving behavior in a complex environment is determined as much by the features of the environment as by the goals of the problem solver. This article explores a technique to determine the extent to which measured features of a complex environment influence problem-solving behavior observed within that environment. In this study, the technique is used to determine how complex flight deck and air traffic control environment influences the strategies used by airline pilots when controlling the flight path of a modern jetliner. Data collected aboard 16 commercial flights are used to measure selected features of the task environment. A record of the pilots' problem-solving behavior is analyzed to determine to what extent behavior is adapted to the environmental features that were measured. The results suggest that the measured features of the environment account for as much as half of the variability in the pilots' problem-solving behavior and provide estimates on the probable effects of each environmental feature.
Effects of social cognitive impairment on speech disorder in schizophrenia.

Science.gov (United States)

Docherty, Nancy M; McCleery, Amanda; Divilbiss, Marielle; Schumann, Emily B; Moe, Aubrey; Shakeel, Mohammed K

2013-05-01

Disordered speech in schizophrenia impairs social functioning because it impedes communication with others. Treatment approaches targeting this symptom have been limited by an incomplete understanding of its causes. This study examined the process underpinnings of speech disorder, assessed in terms of communication failure. Contributions of impairments in 2 social cognitive abilities, emotion perception and theory of mind (ToM), to speech disorder were assessed in 63 patients with schizophrenia or schizoaffective disorder and 21 nonpsychiatric participants, after controlling for the effects of verbal intelligence and impairments in basic language-related neurocognitive abilities. After removal of the effects of the neurocognitive variables, impairments in emotion perception and ToM each explained additional variance in speech disorder in the patients but not the controls. The neurocognitive and social cognitive variables, taken together, explained 51% of the variance in speech disorder in the patients. Schizophrenic disordered speech may be less a concomitant of "positive" psychotic process than of illness-related limitations in neurocognitive and social cognitive functioning.
General-Purpose Monitoring during Speech Production

Science.gov (United States)

Ries, Stephanie; Janssen, Niels; Dufau, Stephane; Alario, F.-Xavier; Burle, Boris

2011-01-01

The concept of "monitoring" refers to our ability to control our actions on-line. Monitoring involved in speech production is often described in psycholinguistic models as an inherent part of the language system. We probed the specificity of speech monitoring in two psycholinguistic experiments where electroencephalographic activities were…
Solving or resolving inadequate and noisy tomographic systems

NARCIS (Netherlands)

Nolet, G.

1985-01-01

Tomography in seismology often leads to underdetermined and inconsistent systems of linear equations. When solving, care must be taken to keep the propagation of data errors under control. In this paper I test the applicability of three types of damped least-squares algorithms to the kind of
Emotion, affect and personality in speech the bias of language and paralanguage

CERN Document Server

Johar, Swati

2016-01-01

This book explores the various categories of speech variation and works to draw a line between linguistic and paralinguistic phenomenon of speech. Paralinguistic contrast is crucial to human speech but has proven to be one of the most difficult tasks in speech systems. In the quest for solutions to speech technology and sciences, this book narrows down the gap between speech technologists and phoneticians and emphasizes the imperative efforts required to accomplish the goal of paralinguistic control in speech technology applications and the acute need for a multidisciplinary categorization system. This interdisciplinary work on paralanguage will not only serve as a source of information but also a theoretical model for linguists, sociologists, psychologists, phoneticians and speech researchers.
Spatial problem-solving strategies of middle school students: Wayfinding with geographic information systems

Science.gov (United States)

Wigglesworth, John C.

2000-06-01

Geographic Information Systems (GIS) is a powerful computer software package that emphasizes the use of maps and the management of spatially referenced environmental data archived in a systems data base. Professional applications of GIS have been in place since the 1980's, but only recently has GIS gained significant attention in the K--12 classroom. Students using GIS are able to manipulate and query data in order to solve all manners of spatial problems. Very few studies have examined how this technological innovation can support classroom learning. In particular, there has been little research on how experience in using the software correlates with a child's spatial cognition and his/her ability to understand spatial relationships. This study investigates the strategies used by middle school students to solve a wayfinding (route-finding) problem using the ArcView GIS software. The research design combined an individual background questionnaire, results from the Group Assessment of Logical Thinking (GALT) test, and analysis of reflective think-aloud sessions to define the characteristics of the strategies students' used to solve this particular class of spatial problem. Three uniquely different spatial problem solving strategies were identified. Visual/Concrete Wayfinders used a highly visual strategy; Logical/Abstract Wayfinders used GIS software tools to apply a more analytical and systematic approach; Transitional Wayfinders used an approach that showed evidence of one that was shifting from a visual strategy to one that was more analytical. The triangulation of data sources indicates that this progression of wayfinding strategy can be correlated both to Piagetian stages of logical thought and to experience with the use of maps. These findings suggest that GIS teachers must be aware that their students' performance will lie on a continuum that is based on cognitive development, spatial ability, and prior experience with maps. To be most effective, GIS teaching
Chosen interval methods for solving linear interval systems with special type of matrix

Science.gov (United States)

Szyszka, Barbara

2013-10-01

The paper is devoted to chosen direct interval methods for solving linear interval systems with special type of matrix. This kind of matrix: band matrix with a parameter, from finite difference problem is obtained. Such linear systems occur while solving one dimensional wave equation (Partial Differential Equations of hyperbolic type) by using the central difference interval method of the second order. Interval methods are constructed so as the errors of method are enclosed in obtained results, therefore presented linear interval systems contain elements that determining the errors of difference method. The chosen direct algorithms have been applied for solving linear systems because they have no errors of method. All calculations were performed in floating-point interval arithmetic.
HMM adaptation for child speech synthesis using ASR data

CSIR Research Space (South Africa)

Govender, N

2015-11-01

Full Text Available . This paper reports on a feasibility study that was conducted to determine whether it is possible to synthesize good quality child voices using child speech data that was recorded for automatic speech recognition (ASR) purposes. A text-to-speech system...
Multiple Transcoding Impact on Speech Quality in Ideal Network Conditions

Directory of Open Access Journals (Sweden)

Martin Mikulec

2015-01-01

Full Text Available This paper deals with the impact of transcoding on the speech quality. We have focused mainly on the transcoding between codecs without the negative influence of the network parameters such as packet loss and delay. It has ensured objective and repeatable results from our measurement. The measurement was performed on the Transcoding Measuring System developed especially for this purpose. The system is based on the open source projects and is useful as a design tool for VoIP system administrators. The paper compares the most used codecs from the transcoding perspective. The multiple transcoding between G711, GSM and G729 codecs were performed and the speech quality of these calls was evaluated. The speech quality was measured by Perceptual Evaluation of Speech Quality method, which provides results in Mean Opinion Score used to describe the speech quality on a scale from 1 to 5. The obtained results indicate periodical speech quality degradation on every transcoding between two codecs.
Objective support for subjective reports of successful inner speech in two people with aphasia.

Science.gov (United States)

Hayward, William; Snider, Sarah F; Luta, George; Friedman, Rhonda B; Turkeltaub, Peter E

2016-01-01

People with aphasia frequently report being able to say a word correctly in their heads, even if they are unable to say that word aloud. It is difficult to know what is meant by these reports of "successful inner speech". We probe the experience of successful inner speech in two people with aphasia. We show that these reports are associated with correct overt speech and phonologically related nonword errors, that they relate to word characteristics associated with ease of lexical access but not ease of production, and that they predict whether or not individual words are relearned during anomia treatment. These findings suggest that reports of successful inner speech are meaningful and may be useful to study self-monitoring in aphasia, to better understand anomia, and to predict treatment outcomes. Ultimately, the study of inner speech in people with aphasia could provide critical insights that inform our understanding of normal language.
The analysis of speech acts patterns in two Egyptian inaugural speeches

Directory of Open Access Journals (Sweden)

Imad Hayif Sameer

2017-09-01

Full Text Available The theory of speech acts, which clarifies what people do when they speak, is not about individual words or sentences that form the basic elements of human communication, but rather about particular speech acts that are performed when uttering words. A speech act is the attempt at doing something purely by speaking. Many things can be done by speaking. Speech acts are studied under what is called speech act theory, and belong to the domain of pragmatics. In this paper, two Egyptian inaugural speeches from El-Sadat and El-Sisi, belonging to different periods were analyzed to find out whether there were differences within this genre in the same culture or not. The study showed that there was a very small difference between these two speeches which were analyzed according to Searle’s theory of speech acts. In El Sadat’s speech, commissives came to occupy the first place. Meanwhile, in El–Sisi’s speech, assertives occupied the first place. Within the speeches of one culture, we can find that the differences depended on the circumstances that surrounded the elections of the Presidents at the time. Speech acts were tools they used to convey what they wanted and to obtain support from their audiences.
Speech Problems

Science.gov (United States)

... Staying Safe Videos for Educators Search English Español Speech Problems KidsHealth / For Teens / Speech Problems What's in ... a person's ability to speak clearly. Some Common Speech and Language Disorders Stuttering is a problem that ...
A matrix formalism to solve interface condition equations in a reactor system

Energy Technology Data Exchange (ETDEWEB)

Matausek, M V [Boris Kidric Institute of Nuclear Sciences Vinca, Beograd (Yugoslavia)

1970-05-15

When a nuclear reactor or a reactor lattice cell is treated by an approximate procedure to solve the neutron transport equation, as the last computational step often appears a problem of solving systems of algebraic equations stating the interface and boundary conditions for the neutron flux moments. These systems have usually the coefficient matrices of the block-bi diagonal type, containing thus a large number of zero elements. In the present report it is shown how such a system can be solved efficiently accounting for all the zero elements both in the coefficient matrix and in the free term vector. The procedure is presented here for the case of multigroup P{sub 3} calculation of neutron flux distribution in a cylindrical reactor lattice cell. Compared with the standard gaussian elimination method, this procedure is more advantageous both in respect to the number of operations needed to solve a given problem and in respect to the computer memory storage requirements. A similar formalism can also be applied for other approximate methods, for instance for multigroup diffusion treatment of a multi zone reactor. (author)
Curriculum providing cognitive knowledge and problem-solving skills for anesthesia systems-based practice.

Science.gov (United States)

Wachtel, Ruth E; Dexter, Franklin

2010-12-01

Residency programs accredited by the ACGME are required to teach core competencies, including systems-based practice (SBP). Projects are important for satisfying this competency, but the level of knowledge and problem-solving skills required presupposes a basic understanding of the field. The responsibilities of anesthesiologists include the coordination of patient flow in the surgical suite. Familiarity with this topic is crucial for many improvement projects. A course in operations research for surgical services was originally developed for hospital administration students. It satisfies 2 of the Institute of Medicine's core competencies for health professionals: evidence-based practice and work in interdisciplinary teams. The course lasts 3.5 days (eg, 2 weekends) and consists of 45 cognitive objectives taught using 7 published articles, 10 lectures, and 156 computer-assisted problem-solving exercises based on 17 case studies. We tested the hypothesis that the cognitive objectives of the curriculum provide the knowledge and problem-solving skills necessary to perform projects that satisfy the SBP competency. Standardized terminology was used to define each component of the SBP competency for the minimum level of knowledge needed. The 8 components of the competency were examined independently. Most cognitive objectives contributed to at least 4 of the 8 core components of the SBP competency. Each component of SBP is addressed at the minimum requirement level of exemplify by at least 6 objectives. There is at least 1 cognitive objective at the level of summarize for each SBP component. A curriculum in operating room management can provide the knowledge and problem-solving skills anesthesiologists need for participation in projects that satisfy the SBP competency.
Application of Business Process Management to drive the deployment of a speech recognition system in a healthcare organization.

Science.gov (United States)

González Sánchez, María José; Framiñán Torres, José Manuel; Parra Calderón, Carlos Luis; Del Río Ortega, Juan Antonio; Vigil Martín, Eduardo; Nieto Cervera, Jaime

2008-01-01

We present a methodology based on Business Process Management to guide the development of a speech recognition system in a hospital in Spain. The methodology eases the deployment of the system by 1) involving the clinical staff in the process, 2) providing the IT professionals with a description of the process and its requirements, 3) assessing advantages and disadvantages of the speech recognition system, as well as its impact in the organisation, and 4) help reorganising the healthcare process before implementing the new technology in order to identify how it can better contribute to the overall objective of the organisation.

Derivative free Davidon-Fletcher-Powell (DFP) for solving symmetric systems of nonlinear equations

Science.gov (United States)

Mamat, M.; Dauda, M. K.; Mohamed, M. A. bin; Waziri, M. Y.; Mohamad, F. S.; Abdullah, H.

2018-03-01

Research from the work of engineers, economist, modelling, industry, computing, and scientist are mostly nonlinear equations in nature. Numerical solution to such systems is widely applied in those areas of mathematics. Over the years, there has been significant theoretical study to develop methods for solving such systems, despite these efforts, unfortunately the methods developed do have deficiency. In a contribution to solve systems of the form F(x) = 0, x ∈ Rn , a derivative free method via the classical Davidon-Fletcher-Powell (DFP) update is presented. This is achieved by simply approximating the inverse Hessian matrix with {Q}k+1-1 to θkI. The modified method satisfied the descent condition and possess local superlinear convergence properties. Interestingly, without computing any derivative, the proposed method never fail to converge throughout the numerical experiments. The output is based on number of iterations and CPU time, different initial starting points were used on a solve 40 benchmark test problems. With the aid of the squared norm merit function and derivative-free line search technique, the approach yield a method of solving symmetric systems of nonlinear equations that is capable of significantly reducing the CPU time and number of iteration, as compared to its counterparts. A comparison between the proposed method and classical DFP update were made and found that the proposed methodis the top performer and outperformed the existing method in almost all the cases. In terms of number of iterations, out of the 40 problems solved, the proposed method solved 38 successfully, (95%) while classical DFP solved 2 problems (i.e. 05%). In terms of CPU time, the proposed method solved 29 out of the 40 problems given, (i.e.72.5%) successfully whereas classical DFP solves 11 (27.5%). The method is valid in terms of derivation, reliable in terms of number of iterations and accurate in terms of CPU time. Thus, suitable and achived the objective.
On the Use of the Humanoid Bioloid System for Robot-Assisted Transcription of Mexican Spanish Speech

Directory of Open Access Journals (Sweden)

Santiago-Omar Caballero-Morales

2015-12-01

Full Text Available Within the context of service robotics (SR, the development of assistive technologies has become an important research field. However, the accomplishment of assistive tasks requires precise and fine control of the mechanic systems that integrate the robotic entity. Among the most challenging tasks in robot control, the handwriting task (transcription is of particular interest due to the fine control required to draw single and multiple alphabet characters to express words and sentences. For language learning activities, robot-assisted speech transcription can motivate the student to practice pronunciation and writing tasks in a dynamic environment. Hence, this paper is aimed to provide the techniques and models to accomplish accurate robot-assisted transcription of Spanish speech. The transcriptor is integrated by a multi-user speech recognizer for continuous speech and the kinematic models for the Mexican Spanish alphabet characters. The Bioloid system with the standard humanoid configuration and no special modifications or tools was considered for implementation. Particularly, the proposed transcriptor could perform the handwriting task with the Bioloid’s two two DOF (degrees-of-freedom arms. This enabled writing of one-line short and long sentences with small alphabet characters (width <1.0 cm. It is expected that the technique and models that integrate the transcriptor can provide support for the development of robot-assisted language learning activities for children and young adults.
O que fonoaudiólogos e estudantes de fonoaudiologia entendem por fluência e disfluência What speech-language pathologists and students of speech-pathology understand as fluency and disfluency

Directory of Open Access Journals (Sweden)

Ana Maria do Carmo Carvalho de Oliveira

2007-03-01

Full Text Available OBJETIVO: verificar a compreensão dos estudantes de 3º e 4º anos e dos profissionais de Fonoaudiologia em relação aos conceitos de fluência e disfluência, aos componentes e influenciadores da fluência e aos tipos de disfluências. MÉTODOS: foram aplicados 107 questionários a uma amostra de 57 profissionais e 50 estudantes. Foi realizada análise qualitativo-quantitativa das questões abertas e quantitativa das questões fechadas. RESULTADOS: a análise descritiva identificou mais de 20 fatores para as perguntas abertas (conceito de fluência e de disfluência e componentes da fluência, mas nenhum fator citado pela maioria dos sujeitos. O componente da fluência mais listado relaciona-se à taxa de elocução. Fatores psicológicos como ansiedade e introversão-extroversão estão entre os fatores mais citados como influenciadores do grau de fluência. Os tipos de disfluências mais categorizadas como gaguejadas foram bloqueios, prolongamentos iniciais e comportamentos de defesa. Não houve diferença estatisticamente significante entre profissionais e estudantes quanto ao perfil de respostas. O aumento dos anos de atuação modificou algumas respostas. CONCLUSÃO: os participantes: 1 apresentaram conceito idealizado de fluência ("fala livre de rupturas", 2 consideraram disfluência como sinal de alteração e não como um fenômeno intrínseco da fala, 3 consideraram a taxa de elocução, e não as disfluências, como o componente que mais afeta o grau de fluência, 4 consideraram os fatores afetivos, principalmente a ansiedade, como os que mais influenciam o grau de fluência, atribuindo uma influência secundária a fatores lingüísticos, cognitivos e genéticos e 5 classificaram os tipos de disfluências de maneira alinhada com a literatura.PURPOSE: to check the understanding of current students (those who are in 3rd and 4th year and Speech-Language Pathology professionals about the concepts of fluency and dysfluency, the
A Danish open-set speech corpus for competing-speech studies

DEFF Research Database (Denmark)

Nielsen, Jens Bo; Dau, Torsten; Neher, Tobias

2014-01-01

Studies investigating speech-on-speech masking effects commonly use closed-set speech materials such as the coordinate response measure [Bolia et al. (2000). J. Acoust. Soc. Am. 107, 1065-1066]. However, these studies typically result in very low (i.e., negative) speech recognition thresholds (SRTs......) when the competing speech signals are spatially separated. To achieve higher SRTs that correspond more closely to natural communication situations, an open-set, low-context, multi-talker speech corpus was developed. Three sets of 268 unique Danish sentences were created, and each set was recorded...... with one of three professional female talkers. The intelligibility of each sentence in the presence of speech-shaped noise was measured. For each talker, 200 approximately equally intelligible sentences were then selected and systematically distributed into 10 test lists. Test list homogeneity was assessed...
Speech entrainment enables patients with Broca’s aphasia to produce fluent speech

Science.gov (United States)

Hubbard, H. Isabel; Hudspeth, Sarah Grace; Holland, Audrey L.; Bonilha, Leonardo; Fromm, Davida; Rorden, Chris

2012-01-01

A distinguishing feature of Broca’s aphasia is non-fluent halting speech typically involving one to three words per utterance. Yet, despite such profound impairments, some patients can mimic audio-visual speech stimuli enabling them to produce fluent speech in real time. We call this effect ‘speech entrainment’ and reveal its neural mechanism as well as explore its usefulness as a treatment for speech production in Broca’s aphasia. In Experiment 1, 13 patients with Broca’s aphasia were tested in three conditions: (i) speech entrainment with audio-visual feedback where they attempted to mimic a speaker whose mouth was seen on an iPod screen; (ii) speech entrainment with audio-only feedback where patients mimicked heard speech; and (iii) spontaneous speech where patients spoke freely about assigned topics. The patients produced a greater variety of words using audio-visual feedback compared with audio-only feedback and spontaneous speech. No difference was found between audio-only feedback and spontaneous speech. In Experiment 2, 10 of the 13 patients included in Experiment 1 and 20 control subjects underwent functional magnetic resonance imaging to determine the neural mechanism that supports speech entrainment. Group results with patients and controls revealed greater bilateral cortical activation for speech produced during speech entrainment compared with spontaneous speech at the junction of the anterior insula and Brodmann area 47, in Brodmann area 37, and unilaterally in the left middle temporal gyrus and the dorsal portion of Broca’s area. Probabilistic white matter tracts constructed for these regions in the normal subjects revealed a structural network connected via the corpus callosum and ventral fibres through the extreme capsule. Unilateral areas were connected via the arcuate fasciculus. In Experiment 3, all patients included in Experiment 1 participated in a 6-week treatment phase using speech entrainment to improve speech production
Solving differential–algebraic equation systems by means of index reduction methodology

DEFF Research Database (Denmark)

Sørensen, Kim; Houbak, Niels; Condra, Thomas

2006-01-01

of a number of differential equations and algebraic equations — a so called DAE system. Two of the DAE systems are of index 1 and they can be solved by means of standard DAE-solvers. For the actual application, the equation systems are integrated by means of MATLAB’s solver: ode23t, that solves moderately...... stiff ODEs and index 1 DAEs by means of the trapezoidal rule. The last sub-model that models the boilers steam drum consist of two differential and three algebraic equations. The index of this model is greater than 1, which means that ode23t cannot integrate this equation system. In this paper......, it is shown how the equation system, by means of an index reduction methodology, can be reduced to a system of ordinary differential equations — ODEs....
Acquirement and enhancement of remote speech signals

Science.gov (United States)

Lü, Tao; Guo, Jin; Zhang, He-yong; Yan, Chun-hui; Wang, Can-jin

2017-07-01

To address the challenges of non-cooperative and remote acoustic detection, an all-fiber laser Doppler vibrometer (LDV) is established. The all-fiber LDV system can offer the advantages of smaller size, lightweight design and robust structure, hence it is a better fit for remote speech detection. In order to improve the performance and the efficiency of LDV for long-range hearing, the speech enhancement technology based on optimally modified log-spectral amplitude (OM-LSA) algorithm is used. The experimental results show that the comprehensible speech signals within the range of 150 m can be obtained by the proposed LDV. The signal-to-noise ratio ( SNR) and mean opinion score ( MOS) of the LDV speech signal can be increased by 100% and 27%, respectively, by using the speech enhancement technology. This all-fiber LDV, which combines the speech enhancement technology, can meet the practical demand in engineering.
Non-linear Dynamics of Speech in Schizophrenia

DEFF Research Database (Denmark)

Fusaroli, Riccardo; Simonsen, Arndis; Weed, Ethan

(regularity and complexity) of speech. Our aims are (1) to achieve a more fine-grained understanding of the speech patterns in schizophrenia than has previously been achieved using traditional, linear measures of prosody and fluency, and (2) to employ the results in a supervised machine-learning process......-effects inference. SANS and SAPS scores were predicted using a 10-fold cross-validated multiple linear regression. Both analyses were iterated 1000 to test for stability of results. Results: Voice dynamics allowed discrimination of patients with schizophrenia from healthy controls with a balanced accuracy of 85...
Speech-Language Therapists' Process of Including Significant Others in Aphasia Rehabilitation

Science.gov (United States)

Hallé, Marie-Christine; Le Dorze, Guylaine; Mingant, Anne

2014-01-01

Background: Although aphasia rehabilitation should include significant others, it is currently unknown how this recommendation is adopted in speech-language therapy practice. Speech-language therapists' (SLTs) experience of including significant others in aphasia rehabilitation is also understudied, yet a better understanding of clinical…
Innovative Speech Reconstructive Surgery

OpenAIRE

Hashem Shemshadi

2003-01-01

Proper speech functioning in human being, depends on the precise coordination and timing balances in a series of complex neuro nuscular movements and actions. Starting from the prime organ of energy source of expelled air from respirato y system; deliver such air to trigger vocal cords; swift changes of this phonatory episode to a comprehensible sound in RESONACE and final coordination of all head and neck structures to elicit final speech in ...
A toolbox to solve coupled systems of differential and difference equations

International Nuclear Information System (INIS)

Ablinger, Jakob; Schneider, Carsten; Bluemlein, Johannes; Freitas, Abilio de

2016-01-01

We present algorithms to solve coupled systems of linear differential equations, arising in the calculation of massive Feynman diagrams with local operator insertions at 3-loop order, which do not request special choices of bases. Here we assume that the desired solution has a power series representation and we seek for the coefficients in closed form. In particular, if the coefficients depend on a small parameter ε (the dimensional parameter), we assume that the coefficients themselves can be expanded in formal Laurent series w.r.t. ε and we try to compute the first terms in closed form. More precisely, we have a decision algorithm which solves the following problem: if the terms can be represented by an indefinite nested hypergeometric sum expression (covering as special cases the harmonic sums, cyclotomic sums, generalized harmonic sums or nested binomial sums), then we can calculate them. If the algorithm fails, we obtain a proof that the terms cannot be represented by the class of indefinite nested hypergeometric sum expressions. Internally, this problem is reduced by holonomic closure properties to solving a coupled system of linear difference equations. The underlying method in this setting relies on decoupling algorithms, difference ring algorithms and recurrence solving. We demonstrate by a concrete example how this algorithm can be applied with the new Mathematica package SolveCoupledSystem which is based on the packages Sigma, HarmonicSums and OreSys. In all applications the representation in x-space is obtained as an iterated integral representation over general alphabets, generalizing Poincare iterated integrals.
A toolbox to solve coupled systems of differential and difference equations

Energy Technology Data Exchange (ETDEWEB)

Ablinger, Jakob; Schneider, Carsten [Linz Univ. (Austria). Research Inst. for Symbolic Computation (RISC); Bluemlein, Johannes; Freitas, Abilio de [DESY Zeuthen (Germany)

2016-01-15

We present algorithms to solve coupled systems of linear differential equations, arising in the calculation of massive Feynman diagrams with local operator insertions at 3-loop order, which do not request special choices of bases. Here we assume that the desired solution has a power series representation and we seek for the coefficients in closed form. In particular, if the coefficients depend on a small parameter ε (the dimensional parameter), we assume that the coefficients themselves can be expanded in formal Laurent series w.r.t. ε and we try to compute the first terms in closed form. More precisely, we have a decision algorithm which solves the following problem: if the terms can be represented by an indefinite nested hypergeometric sum expression (covering as special cases the harmonic sums, cyclotomic sums, generalized harmonic sums or nested binomial sums), then we can calculate them. If the algorithm fails, we obtain a proof that the terms cannot be represented by the class of indefinite nested hypergeometric sum expressions. Internally, this problem is reduced by holonomic closure properties to solving a coupled system of linear difference equations. The underlying method in this setting relies on decoupling algorithms, difference ring algorithms and recurrence solving. We demonstrate by a concrete example how this algorithm can be applied with the new Mathematica package SolveCoupledSystem which is based on the packages Sigma, HarmonicSums and OreSys. In all applications the representation in x-space is obtained as an iterated integral representation over general alphabets, generalizing Poincare iterated integrals.
Speech Processing.

Science.gov (United States)

1983-05-01

The VDE system developed had the capability of recognizing up to 248 separate words in syntactic structures. 4 The two systems described are isolated...AND SPEAKER RECOGNITION by M.J.Hunt 5 ASSESSMENT OF SPEECH SYSTEMS ’ ..- * . by R.K.Moore 6 A SURVEY OF CURRENT EQUIPMENT AND RESEARCH’ by J.S.Bridle...TECHNOLOGY IN NAVY TRAINING SYSTEMS by R.Breaux, M.Blind and R.Lynchard 10 9 I-I GENERAL REVIEW OF MILITARY APPLICATIONS OF VOICE PROCESSING DR. BRUNO
Dopamine Regulation of Human Speech and Bird Song: A Critical Review

Science.gov (United States)

Simonyan, Kristina; Horwitz, Barry; Jarvis, Erich D.

2012-01-01

To understand the neural basis of human speech control, extensive research has been done using a variety of methodologies in a range of experimental models. Nevertheless, several critical questions about learned vocal motor control still remain open. One of them is the mechanism(s) by which neurotransmitters, such as dopamine, modulate speech and…
Bandwidth Extension of Telephone Speech Aided by Data Embedding

Directory of Open Access Journals (Sweden)

Sagi Ariel

2007-01-01

Full Text Available A system for bandwidth extension of telephone speech, aided by data embedding, is presented. The proposed system uses the transmitted analog narrowband speech signal as a carrier of the side information needed to carry out the bandwidth extension. The upper band of the wideband speech is reconstructed at the receiving end from two components: a synthetic wideband excitation signal, generated from the narrowband telephone speech and a wideband spectral envelope, parametrically represented and transmitted as embedded data in the telephone speech. We propose a novel data embedding scheme, in which the scalar Costa scheme is combined with an auditory masking model allowing high rate transparent embedding, while maintaining a low bit error rate. The signal is transformed to the frequency domain via the discrete Hartley transform (DHT and is partitioned into subbands. Data is embedded in an adaptively chosen subset of subbands by modifying the DHT coefficients. In our simulations, high quality wideband speech was obtained from speech transmitted over a telephone line (characterized by spectral magnitude distortion, dispersion, and noise, in which side information data is transparently embedded at the rate of 600 information bits/second and with a bit error rate of approximately . In a listening test, the reconstructed wideband speech was preferred (at different degrees over conventional telephone speech in of the test utterances.
Bandwidth Extension of Telephone Speech Aided by Data Embedding

Directory of Open Access Journals (Sweden)

David Malah

2007-01-01

Full Text Available A system for bandwidth extension of telephone speech, aided by data embedding, is presented. The proposed system uses the transmitted analog narrowband speech signal as a carrier of the side information needed to carry out the bandwidth extension. The upper band of the wideband speech is reconstructed at the receiving end from two components: a synthetic wideband excitation signal, generated from the narrowband telephone speech and a wideband spectral envelope, parametrically represented and transmitted as embedded data in the telephone speech. We propose a novel data embedding scheme, in which the scalar Costa scheme is combined with an auditory masking model allowing high rate transparent embedding, while maintaining a low bit error rate. The signal is transformed to the frequency domain via the discrete Hartley transform (DHT and is partitioned into subbands. Data is embedded in an adaptively chosen subset of subbands by modifying the DHT coefficients. In our simulations, high quality wideband speech was obtained from speech transmitted over a telephone line (characterized by spectral magnitude distortion, dispersion, and noise, in which side information data is transparently embedded at the rate of 600 information bits/second and with a bit error rate of approximately 3⋅10−4. In a listening test, the reconstructed wideband speech was preferred (at different degrees over conventional telephone speech in 92.5% of the test utterances.
Phonological Awareness and Early Reading Development in Childhood Apraxia of Speech (CAS)

Science.gov (United States)

McNeill, B. C.; Gillon, G. T.; Dodd, B.

2009-01-01

Background: Childhood apraxia of speech (CAS) is associated with phonological awareness, reading, and spelling deficits. Comparing literacy skills in CAS with other developmental speech disorders is critical for understanding the complexity of the disorder. Aims: This study compared the phonological awareness and reading development of children…
Modern architectures for intelligent systems: reusable ontologies and problem-solving methods.

Science.gov (United States)

Musen, M A

1998-01-01

When interest in intelligent systems for clinical medicine soared in the 1970s, workers in medical informatics became particularly attracted to rule-based systems. Although many successful rule-based applications were constructed, development and maintenance of large rule bases remained quite problematic. In the 1980s, an entire industry dedicated to the marketing of tools for creating rule-based systems rose and fell, as workers in medical informatics began to appreciate deeply why knowledge acquisition and maintenance for such systems are difficult problems. During this time period, investigators began to explore alternative programming abstractions that could be used to develop intelligent systems. The notions of "generic tasks" and of reusable problem-solving methods became extremely influential. By the 1990s, academic centers were experimenting with architectures for intelligent systems based on two classes of reusable components: (1) domain-independent problem-solving methods-standard algorithms for automating stereotypical tasks--and (2) domain ontologies that captured the essential concepts (and relationships among those concepts) in particular application areas. This paper will highlight how intelligent systems for diverse tasks can be efficiently automated using these kinds of building blocks. The creation of domain ontologies and problem-solving methods is the fundamental end product of basic research in medical informatics. Consequently, these concepts need more attention by our scientific community.
Speech Analysis and Synthesis and Man-Machine Speech Communications for Air Operations. (Synthese et Analyse de la Parole et Liaisons Vocales Homme- Machine dans les Operations Aeriennes)

Science.gov (United States)

1990-05-01

speech processing area are faced . He presents speech communication as an interactive process, in which the listener actively reconstructs the message...speech produced by these systems. Finally, perhaps the greatest recent impetus in advancing digital Finally, in the area of speech and speaker recognitio
Use of digital speech recognition in diagnostics radiology

International Nuclear Information System (INIS)

Arndt, H.; Stockheim, D.; Mutze, S.; Petersein, J.; Gregor, P.; Hamm, B.

1999-01-01

Purpose: Applicability and benefits of digital speech recognition in diagnostic radiology were tested using the speech recognition system SP 6000. Methods: The speech recognition system SP 6000 was integrated into the network of the institute and connected to the existing Radiological Information System (RIS). Three subjects used this system for writing 2305 findings from dictation. After the recognition process the date, length of dictation, time required for checking/correction, kind of examination and error rate were recorded for every dictation. With the same subjects, a correlation was performed with 625 conventionally written finding. Results: After an 1-hour initial training the average error rates were 8.4 to 13.3%. The first adaptation of the speech recognition system (after nine days) decreased the average error rates to 2.4 to 10.7% due to the ability of the program to learn. The 2 nd and 3 rd adaptations resulted only in small changes of the error rate. An individual comparison of the error rate developments in the same kind of investigation showed the relative independence of the error rate on the individual user. Conclusion: The results show that the speech recognition system SP 6000 can be evaluated as an advantageous alternative for quickly recording radiological findings. A comparison between manually writing and dictating the findings verifies the individual differences of the writing speeds and shows the advantage of the application of voice recognition when faced with normal keyboard performance. (orig.) [de

An introduction to silent speech interfaces

CERN Document Server

Freitas, João; Dias, Miguel Sales; Silva, Samuel

2017-01-01

This book provides a broad and comprehensive overview of the existing technical approaches in the area of silent speech interfaces (SSI), both in theory and in application. Each technique is described in the context of the human speech production process, allowing the reader to clearly understand the principles behind SSI in general and across different methods. Additionally, the book explores the combined use of different data sources, collected from various sensors, in order to tackle the limitations of simpler SSI approaches, addressing current challenges of this field. The book also provides information about existing SSI applications, resources and a simple tutorial on how to build an SSI.
Using ILD or ITD Cues for Sound Source Localization and Speech Understanding in a Complex Listening Environment by Listeners with Bilateral and with Hearing-Preservation Cochlear Implants

Science.gov (United States)

Loiselle, Louise H.; Dorman, Michael F.; Yost, William A.; Cook, Sarah J.; Gifford, Rene H.

2016-01-01

Purpose: To assess the role of interaural time differences and interaural level differences in (a) sound-source localization, and (b) speech understanding in a cocktail party listening environment for listeners with bilateral cochlear implants (CIs) and for listeners with hearing-preservation CIs. Methods: Eleven bilateral listeners with MED-EL…
Voice and Speech Quality Perception Assessment and Evaluation

CERN Document Server

Jekosch, Ute

2005-01-01

Foundations of Voice and Speech Quality Perception starts out with the fundamental question of: "How do listeners perceive voice and speech quality and how can these processes be modeled?" Any quantitative answers require measurements. This is natural for physical quantities but harder to imagine for perceptual measurands. This book approaches the problem by actually identifying major perceptual dimensions of voice and speech quality perception, defining units wherever possible and offering paradigms to position these dimensions into a structural skeleton of perceptual speech and voice quality. The emphasis is placed on voice and speech quality assessment of systems in artificial scenarios. Many scientific fields are involved. This book bridges the gap between two quite diverse fields, engineering and humanities, and establishes the new research area of Voice and Speech Quality Perception.
Speech recognition using articulatory and excitation source features

CERN Document Server

Rao, K Sreenivasa

2017-01-01

This book discusses the contribution of articulatory and excitation source information in discriminating sound units. The authors focus on excitation source component of speech -- and the dynamics of various articulators during speech production -- for enhancement of speech recognition (SR) performance. Speech recognition is analyzed for read, extempore, and conversation modes of speech. Five groups of articulatory features (AFs) are explored for speech recognition, in addition to conventional spectral features. Each chapter provides the motivation for exploring the specific feature for SR task, discusses the methods to extract those features, and finally suggests appropriate models to capture the sound unit specific knowledge from the proposed features. The authors close by discussing various combinations of spectral, articulatory and source features, and the desired models to enhance the performance of SR systems.
Improvement of intelligibility of ideal binary-masked noisy speech by adding background noise.

Science.gov (United States)

Cao, Shuyang; Li, Liang; Wu, Xihong

2011-04-01

When a target-speech/masker mixture is processed with the signal-separation technique, ideal binary mask (IBM), intelligibility of target speech is remarkably improved in both normal-hearing listeners and hearing-impaired listeners. Intelligibility of speech can also be improved by filling in speech gaps with un-modulated broadband noise. This study investigated whether intelligibility of target speech in the IBM-treated target-speech/masker mixture can be further improved by adding a broadband-noise background. The results of this study show that following the IBM manipulation, which remarkably released target speech from speech-spectrum noise, foreign-speech, or native-speech masking (experiment 1), adding a broadband-noise background with the signal-to-noise ratio no less than 4 dB significantly improved intelligibility of target speech when the masker was either noise (experiment 2) or speech (experiment 3). The results suggest that since adding the noise background shallows the areas of silence in the time-frequency domain of the IBM-treated target-speech/masker mixture, the abruption of transient changes in the mixture is smoothed and the perceived continuity of target-speech components becomes enhanced, leading to improved target-speech intelligibility. The findings are useful for advancing computational auditory scene analysis, hearing-aid/cochlear-implant designs, and understanding of speech perception under "cocktail-party" conditions.
FCJ-170 Challenging Hate Speech With Facebook Flarf: The Role of User Practices in Regulating Hate Speech on Facebook

Directory of Open Access Journals (Sweden)

Benjamin Abraham

2014-12-01

Full Text Available This article makes a case study of ‘flarfing’ (a creative Facebook user practice with roots in found-text poetry in order to contribute to an understanding of the potentials and limitations facing users of online social networking sites who wish to address the issue of online hate speech. The practice of ‘flarfing’ involves users posting ‘blue text’ hyperlinked Facebook page names into status updates and comment threads. Facebook flarf sends a visible, though often non-literal, message to offenders and onlookers about what kinds of speech the responding activist(s find (unacceptable in online discussion, belonging to a category of agonistic online activism that repurposes the tools of internet trolling for activist ends. I argue this practice represents users attempting to ‘take responsibility’ for the culture of online spaces they inhabit, promoting intolerance to hate speech online. Careful consideration of the limits of flarf's efficacy within Facebook’s specific regulatory environment shows the extent to which this practice and similar responses to online hate speech are constrained by the platforms on which they exist.
Audio-Visual Speech Recognition Using MPEG-4 Compliant Visual Features

Directory of Open Access Journals (Sweden)

Petar S. Aleksic

2002-11-01

Full Text Available We describe an audio-visual automatic continuous speech recognition system, which significantly improves speech recognition performance over a wide range of acoustic noise levels, as well as under clean audio conditions. The system utilizes facial animation parameters (FAPs supported by the MPEG-4 standard for the visual representation of speech. We also describe a robust and automatic algorithm we have developed to extract FAPs from visual data, which does not require hand labeling or extensive training procedures. The principal component analysis (PCA was performed on the FAPs in order to decrease the dimensionality of the visual feature vectors, and the derived projection weights were used as visual features in the audio-visual automatic speech recognition (ASR experiments. Both single-stream and multistream hidden Markov models (HMMs were used to model the ASR system, integrate audio and visual information, and perform a relatively large vocabulary (approximately 1000 words speech recognition experiments. The experiments performed use clean audio data and audio data corrupted by stationary white Gaussian noise at various SNRs. The proposed system reduces the word error rate (WER by 20% to 23% relatively to audio-only speech recognition WERs, at various SNRs (0Ã¢Â€Â“30 dB with additive white Gaussian noise, and by 19% relatively to audio-only speech recognition WER under clean audio conditions.
Magnified Neural Envelope Coding Predicts Deficits in Speech Perception in Noise.

Science.gov (United States)

Millman, Rebecca E; Mattys, Sven L; Gouws, André D; Prendergast, Garreth

2017-08-09

Verbal communication in noisy backgrounds is challenging. Understanding speech in background noise that fluctuates in intensity over time is particularly difficult for hearing-impaired listeners with a sensorineural hearing loss (SNHL). The reduction in fast-acting cochlear compression associated with SNHL exaggerates the perceived fluctuations in intensity in amplitude-modulated sounds. SNHL-induced changes in the coding of amplitude-modulated sounds may have a detrimental effect on the ability of SNHL listeners to understand speech in the presence of modulated background noise. To date, direct evidence for a link between magnified envelope coding and deficits in speech identification in modulated noise has been absent. Here, magnetoencephalography was used to quantify the effects of SNHL on phase locking to the temporal envelope of modulated noise (envelope coding) in human auditory cortex. Our results show that SNHL enhances the amplitude of envelope coding in posteromedial auditory cortex, whereas it enhances the fidelity of envelope coding in posteromedial and posterolateral auditory cortex. This dissociation was more evident in the right hemisphere, demonstrating functional lateralization in enhanced envelope coding in SNHL listeners. However, enhanced envelope coding was not perceptually beneficial. Our results also show that both hearing thresholds and, to a lesser extent, magnified cortical envelope coding in left posteromedial auditory cortex predict speech identification in modulated background noise. We propose a framework in which magnified envelope coding in posteromedial auditory cortex disrupts the segregation of speech from background noise, leading to deficits in speech perception in modulated background noise. SIGNIFICANCE STATEMENT People with hearing loss struggle to follow conversations in noisy environments. Background noise that fluctuates in intensity over time poses a particular challenge. Using magnetoencephalography, we demonstrate
Solving differential-algebraic equation systems by means of index reduction methodology

DEFF Research Database (Denmark)

Sørensen, Kim; Houbak, Niels; Condra, Thomas Joseph

2006-01-01

of a number of differential equations and algebraic equations - a so called DAE system. Two of the DAE systems are of index 1 and they can be solved by means of standard DAE-solvers. For the actual application, the equation systems are integrated by means of MATLAB’s solver: ode23t, that solves moderately...... stiff ODE’s and index 1 DAE’s by means of the trapezoidal rule. The last sub-model that models the boilers steam drum consist of two differential and three algebraic equations. The index of this model is greater than 1, which means that ode23t cannot integrate this equation system. In this paper......, it is shown how the equation system, by means of an index reduction methodology, can be reduced to a system of Ordinary- Differential-Equations - ODE’s....
Speech perception under adverse conditions: Insights from behavioral, computational and neuroscience research

Directory of Open Access Journals (Sweden)

Sara eGuediche

2014-01-01

Full Text Available Adult speech perception reflects the long-term regularities of the native language, but it is also flexible such that it accommodates and adapts to adverse listening conditions and short-term deviations from native-language norms. The purpose of this review article is to examine how the broader neuroscience literature can inform and advance research efforts in understanding the neural basis of flexibility and adaptive plasticity in speech perception. In particular, we consider several domains of neuroscience research that offer insight into how perception can be adaptively tuned to short-term deviations while also maintaining without affecting the long-term learned regularities for mapping sensory input. We review several literatures to highlight the potential role of learning algorithms that rely on prediction error signals and discuss specific neural structures that are likely to contribute to such learning. Already, a few studies have alluded to a potential role of these mechanisms in adaptive plasticity in speech perception. Better understanding the application and limitations of these algorithms for the challenges of flexible speech perception under adverse conditions promises to inform theoretical models of speech.
Multimicrophone Speech Dereverberation: Experimental Validation

Directory of Open Access Journals (Sweden)

Marc Moonen

2007-05-01

Full Text Available Dereverberation is required in various speech processing applications such as handsfree telephony and voice-controlled systems, especially when signals are applied that are recorded in a moderately or highly reverberant environment. In this paper, we compare a number of classical and more recently developed multimicrophone dereverberation algorithms, and validate the different algorithmic settings by means of two performance indices and a speech recognition system. It is found that some of the classical solutions obtain a moderate signal enhancement. More advanced subspace-based dereverberation techniques, on the other hand, fail to enhance the signals despite their high-computational load.
Building a Prototype Text to Speech for Sanskrit

Science.gov (United States)

Mahananda, Baiju; Raju, C. M. S.; Patil, Ramalinga Reddy; Jha, Narayana; Varakhedi, Shrinivasa; Kishore, Prahallad

This paper describes about the work done in building a prototype text to speech system for Sanskrit. A basic prototype text-to-speech is built using a simplified Sanskrit phone set, and employing a unit selection technique, where prerecorded sub-word units are concatenated to synthesize a sentence. We also discuss the issues involved in building a full-fledged text-to-speech for Sanskrit.
Speech processing: from peripheral to hemispheric asymmetry of the auditory system.

Science.gov (United States)

Lazard, Diane S; Collette, Jean-Louis; Perrot, Xavier

2012-01-01

Language processing from the cochlea to auditory association cortices shows side-dependent specificities with an apparent left hemispheric dominance. The aim of this article was to propose to nonspeech specialists a didactic review of two complementary theories about hemispheric asymmetry in speech processing. Starting from anatomico-physiological and clinical observations of auditory asymmetry and interhemispheric connections, this review then exposes behavioral (dichotic listening paradigm) as well as functional (functional magnetic resonance imaging and positron emission tomography) experiments that assessed hemispheric specialization for speech processing. Even though speech at an early phonological level is regarded as being processed bilaterally, a left-hemispheric dominance exists for higher-level processing. This asymmetry may arise from a segregation of the speech signal, broken apart within nonprimary auditory areas in two distinct temporal integration windows--a fast one on the left and a slower one on the right--modeled through the asymmetric sampling in time theory or a spectro-temporal trade-off, with a higher temporal resolution in the left hemisphere and a higher spectral resolution in the right hemisphere, modeled through the spectral/temporal resolution trade-off theory. Both theories deal with the concept that lower-order tuning principles for acoustic signal might drive higher-order organization for speech processing. However, the precise nature, mechanisms, and origin of speech processing asymmetry are still being debated. Finally, an example of hemispheric asymmetry alteration, which has direct clinical implications, is given through the case of auditory aging that mixes peripheral disorder and modifications of central processing. Copyright © 2011 The American Laryngological, Rhinological, and Otological Society, Inc.
Exp-function method for solving Maccari's system

International Nuclear Information System (INIS)

Zhang Sheng

2007-01-01

In this Letter, the Exp-function method is used to seek exact solutions of Maccari's system. As a result, single and combined generalized solitonary solutions are obtained, from which some known solutions obtained by extended sine-Gordon equation method and improved hyperbolic function method are recovered as special cases. It is shown that the Exp-function method provides a very effective and powerful mathematical tool for solving nonlinear evolution equations in mathematical physics
Automated Discovery of Speech Act Categories in Educational Games

Science.gov (United States)

Rus, Vasile; Moldovan, Cristian; Niraula, Nobal; Graesser, Arthur C.

2012-01-01

In this paper we address the important task of automated discovery of speech act categories in dialogue-based, multi-party educational games. Speech acts are important in dialogue-based educational systems because they help infer the student speaker's intentions (the task of speech act classification) which in turn is crucial to providing adequate…
Annotation of Heterogeneous Multimedia Content Using Automatic Speech Recognition

NARCIS (Netherlands)

Huijbregts, M.A.H.; Ordelman, Roeland J.F.; de Jong, Franciska M.G.

2007-01-01

This paper reports on the setup and evaluation of robust speech recognition system parts, geared towards transcript generation for heterogeneous, real-life media collections. The system is deployed for generating speech transcripts for the NIST/TRECVID-2007 test collection, part of a Dutch real-life
Application of artifical intelligence principles to the analysis of "crazy" speech.

Science.gov (United States)

Garfield, D A; Rapp, C

1994-04-01

Artificial intelligence computer simulation methods can be used to investigate psychotic or "crazy" speech. Here, symbolic reasoning algorithms establish semantic networks that schematize speech. These semantic networks consist of two main structures: case frames and object taxonomies. Node-based reasoning rules apply to object taxonomies and pathway-based reasoning rules apply to case frames. Normal listeners may recognize speech as "crazy talk" based on violations of node- and pathway-based reasoning rules. In this article, three separate segments of schizophrenic speech illustrate violations of these rules. This artificial intelligence approach is compared and contrasted with other neurolinguistic approaches and is discussed as a conceptual link between neurobiological and psychodynamic understandings of psychopathology.
On Solving the Lorenz System by Differential Transformation Method

International Nuclear Information System (INIS)

Al-Sawalha, M. Mossa; Noorani, M. S. M.

2008-01-01

The differential transformation method (DTM) is employed to solve a nonlinear differential equation, namely the Lorenz system. Numerical results are compared to those obtained by the Runge–Kutta method to illustrate the preciseness and effectiveness of the proposed method. In particular, we examine the accuracy of the (DTM) as the Lorenz system changes from a non-chaotic system to a chaotic one. It is shown that the (DTM) is robust, accurate and easy to apply
Knowledge acquisition from natural language for expert systems based on classification problem-solving methods

Science.gov (United States)

Gomez, Fernando

1989-01-01

It is shown how certain kinds of domain independent expert systems based on classification problem-solving methods can be constructed directly from natural language descriptions by a human expert. The expert knowledge is not translated into production rules. Rather, it is mapped into conceptual structures which are integrated into long-term memory (LTM). The resulting system is one in which problem-solving, retrieval and memory organization are integrated processes. In other words, the same algorithm and knowledge representation structures are shared by these processes. As a result of this, the system can answer questions, solve problems or reorganize LTM.
Speech and language pathology & pediatric HIV.

Science.gov (United States)

Retzlaff, C

1999-12-01

Children with HIV have critical speech and language issues because the virus manifests itself primarily in the developing central nervous system, sometimes causing speech, motor control, and language disabilities. Language impediments that develop during the second year of life seem to be especially severe. HIV-infected children are also susceptible to recurrent ear infections, which can damage hearing. Developmental issues must be addressed for these children to reach their full potential. A decline in language skills may coincide with or precede other losses in cognitive ability. A speech pathologist can play an important role on a pediatric HIV team. References are included.

Filled pause refinement based on the pronunciation probability for lecture speech.

Directory of Open Access Journals (Sweden)

Yan-Hua Long

Full Text Available Nowadays, although automatic speech recognition has become quite proficient in recognizing or transcribing well-prepared fluent speech, the transcription of speech that contains many disfluencies remains problematic, such as spontaneous conversational and lecture speech. Filled pauses (FPs are the most frequently occurring disfluencies in this type of speech. Most recent studies have shown that FPs are widely believed to increase the error rates for state-of-the-art speech transcription, primarily because most FPs are not well annotated or provided in training data transcriptions and because of the similarities in acoustic characteristics between FPs and some common non-content words. To enhance the speech transcription system, we propose a new automatic refinement approach to detect FPs in British English lecture speech transcription. This approach combines the pronunciation probabilities for each word in the dictionary and acoustic language model scores for FP refinement through a modified speech recognition forced-alignment framework. We evaluate the proposed approach on the Reith Lectures speech transcription task, in which only imperfect training transcriptions are available. Successful results are achieved for both the development and evaluation datasets. Acoustic models trained on different styles of speech genres have been investigated with respect to FP refinement. To further validate the effectiveness of the proposed approach, speech transcription performance has also been examined using systems built on training data transcriptions with and without FP refinement.
Exploring Effects of High School Students' Mathematical Processing Skills and Conceptual Understanding of Chemical Concepts on Algorithmic Problem Solving

Science.gov (United States)

Gultepe, Nejla; Yalcin Celik, Ayse; Kilic, Ziya

2013-01-01

The purpose of the study was to examine the effects of students' conceptual understanding of chemical concepts and mathematical processing skills on algorithmic problem-solving skills. The sample (N = 554) included grades 9, 10, and 11 students in Turkey. Data were collected using the instrument "MPC Test" and with interviews. The MPC…
Errors and Understanding: The Effects of Error-Management Training on Creative Problem-Solving

Science.gov (United States)

Robledo, Issac C.; Hester, Kimberly S.; Peterson, David R.; Barrett, Jamie D.; Day, Eric A.; Hougen, Dean P.; Mumford, Michael D.

2012-01-01

People make errors in their creative problem-solving efforts. The intent of this article was to assess whether error-management training would improve performance on creative problem-solving tasks. Undergraduates were asked to solve an educational leadership problem known to call for creative thought where problem solutions were scored for…
Intelligibility for Binaural Speech with Discarded Low-SNR Speech Components.

Science.gov (United States)

Schoenmaker, Esther; van de Par, Steven

2016-01-01

Speech intelligibility in multitalker settings improves when the target speaker is spatially separated from the interfering speakers. A factor that may contribute to this improvement is the improved detectability of target-speech components due to binaural interaction in analogy to the Binaural Masking Level Difference (BMLD). This would allow listeners to hear target speech components within specific time-frequency intervals that have a negative SNR, similar to the improvement in the detectability of a tone in noise when these contain disparate interaural difference cues. To investigate whether these negative-SNR target-speech components indeed contribute to speech intelligibility, a stimulus manipulation was performed where all target components were removed when local SNRs were smaller than a certain criterion value. It can be expected that for sufficiently high criterion values target speech components will be removed that do contribute to speech intelligibility. For spatially separated speakers, assuming that a BMLD-like detection advantage contributes to intelligibility, degradation in intelligibility is expected already at criterion values below 0 dB SNR. However, for collocated speakers it is expected that higher criterion values can be applied without impairing speech intelligibility. Results show that degradation of intelligibility for separated speakers is only seen for criterion values of 0 dB and above, indicating a negligible contribution of a BMLD-like detection advantage in multitalker settings. These results show that the spatial benefit is related to a spatial separation of speech components at positive local SNRs rather than to a BMLD-like detection improvement for speech components at negative local SNRs.
Psychoacoustic cues to emotion in speech prosody and music.

Science.gov (United States)

Coutinho, Eduardo; Dibben, Nicola

2013-01-01

There is strong evidence of shared acoustic profiles common to the expression of emotions in music and speech, yet relatively limited understanding of the specific psychoacoustic features involved. This study combined a controlled experiment and computational modelling to investigate the perceptual codes associated with the expression of emotion in the acoustic domain. The empirical stage of the study provided continuous human ratings of emotions perceived in excerpts of film music and natural speech samples. The computational stage created a computer model that retrieves the relevant information from the acoustic stimuli and makes predictions about the emotional expressiveness of speech and music close to the responses of human subjects. We show that a significant part of the listeners' second-by-second reported emotions to music and speech prosody can be predicted from a set of seven psychoacoustic features: loudness, tempo/speech rate, melody/prosody contour, spectral centroid, spectral flux, sharpness, and roughness. The implications of these results are discussed in the context of cross-modal similarities in the communication of emotion in the acoustic domain.
Application of expressive speech in the TTS system with cepstral description

Czech Academy of Sciences Publication Activity Database

Přibil, Jiří; Přibilová, Anna

-, č. 5042 (2008), s. 200-212 ISSN 0302-9743 R&D Projects: GA AV ČR 1QS108040569 Grant - others:MŠk(SK) 1/3107/06 Institutional research plan: CEZ:AV0Z20670512 Keywords : speech synthesis * speech processing Subject RIV: JA - Electronics ; Optoelectronics, Electrical Engineering
an online model for assessing st for assessing st stepwise solving

African Journals Online (AJOL)

User

er formative or summative assessment. Hence, in this ... chnology, educational system, students' understanding, calculus questio ent of the ... comprehension on the instructional c ..... [8] Moses, O. A. “Design of a Problem-solving approach.
Prediction and Optimization of Speech Intelligibility in Adverse Conditions

NARCIS (Netherlands)

Taal, C.H.

2013-01-01

In digital speech-communication systems like mobile phones, public address systems and hearing aids, conveying the message is one of the most important goals. This can be challenging since the intelligibility of the speech may be harmed at various stages before, during and after the transmission
Speech Understanding in Air Intercept Controller Training System Design.

Science.gov (United States)

1979-01-01

Street MD 700 Utica, NY 13503chief MI Field Unit Mr. J. Michael Nyc, Pres identP.O. Box 476 Marketing Consultants Interna tional , Inc.Fort Rucker, AL... Researc h Lab Systems and Information Sciences Lab ~aman Engi neering Division Texas Instruments ~fright-Patterson AFB P. 0. Box 5936 Dayton, OH
Speech Function and Speech Role in Carl Fredricksen's Dialogue on Up Movie

OpenAIRE

Rehana, Ridha; Silitonga, Sortha

2013-01-01

One aim of this article is to show through a concrete example how speech function and speech role used in movie. The illustrative example is taken from the dialogue of Up movie. Central to the analysis proper form of dialogue on Up movie that contain of speech function and speech role; i.e. statement, offer, question, command, giving, and demanding. 269 dialogue were interpreted by actor, and it was found that the use of speech function and speech role.
Problem Solving, Scaffolding and Learning

Science.gov (United States)

Lin, Shih-Yin

2012-01-01

Helping students to construct robust understanding of physics concepts and develop good solving skills is a central goal in many physics classrooms. This thesis examine students' problem solving abilities from different perspectives and explores strategies to scaffold students' learning. In studies involving analogical problem solving…
Multiobjective CVaR Optimization Model and Solving Method for Hydrothermal System Considering Uncertain Load Demand

Directory of Open Access Journals (Sweden)

Zhongfu Tan

2015-01-01

Full Text Available In order to solve the influence of load uncertainty on hydrothermal power system operation and achieve the optimal objectives of system power generation consumption, pollutant emissions, and first-stage hydropower station storage capacity, this paper introduced CVaR method and built a multiobjective optimization model and its solving method. In the optimization model, load demand’s actual values and deviation values are regarded as random variables, scheduling objective is redefined to meet confidence level requirement and system operation constraints and loss function constraints are taken into consideration. To solve the proposed model, this paper linearized nonlinear constraints, applied fuzzy satisfaction, fuzzy entropy, and weighted multiobjective function theories to build a fuzzy entropy multiobjective CVaR model. The model is a mixed integer linear programming problem. Then, six thermal power plants and three cascade hydropower stations are taken as the hydrothermal system for numerical simulation. The results verified that multiobjective CVaR method is applicable to solve hydrothermal scheduling problems. It can better reflect risk level of the scheduling result. The fuzzy entropy satisfaction degree solving algorithm can simplify solving difficulty and get the optimum operation scheduling scheme.
Experimental comparison between speech transmission index, rapid speech transmission index, and speech intelligibility index.

Science.gov (United States)

Larm, Petra; Hongisto, Valtteri

2006-02-01

During the acoustical design of, e.g., auditoria or open-plan offices, it is important to know how speech can be perceived in various parts of the room. Different objective methods have been developed to measure and predict speech intelligibility, and these have been extensively used in various spaces. In this study, two such methods were compared, the speech transmission index (STI) and the speech intelligibility index (SII). Also the simplification of the STI, the room acoustics speech transmission index (RASTI), was considered. These quantities are all based on determining an apparent speech-to-noise ratio on selected frequency bands and summing them using a specific weighting. For comparison, some data were needed on the possible differences of these methods resulting from the calculation scheme and also measuring equipment. Their prediction accuracy was also of interest. Measurements were made in a laboratory having adjustable noise level and absorption, and in a real auditorium. It was found that the measurement equipment, especially the selection of the loudspeaker, can greatly affect the accuracy of the results. The prediction accuracy of the RASTI was found acceptable, if the input values for the prediction are accurately known, even though the studied space was not ideally diffuse.
Freedom of Speech: A Clear and Present Need to Teach. ERIC Report.

Science.gov (United States)

Boileau, Don M.

1983-01-01

Presents annotations of 21 documents in the ERIC system on the following subjects: (1) theory of freedom of speech; (2) theorists; (3) research on freedom of speech; (4) broadcasting and freedom of speech; and (5) international questions of freedom of speech. (PD)
Hybrid methodological approach to context-dependent speech recognition

Directory of Open Access Journals (Sweden)

Dragiša Mišković

2017-01-01

Full Text Available Although the importance of contextual information in speech recognition has been acknowledged for a long time now, it has remained clearly underutilized even in state-of-the-art speech recognition systems. This article introduces a novel, methodologically hybrid approach to the research question of context-dependent speech recognition in human–machine interaction. To the extent that it is hybrid, the approach integrates aspects of both statistical and representational paradigms. We extend the standard statistical pattern-matching approach with a cognitively inspired and analytically tractable model with explanatory power. This methodological extension allows for accounting for contextual information which is otherwise unavailable in speech recognition systems, and using it to improve post-processing of recognition hypotheses. The article introduces an algorithm for evaluation of recognition hypotheses, illustrates it for concrete interaction domains, and discusses its implementation within two prototype conversational agents.
Speech Intelligibility Advantages using an Acoustic Beamformer Display

Science.gov (United States)

Begault, Durand R.; Sunder, Kaushik; Godfroy, Martine; Otto, Peter

2015-01-01

A speech intelligibility test conforming to the Modified Rhyme Test of ANSI S3.2 "Method for Measuring the Intelligibility of Speech Over Communication Systems" was conducted using a prototype 12-channel acoustic beamformer system. The target speech material (signal) was identified against speech babble (noise), with calculated signal-noise ratios of 0, 5 and 10 dB. The signal was delivered at a fixed beam orientation of 135 deg (re 90 deg as the frontal direction of the array) and the noise at 135 deg (co-located) and 0 deg (separated). A significant improvement in intelligibility from 57% to 73% was found for spatial separation for the same signal-noise ratio (0 dB). Significant effects for improved intelligibility due to spatial separation were also found for higher signal-noise ratios (5 and 10 dB).
Robust Speech/Non-Speech Classification in Heterogeneous Multimedia Content

NARCIS (Netherlands)

Huijbregts, M.A.H.; de Jong, Franciska M.G.

In this paper we present a speech/non-speech classification method that allows high quality classification without the need to know in advance what kinds of audible non-speech events are present in an audio recording and that does not require a single parameter to be tuned on in-domain data. Because
Synergetic Organization in Speech Rhythm

Science.gov (United States)

Cummins, Fred

The Speech Cycling Task is a novel experimental paradigm developed together with Robert Port and Keiichi Tajima at Indiana University. In a task of this sort, subjects repeat a phrase containing multiple prominent, or stressed, syllables in time with an auditory metronome, which can be simple or complex. A phase-based collective variable is defined in the acoustic speech signal. This paper reports on two experiments using speech cycling which together reveal many of the hallmarks of hierarchically coupled oscillatory processes. The first experiment requires subjects to place the final stressed syllable of a small phrase at specified phases within the overall Phrase Repetition Cycle (PRC). It is clearly demonstrated that only three patterns, characterized by phases around 1/3, 1/2 or 2/3 are reliably produced, and these points are attractors for other target phases. The system is thus multistable, and the attractors correspond to stable couplings between the metrical foot and the PRC. A second experiment examines the behavior of these attractors at increased rates. Faster rates lead to mode jumps between attractors. Previous experiments have also illustrated hysteresis as the system moves from one mode to the next. The dynamical organization is particularly interesting from a modeling point of view, as there is no single part of the speech production system which cycles at the level of either the metrical foot or the phrase repetition cycle. That is, there is no continuous kinematic observable in the system. Nonetheless, there is strong evidence that the oscopic behavior of the entire production system is correctly described as hierarchically coupled oscillators. There are many parallels between this organization and the forms of inter-limb coupling observed in locomotion and rhythmic manual tasks.
A perspective on single-channel frequency-domain speech enhancement

CERN Document Server

Benesty, Jacob

2010-01-01

This book focuses on a class of single-channel noise reduction methods that are performed in the frequency domain via the short-time Fourier transform (STFT). The simplicity and relative effectiveness of this class of approaches make them the dominant choice in practical systems. Even though many popular algorithms have been proposed through more than four decades of continuous research, there are a number of critical areas where our understanding and capabilities still remain quite rudimentary, especially with respect to the relationship between noise reduction and speech distortion. All exis
Student Teachers' Ways of Thinking and Ways of Understanding Digestion and the Digestive System in Biology

Science.gov (United States)

Çimer, Sabiha Odabasi; Ursavas, Nazihan

2012-01-01

The purpose of this study was to identify the ways in which student teachers understand digestion and the digestive system and, subsequently, their ways of thinking, as reflected in their problem solving approaches and the justification schemes that they used to validate their claims. For this purpose, clinical interviews were conducted with 10…

Analysis of Feature Extraction Methods for Speaker Dependent Speech Recognition

Directory of Open Access Journals (Sweden)

Gurpreet Kaur

2017-02-01

Full Text Available Speech recognition is about what is being said, irrespective of who is saying. Speech recognition is a growing field. Major progress is taking place on the technology of automatic speech recognition (ASR. Still, there are lots of barriers in this field in terms of recognition rate, background noise, speaker variability, speaking rate, accent etc. Speech recognition rate mainly depends on the selection of features and feature extraction methods. This paper outlines the feature extraction techniques for speaker dependent speech recognition for isolated words. A brief survey of different feature extraction techniques like Mel-Frequency Cepstral Coefficients (MFCC, Linear Predictive Coding Coefficients (LPCC, Perceptual Linear Prediction (PLP, Relative Spectra Perceptual linear Predictive (RASTA-PLP analysis are presented and evaluation is done. Speech recognition has various applications from daily use to commercial use. We have made a speaker dependent system and this system can be useful in many areas like controlling a patient vehicle using simple commands.
Children with Autism Understand Indirect Speech Acts: Evidence from a Semi-Structured Act-Out Task.

Directory of Open Access Journals (Sweden)

Mikhail Kissine

Full Text Available Children with Autism Spectrum Disorder are often said to present a global pragmatic impairment. However, there is some observational evidence that context-based comprehension of indirect requests may be preserved in autism. In order to provide experimental confirmation to this hypothesis, indirect speech act comprehension was tested in a group of 15 children with autism between 7 and 12 years and a group of 20 typically developing children between 2:7 and 3:6 years. The aim of the study was to determine whether children with autism can display genuinely contextual understanding of indirect requests. The experiment consisted of a three-pronged semi-structured task involving Mr Potato Head. In the first phase a declarative sentence was uttered by one adult as an instruction to put a garment on a Mr Potato Head toy; in the second the same sentence was uttered as a comment on a picture by another speaker; in the third phase the same sentence was uttered as a comment on a picture by the first speaker. Children with autism complied with the indirect request in the first phase and demonstrated the capacity to inhibit the directive interpretation in phases 2 and 3. TD children had some difficulty in understanding the indirect instruction in phase 1. These results call for a more nuanced view of pragmatic dysfunction in autism.
Human speech articulator measurements using low power, 2GHz Homodyne sensors

International Nuclear Information System (INIS)

Barnes, T; Burnett, G C; Holzrichter, J F

1999-01-01

Very low power, short-range microwave ''radar-like'' sensors can measure the motions and vibrations of internal human speech articulators as speech is produced. In these animate (and also in inanimate acoustic systems) microwave sensors can measure vibration information associated with excitation sources and other interfaces. These data, together with the corresponding acoustic data, enable the calculation of system transfer functions. This information appears to be useful for a surprisingly wide range of applications such as speech coding and recognition, speaker or object identification, speech and musical instrument synthesis, noise cancellation, and other applications
Effects of human fatigue on speech signals

Science.gov (United States)

Stamoulis, Catherine

2004-05-01

Cognitive performance may be significantly affected by fatigue. In the case of critical personnel, such as pilots, monitoring human fatigue is essential to ensure safety and success of a given operation. One of the modalities that may be used for this purpose is speech, which is sensitive to respiratory changes and increased muscle tension of vocal cords, induced by fatigue. Age, gender, vocal tract length, physical and emotional state may significantly alter speech intensity, duration, rhythm, and spectral characteristics. In addition to changes in speech rhythm, fatigue may also affect the quality of speech, such as articulation. In a noisy environment, detecting fatigue-related changes in speech signals, particularly subtle changes at the onset of fatigue, may be difficult. Therefore, in a performance-monitoring system, speech parameters which are significantly affected by fatigue need to be identified and extracted from input signals. For this purpose, a series of experiments was performed under slowly varying cognitive load conditions and at different times of the day. The results of the data analysis are presented here.
Distributed Speech Enhancement in Wireless Acoustic Sensor Networks

NARCIS (Netherlands)

Zeng, Y.

2015-01-01

In digital speech communication applications like hands-free mobile telephony, hearing aids and human-to-computer communication systems, the recorded speech signals are typically corrupted by background noise. As a result, their quality and intelligibility can get severely degraded. Traditional
Delayed speech development in children: Introduction to terminology

Directory of Open Access Journals (Sweden)

M. Yu. Bobylova

2017-01-01

Full Text Available There has been recently an increase in the number of children diagnosed with delayed speech development. There is delay compensation with age, but mild deficiency often remains for life. Delayed speech development is more common in boys than in girls. Its etiology is unknown in most cases, so a child should be followed up to make an accurate diagnosis. Genetic predisposition or environmental factors frequently influence speech development. The course of its delays is various. In the history of a number of disorders (childhood disintegrative disorder, Landau–Kleffner syndrome, there is evidence for the normal development of speech to a certain period and then stops or even regresses. By way of comparison, there are generally speech developmental changes in autism even during the preverbal stage (a complex of revival fails to form; babbling is poor, low emotional, gibberish; at the same time, the baby recipes whole phrases without using them to communicate. These speech disorders are considered not only as a delay, but also as a developmental abnormality. Speech disorders in children should be diagnosed as early as possible in order to initiative corrective measures in time. In this case, a physician makes a diagnosis and a special education teacher does corrective work. The successful collaboration and mutual understanding of the specialists in these areas will determine quality of life for a child in the future. This paper focusses on the terminology and classification of delays, which are necessary for physicians and teachers to speak the same language.
Development and validation of a system of assimilation indices: A mixed method approach to understand change in psychotherapy.

Science.gov (United States)

Neto, David D; Baptista, Telmo M; Dent-Brown, Kim

2015-06-01

Assimilation is an important process in understanding change in psychotherapy. Similar to other psychological processes, assimilation may be traceable in the speech of clients by attending to its signs or indices. In the present research, we aimed to build a system of indices of assimilation. This research follows a mixed method design. The indices were derived through qualitative analysis, using grounded theory. Subsequently, the indices were adjusted quantitatively and applied to 30 single psychotherapy sessions of adult clients with depression and 11 therapists. Forty-two indices were found and grouped into the following five process categories of assimilation: external distress, pain, noticing, decentring and action. The indices showed good inter-rater reliability and internal consistency. Except for noticing, all process categories correlated significantly with each other according to conceptual proximity. The system of indices also showed convergent validity with an existing coding system of assimilation for two process categories. The results suggest that the system of indices is a useful approach for understanding assimilation. The consideration of assimilation in a continuous fashion through sub-processes may help to extend our knowledge of this process and provide a tool for clinical practice. Assimilation is an important process in understanding change in psychotherapy in the sense that it takes into account insight and action-related processes. Clients convey in their speech signs or indices of the assimilation process which can be observed both in the style and content of their utterances. Using these indices, therapists can continuously assess assimilation and use this information in choosing interventions. Limitations: This study follows a cross-sectional design and does not allow consideration of the predictive value of the indices. The outcome of the therapy was not taken into account, which restricts validity considerations to the comparison with
Financial and workflow analysis of radiology reporting processes in the planning phase of implementation of a speech recognition system

Science.gov (United States)

Whang, Tom; Ratib, Osman M.; Umamoto, Kathleen; Grant, Edward G.; McCoy, Michael J.

2002-05-01

The goal of this study is to determine the financial value and workflow improvements achievable by replacing traditional transcription services with a speech recognition system in a large, university hospital setting. Workflow metrics were measured at two hospitals, one of which exclusively uses a transcription service (UCLA Medical Center), and the other which exclusively uses speech recognition (West Los Angeles VA Hospital). Workflow metrics include time spent per report (the sum of time spent interpreting, dictating, reviewing, and editing), transcription turnaround, and total report turnaround. Compared to traditional transcription, speech recognition resulted in radiologists spending 13-32% more time per report, but it also resulted in reduction of report turnaround time by 22-62% and reduction of marginal cost per report by 94%. The model developed here helps justify the introduction of a speech recognition system by showing that the benefits of reduced operating costs and decreased turnaround time outweigh the cost of increased time spent per report. Whether the ultimate goal is to achieve a financial objective or to improve operational efficiency, it is important to conduct a thorough analysis of workflow before implementation.
Engaging Systems Understanding through Games (Invited)

Science.gov (United States)

Pfirman, S. L.; Lee, J. J.; Eklund, K.; Turrin, M.; O'Garra, T.; Orlove, B. S.

2013-12-01

The Polar Learning And Responding (PoLAR) Climate Change Education Partnership (CCEP), supported by the National Science Foundation's CCEP Phase II program, uses novel educational approaches to engage adult learners and to inform public understanding about climate change. Both previous studies and our experience show that games and game-like activities lead people to explore systems and motivate problem-solving. This presentation focuses on three games developed by the PoLAR team: a multiplayer card game, a strategy board game, and a serious game, and discusses them within the larger framework of research and evaluation of learning outcomes. In the multiplayer card game EcoChains: Arctic Crisis, players learn how to build marine food chains, then strategize ways to make them resilient to a variety of natural and anthropogenic events. In the strategy board game Arctic SMARTIC (Strategic MAnagement of Resources in TImes of Change), participants take on roles, set developmental priorities, and then negotiate to resolve conflicts and deal with climate change scenarios. In the serious game FUTURE COAST, players explore "what if" scenarios in a collaborative narrative environment. Grounded on the award-winning WORLD WITHOUT OIL, which employed a similar story frame to impart energy concepts and realities, FUTURE COAST uses voicemails from the future to impel players through complexities of disrupted systems and realities of human interactions when facing change. Launching February 2014, FUTURE COAST is played online and in field events; players create media designed to be spreadable through their social networks. As players envision possible futures, they create diverse communities of practice that synthesize across human-environment interactions. Playtests highlight how the game evokes systems thinking, and engages and problem-solves via narrative: * 'While I was initially unsure how I'd contribute to a group I'd never met, the project itself proved so engaging that I
Speech disorders - children

Science.gov (United States)

... disorder; Voice disorders; Vocal disorders; Disfluency; Communication disorder - speech disorder; Speech disorder - stuttering ... evaluation tools that can help identify and diagnose speech disorders: Denver Articulation Screening Examination Goldman-Fristoe Test of ...
The role of private speech in cognitive regulation of learners: The case of English as a foreign language education

Directory of Open Access Journals (Sweden)

Mohamad Reza Anani Sarab

2015-12-01

Full Text Available Investigations into the use of private speech by adult English foreign language (EFL learners in regulating their mental activities have been an interesting area of research with a sociocultural framework. Following this line of research, 30 advanced adult EFL learners were selected via the administration of Oxford quick placement test and took a test of solving challenging English riddles while their voices were being recorded. Later, instances of the produced private speech were analyzed in terms of form, content, and function. It was demonstrated that private speech with its different forms, contents, and functions plays a very crucial role in cognitive regulation of EFL learners which has important implications for the context of language learning classrooms. In addition, participants seemed to produce qualitatively different kinds of L2 private speech, which brought us to the conclusion that it would be necessary to consider quality and not just quantity in studying psycholinguistic concepts, such as cognitive regulation and private speech.
Solving the Coupled System Improves Computational Efficiency of the Bidomain Equations

KAUST Repository

Southern, J.A.

2009-10-01

The bidomain equations are frequently used to model the propagation of cardiac action potentials across cardiac tissue. At the whole organ level, the size of the computational mesh required makes their solution a significant computational challenge. As the accuracy of the numerical solution cannot be compromised, efficiency of the solution technique is important to ensure that the results of the simulation can be obtained in a reasonable time while still encapsulating the complexities of the system. In an attempt to increase efficiency of the solver, the bidomain equations are often decoupled into one parabolic equation that is computationally very cheap to solve and an elliptic equation that is much more expensive to solve. In this study, the performance of this uncoupled solution method is compared with an alternative strategy in which the bidomain equations are solved as a coupled system. This seems counterintuitive as the alternative method requires the solution of a much larger linear system at each time step. However, in tests on two 3-D rabbit ventricle benchmarks, it is shown that the coupled method is up to 80% faster than the conventional uncoupled method-and that parallel performance is better for the larger coupled problem.
Solving the Coupled System Improves Computational Efficiency of the Bidomain Equations

KAUST Repository

Southern, J.A.; Plank, G.; Vigmond, E.J.; Whiteley, J.P.

2009-01-01

The bidomain equations are frequently used to model the propagation of cardiac action potentials across cardiac tissue. At the whole organ level, the size of the computational mesh required makes their solution a significant computational challenge. As the accuracy of the numerical solution cannot be compromised, efficiency of the solution technique is important to ensure that the results of the simulation can be obtained in a reasonable time while still encapsulating the complexities of the system. In an attempt to increase efficiency of the solver, the bidomain equations are often decoupled into one parabolic equation that is computationally very cheap to solve and an elliptic equation that is much more expensive to solve. In this study, the performance of this uncoupled solution method is compared with an alternative strategy in which the bidomain equations are solved as a coupled system. This seems counterintuitive as the alternative method requires the solution of a much larger linear system at each time step. However, in tests on two 3-D rabbit ventricle benchmarks, it is shown that the coupled method is up to 80% faster than the conventional uncoupled method-and that parallel performance is better for the larger coupled problem.
Aplikasi sistem pakar diagnosis penyakit ispa berbasis speech recognition menggunakan metode naive bayes classifier

Directory of Open Access Journals (Sweden)

Mariam Marlina

2017-05-01

Full Text Available AbstrakISPA (Infeksi Saluran Pernafasan Akut adalah suatu penyakit gangguan saluran pernapasan yang dapat menimbulkan berbagai spektrum penyakit mulai dari penyakit tanpa gejala, infeksi ringan sampai penyakit yang parah dan mematikan akibat faktor lingkungan. Kurangnya pengetahuan masyarakat mengenai gejala dan cara penanganan penyakit ISPA merupakan salah satu faktor penyebab tingginya angka kematian akibat ISPA. Peran sistem pakar yang disediakan dalam bentuk aplikasi sangat diperlukan untuk membantu seseorang dalam melakukan diagnosa penyakit ISPA secara mudah dan cepat. Dengan berusaha mengadopsi pengetahuan manusia ke komputer, sistem pakar mampu menyelesaikan permasalahan seperti yang dilakukan oleh seorang pakar. Oleh Karena itu, Aplikasi Sistem Pakar Diagnosis Penyakit ISPA Berbasis Speech Recognition Menggunakan Metode Naive Bayes Classifier dapat digunakan untuk mendiagnosis penyakit ISPA terhadap seseorang berdasarkan konversi hasil deteksi suara pengguna. Dengan aplikasi ini pengguna seakan berkonsultasi kepada seorang dokter/pakar yang menangani penyakit ISPA. Aplikasi dibangun berbasis android dengan menggunakan bahasa pemrograman Java dan database MySQL. Kata kunci : Sistem pakar, speech recognition, ISPA, metode naïve bayes classifier, Android. AbstractISPA (Acute Respiratory Tract Infection is a respiratory disorder disease that can lead to a wide spectrum of diseases ranging from asymptomatic disease, mild infection to severe and deadly disease due to environmental factors. So if someone complains of respiratory disorders not necessarily just have regular respiratory problems because it could be the person has ARI disease. The role of expert systems provided in the form of an application is needed to help a person in the diagnosis of ARI disease easily and quickly. By trying to adopt human knowledge into a computer, an expert system is capable of solving problems like that of an expert. Therefore, the Application of Expert
Speech and Hearing Science in Ancient India--A Review of Sanskrit Literature.

Science.gov (United States)

Savithri, S. R.

1988-01-01

The study reviewed Sanskrit books written between 1500 BC and 1904 AD concerning diseases, speech pathology, and audiology. Details are provided of the ancient Indian system of disease classification, the classification of speech sounds, causes of speech disorders, and treatment of speech and language disorders. (DB)
Lost in Translation: Understanding Students' Use of Social Networking and Online Resources to Support Early Clinical Practices. A National Survey of Graduate Speech-Language Pathology Students

Science.gov (United States)

Boster, Jamie B.; McCarthy, John W.

2018-01-01

The Internet is a source of many resources for graduate speech-language pathology (SLP) students. It is important to understand the resources students are aware of, which they use, and why they are being chosen as sources of information for therapy activities. A national online survey of graduate SLP students was conducted to assess their…
Neurophysiology of speech differences in childhood apraxia of speech.

Science.gov (United States)

Preston, Jonathan L; Molfese, Peter J; Gumkowski, Nina; Sorcinelli, Andrea; Harwood, Vanessa; Irwin, Julia R; Landi, Nicole

2014-01-01

Event-related potentials (ERPs) were recorded during a picture naming task of simple and complex words in children with typical speech and with childhood apraxia of speech (CAS). Results reveal reduced amplitude prior to speaking complex (multisyllabic) words relative to simple (monosyllabic) words for the CAS group over the right hemisphere during a time window thought to reflect phonological encoding of word forms. Group differences were also observed prior to production of spoken tokens regardless of word complexity during a time window just prior to speech onset (thought to reflect motor planning/programming). Results suggest differences in pre-speech neurolinguistic processes.
Comments on new iterative methods for solving linear systems

Directory of Open Access Journals (Sweden)

Wang Ke

2017-06-01

Full Text Available Some new iterative methods were presented by Du, Zheng and Wang for solving linear systems in [3], where it is shown that the new methods, comparing to the classical Jacobi or Gauss-Seidel method, can be applied to more systems and have faster convergence. This note shows that their methods are suitable for more matrices than positive matrices which the authors suggested through further analysis and numerical examples.
Studies of Visual Attention in Physics Problem Solving

Science.gov (United States)

Madsen, Adrian M.

2013-01-01

The work described here represents an effort to understand and influence visual attention while solving physics problems containing a diagram. Our visual system is guided by two types of processes--top-down and bottom-up. The top-down processes are internal and determined by ones prior knowledge and goals. The bottom-up processes are external and…
The normalities and abnormalities associated with speech in psychometrically-defined schizotypy.

Science.gov (United States)

Cohen, Alex S; Auster, Tracey L; McGovern, Jessica E; MacAulay, Rebecca K

2014-12-01

Speech deficits are thought to be an important feature of schizotypy--defined as the personality organization reflecting a putative liability for schizophrenia. There is reason to suspect that these deficits manifest as a function of limited cognitive resources. To evaluate this idea, we examined speech from individuals with psychometrically-defined schizotypy during a low cognitively-demanding task versus a relatively high cognitively-demanding task. A range of objective, computer-based measures of speech tapping speech production (silence, number and length of pauses, number and length of utterances), speech variability (global and local intonation and emphasis) and speech content (word fillers, idea density) were employed. Data for control (n=37) and schizotypy (n=39) groups were examined. Results did not confirm our hypotheses. While the cognitive-load task reduced speech expressivity for subjects as a group for most variables, the schizotypy group was not more pathological in speech characteristics compared to the control group. Interestingly, some aspects of speech in schizotypal versus control subjects were healthier under high cognitive load. Moreover, schizotypal subjects performed better, at a trend level, than controls on the cognitively demanding task. These findings hold important implications for our understanding of the neurocognitive architecture associated with the schizophrenia-spectrum. Of particular note concerns the apparent mismatch between self-reported schizotypal traits and objective performance, and the resiliency of speech under cognitive stress in persons with high levels of schizotypy. Copyright © 2014 Elsevier B.V. All rights reserved.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.