WorldWideScience

Sample records for perceptual speech assessment

  1. Perceptual Speech Assessment After Anterior Maxillary Distraction in Patients With Cleft Maxillary Hypoplasia.

    Science.gov (United States)

    Richardson, Sunil; Seelan, Nikkie S; Selvaraj, Dhivakar; Khandeparker, Rakshit V; Gnanamony, Sangeetha

    2016-06-01

    To assess speech outcomes after anterior maxillary distraction (AMD) in patients with cleft-related maxillary hypoplasia. Fifty-eight patients at least 10 years old with cleft-related maxillary hypoplasia were included in this study irrespective of gender, type of cleft lip and palate, and amount of required advancement. AMD was carried out in all patients using a tooth-borne palatal distractor by a single oral and maxillofacial surgeon. Perceptual speech assessment was performed by 2 speech language pathologists preoperatively, before placement of the distractor device, and 6 months postoperatively using the scoring system of Perkins et al (Plast Reconstr Surg 116:72, 2005); the system evaluates velopharyngeal insufficiency (VPI), resonance, nasal air emission, articulation errors, and intelligibility. The data obtained were tabulated and subjected to statistical analysis using Wilcoxon signed rank test. A P value less than .05 was considered significant. Eight patients were lost to follow-up. At 6-month follow-up, improvements of 62% (n = 31), 64% (n = 32), 50% (n = 25), 68% (n = 34), and 70% (n = 35) in VPI, resonance, nasal air emission, articulation, and intelligibility, respectively, were observed, with worsening of all parameters in 1 patient (2%). The results for all tested parameters were highly significant (P ≤ .001). AMD offers a substantial improvement in speech for all 5 parameters of perceptual speech assessment. Copyright © 2016 The American Association of Oral and Maxillofacial Surgeons. Published by Elsevier Inc. All rights reserved.

  2. Audiovisual speech perception development at varying levels of perceptual processing.

    Science.gov (United States)

    Lalonde, Kaylah; Holt, Rachael Frush

    2016-04-01

    This study used the auditory evaluation framework [Erber (1982). Auditory Training (Alexander Graham Bell Association, Washington, DC)] to characterize the influence of visual speech on audiovisual (AV) speech perception in adults and children at multiple levels of perceptual processing. Six- to eight-year-old children and adults completed auditory and AV speech perception tasks at three levels of perceptual processing (detection, discrimination, and recognition). The tasks differed in the level of perceptual processing required to complete them. Adults and children demonstrated visual speech influence at all levels of perceptual processing. Whereas children demonstrated the same visual speech influence at each level of perceptual processing, adults demonstrated greater visual speech influence on tasks requiring higher levels of perceptual processing. These results support previous research demonstrating multiple mechanisms of AV speech processing (general perceptual and speech-specific mechanisms) with independent maturational time courses. The results suggest that adults rely on both general perceptual mechanisms that apply to all levels of perceptual processing and speech-specific mechanisms that apply when making phonetic decisions and/or accessing the lexicon. Six- to eight-year-old children seem to rely only on general perceptual mechanisms across levels. As expected, developmental differences in AV benefit on this and other recognition tasks likely reflect immature speech-specific mechanisms and phonetic processing in children.

  3. Quality of synthetic speech perceptual dimensions, influencing factors, and instrumental assessment

    CERN Document Server

    Hinterleitner, Florian

    2017-01-01

    This book reviews research towards perceptual quality dimensions of synthetic speech, compares these findings with the state of the art, and derives a set of five universal perceptual quality dimensions for TTS signals. They are: (i) naturalness of voice, (ii) prosodic quality, (iii) fluency and intelligibility, (iv) absence of disturbances, and (v) calmness. Moreover, a test protocol for the efficient indentification of those dimensions in a listening test is introduced. Furthermore, several factors influencing these dimensions are examined. In addition, different techniques for the instrumental quality assessment of TTS signals are introduced, reviewed and tested. Finally, the requirements for the integration of an instrumental quality measure into a concatenative TTS system are examined.

  4. Cerebellar tDCS dissociates the timing of perceptual decisions from perceptual change in speech

    NARCIS (Netherlands)

    Lametti, D.R.; Oostwoud Wijdenes, L.; Bonaiuto, J.; Bestmann, S.; Rothwell, J.C.

    2016-01-01

    Neuroimaging studies suggest that the cerebellum might play a role in both speech perception and speech perceptual learning. However, it remains unclear what this role is: does the cerebellum directly contribute to the perceptual decision? Or does it contribute to the timing of perceptual decisions?

  5. Visual speech information: a help or hindrance in perceptual processing of dysarthric speech.

    Science.gov (United States)

    Borrie, Stephanie A

    2015-03-01

    This study investigated the influence of visual speech information on perceptual processing of neurologically degraded speech. Fifty listeners identified spastic dysarthric speech under both audio (A) and audiovisual (AV) conditions. Condition comparisons revealed that the addition of visual speech information enhanced processing of the neurologically degraded input in terms of (a) acuity (percent phonemes correct) of vowels and consonants and (b) recognition (percent words correct) of predictive and nonpredictive phrases. Listeners exploited stress-based segmentation strategies more readily in AV conditions, suggesting that the perceptual benefit associated with adding visual speech information to the auditory signal-the AV advantage-has both segmental and suprasegmental origins. Results also revealed that the magnitude of the AV advantage can be predicted, to some degree, by the extent to which an individual utilizes syllabic stress cues to inform word recognition in AV conditions. Findings inform the development of a listener-specific model of speech perception that applies to processing of dysarthric speech in everyday communication contexts.

  6. Audiovisual speech perception development at varying levels of perceptual processing

    OpenAIRE

    Lalonde, Kaylah; Holt, Rachael Frush

    2016-01-01

    This study used the auditory evaluation framework [Erber (1982). Auditory Training (Alexander Graham Bell Association, Washington, DC)] to characterize the influence of visual speech on audiovisual (AV) speech perception in adults and children at multiple levels of perceptual processing. Six- to eight-year-old children and adults completed auditory and AV speech perception tasks at three levels of perceptual processing (detection, discrimination, and recognition). The tasks differed in the le...

  7. Audiovisual Cues and Perceptual Learning of Spectrally Distorted Speech

    Science.gov (United States)

    Pilling, Michael; Thomas, Sharon

    2011-01-01

    Two experiments investigate the effectiveness of audiovisual (AV) speech cues (cues derived from both seeing and hearing a talker speak) in facilitating perceptual learning of spectrally distorted speech. Speech was distorted through an eight channel noise-vocoder which shifted the spectral envelope of the speech signal to simulate the properties…

  8. Voice and Speech Quality Perception Assessment and Evaluation

    CERN Document Server

    Jekosch, Ute

    2005-01-01

    Foundations of Voice and Speech Quality Perception starts out with the fundamental question of: "How do listeners perceive voice and speech quality and how can these processes be modeled?" Any quantitative answers require measurements. This is natural for physical quantities but harder to imagine for perceptual measurands. This book approaches the problem by actually identifying major perceptual dimensions of voice and speech quality perception, defining units wherever possible and offering paradigms to position these dimensions into a structural skeleton of perceptual speech and voice quality. The emphasis is placed on voice and speech quality assessment of systems in artificial scenarios. Many scientific fields are involved. This book bridges the gap between two quite diverse fields, engineering and humanities, and establishes the new research area of Voice and Speech Quality Perception.

  9. Audiovisual cues benefit recognition of accented speech in noise but not perceptual adaptation.

    Science.gov (United States)

    Banks, Briony; Gowen, Emma; Munro, Kevin J; Adank, Patti

    2015-01-01

    Perceptual adaptation allows humans to recognize different varieties of accented speech. We investigated whether perceptual adaptation to accented speech is facilitated if listeners can see a speaker's facial and mouth movements. In Study 1, participants listened to sentences in a novel accent and underwent a period of training with audiovisual or audio-only speech cues, presented in quiet or in background noise. A control group also underwent training with visual-only (speech-reading) cues. We observed no significant difference in perceptual adaptation between any of the groups. To address a number of remaining questions, we carried out a second study using a different accent, speaker and experimental design, in which participants listened to sentences in a non-native (Japanese) accent with audiovisual or audio-only cues, without separate training. Participants' eye gaze was recorded to verify that they looked at the speaker's face during audiovisual trials. Recognition accuracy was significantly better for audiovisual than for audio-only stimuli; however, no statistical difference in perceptual adaptation was observed between the two modalities. Furthermore, Bayesian analysis suggested that the data supported the null hypothesis. Our results suggest that although the availability of visual speech cues may be immediately beneficial for recognition of unfamiliar accented speech in noise, it does not improve perceptual adaptation.

  10. Divided attention disrupts perceptual encoding during speech recognition.

    Science.gov (United States)

    Mattys, Sven L; Palmer, Shekeila D

    2015-03-01

    Performing a secondary task while listening to speech has a detrimental effect on speech processing, but the locus of the disruption within the speech system is poorly understood. Recent research has shown that cognitive load imposed by a concurrent visual task increases dependency on lexical knowledge during speech processing, but it does not affect lexical activation per se. This suggests that "lexical drift" under cognitive load occurs either as a post-lexical bias at the decisional level or as a secondary consequence of reduced perceptual sensitivity. This study aimed to adjudicate between these alternatives using a forced-choice task that required listeners to identify noise-degraded spoken words with or without the addition of a concurrent visual task. Adding cognitive load increased the likelihood that listeners would select a word acoustically similar to the target even though its frequency was lower than that of the target. Thus, there was no evidence that cognitive load led to a high-frequency response bias. Rather, cognitive load seems to disrupt sublexical encoding, possibly by impairing perceptual acuity at the auditory periphery.

  11. Role of working memory and lexical knowledge in perceptual restoration of interrupted speech.

    Science.gov (United States)

    Nagaraj, Naveen K; Magimairaj, Beula M

    2017-12-01

    The role of working memory (WM) capacity and lexical knowledge in perceptual restoration (PR) of missing speech was investigated using the interrupted speech perception paradigm. Speech identification ability, which indexed PR, was measured using low-context sentences periodically interrupted at 1.5 Hz. PR was measured for silent gated, low-frequency speech noise filled, and low-frequency fine-structure and envelope filled interrupted conditions. WM capacity was measured using verbal and visuospatial span tasks. Lexical knowledge was assessed using both receptive vocabulary and meaning from context tests. Results showed that PR was better for speech noise filled condition than other conditions tested. Both receptive vocabulary and verbal WM capacity explained unique variance in PR for the speech noise filled condition, but were unrelated to performance in the silent gated condition. It was only receptive vocabulary that uniquely predicted PR for fine-structure and envelope filled conditions. These findings suggest that the contribution of lexical knowledge and verbal WM during PR depends crucially on the information content that replaced the silent intervals. When perceptual continuity was partially restored by filler speech noise, both lexical knowledge and verbal WM capacity facilitated PR. Importantly, for fine-structure and envelope filled interrupted conditions, lexical knowledge was crucial for PR.

  12. Automated speech quality monitoring tool based on perceptual evaluation

    OpenAIRE

    Vozňák, Miroslav; Rozhon, Jan

    2010-01-01

    The paper deals with a speech quality monitoring tool which we have developed in accordance with PESQ (Perceptual Evaluation of Speech Quality) and is automatically running and calculating the MOS (Mean Opinion Score). Results are stored into database and used in a research project investigating how meteorological conditions influence the speech quality in a GSM network. The meteorological station, which is located in our university campus provides information about a temperature,...

  13. Perceptual Speech Assessment after Maxillary Advancement Osteotomy in Patients with a Repaired Cleft Lip and Palate

    Directory of Open Access Journals (Sweden)

    Seok-Kwun Kim

    2012-05-01

    Full Text Available BackgroundMaxillary hypoplasia refers to a deficiency in the growth of the maxilla commonly seen in patients with a repaired cleft palate. Those who develop maxillary hypoplasia can be offered a repositioning of the maxilla to a functional and esthetic position. Velopharyngeal dysfunction is one of the important problems affecting speech after maxillary advancement surgery. The aim of this study was to investigate the impact of maxillary advancement on repaired cleft palate patients without preoperative deterioration in speech compared with non-cleft palate patients.MethodsEighteen patients underwent Le Fort I osteotomy between 2005 and 2011. One patient was excluded due to preoperative deterioration in speech. Eight repaired cleft palate patients belonged to group A, and 9 non-cleft palate patients belonged to group B. Speech assessments were performed preoperatively and postoperatively by using a speech screening protocol that consisted of a list of single words designed by Ok-Ran Jung. Wilcoxon signed rank test was used to determine if there were significant differences between the preoperative and postoperative outcomes in each group A and B. And Mann-Whitney U test was used to determine if there were significant differences in the change of score between groups A and B.ResultsNo patients had any noticeable change in speech production on perceptual assessment after maxillary advancement in our study. Furthermore, there were no significant differences between groups A and B.ConclusionsRepaired cleft palate patients without preoperative velopharyngeal dysfunction would not have greater risk of deterioration of velopharyngeal function after maxillary advancement compared to non-cleft palate patients.

  14. Perceptual speech assessment after maxillary advancement osteotomy in patients with a repaired cleft lip and palate.

    Science.gov (United States)

    Kim, Seok-Kwun; Kim, Ju-Chan; Moon, Ju-Bong; Lee, Keun-Cheol

    2012-05-01

    Maxillary hypoplasia refers to a deficiency in the growth of the maxilla commonly seen in patients with a repaired cleft palate. Those who develop maxillary hypoplasia can be offered a repositioning of the maxilla to a functional and esthetic position. Velopharyngeal dysfunction is one of the important problems affecting speech after maxillary advancement surgery. The aim of this study was to investigate the impact of maxillary advancement on repaired cleft palate patients without preoperative deterioration in speech compared with non-cleft palate patients. Eighteen patients underwent Le Fort I osteotomy between 2005 and 2011. One patient was excluded due to preoperative deterioration in speech. Eight repaired cleft palate patients belonged to group A, and 9 non-cleft palate patients belonged to group B. Speech assessments were performed preoperatively and postoperatively by using a speech screening protocol that consisted of a list of single words designed by Ok-Ran Jung. Wilcoxon signed rank test was used to determine if there were significant differences between the preoperative and postoperative outcomes in each group A and B. And Mann-Whitney U test was used to determine if there were significant differences in the change of score between groups A and B. No patients had any noticeable change in speech production on perceptual assessment after maxillary advancement in our study. Furthermore, there were no significant differences between groups A and B. Repaired cleft palate patients without preoperative velopharyngeal dysfunction would not have greater risk of deterioration of velopharyngeal function after maxillary advancement compared to non-cleft palate patients.

  15. Methodology for perceptual assessment of speech in patients with cleft palate: a critical review of the literature.

    Science.gov (United States)

    Lohmander, Anette; Olsson, Maria

    2004-01-01

    This review of 88 articles in three international journals was undertaken for the purpose of investigating the methodology for perceptual speech assessment in patients with cleft palate. The articles were published between 1980 and 2000 in the Cleft Palate-Craniofacial Journal, the International Journal of Language and Communication Disorders, and Folia Phoniatrica et Logopaedica. The majority of articles (76) were published in the Cleft Palate-Craniofacial Journal, with an increase in articles during the 1990s and 2000. Information about measures or variables was clearly given in all articles. However, the review raises several major concerns regarding method for collection and documentation of data and method for measurement. The most distressing findings were the use of a cross-sectional design in studies of few patients with large age ranges and different types of clefts, the use of highly variable speech samples, and the lack of information about listeners and on reliability. It is hoped that ongoing national and international collaborative efforts to standardize procedures for collection and analysis of perceptual data will help to eliminate such concerns and thus make comparison of published results possible in the future.

  16. Improved Kalman Filter-Based Speech Enhancement with Perceptual Post-Filtering

    Institute of Scientific and Technical Information of China (English)

    WEIJianqiang; DULimin; YANZhaoli; ZENGHui

    2004-01-01

    In this paper, a Kalman filter-based speech enhancement algorithm with some improvements of previous work is presented. A new technique based on spectral subtraction is used for separation speech and noise characteristics from noisy speech and for the computation of speech and noise Autoregressive (AR) parameters. In order to obtain a Kalman filter output with high audible quality, a perceptual post-filter is placed at the output of the Kalman filter to smooth the enhanced speech spectra.Extensive experiments indicate that this newly proposed method works well.

  17. Relationship between perceptual learning in speech and statistical learning in younger and older adults

    Directory of Open Access Journals (Sweden)

    Thordis Marisa Neger

    2014-09-01

    Full Text Available Within a few sentences, listeners learn to understand severely degraded speech such as noise-vocoded speech. However, individuals vary in the amount of such perceptual learning and it is unclear what underlies these differences. The present study investigates whether perceptual learning in speech relates to statistical learning, as sensitivity to probabilistic information may aid identification of relevant cues in novel speech input. If statistical learning and perceptual learning (partly draw on the same general mechanisms, then statistical learning in a non-auditory modality using non-linguistic sequences should predict adaptation to degraded speech.In the present study, 73 older adults (aged over 60 years and 60 younger adults (aged between 18 and 30 years performed a visual artificial grammar learning task and were presented with sixty meaningful noise-vocoded sentences in an auditory recall task. Within age groups, sentence recognition performance over exposure was analyzed as a function of statistical learning performance, and other variables that may predict learning (i.e., hearing, vocabulary, attention switching control, working memory and processing speed. Younger and older adults showed similar amounts of perceptual learning, but only younger adults showed significant statistical learning. In older adults, improvement in understanding noise-vocoded speech was constrained by age. In younger adults, amount of adaptation was associated with lexical knowledge and with statistical learning ability. Thus, individual differences in general cognitive abilities explain listeners' variability in adapting to noise-vocoded speech. Results suggest that perceptual and statistical learning share mechanisms of implicit regularity detection, but that the ability to detect statistical regularities is impaired in older adults if visual sequences are presented quickly.

  18. Perceptual and Acoustic Reliability Estimates for the Speech Disorders Classification System (SDCS)

    Science.gov (United States)

    Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.

    2010-01-01

    A companion paper describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). The SDCS uses perceptual and acoustic data reduction methods to obtain information on a speaker's speech, prosody, and voice. The present paper provides reliability estimates for…

  19. Constraints on the Transfer of Perceptual Learning in Accented Speech

    Science.gov (United States)

    Eisner, Frank; Melinger, Alissa; Weber, Andrea

    2013-01-01

    The perception of speech sounds can be re-tuned through a mechanism of lexically driven perceptual learning after exposure to instances of atypical speech production. This study asked whether this re-tuning is sensitive to the position of the atypical sound within the word. We investigated perceptual learning using English voiced stop consonants, which are commonly devoiced in word-final position by Dutch learners of English. After exposure to a Dutch learner’s productions of devoiced stops in word-final position (but not in any other positions), British English (BE) listeners showed evidence of perceptual learning in a subsequent cross-modal priming task, where auditory primes with devoiced final stops (e.g., “seed”, pronounced [si:th]), facilitated recognition of visual targets with voiced final stops (e.g., SEED). In Experiment 1, this learning effect generalized to test pairs where the critical contrast was in word-initial position, e.g., auditory primes such as “town” facilitated recognition of visual targets like DOWN. Control listeners, who had not heard any stops by the speaker during exposure, showed no learning effects. The generalization to word-initial position did not occur when participants had also heard correctly voiced, word-initial stops during exposure (Experiment 2), and when the speaker was a native BE speaker who mimicked the word-final devoicing (Experiment 3). The readiness of the perceptual system to generalize a previously learned adjustment to other positions within the word thus appears to be modulated by distributional properties of the speech input, as well as by the perceived sociophonetic characteristics of the speaker. The results suggest that the transfer of pre-lexical perceptual adjustments that occur through lexically driven learning can be affected by a combination of acoustic, phonological, and sociophonetic factors. PMID:23554598

  20. The role of training structure in perceptual learning of accented speech.

    Science.gov (United States)

    Tzeng, Christina Y; Alexander, Jessica E D; Sidaras, Sabrina K; Nygaard, Lynne C

    2016-11-01

    Foreign-accented speech contains multiple sources of variation that listeners learn to accommodate. Extending previous findings showing that exposure to high-variation training facilitates perceptual learning of accented speech, the current study examines to what extent the structure of training materials affects learning. During training, native adult speakers of American English transcribed sentences spoken in English by native Spanish-speaking adults. In Experiment 1, training stimuli were blocked by speaker, sentence, or randomized with respect to speaker and sentence (Variable training). At test, listeners transcribed novel English sentences produced by unfamiliar Spanish-accented speakers. Listeners' transcription accuracy was highest in the Variable condition, suggesting that varying both speaker identity and sentence across training trials enabled listeners to generalize their learning to novel speakers and linguistic content. Experiment 2 assessed the extent to which ordering of training tokens by a single factor, speaker intelligibility, would facilitate speaker-independent accent learning, finding that listeners' test performance did not reliably differ from that in the no-training control condition. Overall, these results suggest that the structure of training exposure, specifically trial-to-trial variation on both speaker's voice and linguistic content, facilitates learning of the systematic properties of accented speech. The current findings suggest a crucial role of training structure in optimizing perceptual learning. Beyond characterizing the types of variation listeners encode in their representations of spoken utterances, theories of spoken language processing should incorporate the role of training structure in learning lawful variation in speech. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  1. More than a boundary shift: Perceptual adaptation to foreign-accented speech reshapes the internal structure of phonetic categories.

    Science.gov (United States)

    Xie, Xin; Theodore, Rachel M; Myers, Emily B

    2017-01-01

    The literature on perceptual learning for speech shows that listeners use lexical information to disambiguate phonetically ambiguous speech sounds and that they maintain this new mapping for later recognition of ambiguous sounds for a given talker. Evidence for this kind of perceptual reorganization has focused on phonetic category boundary shifts. Here, we asked whether listeners adjust both category boundaries and internal category structure in rapid adaptation to foreign accents. We investigated the perceptual learning of Mandarin-accented productions of word-final voiced stops in English. After exposure to a Mandarin speaker's productions, native-English listeners' adaptation to the talker was tested in 3 ways: a cross-modal priming task to assess spoken word recognition (Experiment 1), a category identification task to assess shifts in the phonetic boundary (Experiment 2), and a goodness rating task to assess internal category structure (Experiment 3). Following exposure, both category boundary and internal category structure were adjusted; moreover, these prelexical changes facilitated subsequent word recognition. Together, the results demonstrate that listeners' sensitivity to acoustic-phonetic detail in the accented input promoted a dynamic, comprehensive reorganization of their perceptual response as a consequence of exposure to the accented input. We suggest that an examination of internal category structure is important for a complete account of the mechanisms of perceptual learning. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  2. Audiomotor Perceptual Training Enhances Speech Intelligibility in Background Noise.

    Science.gov (United States)

    Whitton, Jonathon P; Hancock, Kenneth E; Shannon, Jeffrey M; Polley, Daniel B

    2017-11-06

    Sensory and motor skills can be improved with training, but learning is often restricted to practice stimuli. As an exception, training on closed-loop (CL) sensorimotor interfaces, such as action video games and musical instruments, can impart a broad spectrum of perceptual benefits. Here we ask whether computerized CL auditory training can enhance speech understanding in levels of background noise that approximate a crowded restaurant. Elderly hearing-impaired subjects trained for 8 weeks on a CL game that, like a musical instrument, challenged them to monitor subtle deviations between predicted and actual auditory feedback as they moved their fingertip through a virtual soundscape. We performed our study as a randomized, double-blind, placebo-controlled trial by training other subjects in an auditory working-memory (WM) task. Subjects in both groups improved at their respective auditory tasks and reported comparable expectations for improved speech processing, thereby controlling for placebo effects. Whereas speech intelligibility was unchanged after WM training, subjects in the CL training group could correctly identify 25% more words in spoken sentences or digit sequences presented in high levels of background noise. Numerically, CL audiomotor training provided more than three times the benefit of our subjects' hearing aids for speech processing in noisy listening conditions. Gains in speech intelligibility could be predicted from gameplay accuracy and baseline inhibitory control. However, benefits did not persist in the absence of continuing practice. These studies employ stringent clinical standards to demonstrate that perceptual learning on a computerized audio game can transfer to "real-world" communication challenges. Copyright © 2017 Elsevier Ltd. All rights reserved.

  3. Perceptual Assessment of Velopharyngeal Dysfunction by Otolaryngology Residents.

    Science.gov (United States)

    Butts, Sydney C; Truong, Alan; Forde, Christina; Stefanov, Dimitre G; Marrinan, Eileen

    2016-12-01

    To assess the ability of otolaryngology residents to rate the hypernasal resonance of patients with velopharyngeal dysfunction. We hypothesize that experience (postgraduate year [PGY] level) and training will result in improved ratings of speech samples. Prospective cohort study. Otolaryngology training programs at 2 academic medical centers. Thirty otolaryngology residents (PGY 1-5) were enrolled in the study. All residents rated 30 speech samples at 2 separate times. Half the residents completed a training module between the rating exercises, with the other half serving as a control group. Percentage agreement with the expert rating of each speech sample and intrarater reliability were calculated for each resident. Analysis of covariance was used to model accuracy at session 2. The median percentage agreement at session 1 was 53.3% for all residents. At the second session, the median scores were 53.3% for the control group and 60% for the training group, but this difference was not statistically significant. Intrarater reliability was moderate for both groups. Residents were more accurate in their ratings of normal and severely hypernasal speech. There was no correlation between rating accuracy and PGY level. Score at session 1 positively correlated with score at session 2. Perceptual training of otolaryngology residents has the potential to improve their ratings of hypernasal speech. Length of time in residency may not be best predictor of perceptual skill. Training modalities incorporating practice with hypernasal speech samples could improve rater skills and should be studied more extensively. © American Academy of Otolaryngology—Head and Neck Surgery Foundation 2016.

  4. Auditory Perceptual Learning for Speech Perception Can be Enhanced by Audiovisual Training.

    Science.gov (United States)

    Bernstein, Lynne E; Auer, Edward T; Eberhardt, Silvio P; Jiang, Jintao

    2013-01-01

    Speech perception under audiovisual (AV) conditions is well known to confer benefits to perception such as increased speed and accuracy. Here, we investigated how AV training might benefit or impede auditory perceptual learning of speech degraded by vocoding. In Experiments 1 and 3, participants learned paired associations between vocoded spoken nonsense words and nonsense pictures. In Experiment 1, paired-associates (PA) AV training of one group of participants was compared with audio-only (AO) training of another group. When tested under AO conditions, the AV-trained group was significantly more accurate than the AO-trained group. In addition, pre- and post-training AO forced-choice consonant identification with untrained nonsense words showed that AV-trained participants had learned significantly more than AO participants. The pattern of results pointed to their having learned at the level of the auditory phonetic features of the vocoded stimuli. Experiment 2, a no-training control with testing and re-testing on the AO consonant identification, showed that the controls were as accurate as the AO-trained participants in Experiment 1 but less accurate than the AV-trained participants. In Experiment 3, PA training alternated AV and AO conditions on a list-by-list basis within participants, and training was to criterion (92% correct). PA training with AO stimuli was reliably more effective than training with AV stimuli. We explain these discrepant results in terms of the so-called "reverse hierarchy theory" of perceptual learning and in terms of the diverse multisensory and unisensory processing resources available to speech perception. We propose that early AV speech integration can potentially impede auditory perceptual learning; but visual top-down access to relevant auditory features can promote auditory perceptual learning.

  5. Automated Intelligibility Assessment of Pathological Speech Using Phonological Features

    Directory of Open Access Journals (Sweden)

    Catherine Middag

    2009-01-01

    Full Text Available It is commonly acknowledged that word or phoneme intelligibility is an important criterion in the assessment of the communication efficiency of a pathological speaker. People have therefore put a lot of effort in the design of perceptual intelligibility rating tests. These tests usually have the drawback that they employ unnatural speech material (e.g., nonsense words and that they cannot fully exclude errors due to listener bias. Therefore, there is a growing interest in the application of objective automatic speech recognition technology to automate the intelligibility assessment. Current research is headed towards the design of automated methods which can be shown to produce ratings that correspond well with those emerging from a well-designed and well-performed perceptual test. In this paper, a novel methodology that is built on previous work (Middag et al., 2008 is presented. It utilizes phonological features, automatic speech alignment based on acoustic models that were trained on normal speech, context-dependent speaker feature extraction, and intelligibility prediction based on a small model that can be trained on pathological speech samples. The experimental evaluation of the new system reveals that the root mean squared error of the discrepancies between perceived and computed intelligibilities can be as low as 8 on a scale of 0 to 100.

  6. Perceptual sensitivity to spectral properties of earlier sounds during speech categorization.

    Science.gov (United States)

    Stilp, Christian E; Assgari, Ashley A

    2018-02-28

    Speech perception is heavily influenced by surrounding sounds. When spectral properties differ between earlier (context) and later (target) sounds, this can produce spectral contrast effects (SCEs) that bias perception of later sounds. For example, when context sounds have more energy in low-F 1 frequency regions, listeners report more high-F 1 responses to a target vowel, and vice versa. SCEs have been reported using various approaches for a wide range of stimuli, but most often, large spectral peaks were added to the context to bias speech categorization. This obscures the lower limit of perceptual sensitivity to spectral properties of earlier sounds, i.e., when SCEs begin to bias speech categorization. Listeners categorized vowels (/ɪ/-/ɛ/, Experiment 1) or consonants (/d/-/g/, Experiment 2) following a context sentence with little spectral amplification (+1 to +4 dB) in frequency regions known to produce SCEs. In both experiments, +3 and +4 dB amplification in key frequency regions of the context produced SCEs, but lesser amplification was insufficient to bias performance. This establishes a lower limit of perceptual sensitivity where spectral differences across sounds can bias subsequent speech categorization. These results are consistent with proposed adaptation-based mechanisms that potentially underlie SCEs in auditory perception. Recent sounds can change what speech sounds we hear later. This can occur when the average frequency composition of earlier sounds differs from that of later sounds, biasing how they are perceived. These "spectral contrast effects" are widely observed when sounds' frequency compositions differ substantially. We reveal the lower limit of these effects, as +3 dB amplification of key frequency regions in earlier sounds was enough to bias categorization of the following vowel or consonant sound. Speech categorization being biased by very small spectral differences across sounds suggests that spectral contrast effects occur

  7. Speech Adaptation to Kinematic Recording Sensors: Perceptual and Acoustic Findings

    Science.gov (United States)

    Dromey, Christopher; Hunter, Elise; Nissen, Shawn L.

    2018-01-01

    Purpose: This study used perceptual and acoustic measures to examine the time course of speech adaptation after the attachment of electromagnetic sensor coils to the tongue, lips, and jaw. Method: Twenty native English speakers read aloud stimulus sentences before the attachment of the sensors, immediately after attachment, and again 5, 10, 15,…

  8. Multisensory training can promote or impede visual perceptual learning of speech stimuli: visual-tactile vs. visual-auditory training.

    Science.gov (United States)

    Eberhardt, Silvio P; Auer, Edward T; Bernstein, Lynne E

    2014-01-01

    In a series of studies we have been investigating how multisensory training affects unisensory perceptual learning with speech stimuli. Previously, we reported that audiovisual (AV) training with speech stimuli can promote auditory-only (AO) perceptual learning in normal-hearing adults but can impede learning in congenitally deaf adults with late-acquired cochlear implants. Here, impeder and promoter effects were sought in normal-hearing adults who participated in lipreading training. In Experiment 1, visual-only (VO) training on paired associations between CVCVC nonsense word videos and nonsense pictures demonstrated that VO words could be learned to a high level of accuracy even by poor lipreaders. In Experiment 2, visual-auditory (VA) training in the same paradigm but with the addition of synchronous vocoded acoustic speech impeded VO learning of the stimuli in the paired-associates paradigm. In Experiment 3, the vocoded AO stimuli were shown to be less informative than the VO speech. Experiment 4 combined vibrotactile speech stimuli with the visual stimuli during training. Vibrotactile stimuli were shown to promote visual perceptual learning. In Experiment 5, no-training controls were used to show that training with visual speech carried over to consonant identification of untrained CVCVC stimuli but not to lipreading words in sentences. Across this and previous studies, multisensory training effects depended on the functional relationship between pathways engaged during training. Two principles are proposed to account for stimulus effects: (1) Stimuli presented to the trainee's primary perceptual pathway will impede learning by a lower-rank pathway. (2) Stimuli presented to the trainee's lower rank perceptual pathway will promote learning by a higher-rank pathway. The mechanisms supporting these principles are discussed in light of multisensory reverse hierarchy theory (RHT).

  9. Perceptual pitch deficits coexist with pitch production difficulties in music but not Mandarin speech

    Science.gov (United States)

    Yang, Wu-xia; Feng, Jie; Huang, Wan-ting; Zhang, Cheng-xiang; Nan, Yun

    2014-01-01

    Congenital amusia is a musical disorder that mainly affects pitch perception. Among Mandarin speakers, some amusics also have difficulties in processing lexical tones (tone agnosics). To examine to what extent these perceptual deficits may be related to pitch production impairments in music and Mandarin speech, eight amusics, eight tone agnosics, and 12 age- and IQ-matched normal native Mandarin speakers were asked to imitate music note sequences and Mandarin words of comparable lengths. The results indicated that both the amusics and tone agnosics underperformed the controls on musical pitch production. However, tone agnosics performed no worse than the amusics, suggesting that lexical tone perception deficits may not aggravate musical pitch production difficulties. Moreover, these three groups were all able to imitate lexical tones with perfect intelligibility. Taken together, the current study shows that perceptual musical pitch and lexical tone deficits might coexist with musical pitch production difficulties. But at the same time these perceptual pitch deficits might not affect lexical tone production or the intelligibility of the speech words that were produced. The perception-production relationship for pitch among individuals with perceptual pitch deficits may be, therefore, domain-dependent. PMID:24474944

  10. Perceptual Pitch Deficits Coexist with Pitch Production Difficulties in Music but Not Mandarin Speech

    Directory of Open Access Journals (Sweden)

    Wu-xia eYang

    2014-01-01

    Full Text Available Congenital amusia is a musical disorder that mainly affects pitch perception. Among Mandarin speakers, some amusics also have difficulties in processing lexical tones (tone agnosics. To examine to what extent these perceptual deficits may be related to pitch production impairments in music and Mandarin speech, 8 amusics, 8 tone agnosics, and 12 age- and IQ-matched normal native Mandarin speakers were asked to imitate music note sequences and Mandarin words of comparable lengths. The results indicated that both the amusics and tone agnosics underperformed the controls on musical pitch production. However, tone agnosics performed no worse than the amusics, suggesting that lexical tone perception deficits may not aggravate musical pitch production difficulties. Moreover, these three groups were all able to imitate lexical tones with perfect intelligibility. Taken together, the current study shows that perceptual musical pitch and lexical tone deficits might coexist with musical pitch production difficulties. But at the same time these perceptual pitch deficits might not affect lexical tone production or the intelligibility of the speech words that were produced. The perception-production relationship for pitch among individuals with perceptual pitch deficits may be, therefore, domain-dependent.

  11. Auditory-Perceptual and Acoustic Methods in Measuring Dysphonia Severity of Korean Speech.

    Science.gov (United States)

    Maryn, Youri; Kim, Hyung-Tae; Kim, Jaeock

    2016-09-01

    The purpose of this study was to explore the criterion-related concurrent validity of two standardized auditory-perceptual rating protocols and the Acoustic Voice Quality Index (AVQI) for measuring dysphonia severity in Korean speech. Sixty native Korean subjects with various voice disorders were asked to sustain the vowel [a:] and to read aloud the Korean text "Walk." A 3-second midvowel portion of the sustained vowel and two sentences (with 25 syllables) were edited, concatenated, and analyzed according to methods described elsewhere. From 56 participants, both continuous speech and sustained vowel recordings had sufficiently high signal-to-noise ratios (35.5 dB and 37 dB on average, respectively) and were therefore subjected to further dysphonia severity analysis with (1) "G" or Grade from the GRBAS protocol, (2) "OS" or Overall Severity from the Consensus Auditory-Perceptual Evaluation of Voice protocol, and (3) AVQI. First, high correlations were found between G and OS (rS = 0.955 for sustained vowels; rS = 0.965 for continuous speech). Second, the AVQI showed a strong correlation with G (rS = 0.911) as well as OS (rP = 0.924). These findings are in agreement with similar studies dealing with continuous speech in other languages. The present study highlights the criterion-related concurrent validity of these methods in Korean speech. Furthermore, it supports the cross-linguistic robustness of the AVQI as a valid and objective marker of overall dysphonia severity. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  12. Plasticity in the Human Speech Motor System Drives Changes in Speech Perception

    Science.gov (United States)

    Lametti, Daniel R.; Rochet-Capellan, Amélie; Neufeld, Emily; Shiller, Douglas M.

    2014-01-01

    Recent studies of human speech motor learning suggest that learning is accompanied by changes in auditory perception. But what drives the perceptual change? Is it a consequence of changes in the motor system? Or is it a result of sensory inflow during learning? Here, subjects participated in a speech motor-learning task involving adaptation to altered auditory feedback and they were subsequently tested for perceptual change. In two separate experiments, involving two different auditory perceptual continua, we show that changes in the speech motor system that accompany learning drive changes in auditory speech perception. Specifically, we obtained changes in speech perception when adaptation to altered auditory feedback led to speech production that fell into the phonetic range of the speech perceptual tests. However, a similar change in perception was not observed when the auditory feedback that subjects' received during learning fell into the phonetic range of the perceptual tests. This indicates that the central motor outflow associated with vocal sensorimotor adaptation drives changes to the perceptual classification of speech sounds. PMID:25080594

  13. Subjective and Objective Quality Assessment of Single-Channel Speech Separation Algorithms

    DEFF Research Database (Denmark)

    Mowlaee, Pejman; Saeidi, Rahim; Christensen, Mads Græsbøll

    2012-01-01

    Previous studies on performance evaluation of single-channel speech separation (SCSS) algorithms mostly focused on automatic speech recognition (ASR) accuracy as their performance measure. Assessing the separated signals by different metrics other than this has the benefit that the results...... are expected to carry on to other applications beyond ASR. In this paper, in addition to conventional speech quality metrics (PESQ and SNRloss), we also evaluate the separation systems output using different source separation metrics: blind source separation evaluation (BSS EVAL) and perceptual evaluation...... that PESQ and PEASS quality metrics predict well the subjective quality of separated signals obtained by the separation systems. From the results it is observed that the short-time objective intelligibility (STOI) measure predict the speech intelligibility results....

  14. A Measure of the Auditory-perceptual Quality of Strain from Electroglottographic Analysis of Continuous Dysphonic Speech: Application to Adductor Spasmodic Dysphonia.

    Science.gov (United States)

    Somanath, Keerthan; Mau, Ted

    2016-11-01

    (1) To develop an automated algorithm to analyze electroglottographic (EGG) signal in continuous dysphonic speech, and (2) to identify EGG waveform parameters that correlate with the auditory-perceptual quality of strain in the speech of patients with adductor spasmodic dysphonia (ADSD). Software development with application in a prospective controlled study. EGG was recorded from 12 normal speakers and 12 subjects with ADSD reading excerpts from the Rainbow Passage. Data were processed by a new algorithm developed with the specific goal of analyzing continuous dysphonic speech. The contact quotient, pulse width, a new parameter peak skew, and various contact closing slope quotient and contact opening slope quotient measures were extracted. EGG parameters were compared between normal and ADSD speech. Within the ADSD group, intra-subject comparison was also made between perceptually strained syllables and unstrained syllables. The opening slope quotient SO7525 distinguished strained syllables from unstrained syllables in continuous speech within individual subjects with ADSD. The standard deviations, but not the means, of contact quotient, EGGW50, peak skew, and SO7525 were different between normal and ADSD speakers. The strain-stress pattern in continuous speech can be visualized as color gradients based on the variation of EGG parameter values. EGG parameters may provide a within-subject measure of vocal strain and serve as a marker for treatment response. The addition of EGG to multidimensional assessment may lead to improved characterization of the voice disturbance in ADSD. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  15. Auditory-perceptual speech analysis in children with cerebellar tumours: a long-term follow-up study.

    Science.gov (United States)

    De Smet, Hyo Jung; Catsman-Berrevoets, Coriene; Aarsen, Femke; Verhoeven, Jo; Mariën, Peter; Paquier, Philippe F

    2012-09-01

    Mutism and Subsequent Dysarthria (MSD) and the Posterior Fossa Syndrome (PFS) have become well-recognized clinical entities which may develop after resection of cerebellar tumours. However, speech characteristics following a period of mutism have not been documented in much detail. This study carried out a perceptual speech analysis in 24 children and adolescents (of whom 12 became mute in the immediate postoperative phase) 1-12.2 years after cerebellar tumour resection. The most prominent speech deficits in this study were distorted vowels, slow rate, voice tremor, and monopitch. Factors influencing long-term speech disturbances are presence or absence of postoperative PFS, the localisation of the surgical lesion and the type of adjuvant treatment. Long-term speech deficits may be present up to 12 years post-surgery. The speech deficits found in children and adolescents with cerebellar lesions following cerebellar tumour surgery do not necessarily resemble adult speech characteristics of ataxic dysarthria. Copyright © 2012 European Paediatric Neurology Society. Published by Elsevier Ltd. All rights reserved.

  16. Susceptibility to a multisensory speech illusion in older persons is driven by perceptual processes

    Directory of Open Access Journals (Sweden)

    Annalisa eSetti

    2013-09-01

    Full Text Available Recent studies suggest that multisensory integration is enhanced in older adults but it is not known whether this enhancement is solely driven by perceptual processes or affected by cognitive processes. Using the ‘McGurk illusion’, in Experiment 1 we found that audio-visual integration of incongruent audio-visual words was higher in older adults than in younger adults, although the recognition of either audio- or visual-only presented words was the same across groups. In Experiment 2 we tested recall of sentences within which an incongruent audio-visual speech word was embedded. The overall semantic meaning of the sentence was compatible with either one of the unisensory components of the target word and/or with the illusory percept. Older participants recalled more illusory audio-visual words in sentences than younger adults, however, there was no differential effect of word compatibility on recall for the two groups. Our findings suggest that the relatively high susceptibility to the audio-visual speech illusion in older participants is due more to perceptual than cognitive processing.

  17. Perceptual chunking and its effect on memory in speech processing: ERP and behavioral evidence.

    Science.gov (United States)

    Gilbert, Annie C; Boucher, Victor J; Jemel, Boutheina

    2014-01-01

    We examined how perceptual chunks of varying size in utterances can influence immediate memory of heard items (monosyllabic words). Using behavioral measures and event-related potentials (N400) we evaluated the quality of the memory trace for targets taken from perceived temporal groups (TGs) of three and four items. Variations in the amplitude of the N400 showed a better memory trace for items presented in TGs of three compared to those in groups of four. Analyses of behavioral responses along with P300 components also revealed effects of chunk position in the utterance. This is the first study to measure the online effects of perceptual chunks on the memory trace of spoken items. Taken together, the N400 and P300 responses demonstrate that the perceptual chunking of speech facilitates information buffering and a processing on a chunk-by-chunk basis.

  18. Use of Crowdsourcing to Assess the Ecological Validity of Perceptual-Training Paradigms in Dysarthria.

    Science.gov (United States)

    Lansford, Kaitlin L; Borrie, Stephanie A; Bystricky, Lukas

    2016-05-01

    It has been documented in laboratory settings that familiarizing listeners with dysarthric speech improves intelligibility of that speech. If these findings can be replicated in real-world settings, the ability to improve communicative function by focusing on communication partners has major implications for extending clinical practice in dysarthria rehabilitation. An important step toward development of a listener-targeted treatment approach requires establishment of its ecological validity. To this end, the present study leveraged the mechanism of crowdsourcing to determine whether perceptual-training benefits achieved by listeners in the laboratory could be elicited in an at-home computer-based scenario. Perceptual-training data (i.e., intelligibility scores from a posttraining transcription task) were collected from listeners in 2 settings-the laboratory and the crowdsourcing website Amazon Mechanical Turk. Consistent with previous findings, results revealed a main effect of training condition (training vs. control) on intelligibility scores. There was, however, no effect of training setting (Mechanical Turk vs. laboratory). Thus, the perceptual benefit achieved via Mechanical Turk was comparable to that achieved in the laboratory. This study provides evidence regarding the ecological validity of perceptual-training paradigms designed to improve intelligibility of dysarthric speech, thereby supporting their continued advancement as a listener-targeted treatment option.

  19. Perceptual effects of noise reduction by time-frequency masking of noisy speech.

    Science.gov (United States)

    Brons, Inge; Houben, Rolph; Dreschler, Wouter A

    2012-10-01

    Time-frequency masking is a method for noise reduction that is based on the time-frequency representation of a speech in noise signal. Depending on the estimated signal-to-noise ratio (SNR), each time-frequency unit is either attenuated or not. A special type of a time-frequency mask is the ideal binary mask (IBM), which has access to the real SNR (ideal). The IBM either retains or removes each time-frequency unit (binary mask). The IBM provides large improvements in speech intelligibility and is a valuable tool for investigating how different factors influence intelligibility. This study extends the standard outcome measure (speech intelligibility) with additional perceptual measures relevant for noise reduction: listening effort, noise annoyance, speech naturalness, and overall preference. Four types of time-frequency masking were evaluated: the original IBM, a tempered version of the IBM (called ITM) which applies limited and non-binary attenuation, and non-ideal masking (also tempered) with two different types of noise-estimation algorithms. The results from ideal masking imply that there is a trade-off between intelligibility and sound quality, which depends on the attenuation strength. Additionally, the results for non-ideal masking suggest that subjective measures can show effects of noise reduction even if noise reduction does not lead to differences in intelligibility.

  20. Utility and accuracy of perceptual voice and speech distinctions in the diagnosis of Parkinson's disease, PSP and MSA-P.

    Science.gov (United States)

    Miller, Nick; Nath, Uma; Noble, Emma; Burn, David

    2017-06-01

    To determine if perceptual speech measures distinguish people with Parkinson's disease (PD), multiple system atrophy with predominant parkinsonism (MSA-P) and progressive supranuclear palsy (PSP). Speech-language therapists blind to patient characteristics employed clinical rating scales to evaluate speech/voice in 24 people with clinically diagnosed PD, 17 with PSP and 9 with MSA-P, matched for disease duration (mean 4.9 years, standard deviation 2.2). No consistent intergroup differences appeared on specific speech/voice variables. People with PD were significantly less impaired on overall speech/voice severity. Analyses by severity suggested further investigation around laryngeal, resonance and fluency changes may characterize individual groups. MSA-P and PSP compared with PD were distinguished by severity of speech/voice deterioration, but individual speech/voice parameters failed to consistently differentiate groups.

  1. Acoustic and Perceptual Analyses of Adductor Spasmodic Dysphonia in Mandarin-speaking Chinese.

    Science.gov (United States)

    Chen, Zhipeng; Li, Jingyuan; Ren, Qingyi; Ge, Pingjiang

    2018-02-12

    The objective of this study was to examine the perceptual structure and acoustic characteristics of speech of patients with adductor spasmodic dysphonia (ADSD) in Mandarin. Case-Control Study MATERIALS AND METHODS: For the estimation of dysphonia level, perceptual and acoustic analysis were used for patients with ADSD (N = 20) and the control group (N = 20) that are Mandarin-Chinese speakers. For both subgroups, a sustained vowel and connected speech samples were obtained. The difference of perceptual and acoustic parameters between the two subgroups was assessed and analyzed. For acoustic assessment, the percentage of phonatory breaks (PBs) of connected reading and the percentage of aperiodic segments and frequency shifts (FS) of vowel and reading in patients with ADSD were significantly worse than controls, the mean harmonics-to-noise ratio and the fundamental frequency standard deviation of vowel as well. For perceptual evaluation, the rating of speech and vowel in patients with ADSD are significantly higher than controls. The percentage of aberrant acoustic events (PB, frequency shift, and aperiodic segment) and the fundamental frequency standard deviation and mean harmonics-to-noise ratio were significantly correlated with the perceptual rating in the vowel and reading productions. The perceptual and acoustic parameters of connected vowel and reading in patients with ADSD are worse than those in normal controls, and could validly and reliably estimate dysphonia of ADSD in Mandarin-speaking Chinese. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  2. Objective voice and speech analysis of persons with chronic hoarseness by prosodic analysis of speech samples.

    Science.gov (United States)

    Haderlein, Tino; Döllinger, Michael; Matoušek, Václav; Nöth, Elmar

    2016-10-01

    Automatic voice assessment is often performed using sustained vowels. In contrast, speech analysis of read-out texts can be applied to voice and speech assessment. Automatic speech recognition and prosodic analysis were used to find regression formulae between automatic and perceptual assessment of four voice and four speech criteria. The regression was trained with 21 men and 62 women (average age 49.2 years) and tested with another set of 24 men and 49 women (48.3 years), all suffering from chronic hoarseness. They read the text 'Der Nordwind und die Sonne' ('The North Wind and the Sun'). Five voice and speech therapists evaluated the data on 5-point Likert scales. Ten prosodic and recognition accuracy measures (features) were identified which describe all the examined criteria. Inter-rater correlation within the expert group was between r = 0.63 for the criterion 'match of breath and sense units' and r = 0.87 for the overall voice quality. Human-machine correlation was between r = 0.40 for the match of breath and sense units and r = 0.82 for intelligibility. The perceptual ratings of different criteria were highly correlated with each other. Likewise, the feature sets modeling the criteria were very similar. The automatic method is suitable for assessing chronic hoarseness in general and for subgroups of functional and organic dysphonia. In its current version, it is almost as reliable as a randomly picked rater from a group of voice and speech therapists.

  3. Cognitive-perceptual examination of remediation approaches to hypokinetic dysarthria.

    Science.gov (United States)

    McAuliffe, Megan J; Kerr, Sarah E; Gibson, Elizabeth M R; Anderson, Tim; LaShell, Patrick J

    2014-08-01

    To determine how increased vocal loudness and reduced speech rate affect listeners' cognitive-perceptual processing of hypokinetic dysarthric speech associated with Parkinson's disease. Fifty-one healthy listener participants completed a speech perception experiment. Listeners repeated phrases produced by 5 individuals with dysarthria across habitual, loud, and slow speaking modes. Listeners were allocated to habitual ( n = 17), loud ( n = 17), or slow ( n = 17) experimental conditions. Transcripts derived from the phrase repetition task were coded for overall accuracy (i.e., intelligibility), and perceptual error analyses examined how these conditions affected listeners' phonemic mapping (i.e., syllable resemblance) and lexical segmentation (i.e., lexical boundary error analysis). Both speech conditions provided obvious perceptual benefits to listeners. Overall, transcript accuracy was highest in the slow condition. In the loud condition, however, improvement was evidenced across the experiment. An error analysis suggested that listeners in the loud condition prioritized acoustic-phonetic cues in their attempts to resolve the degraded signal, whereas those in the slow condition appeared to preferentially weight lexical stress cues. Increased loudness and reduced rate exhibited differential effects on listeners' perceptual processing of dysarthric speech. The current study highlights the insights that may be gained from a cognitive-perceptual approach.

  4. Reliability in perceptual analysis of voice quality.

    Science.gov (United States)

    Bele, Irene Velsvik

    2005-12-01

    This study focuses on speaking voice quality in male teachers (n = 35) and male actors (n = 36), who represent untrained and trained voice users, because we wanted to investigate normal and supranormal voices. In this study, both substantial and methodologic aspects were considered. It includes a method for perceptual voice evaluation, and a basic issue was rater reliability. A listening group of 10 listeners, 7 experienced speech-language therapists, and 3 speech-language therapist students evaluated the voices by 15 vocal characteristics using VA scales. Two sets of voice signals were investigated: text reading (2 loudness levels) and sustained vowel (3 levels). The results indicated a high interrater reliability for most perceptual characteristics. Connected speech was evaluated more reliably, especially at the normal level, but both types of voice signals were evaluated reliably, although the reliability for connected speech was somewhat higher than for vowels. Experienced listeners tended to be more consistent in their ratings than did the student raters. Some vocal characteristics achieved acceptable reliability even with a smaller panel of listeners. The perceptual characteristics grouped in 4 factors reflected perceptual dimensions.

  5. Turkish- and English-speaking children display sensitivity to perceptual context in the referring expressions they produce in speech and gesture

    Science.gov (United States)

    Demir, Özlem Ece; So, Wing-Chee; Özyürek, Asli; Goldin-Meadow, Susan

    2012-01-01

    Speakers choose a particular expression based on many factors, including availability of the referent in the perceptual context. We examined whether, when expressing referents, monolingual English- and Turkish-speaking children: (1) are sensitive to perceptual context, (2) express this sensitivity in language-specific ways, and (3) use co-speech gestures to specify referents that are underspecified. We also explored the mechanisms underlying children’s sensitivity to perceptual context. Children described short vignettes to an experimenter under two conditions: The characters in the vignettes were present in the perceptual context (perceptual context); the characters were absent (no perceptual context). Children routinely used nouns in the no perceptual context condition, but shifted to pronouns (English-speaking children) or omitted arguments (Turkish-speaking children) in the perceptual context condition. Turkish-speaking children used underspecified referents more frequently than English-speaking children in the perceptual context condition; however, they compensated for the difference by using gesture to specify the forms. Gesture thus gives children learning structurally different languages a way to achieve comparable levels of specification while at the same time adhering to the referential expressions dictated by their language. PMID:22904588

  6. Electrophysiological assessment of audiovisual integration in speech perception

    DEFF Research Database (Denmark)

    Eskelund, Kasper; Dau, Torsten

    Speech perception integrates signal from ear and eye. This is witnessed by a wide range of audiovisual integration effects, such as ventriloquism and the McGurk illusion. Some behavioral evidence suggest that audiovisual integration of specific aspects is special for speech perception. However, our...... knowledge of such bimodal integration would be strengthened if the phenomena could be investigated by objective, neutrally based methods. One key question of the present work is if perceptual processing of audiovisual speech can be gauged with a specific signature of neurophysiological activity...... on the auditory speech percept? In two experiments, which both combine behavioral and neurophysiological measures, an uncovering of the relation between perception of faces and of audiovisual integration is attempted. Behavioral findings suggest a strong effect of face perception, whereas the MMN results are less...

  7. Comparative analysis of perceptual evaluation, acoustic analysis and indirect laryngoscopy for vocal assessment of a population with vocal complaint.

    Science.gov (United States)

    Nemr, Kátia; Amar, Ali; Abrahão, Marcio; Leite, Grazielle Capatto de Almeida; Köhle, Juliana; Santos, Alexandra de O; Correa, Luiz Artur Costa

    2005-01-01

    As a result of technology evolution and development, methods of voice evaluation have changed both in medical and speech and language pathology practice. To relate the results of perceptual evaluation, acoustic analysis and medical evaluation in the diagnosis of vocal and/or laryngeal affections of the population with vocal complaint. Clinical prospective. 29 people that attended vocal health protection campaign were evaluated. They were submitted to perceptual evaluation (AFPA), acoustic analysis (AA), indirect laryngoscopy (LI) and telelaryngoscopy (TL). Correlations between medical and speech language pathology evaluation methods were established, verifying possible statistical signification with the application of Fischer Exact Test. There were statistically significant results in the correlation between AFPA and LI, AFPA and TL, LI and TL. This research study conducted in a vocal health protection campaign presented correlations between speech language pathology evaluation and perceptual evaluation and clinical evaluation, as well as between vocal affection and/or laryngeal medical exams.

  8. Cross-language and second language speech perception

    DEFF Research Database (Denmark)

    Bohn, Ocke-Schwen

    2017-01-01

    in cross-language and second language speech perception research: The mapping issue (the perceptual relationship of sounds of the native and the nonnative language in the mind of the native listener and the L2 learner), the perceptual and learning difficulty/ease issue (how this relationship may or may...... not cause perceptual and learning difficulty), and the plasticity issue (whether and how experience with the nonnative language affects the perceptual organization of speech sounds in the mind of L2 learners). One important general conclusion from this research is that perceptual learning is possible at all...

  9. Speech Intelligibility Evaluation for Mobile Phones

    DEFF Research Database (Denmark)

    Jørgensen, Søren; Cubick, Jens; Dau, Torsten

    2015-01-01

    In the development process of modern telecommunication systems, such as mobile phones, it is common practice to use computer models to objectively evaluate the transmission quality of the system, instead of time-consuming perceptual listening tests. Such models have typically focused on the quality...... of the transmitted speech, while little or no attention has been provided to speech intelligibility. The present study investigated to what extent three state-of-the art speech intelligibility models could predict the intelligibility of noisy speech transmitted through mobile phones. Sentences from the Danish...... Dantale II speech material were mixed with three different kinds of background noise, transmitted through three different mobile phones, and recorded at the receiver via a local network simulator. The speech intelligibility of the transmitted sentences was assessed by six normal-hearing listeners...

  10. The Effects of Meaning-Based Auditory Training on Behavioral Measures of Perceptual Effort in Individuals with Impaired Hearing.

    Science.gov (United States)

    Sommers, Mitchell S; Tye-Murray, Nancy; Barcroft, Joe; Spehar, Brent P

    2015-11-01

    There has been considerable interest in measuring the perceptual effort required to understand speech, as well as to identify factors that might reduce such effort. In the current study, we investigated whether, in addition to improving speech intelligibility, auditory training also could reduce perceptual or listening effort. Perceptual effort was assessed using a modified version of the n-back memory task in which participants heard lists of words presented without background noise and were asked to continually update their memory of the three most recently presented words. Perceptual effort was indexed by memory for items in the three-back position immediately before, immediately after, and 3 months after participants completed the Computerized Learning Exercises for Aural Rehabilitation (clEAR), a 12-session computerized auditory training program. Immediate posttraining measures of perceptual effort indicated that participants could remember approximately one additional word compared to pretraining. Moreover, some training gains were retained at the 3-month follow-up, as indicated by significantly greater recall for the three-back item at the 3-month measurement than at pretest. There was a small but significant correlation between gains in intelligibility and gains in perceptual effort. The findings are discussed within the framework of a limited-capacity speech perception system.

  11. The speech perception skills of children with and without speech sound disorder.

    Science.gov (United States)

    Hearnshaw, Stephanie; Baker, Elise; Munro, Natalie

    To investigate whether Australian-English speaking children with and without speech sound disorder (SSD) differ in their overall speech perception accuracy. Additionally, to investigate differences in the perception of specific phonemes and the association between speech perception and speech production skills. Twenty-five Australian-English speaking children aged 48-60 months participated in this study. The SSD group included 12 children and the typically developing (TD) group included 13 children. Children completed routine speech and language assessments in addition to an experimental Australian-English lexical and phonetic judgement task based on Rvachew's Speech Assessment and Interactive Learning System (SAILS) program (Rvachew, 2009). This task included eight words across four word-initial phonemes-/k, ɹ, ʃ, s/. Children with SSD showed significantly poorer perceptual accuracy on the lexical and phonetic judgement task compared with TD peers. The phonemes /ɹ/ and /s/ were most frequently perceived in error across both groups. Additionally, the phoneme /ɹ/ was most commonly produced in error. There was also a positive correlation between overall speech perception and speech production scores. Children with SSD perceived speech less accurately than their typically developing peers. The findings suggest that an Australian-English variation of a lexical and phonetic judgement task similar to the SAILS program is promising and worthy of a larger scale study. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. High-frequency energy in singing and speech

    Science.gov (United States)

    Monson, Brian Bruce

    While human speech and the human voice generate acoustical energy up to (and beyond) 20 kHz, the energy above approximately 5 kHz has been largely neglected. Evidence is accruing that this high-frequency energy contains perceptual information relevant to speech and voice, including percepts of quality, localization, and intelligibility. The present research was an initial step in the long-range goal of characterizing high-frequency energy in singing voice and speech, with particular regard for its perceptual role and its potential for modification during voice and speech production. In this study, a database of high-fidelity recordings of talkers was created and used for a broad acoustical analysis and general characterization of high-frequency energy, as well as specific characterization of phoneme category, voice and speech intensity level, and mode of production (speech versus singing) by high-frequency energy content. Directionality of radiation of high-frequency energy from the mouth was also examined. The recordings were used for perceptual experiments wherein listeners were asked to discriminate between speech and voice samples that differed only in high-frequency energy content. Listeners were also subjected to gender discrimination tasks, mode-of-production discrimination tasks, and transcription tasks with samples of speech and singing that contained only high-frequency content. The combination of these experiments has revealed that (1) human listeners are able to detect very subtle level changes in high-frequency energy, and (2) human listeners are able to extract significant perceptual information from high-frequency energy.

  13. Dysarthric Bengali speech: A neurolinguistic study

    Directory of Open Access Journals (Sweden)

    Chakraborty N

    2008-01-01

    Full Text Available Background and Aims: Dysarthria affects linguistic domains such as respiration, phonation, articulation, resonance and prosody due to upper motor neuron, lower motor neuron, cerebellar or extrapyramidal tract lesions. Although Bengali is one of the major languages globally, dysarthric Bengali speech has not been subjected to neurolinguistic analysis. We attempted such an analysis with the goal of identifying the speech defects in native Bengali speakers in various types of dysarthria encountered in neurological disorders. Settings and Design: A cross-sectional observational study was conducted with 66 dysarthric subjects, predominantly middle-aged males, attending the Neuromedicine OPD of a tertiary care teaching hospital in Kolkata. Materials and Methods: After neurological examination, an instrument comprising commonly used Bengali words and a text block covering all Bengali vowels and consonants were used to carry out perceptual analysis of dysarthric speech. From recorded speech, 24 parameters pertaining to five linguistic domains were assessed. The Kruskal-Wallis analysis of variance, Chi-square test and Fisher′s exact test were used for analysis. Results: The dysarthria types were spastic (15 subjects, flaccid (10, mixed (12, hypokinetic (12, hyperkinetic (9 and ataxic (8. Of the 24 parameters assessed, 15 were found to occur in one or more types with a prevalence of at least 25%. Imprecise consonant was the most frequently occurring defect in most dysarthrias. The spectrum of defects in each type was identified. Some parameters were capable of distinguishing between types. Conclusions: This perceptual analysis has defined linguistic defects likely to be encountered in dysarthric Bengali speech in neurological disorders. The speech distortion can be described and distinguished by a limited number of parameters. This may be of importance to the speech therapist and neurologist in planning rehabilitation and further management.

  14. Cognitive-Perceptual Examination of Remediation Approaches to Hypokinetic Dysarthria

    Science.gov (United States)

    McAuliffe, Megan J.; Kerr, Sarah E.; Gibson, Elizabeth M. R.; Anderson, Tim; LaShell, Patrick J.

    2014-01-01

    Purpose: To determine how increased vocal loudness and reduced speech rate affect listeners' cognitive-perceptual processing of hypokinetic dysarthric speech associated with Parkinson's disease. Method: Fifty-one healthy listener participants completed a speech perception experiment. Listeners repeated phrases produced by 5 individuals…

  15. Perceptual statistical learning over one week in child speech production.

    Science.gov (United States)

    Richtsmeier, Peter T; Goffman, Lisa

    2017-07-01

    What cognitive mechanisms account for the trajectory of speech sound development, in particular, gradually increasing accuracy during childhood? An intriguing potential contributor is statistical learning, a type of learning that has been studied frequently in infant perception but less often in child speech production. To assess the relevance of statistical learning to developing speech accuracy, we carried out a statistical learning experiment with four- and five-year-olds in which statistical learning was examined over one week. Children were familiarized with and tested on word-medial consonant sequences in novel words. There was only modest evidence for statistical learning, primarily in the first few productions of the first session. This initial learning effect nevertheless aligns with previous statistical learning research. Furthermore, the overall learning effect was similar to an estimate of weekly accuracy growth based on normative studies. The results implicate other important factors in speech sound development, particularly learning via production. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. Effect of classic uvulopalatopharyngoplasty and laser-assisted uvulopalatopharyngoplasty on voice acoustics and speech nasalance

    International Nuclear Information System (INIS)

    Mahmoud Y Abu El-ella

    2010-01-01

    Uvulopalatopharyngoplasty (UPPP) is a commonly used surgical technique for oropharyngeal reconstruction in patients with obstructive sleep apnea (OSA). This procedure can be done either through the classic or the laser-assisted uvulopalatopharyngoplasty (LAUP) technique. The purpose of this study was to evaluate the effect of classic UPPP and LAUP on acoustics of voice and speech nasalance, and to compare the effect of each operation on these two domains. Patients and The study included 27 patients with a mean age of 46 years. All patients were diagnosed with OSA based on polysomnographic examination. Patients were divided into two groups according to the type of surgical procedure. Fifteen patients underwent classic UPPP, whereas 12 patients were subjected to LAUP. A full assessment was done for all patients preoperatively and postoperatively, including auditory perceptual assessment (APA) of voice and speech, objective assessment using acoustic voice analysis and nasometry. Auditory perceptual assessment of speech and voice, acoustic analysis of voice and nasometric analysis of speech did not show statistically significant differences between the preoperative and postoperative evaluations in either group (P>.05).The results of this study demonstrated that in patients with OSA, the surgical technique, whether classic UPPP or LAUP, does not have significant effects on the patients' voice quality or their speech outcomes (Author).

  17. What Information Is Necessary for Speech Categorization? Harnessing Variability in the Speech Signal by Integrating Cues Computed Relative to Expectations

    Science.gov (United States)

    McMurray, Bob; Jongman, Allard

    2011-01-01

    Most theories of categorization emphasize how continuous perceptual information is mapped to categories. However, equally important are the informational assumptions of a model, the type of information subserving this mapping. This is crucial in speech perception where the signal is variable and context dependent. This study assessed the…

  18. Effects of semantic context and feedback on perceptual learning of speech processed through an acoustic simulation of a cochlear implant.

    Science.gov (United States)

    Loebach, Jeremy L; Pisoni, David B; Svirsky, Mario A

    2010-02-01

    The effect of feedback and materials on perceptual learning was examined in listeners with normal hearing who were exposed to cochlear implant simulations. Generalization was most robust when feedback paired the spectrally degraded sentences with their written transcriptions, promoting mapping between the degraded signal and its acoustic-phonetic representation. Transfer-appropriate processing theory suggests that such feedback was most successful because the original learning conditions were reinstated at testing: Performance was facilitated when both training and testing contained degraded stimuli. In addition, the effect of semantic context on generalization was assessed by training listeners on meaningful or anomalous sentences. Training with anomalous sentences was as effective as that with meaningful sentences, suggesting that listeners were encouraged to use acoustic-phonetic information to identify speech than to make predictions from semantic context.

  19. Perceptual Mapping: A Methodology in the Assessment of Environmental Perceptions.

    Science.gov (United States)

    Sergent, Marie T.; Sedlacek, William E.

    1989-01-01

    Describes perceptual mapping, a newly developed method for assessing perceptions of campus environments. Describes evaluation of a student union by students using this method. Discusses the advantages and disadvantages of this perceptual mapping method for assessing college environments. (Author/ABL)

  20. Comparison of Perceptual Signs of Voice before and after Vocal Hygiene Program in Adults with Dysphonia

    Directory of Open Access Journals (Sweden)

    Seyyedeh Maryam khoddami

    2011-12-01

    Full Text Available Background and Aim: Vocal abuse and misuse are the most frequent causes of voice disorders. Consequently some therapy is needed to stop or modify such behaviors. This research was performed to study the effectiveness of vocal hygiene program on perceptual signs of voice in people with dysphonia.Methods: A Vocal hygiene program was performed to 8 adults with dysphonia for 6 weeks. At first, Consensus Auditory- Perceptual Evaluation of Voice was used to assess perceptual signs. Then the program was delivered, Individuals were followed in second and forth weeks visits. In the last session, perceptual assessment was performed and individuals’ opinions were collected. Perceptual findings were compared before and after the therapy.Results: After the program, mean score of perceptual assessment decreased. Mean score of every perceptual sign revealed significant difference before and after the therapy (p≤0.0001. «Loudness» had maximum score and coordination between speech and respiration indicated minimum score. All participants confirmed efficiency of the therapy.Conclusion: The vocal hygiene program improves all perceptual signs of voice although not equally. This deduction is confirmed by both clinician-based and patient-based assessments. As a result, vocal hygiene program is necessary for a comprehensive voice therapy but is not solely effective to resolve all voice problems.

  1. Audiovisual Temporal Recalibration for Speech in Synchrony Perception and Speech Identification

    Science.gov (United States)

    Asakawa, Kaori; Tanaka, Akihiro; Imai, Hisato

    We investigated whether audiovisual synchrony perception for speech could change after observation of the audiovisual temporal mismatch. Previous studies have revealed that audiovisual synchrony perception is re-calibrated after exposure to a constant timing difference between auditory and visual signals in non-speech. In the present study, we examined whether this audiovisual temporal recalibration occurs at the perceptual level even for speech (monosyllables). In Experiment 1, participants performed an audiovisual simultaneity judgment task (i.e., a direct measurement of the audiovisual synchrony perception) in terms of the speech signal after observation of the speech stimuli which had a constant audiovisual lag. The results showed that the “simultaneous” responses (i.e., proportion of responses for which participants judged the auditory and visual stimuli to be synchronous) at least partly depended on exposure lag. In Experiment 2, we adopted the McGurk identification task (i.e., an indirect measurement of the audiovisual synchrony perception) to exclude the possibility that this modulation of synchrony perception was solely attributable to the response strategy using stimuli identical to those of Experiment 1. The characteristics of the McGurk effect reported by participants depended on exposure lag. Thus, it was shown that audiovisual synchrony perception for speech could be modulated following exposure to constant lag both in direct and indirect measurement. Our results suggest that temporal recalibration occurs not only in non-speech signals but also in monosyllabic speech at the perceptual level.

  2. The relationship between the Nasality Severity Index 2.0 and perceptual judgments of hypernasality.

    NARCIS (Netherlands)

    K. Bettens; Anke Luyten; F. Wuyts; M. de Bodt; K. van Lierde; Y. Maryn

    2016-01-01

    PURPOSE: The Nasality Severity Index 2.0 (NSI 2.0) forms a new, multiparametric approach in the identification of hypernasality. The present study aimed to investigate the correlation between the NSI 2.0 scores and the perceptual assessment of hypernasality. METHOD: Speech samples of 35 patients,

  3. Musical training during early childhood enhances the neural encoding of speech in noise.

    Science.gov (United States)

    Strait, Dana L; Parbery-Clark, Alexandra; Hittner, Emily; Kraus, Nina

    2012-12-01

    For children, learning often occurs in the presence of background noise. As such, there is growing desire to improve a child's access to a target signal in noise. Given adult musicians' perceptual and neural speech-in-noise enhancements, we asked whether similar effects are present in musically-trained children. We assessed the perception and subcortical processing of speech in noise and related cognitive abilities in musician and nonmusician children that were matched for a variety of overarching factors. Outcomes reveal that musicians' advantages for processing speech in noise are present during pivotal developmental years. Supported by correlations between auditory working memory and attention and auditory brainstem response properties, we propose that musicians' perceptual and neural enhancements are driven in a top-down manner by strengthened cognitive abilities with training. Our results may be considered by professionals involved in the remediation of language-based learning deficits, which are often characterized by poor speech perception in noise. Copyright © 2012 Elsevier Inc. All rights reserved.

  4. Real-time continuous visual biofeedback in the treatment of speech breathing disorders following childhood traumatic brain injury: report of one case.

    Science.gov (United States)

    Murdoch, B E; Pitt, G; Theodoros, D G; Ward, E C

    1999-01-01

    The efficacy of traditional and physiological biofeedback methods for modifying abnormal speech breathing patterns was investigated in a child with persistent dysarthria following severe traumatic brain injury (TBI). An A-B-A-B single-subject experimental research design was utilized to provide the subject with two exclusive periods of therapy for speech breathing, based on traditional therapy techniques and physiological biofeedback methods, respectively. Traditional therapy techniques included establishing optimal posture for speech breathing, explanation of the movement of the respiratory muscles, and a hierarchy of non-speech and speech tasks focusing on establishing an appropriate level of sub-glottal air pressure, and improving the subject's control of inhalation and exhalation. The biofeedback phase of therapy utilized variable inductance plethysmography (or Respitrace) to provide real-time, continuous visual biofeedback of ribcage circumference during breathing. As in traditional therapy, a hierarchy of non-speech and speech tasks were devised to improve the subject's control of his respiratory pattern. Throughout the project, the subject's respiratory support for speech was assessed both instrumentally and perceptually. Instrumental assessment included kinematic and spirometric measures, and perceptual assessment included the Frenchay Dysarthria Assessment, Assessment of Intelligibility of Dysarthric Speech, and analysis of a speech sample. The results of the study demonstrated that real-time continuous visual biofeedback techniques for modifying speech breathing patterns were not only effective, but superior to the traditional therapy techniques for modifying abnormal speech breathing patterns in a child with persistent dysarthria following severe TBI. These results show that physiological biofeedback techniques are potentially useful clinical tools for the remediation of speech breathing impairment in the paediatric dysarthric population.

  5. Automatic Speech Recognition Systems for the Evaluation of Voice and Speech Disorders in Head and Neck Cancer

    Directory of Open Access Journals (Sweden)

    Andreas Maier

    2010-01-01

    Full Text Available In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appropriate for objective and quick evaluation of intelligibility. In this study we investigate the applicability of the method to speech disorders caused by head and neck cancer. Intelligibility was quantified by speech recognition on recordings of a standard text read by 41 German laryngectomized patients with cancer of the larynx or hypopharynx and 49 German patients who had suffered from oral cancer. The speech recognition provides the percentage of correctly recognized words of a sequence, that is, the word recognition rate. Automatic evaluation was compared to perceptual ratings by a panel of experts and to an age-matched control group. Both patient groups showed significantly lower word recognition rates than the control group. Automatic speech recognition yielded word recognition rates which complied with experts' evaluation of intelligibility on a significant level. Automatic speech recognition serves as a good means with low effort to objectify and quantify the most important aspect of pathologic speech—the intelligibility. The system was successfully applied to voice and speech disorders.

  6. Schizophrenia alters intra-network functional connectivity in the caudate for detecting speech under informational speech masking conditions.

    Science.gov (United States)

    Zheng, Yingjun; Wu, Chao; Li, Juanhua; Li, Ruikeng; Peng, Hongjun; She, Shenglin; Ning, Yuping; Li, Liang

    2018-04-04

    Speech recognition under noisy "cocktail-party" environments involves multiple perceptual/cognitive processes, including target detection, selective attention, irrelevant signal inhibition, sensory/working memory, and speech production. Compared to health listeners, people with schizophrenia are more vulnerable to masking stimuli and perform worse in speech recognition under speech-on-speech masking conditions. Although the schizophrenia-related speech-recognition impairment under "cocktail-party" conditions is associated with deficits of various perceptual/cognitive processes, it is crucial to know whether the brain substrates critically underlying speech detection against informational speech masking are impaired in people with schizophrenia. Using functional magnetic resonance imaging (fMRI), this study investigated differences between people with schizophrenia (n = 19, mean age = 33 ± 10 years) and their matched healthy controls (n = 15, mean age = 30 ± 9 years) in intra-network functional connectivity (FC) specifically associated with target-speech detection under speech-on-speech-masking conditions. The target-speech detection performance under the speech-on-speech-masking condition in participants with schizophrenia was significantly worse than that in matched healthy participants (healthy controls). Moreover, in healthy controls, but not participants with schizophrenia, the strength of intra-network FC within the bilateral caudate was positively correlated with the speech-detection performance under the speech-masking conditions. Compared to controls, patients showed altered spatial activity pattern and decreased intra-network FC in the caudate. In people with schizophrenia, the declined speech-detection performance under speech-on-speech masking conditions is associated with reduced intra-caudate functional connectivity, which normally contributes to detecting target speech against speech masking via its functions of suppressing masking-speech signals.

  7. The brain dynamics of rapid perceptual adaptation to adverse listening conditions.

    Science.gov (United States)

    Erb, Julia; Henry, Molly J; Eisner, Frank; Obleser, Jonas

    2013-06-26

    Listeners show a remarkable ability to quickly adjust to degraded speech input. Here, we aimed to identify the neural mechanisms of such short-term perceptual adaptation. In a sparse-sampling, cardiac-gated functional magnetic resonance imaging (fMRI) acquisition, human listeners heard and repeated back 4-band-vocoded sentences (in which the temporal envelope of the acoustic signal is preserved, while spectral information is highly degraded). Clear-speech trials were included as baseline. An additional fMRI experiment on amplitude modulation rate discrimination quantified the convergence of neural mechanisms that subserve coping with challenging listening conditions for speech and non-speech. First, the degraded speech task revealed an "executive" network (comprising the anterior insula and anterior cingulate cortex), parts of which were also activated in the non-speech discrimination task. Second, trial-by-trial fluctuations in successful comprehension of degraded speech drove hemodynamic signal change in classic "language" areas (bilateral temporal cortices). Third, as listeners perceptually adapted to degraded speech, downregulation in a cortico-striato-thalamo-cortical circuit was observable. The present data highlight differential upregulation and downregulation in auditory-language and executive networks, respectively, with important subcortical contributions when successfully adapting to a challenging listening situation.

  8. Linguistic Processing of Accented Speech Across the Lifespan

    Directory of Open Access Journals (Sweden)

    Alejandrina eCristia

    2012-11-01

    Full Text Available In most of the world, people have regular exposure to multiple accents. Therefore, learning to quickly process accented speech is a prerequisite to successful communication. In this paper, we examine work on the perception of accented speech across the lifespan, from early infancy to late adulthood. Unfamiliar accents initially impair linguistic processing by infants, children, younger adults, and older adults, but listeners of all ages come to adapt to accented speech. Emergent research also goes beyond these perceptual abilities, by assessing links with production and the relative contributions of linguistic knowledge and general cognitive skills. We conclude by underlining points of convergence across ages, and the gaps left to face in future work.

  9. Speech Prosody in Cerebellar Ataxia

    Science.gov (United States)

    Casper, Maureen A.; Raphael, Lawrence J.; Harris, Katherine S.; Geibel, Jennifer M.

    2007-01-01

    Persons with cerebellar ataxia exhibit changes in physical coordination and speech and voice production. Previously, these alterations of speech and voice production were described primarily via perceptual coordinates. In this study, the spatial-temporal properties of syllable production were examined in 12 speakers, six of whom were healthy…

  10. Training to Improve Hearing Speech in Noise: Biological Mechanisms

    Science.gov (United States)

    Song, Judy H.; Skoe, Erika; Banai, Karen

    2012-01-01

    We investigated training-related improvements in listening in noise and the biological mechanisms mediating these improvements. Training-related malleability was examined using a program that incorporates cognitively based listening exercises to improve speech-in-noise perception. Before and after training, auditory brainstem responses to a speech syllable were recorded in quiet and multitalker noise from adults who ranged in their speech-in-noise perceptual ability. Controls did not undergo training but were tested at intervals equivalent to the trained subjects. Trained subjects exhibited significant improvements in speech-in-noise perception that were retained 6 months later. Subcortical responses in noise demonstrated training-related enhancements in the encoding of pitch-related cues (the fundamental frequency and the second harmonic), particularly for the time-varying portion of the syllable that is most vulnerable to perceptual disruption (the formant transition region). Subjects with the largest strength of pitch encoding at pretest showed the greatest perceptual improvement. Controls exhibited neither neurophysiological nor perceptual changes. We provide the first demonstration that short-term training can improve the neural representation of cues important for speech-in-noise perception. These results implicate and delineate biological mechanisms contributing to learning success, and they provide a conceptual advance to our understanding of the kind of training experiences that can influence sensory processing in adulthood. PMID:21799207

  11. Emotionally conditioning the target-speech voice enhances recognition of the target speech under "cocktail-party" listening conditions.

    Science.gov (United States)

    Lu, Lingxi; Bao, Xiaohan; Chen, Jing; Qu, Tianshu; Wu, Xihong; Li, Liang

    2018-05-01

    Under a noisy "cocktail-party" listening condition with multiple people talking, listeners can use various perceptual/cognitive unmasking cues to improve recognition of the target speech against informational speech-on-speech masking. One potential unmasking cue is the emotion expressed in a speech voice, by means of certain acoustical features. However, it was unclear whether emotionally conditioning a target-speech voice that has none of the typical acoustical features of emotions (i.e., an emotionally neutral voice) can be used by listeners for enhancing target-speech recognition under speech-on-speech masking conditions. In this study we examined the recognition of target speech against a two-talker speech masker both before and after the emotionally neutral target voice was paired with a loud female screaming sound that has a marked negative emotional valence. The results showed that recognition of the target speech (especially the first keyword in a target sentence) was significantly improved by emotionally conditioning the target speaker's voice. Moreover, the emotional unmasking effect was independent of the unmasking effect of the perceived spatial separation between the target speech and the masker. Also, (skin conductance) electrodermal responses became stronger after emotional learning when the target speech and masker were perceptually co-located, suggesting an increase of listening efforts when the target speech was informationally masked. These results indicate that emotionally conditioning the target speaker's voice does not change the acoustical parameters of the target-speech stimuli, but the emotionally conditioned vocal features can be used as cues for unmasking target speech.

  12. Speech perception in older hearing impaired listeners: benefits of perceptual training.

    Directory of Open Access Journals (Sweden)

    David L Woods

    Full Text Available Hearing aids (HAs only partially restore the ability of older hearing impaired (OHI listeners to understand speech in noise, due in large part to persistent deficits in consonant identification. Here, we investigated whether adaptive perceptual training would improve consonant-identification in noise in sixteen aided OHI listeners who underwent 40 hours of computer-based training in their homes. Listeners identified 20 onset and 20 coda consonants in 9,600 consonant-vowel-consonant (CVC syllables containing different vowels (/ɑ/, /i/, or /u/ and spoken by four different talkers. Consonants were presented at three consonant-specific signal-to-noise ratios (SNRs spanning a 12 dB range. Noise levels were adjusted over training sessions based on d' measures. Listeners were tested before and after training to measure (1 changes in consonant-identification thresholds using syllables spoken by familiar and unfamiliar talkers, and (2 sentence reception thresholds (SeRTs using two different sentence tests. Consonant-identification thresholds improved gradually during training. Laboratory tests of d' thresholds showed an average improvement of 9.1 dB, with 94% of listeners showing statistically significant training benefit. Training normalized consonant confusions and improved the thresholds of some consonants into the normal range. Benefits were equivalent for onset and coda consonants, syllables containing different vowels, and syllables presented at different SNRs. Greater training benefits were found for hard-to-identify consonants and for consonants spoken by familiar than unfamiliar talkers. SeRTs, tested with simple sentences, showed less elevation than consonant-identification thresholds prior to training and failed to show significant training benefit, although SeRT improvements did correlate with improvements in consonant thresholds. We argue that the lack of SeRT improvement reflects the dominant role of top-down semantic processing in

  13. Alternative Speech Communication System for Persons with Severe Speech Disorders

    Science.gov (United States)

    Selouani, Sid-Ahmed; Sidi Yakoub, Mohammed; O'Shaughnessy, Douglas

    2009-12-01

    Assistive speech-enabled systems are proposed to help both French and English speaking persons with various speech disorders. The proposed assistive systems use automatic speech recognition (ASR) and speech synthesis in order to enhance the quality of communication. These systems aim at improving the intelligibility of pathologic speech making it as natural as possible and close to the original voice of the speaker. The resynthesized utterances use new basic units, a new concatenating algorithm and a grafting technique to correct the poorly pronounced phonemes. The ASR responses are uttered by the new speech synthesis system in order to convey an intelligible message to listeners. Experiments involving four American speakers with severe dysarthria and two Acadian French speakers with sound substitution disorders (SSDs) are carried out to demonstrate the efficiency of the proposed methods. An improvement of the Perceptual Evaluation of the Speech Quality (PESQ) value of 5% and more than 20% is achieved by the speech synthesis systems that deal with SSD and dysarthria, respectively.

  14. Individual differneces in degraded speech perception

    Science.gov (United States)

    Carbonell, Kathy M.

    One of the lasting concerns in audiology is the unexplained individual differences in speech perception performance even for individuals with similar audiograms. One proposal is that there are cognitive/perceptual individual differences underlying this vulnerability and that these differences are present in normal hearing (NH) individuals but do not reveal themselves in studies that use clear speech produced in quiet (because of a ceiling effect). However, previous studies have failed to uncover cognitive/perceptual variables that explain much of the variance in NH performance on more challenging degraded speech tasks. This lack of strong correlations may be due to either examining the wrong measures (e.g., working memory capacity) or to there being no reliable differences in degraded speech performance in NH listeners (i.e., variability in performance is due to measurement noise). The proposed project has 3 aims; the first, is to establish whether there are reliable individual differences in degraded speech performance for NH listeners that are sustained both across degradation types (speech in noise, compressed speech, noise-vocoded speech) and across multiple testing sessions. The second aim is to establish whether there are reliable differences in NH listeners' ability to adapt their phonetic categories based on short-term statistics both across tasks and across sessions; and finally, to determine whether performance on degraded speech perception tasks are correlated with performance on phonetic adaptability tasks, thus establishing a possible explanatory variable for individual differences in speech perception for NH and hearing impaired listeners.

  15. Bandwidth extension of speech using perceptual criteria

    CERN Document Server

    Berisha, Visar; Liss, Julie

    2013-01-01

    Bandwidth extension of speech is used in the International Telecommunication Union G.729.1 standard in which the narrowband bitstream is combined with quantized high-band parameters. Although this system produces high-quality wideband speech, the additional bits used to represent the high band can be further reduced. In addition to the algorithm used in the G.729.1 standard, bandwidth extension methods based on spectrum prediction have also been proposed. Although these algorithms do not require additional bits, they perform poorly when the correlation between the low and the high band is weak. In this book, two wideband speech coding algorithms that rely on bandwidth extension are developed. The algorithms operate as wrappers around existing narrowband compression schemes. More specifically, in these algorithms, the low band is encoded using an existing toll-quality narrowband system, whereas the high band is generated using the proposed extension techniques. The first method relies only on transmitted high-...

  16. Segmental intelligibility of synthetic speech produced by rule.

    Science.gov (United States)

    Logan, J S; Greene, B G; Pisoni, D B

    1989-08-01

    This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk--Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener's processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener.

  17. Segmental intelligibility of synthetic speech produced by rule

    Science.gov (United States)

    Logan, John S.; Greene, Beth G.; Pisoni, David B.

    2012-01-01

    This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk—Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener’s processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener. PMID:2527884

  18. Perceptual organization of speech signals by children with and without dyslexia

    Science.gov (United States)

    Nittrouer, Susan; Lowenstein, Joanna H.

    2013-01-01

    Developmental dyslexia is a condition in which children encounter difficulty learning to read in spite of adequate instruction. Although considerable effort has been expended trying to identify the source of the problem, no single solution has been agreed upon. The current study explored a new hypothesis, that developmental dyslexia may be due to faulty perceptual organization of linguistically relevant sensory input. To test that idea, sentence-length speech signals were processed to create either sine-wave or noise-vocoded analogs. Seventy children between 8 and 11 years of age, with and without dyslexia participated. Children with dyslexia were selected to have phonological awareness deficits, although those without such deficits were retained in the study. The processed sentences were presented for recognition, and measures of reading, phonological awareness, and expressive vocabulary were collected. Results showed that children with dyslexia, regardless of phonological subtype, had poorer recognition scores than children without dyslexia for both kinds of degraded sentences. Older children with dyslexia recognized the sine-wave sentences better than younger children with dyslexia, but no such effect of age was found for the vocoded materials. Recognition scores were used as predictor variables in regression analyses with reading, phonological awareness, and vocabulary measures used as dependent variables. Scores for both sorts of sentence materials were strong predictors of performance on all three dependent measures when all children were included, but only performance for the sine-wave materials explained significant proportions of variance when only children with dyslexia were included. Finally, matching young, typical readers with older children with dyslexia on reading abilities did not mitigate the group difference in recognition of vocoded sentences. Conclusions were that children with dyslexia have difficulty organizing linguistically relevant sensory

  19. On the Perception of Speech Sounds as Biologically Significant Signals1,2

    Science.gov (United States)

    Pisoni, David B.

    2012-01-01

    This paper reviews some of the major evidence and arguments currently available to support the view that human speech perception may require the use of specialized neural mechanisms for perceptual analysis. Experiments using synthetically produced speech signals with adults are briefly summarized and extensions of these results to infants and other organisms are reviewed with an emphasis towards detailing those aspects of speech perception that may require some need for specialized species-specific processors. Finally, some comments on the role of early experience in perceptual development are provided as an attempt to identify promising areas of new research in speech perception. PMID:399200

  20. The relationship between the Nasality Severity Index 2.0 and perceptual judgments of hypernasality.

    Science.gov (United States)

    Bettens, Kim; De Bodt, Marc; Maryn, Youri; Luyten, Anke; Wuyts, Floris L; Van Lierde, Kristiane M

    2016-01-01

    The Nasality Severity Index 2.0 (NSI 2.0) forms a new, multiparametric approach in the identification of hypernasality. The present study aimed to investigate the correlation between the NSI 2.0 scores and the perceptual assessment of hypernasality. Speech samples of 35 patients, representing a range of nasality from normal to severely hypernasal, were rated by four expert speech-language pathologists using visual analogue scaling (VAS) judging the degree of hypernasality, audible nasal airflow (ANA) and speech intelligibility. Inter- and intra-listener reliability was verified using intraclass correlation coefficients. Correlations between NSI 2.0 scores and its parameters (i.e. nasalance score of an oral text and vowel /u/, voice low tone to high tone ratio of the vowel /i/) and the degree of hypernasality were determined using Pearson correlation coefficients. Multiple linear regression analysis was used to investigate the possible influence of ANA and speech intelligibility on the NSI 2.0 scores. Overall good to excellent inter- and intra-listener reliability was found for the perceptual ratings. A moderate, but significant negative correlation between NSI 2.0 scores and perceived hypernasality (r=-0.64) was found, in which a more negative NSI 2.0 score indicates the presence of more severe hypernasality. No significant influence of ANA or intelligibility on the NSI 2.0 was observed based on the regression analysis. Because the NSI 2.0 correlates significantly with perceived hypernasality, it provides an easy-to-interpret severity score of hypernasality which will facilitate the evaluation of therapy outcomes, communication to the patient and other clinicians, and decisions for treatment planning, based on a multiparametric approach. However, research is still necessary to further explore the instrumental correlates of perceived hypernasality. The reader will be able to (1) describe and discuss current issues and influencing variables regarding perceptual

  1. Fine-structure processing, frequency selectivity and speech perception in hearing-impaired listeners

    DEFF Research Database (Denmark)

    Strelcyk, Olaf; Dau, Torsten

    2008-01-01

    Hearing-impaired people often experience great difficulty with speech communication when background noise is present, even if reduced audibility has been compensated for. Other impairment factors must be involved. In order to minimize confounding effects, the subjects participating in this study...... consisted of groups with homogeneous, symmetric audiograms. The perceptual listening experiments assessed the intelligibility of full-spectrum as well as low-pass filtered speech in the presence of stationary and fluctuating interferers, the individual's frequency selectivity and the integrity of temporal...... modulation were obtained. In addition, these binaural and monaural thresholds were measured in a stationary background noise in order to assess the persistence of the fine-structure processing to interfering noise. Apart from elevated speech reception thresholds, the hearing impaired listeners showed poorer...

  2. Assessment of Spectral and Temporal Resolution in Cochlear Implant Users Using Psychoacoustic Discrimination and Speech Cue Categorization.

    Science.gov (United States)

    Winn, Matthew B; Won, Jong Ho; Moon, Il Joon

    This study was conducted to measure auditory perception by cochlear implant users in the spectral and temporal domains, using tests of either categorization (using speech-based cues) or discrimination (using conventional psychoacoustic tests). The authors hypothesized that traditional nonlinguistic tests assessing spectral and temporal auditory resolution would correspond to speech-based measures assessing specific aspects of phonetic categorization assumed to depend on spectral and temporal auditory resolution. The authors further hypothesized that speech-based categorization performance would ultimately be a superior predictor of speech recognition performance, because of the fundamental nature of speech recognition as categorization. Nineteen cochlear implant listeners and 10 listeners with normal hearing participated in a suite of tasks that included spectral ripple discrimination, temporal modulation detection, and syllable categorization, which was split into a spectral cue-based task (targeting the /ba/-/da/ contrast) and a timing cue-based task (targeting the /b/-/p/ and /d/-/t/ contrasts). Speech sounds were manipulated to contain specific spectral or temporal modulations (formant transitions or voice onset time, respectively) that could be categorized. Categorization responses were quantified using logistic regression to assess perceptual sensitivity to acoustic phonetic cues. Word recognition testing was also conducted for cochlear implant listeners. Cochlear implant users were generally less successful at utilizing both spectral and temporal cues for categorization compared with listeners with normal hearing. For the cochlear implant listener group, spectral ripple discrimination was significantly correlated with the categorization of formant transitions; both were correlated with better word recognition. Temporal modulation detection using 100- and 10-Hz-modulated noise was not correlated either with the cochlear implant subjects' categorization of

  3. Trans-oral endoscopic partial adenoidectomy does not worsen the speech after cleft palate repair.

    Science.gov (United States)

    Abdel-Aziz, Mosaad; Khalifa, Badawy; Shawky, Ahmed; Rashed, Mohammed; Naguib, Nader; Abdel-Hameed, Asmaa

    2016-01-01

    Adenoid hypertrophy may play a role in velopharyngeal closure especially in patients with palatal abnormality; adenoidectomy may lead to velopharyngeal insufficiency and hyper nasal speech. Patients with cleft palate even after repair should not undergo adenoidectomy unless absolutely needed, and in such situations, conservative or partial adenoidectomy is performed to avoid the occurrence of velopharyngeal insufficiency. Trans-oral endoscopic adenoidectomy enables the surgeon to inspect the velopharyngeal valve during the procedure. The aim of this study was to assess the effect of transoral endoscopic partial adenoidectomy on the speech of children with repaired cleft palate. Twenty children with repaired cleft palate underwent transoral endoscopic partial adenoidectomy to relieve their airway obstruction. The procedure was completely visualized with the use of a 70° 4mm nasal endoscope; the upper part of the adenoid was removed using adenoid curette and St. Claire Thompson forceps, while the lower part was retained to maintain the velopharyngeal competence. Preoperative and postoperative evaluation of speech was performed, subjectively by auditory perceptual assessment, and objectively by nasometric assessment. Speech was not adversely affected after surgery. The difference between preoperative and postoperative auditory perceptual assessment and nasalance scores for nasal and oral sentences was insignificant (p=0.231, 0.442, 0.118 respectively). Transoral endoscopic partial adenoidectomy is a safe method; it does not worsen the speech of repaired cleft palate patients. It enables the surgeon to strictly inspect the velopharyngeal valve during the procedure with better determination of the adenoidal part that may contribute in velopharyngeal closure. Copyright © 2015 Associação Brasileira de Otorrinolaringologia e Cirurgia Cérvico-Facial. Published by Elsevier Editora Ltda. All rights reserved.

  4. Preschoolers Benefit from Visually Salient Speech Cues

    Science.gov (United States)

    Lalonde, Kaylah; Holt, Rachael Frush

    2015-01-01

    Purpose: This study explored visual speech influence in preschoolers using 3 developmentally appropriate tasks that vary in perceptual difficulty and task demands. They also examined developmental differences in the ability to use visually salient speech cues and visual phonological knowledge. Method: Twelve adults and 27 typically developing 3-…

  5. Analysis of Feature Extraction Methods for Speaker Dependent Speech Recognition

    Directory of Open Access Journals (Sweden)

    Gurpreet Kaur

    2017-02-01

    Full Text Available Speech recognition is about what is being said, irrespective of who is saying. Speech recognition is a growing field. Major progress is taking place on the technology of automatic speech recognition (ASR. Still, there are lots of barriers in this field in terms of recognition rate, background noise, speaker variability, speaking rate, accent etc. Speech recognition rate mainly depends on the selection of features and feature extraction methods. This paper outlines the feature extraction techniques for speaker dependent speech recognition for isolated words. A brief survey of different feature extraction techniques like Mel-Frequency Cepstral Coefficients (MFCC, Linear Predictive Coding Coefficients (LPCC, Perceptual Linear Prediction (PLP, Relative Spectra Perceptual linear Predictive (RASTA-PLP analysis are presented and evaluation is done. Speech recognition has various applications from daily use to commercial use. We have made a speaker dependent system and this system can be useful in many areas like controlling a patient vehicle using simple commands.

  6. Text as a Supplement to Speech in Young and Older Adults.

    Science.gov (United States)

    Krull, Vidya; Humes, Larry E

    2016-01-01

    The purpose of this experiment was to quantify the contribution of visual text to auditory speech recognition in background noise. Specifically, the authors tested the hypothesis that partially accurate visual text from an automatic speech recognizer could be used successfully to supplement speech understanding in difficult listening conditions in older adults, with normal or impaired hearing. The working hypotheses were based on what is known regarding audiovisual speech perception in the elderly from speechreading literature. We hypothesized that (1) combining auditory and visual text information will result in improved recognition accuracy compared with auditory or visual text information alone, (2) benefit from supplementing speech with visual text (auditory and visual enhancement) in young adults will be greater than that in older adults, and (3) individual differences in performance on perceptual measures would be associated with cognitive abilities. Fifteen young adults with normal hearing, 15 older adults with normal hearing, and 15 older adults with hearing loss participated in this study. All participants completed sentence recognition tasks in auditory-only, text-only, and combined auditory-text conditions. The auditory sentence stimuli were spectrally shaped to restore audibility for the older participants with impaired hearing. All participants also completed various cognitive measures, including measures of working memory, processing speed, verbal comprehension, perceptual and cognitive speed, processing efficiency, inhibition, and the ability to form wholes from parts. Group effects were examined for each of the perceptual and cognitive measures. Audiovisual benefit was calculated relative to performance on auditory- and visual-text only conditions. Finally, the relationship between perceptual measures and other independent measures were examined using principal-component factor analyses, followed by regression analyses. Both young and older adults

  7. Assessment of breathing patterns and respiratory muscle recruitment during singing and speech in quadriplegia.

    Science.gov (United States)

    Tamplin, Jeanette; Brazzale, Danny J; Pretto, Jeffrey J; Ruehland, Warren R; Buttifant, Mary; Brown, Douglas J; Berlowitz, David J

    2011-02-01

    To explore how respiratory impairment after cervical spinal cord injury affects vocal function, and to explore muscle recruitment strategies used during vocal tasks after quadriplegia. It was hypothesized that to achieve the increased respiratory support required for singing and loud speech, people with quadriplegia use different patterns of muscle recruitment and control strategies compared with control subjects without spinal cord injury. Matched, parallel-group design. Large university-affiliated public hospital. Consenting participants with motor-complete C5-7 quadriplegia (n=6) and able-bodied age-matched controls (n=6) were assessed on physiologic and voice measures during vocal tasks. Not applicable. Standard respiratory function testing, surface electromyographic activity from accessory respiratory muscles, sound pressure levels during vocal tasks, the Voice Handicap Index, and the Perceptual Voice Profile. The group with quadriplegia had a reduced lung capacity (vital capacity, 71% vs 102% of predicted; P=.028), more perceived voice problems (Voice Handicap Index score, 22.5 vs 6.5; P=.046), and greater recruitment of accessory respiratory muscles during both loud and soft volumes (P=.028) than the able-bodied controls. The group with quadriplegia also demonstrated higher accessory muscle activation in changing from soft to loud speech (P=.028). People with quadriplegia have impaired vocal ability and use different muscle recruitment strategies during speech than the able-bodied. These findings will enable us to target specific measurements of respiratory physiology for assessing functional improvements in response to formal therapeutic singing training. Copyright © 2011 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  8. Tools for the assessment of childhood apraxia of speech.

    Science.gov (United States)

    Gubiani, Marileda Barichello; Pagliarin, Karina Carlesso; Keske-Soares, Marcia

    2015-01-01

    This study systematically reviews the literature on the main tools used to evaluate childhood apraxia of speech (CAS). The search strategy includes Scopus, PubMed, and Embase databases. Empirical studies that used tools for assessing CAS were selected. Articles were selected by two independent researchers. The search retrieved 695 articles, out of which 12 were included in the study. Five tools were identified: Verbal Motor Production Assessment for Children, Dynamic Evaluation of Motor Speech Skill, The Orofacial Praxis Test, Kaufman Speech Praxis Test for Children, and Madison Speech Assessment Protocol. There are few instruments available for CAS assessment and most of them are intended to assess praxis and/or orofacial movements, sequences of orofacial movements, articulation of syllables and phonemes, spontaneous speech, and prosody. There are some tests for assessment and diagnosis of CAS. However, few studies on this topic have been conducted at the national level, as well as protocols to assess and assist in an accurate diagnosis.

  9. Auditory-Perceptual Evaluation of Dysphonia: A Comparison Between Narrow and Broad Terminology Systems

    DEFF Research Database (Denmark)

    Iwarsson, Jenny

    2017-01-01

    of the terminology used in the multiparameter Danish Dysphonia Assessment (DDA) approach into the five-parameter GRBAS system. Methods. Voice samples illustrating type and grade of the voice qualities included in DDA were rated by five speech language pathologists using the GRBAS system with the aim of estimating...... terms and antagonists, reflecting muscular hypo- and hyperfunction. Key Words: Auditory-perceptual voice analysis–Dysphonia–GRBAS–Listening test–Voice ratings....

  10. Typical versus delayed speech onset influences verbal reporting of autistic interests.

    Science.gov (United States)

    Chiodo, Liliane; Majerus, Steve; Mottron, Laurent

    2017-01-01

    The distinction between autism and Asperger syndrome has been abandoned in the DSM-5. However, this clinical categorization largely overlaps with the presence or absence of a speech onset delay which is associated with clinical, cognitive, and neural differences. It is unknown whether these different speech development pathways and associated cognitive differences are involved in the heterogeneity of the restricted interests that characterize autistic adults. This study tested the hypothesis that speech onset delay, or conversely, early mastery of speech, orients the nature and verbal reporting of adult autistic interests. The occurrence of a priori defined descriptors for perceptual and thematic dimensions were determined, as well as the perceived function and benefits, in the response of autistic people to a semi-structured interview on their intense interests. The number of words, grammatical categories, and proportion of perceptual / thematic descriptors were computed and compared between groups by variance analyses. The participants comprised 40 autistic adults grouped according to the presence ( N  = 20) or absence ( N  = 20) of speech onset delay, as well as 20 non-autistic adults, also with intense interests, matched for non-verbal intelligence using Raven's Progressive Matrices. The overall nature, function, and benefit of intense interests were similar across autistic subgroups, and between autistic and non-autistic groups. However, autistic participants with a history of speech onset delay used more perceptual than thematic descriptors when talking about their interests, whereas the opposite was true for autistic individuals without speech onset delay. This finding remained significant after controlling for linguistic differences observed between the two groups. Verbal reporting, but not the nature or positive function, of intense interests differed between adult autistic individuals depending on their speech acquisition history: oral reporting of

  11. Timing in audiovisual speech perception: A mini review and new psychophysical data.

    Science.gov (United States)

    Venezia, Jonathan H; Thurman, Steven M; Matchin, William; George, Sahara E; Hickok, Gregory

    2016-02-01

    Recent influential models of audiovisual speech perception suggest that visual speech aids perception by generating predictions about the identity of upcoming speech sounds. These models place stock in the assumption that visual speech leads auditory speech in time. However, it is unclear whether and to what extent temporally-leading visual speech information contributes to perception. Previous studies exploring audiovisual-speech timing have relied upon psychophysical procedures that require artificial manipulation of cross-modal alignment or stimulus duration. We introduce a classification procedure that tracks perceptually relevant visual speech information in time without requiring such manipulations. Participants were shown videos of a McGurk syllable (auditory /apa/ + visual /aka/ = perceptual /ata/) and asked to perform phoneme identification (/apa/ yes-no). The mouth region of the visual stimulus was overlaid with a dynamic transparency mask that obscured visual speech in some frames but not others randomly across trials. Variability in participants' responses (~35 % identification of /apa/ compared to ~5 % in the absence of the masker) served as the basis for classification analysis. The outcome was a high resolution spatiotemporal map of perceptually relevant visual features. We produced these maps for McGurk stimuli at different audiovisual temporal offsets (natural timing, 50-ms visual lead, and 100-ms visual lead). Briefly, temporally-leading (~130 ms) visual information did influence auditory perception. Moreover, several visual features influenced perception of a single speech sound, with the relative influence of each feature depending on both its temporal relation to the auditory signal and its informational content.

  12. Timing in Audiovisual Speech Perception: A Mini Review and New Psychophysical Data

    Science.gov (United States)

    Venezia, Jonathan H.; Thurman, Steven M.; Matchin, William; George, Sahara E.; Hickok, Gregory

    2015-01-01

    Recent influential models of audiovisual speech perception suggest that visual speech aids perception by generating predictions about the identity of upcoming speech sounds. These models place stock in the assumption that visual speech leads auditory speech in time. However, it is unclear whether and to what extent temporally-leading visual speech information contributes to perception. Previous studies exploring audiovisual-speech timing have relied upon psychophysical procedures that require artificial manipulation of cross-modal alignment or stimulus duration. We introduce a classification procedure that tracks perceptually-relevant visual speech information in time without requiring such manipulations. Participants were shown videos of a McGurk syllable (auditory /apa/ + visual /aka/ = perceptual /ata/) and asked to perform phoneme identification (/apa/ yes-no). The mouth region of the visual stimulus was overlaid with a dynamic transparency mask that obscured visual speech in some frames but not others randomly across trials. Variability in participants' responses (∼35% identification of /apa/ compared to ∼5% in the absence of the masker) served as the basis for classification analysis. The outcome was a high resolution spatiotemporal map of perceptually-relevant visual features. We produced these maps for McGurk stimuli at different audiovisual temporal offsets (natural timing, 50-ms visual lead, and 100-ms visual lead). Briefly, temporally-leading (∼130 ms) visual information did influence auditory perception. Moreover, several visual features influenced perception of a single speech sound, with the relative influence of each feature depending on both its temporal relation to the auditory signal and its informational content. PMID:26669309

  13. Is sequence awareness mandatory for perceptual sequence learning: An assessment using a pure perceptual sequence learning design.

    Science.gov (United States)

    Deroost, Natacha; Coomans, Daphné

    2018-02-01

    We examined the role of sequence awareness in a pure perceptual sequence learning design. Participants had to react to the target's colour that changed according to a perceptual sequence. By varying the mapping of the target's colour onto the response keys, motor responses changed randomly. The effect of sequence awareness on perceptual sequence learning was determined by manipulating the learning instructions (explicit versus implicit) and assessing the amount of sequence awareness after the experiment. In the explicit instruction condition (n = 15), participants were instructed to intentionally search for the colour sequence, whereas in the implicit instruction condition (n = 15), they were left uninformed about the sequenced nature of the task. Sequence awareness after the sequence learning task was tested by means of a questionnaire and the process-dissociation-procedure. The results showed that the instruction manipulation had no effect on the amount of perceptual sequence learning. Based on their report to have actively applied their sequence knowledge during the experiment, participants were subsequently regrouped in a sequence strategy group (n = 14, of which 4 participants from the implicit instruction condition and 10 participants from the explicit instruction condition) and a no-sequence strategy group (n = 16, of which 11 participants from the implicit instruction condition and 5 participants from the explicit instruction condition). Only participants of the sequence strategy group showed reliable perceptual sequence learning and sequence awareness. These results indicate that perceptual sequence learning depends upon the continuous employment of strategic cognitive control processes on sequence knowledge. Sequence awareness is suggested to be a necessary but not sufficient condition for perceptual learning to take place. Copyright © 2018 Elsevier B.V. All rights reserved.

  14. Automatic Speech Signal Analysis for Clinical Diagnosis and Assessment of Speech Disorders

    CERN Document Server

    Baghai-Ravary, Ladan

    2013-01-01

    Automatic Speech Signal Analysis for Clinical Diagnosis and Assessment of Speech Disorders provides a survey of methods designed to aid clinicians in the diagnosis and monitoring of speech disorders such as dysarthria and dyspraxia, with an emphasis on the signal processing techniques, statistical validity of the results presented in the literature, and the appropriateness of methods that do not require specialized equipment, rigorously controlled recording procedures or highly skilled personnel to interpret results. Such techniques offer the promise of a simple and cost-effective, yet objective, assessment of a range of medical conditions, which would be of great value to clinicians. The ideal scenario would begin with the collection of examples of the clients’ speech, either over the phone or using portable recording devices operated by non-specialist nursing staff. The recordings could then be analyzed initially to aid diagnosis of conditions, and subsequently to monitor the clients’ progress and res...

  15. Personality in speech assessment and automatic classification

    CERN Document Server

    Polzehl, Tim

    2015-01-01

    This work combines interdisciplinary knowledge and experience from research fields of psychology, linguistics, audio-processing, machine learning, and computer science. The work systematically explores a novel research topic devoted to automated modeling of personality expression from speech. For this aim, it introduces a novel personality assessment questionnaire and presents the results of extensive labeling sessions to annotate the speech data with personality assessments. It provides estimates of the Big 5 personality traits, i.e. openness, conscientiousness, extroversion, agreeableness, and neuroticism. Based on a database built on the questionnaire, the book presents models to tell apart different personality types or classes from speech automatically.

  16. Speech perception at the interface of neurobiology and linguistics.

    Science.gov (United States)

    Poeppel, David; Idsardi, William J; van Wassenhove, Virginie

    2008-03-12

    Speech perception consists of a set of computations that take continuously varying acoustic waveforms as input and generate discrete representations that make contact with the lexical representations stored in long-term memory as output. Because the perceptual objects that are recognized by the speech perception enter into subsequent linguistic computation, the format that is used for lexical representation and processing fundamentally constrains the speech perceptual processes. Consequently, theories of speech perception must, at some level, be tightly linked to theories of lexical representation. Minimally, speech perception must yield representations that smoothly and rapidly interface with stored lexical items. Adopting the perspective of Marr, we argue and provide neurobiological and psychophysical evidence for the following research programme. First, at the implementational level, speech perception is a multi-time resolution process, with perceptual analyses occurring concurrently on at least two time scales (approx. 20-80 ms, approx. 150-300 ms), commensurate with (sub)segmental and syllabic analyses, respectively. Second, at the algorithmic level, we suggest that perception proceeds on the basis of internal forward models, or uses an 'analysis-by-synthesis' approach. Third, at the computational level (in the sense of Marr), the theory of lexical representation that we adopt is principally informed by phonological research and assumes that words are represented in the mental lexicon in terms of sequences of discrete segments composed of distinctive features. One important goal of the research programme is to develop linking hypotheses between putative neurobiological primitives (e.g. temporal primitives) and those primitives derived from linguistic inquiry, to arrive ultimately at a biologically sensible and theoretically satisfying model of representation and computation in speech.

  17. Requirements for the evaluation of computational speech segregation systems

    DEFF Research Database (Denmark)

    May, Tobias; Dau, Torsten

    2014-01-01

    Recent studies on computational speech segregation reported improved speech intelligibility in noise when estimating and applying an ideal binary mask with supervised learning algorithms. However, an important requirement for such systems in technical applications is their robustness to acoustic...... associated with perceptual attributes in speech segregation. The results could help establish a framework for a systematic evaluation of future segregation systems....

  18. Differential Diagnosis of Children with Suspected Childhood Apraxia of Speech

    Science.gov (United States)

    Murray, Elizabeth; McCabe, Patricia; Heard, Robert; Ballard, Kirrie J.

    2015-01-01

    Purpose: The gold standard for diagnosing childhood apraxia of speech (CAS) is expert judgment of perceptual features. The aim of this study was to identify a set of objective measures that differentiate CAS from other speech disorders. Method: Seventy-two children (4-12 years of age) diagnosed with suspected CAS by community speech-language…

  19. The Speech multi features fusion perceptual hash algorithm based on tensor decomposition

    Science.gov (United States)

    Huang, Y. B.; Fan, M. H.; Zhang, Q. Y.

    2018-03-01

    With constant progress in modern speech communication technologies, the speech data is prone to be attacked by the noise or maliciously tampered. In order to make the speech perception hash algorithm has strong robustness and high efficiency, this paper put forward a speech perception hash algorithm based on the tensor decomposition and multi features is proposed. This algorithm analyses the speech perception feature acquires each speech component wavelet packet decomposition. LPCC, LSP and ISP feature of each speech component are extracted to constitute the speech feature tensor. Speech authentication is done by generating the hash values through feature matrix quantification which use mid-value. Experimental results showing that the proposed algorithm is robust for content to maintain operations compared with similar algorithms. It is able to resist the attack of the common background noise. Also, the algorithm is highly efficiency in terms of arithmetic, and is able to meet the real-time requirements of speech communication and complete the speech authentication quickly.

  20. The Auditory-Visual Speech Benefit on Working Memory in Older Adults with Hearing Impairment

    Directory of Open Access Journals (Sweden)

    Jana B. Frtusova

    2016-04-01

    Full Text Available This study examined the effect of auditory-visual (AV speech stimuli on working memory in hearing impaired participants (HIP in comparison to age- and education-matched normal elderly controls (NEC. Participants completed a working memory n-back task (0- to 2-back in which sequences of digits were presented in visual-only (i.e., speech-reading, auditory-only (A-only, and AV conditions. Auditory event-related potentials (ERP were collected to assess the relationship between perceptual and working memory processing. The behavioural results showed that both groups were faster in the AV condition in comparison to the unisensory conditions. The ERP data showed perceptual facilitation in the AV condition, in the form of reduced amplitudes and latencies of the auditory N1 and/or P1 components, in the HIP group. Furthermore, a working memory ERP component, the P3, peaked earlier for both groups in the AV condition compared to the A-only condition. In general, the HIP group showed a more robust AV benefit; however, the NECs showed a dose-response relationship between perceptual facilitation and working memory improvement, especially for facilitation of processing speed. Two measures, reaction time and P3 amplitude, suggested that the presence of visual speech cues may have helped the HIP to counteract the demanding auditory processing, to the level that no group differences were evident during the AV modality despite lower performance during the A-only condition. Overall, this study provides support for the theory of an integrated perceptual-cognitive system. The practical significance of these findings is also discussed.

  1. Scale-free amplitude modulation of neuronal oscillations tracks comprehension of accelerated speech

    NARCIS (Netherlands)

    Borges, Ana Filipa Teixeira; Giraud, Anne Lise; Mansvelder, Huibert D.; Linkenkaer-Hansen, Klaus

    2018-01-01

    Speech comprehension is preserved up to a threefold acceleration, but deteriorates rapidly at higher speeds. Current models posit that perceptual resilience to accelerated speech is limited by the brain’s ability to parse speech into syllabic units using δ/θ oscillations. Here, we investigated

  2. Cross-linguistic perspectives on speech assessment in cleft palate

    DEFF Research Database (Denmark)

    Willadsen, Elisabeth; Henningsson, Gunilla

    2012-01-01

    . Finally, the influence of different languages on some aspects of language acquisition in young children with cleft palate is presented and discussed. Until recently, not much has been written about cross linguistic perspectives when dealing with cleft palate speech. Most literature about assessment......This chapter deals with cross linguistic perspectives that need to be taken into account when comparing speech assessment and speech outcome obtained from cleft palate speakers of different languages. Firstly, an overview of consonants and vowels vulnerable to the cleft condition is presented. Then......, consequences for assessment of cleft palate speech by native versus non-native speakers of a language are discussed, as well as the use of phonemic versus phonetic transcription in cross linguistic studies. Specific recommendations for the construction of speech samples in cross linguistic studies are given...

  3. Musical Training during Early Childhood Enhances the Neural Encoding of Speech in Noise

    Science.gov (United States)

    Strait, Dana L.; Parbery-Clark, Alexandra; Hittner, Emily; Kraus, Nina

    2012-01-01

    For children, learning often occurs in the presence of background noise. As such, there is growing desire to improve a child's access to a target signal in noise. Given adult musicians' perceptual and neural speech-in-noise enhancements, we asked whether similar effects are present in musically-trained children. We assessed the perception and…

  4. The assessment of aesthetic and perceptual aspects within environmental impact assessment of renewable energy projects in Italy

    International Nuclear Information System (INIS)

    Tolli, Michela; Recanatesi, Fabio; Piccinno, Matteo; Leone, Antonio

    2016-01-01

    The main aim of this paper is to explore how perceptual and aesthetic impact analyses are considered in Environmental Impact Assessment (EIA), with specific reference to Italian renewable energy projects. To investigate this topic, the paper starts by establishing which factors are linked with perceptual and aesthetic impacts and why it is important to analyze these aspects, which are also related to legislative provisions and procedures in Europe and in Italy. In particular the paper refers to renewable energy projects because environmental policies are encouraging more and more investment in this kind of primary resource. The growing interest in this type of energy is leading to the realization of projects which change the governance of territories, with inevitable effects on the landscape from the aesthetic and perceptual points of view. Legislative references to EIA, including the latest directive regarding this topic show the importance of integrating the assessment of environmental and perceptual impacts, thus there is a need to improve EIA methodological approaches to this purpose. This paper proposes a profile of aesthetic and perceptual impact analysis in EIA for renewable energy projects in Italy, and concludes with recommendations as to how this kind of analysis could be improved. - Highlights: • We analyze 29 EIA Reports of Italian renewable energy projects. • We examine esthetic and perceptual aspects present in Italian EIA reports. • We identified inconsistency in use of methods for esthetic and perceptual aspects. • Local populations are rarely included as stakeholders in EIAs. • A shared understanding of perceptual and esthetic issues in EIA proceedings is required.

  5. The assessment of aesthetic and perceptual aspects within environmental impact assessment of renewable energy projects in Italy

    Energy Technology Data Exchange (ETDEWEB)

    Tolli, Michela, E-mail: michela.tolli@uniroma1.it [Department of Architecture and Design (DiAP), Sapienza University of Rome, Via Gramsci 53, 00197 Rome (Italy); Recanatesi, Fabio, E-mail: fabio.rec@unitus.it [Department of Agriculture, Forests, Nature and Energy (D.A.F.N.E.), Tuscia University, Via S. Camillo de Lellis, 01100 Viterbo (Italy); Piccinno, Matteo; Leone, Antonio [Department of Agriculture, Forests, Nature and Energy (D.A.F.N.E.), Tuscia University, Via S. Camillo de Lellis, 01100 Viterbo (Italy)

    2016-02-15

    The main aim of this paper is to explore how perceptual and aesthetic impact analyses are considered in Environmental Impact Assessment (EIA), with specific reference to Italian renewable energy projects. To investigate this topic, the paper starts by establishing which factors are linked with perceptual and aesthetic impacts and why it is important to analyze these aspects, which are also related to legislative provisions and procedures in Europe and in Italy. In particular the paper refers to renewable energy projects because environmental policies are encouraging more and more investment in this kind of primary resource. The growing interest in this type of energy is leading to the realization of projects which change the governance of territories, with inevitable effects on the landscape from the aesthetic and perceptual points of view. Legislative references to EIA, including the latest directive regarding this topic show the importance of integrating the assessment of environmental and perceptual impacts, thus there is a need to improve EIA methodological approaches to this purpose. This paper proposes a profile of aesthetic and perceptual impact analysis in EIA for renewable energy projects in Italy, and concludes with recommendations as to how this kind of analysis could be improved. - Highlights: • We analyze 29 EIA Reports of Italian renewable energy projects. • We examine esthetic and perceptual aspects present in Italian EIA reports. • We identified inconsistency in use of methods for esthetic and perceptual aspects. • Local populations are rarely included as stakeholders in EIAs. • A shared understanding of perceptual and esthetic issues in EIA proceedings is required.

  6. Comparing the information conveyed by envelope modulation for speech intelligibility, speech quality, and music quality.

    Science.gov (United States)

    Kates, James M; Arehart, Kathryn H

    2015-10-01

    This paper uses mutual information to quantify the relationship between envelope modulation fidelity and perceptual responses. Data from several previous experiments that measured speech intelligibility, speech quality, and music quality are evaluated for normal-hearing and hearing-impaired listeners. A model of the auditory periphery is used to generate envelope signals, and envelope modulation fidelity is calculated using the normalized cross-covariance of the degraded signal envelope with that of a reference signal. Two procedures are used to describe the envelope modulation: (1) modulation within each auditory frequency band and (2) spectro-temporal processing that analyzes the modulation of spectral ripple components fit to successive short-time spectra. The results indicate that low modulation rates provide the highest information for intelligibility, while high modulation rates provide the highest information for speech and music quality. The low-to-mid auditory frequencies are most important for intelligibility, while mid frequencies are most important for speech quality and high frequencies are most important for music quality. Differences between the spectral ripple components used for the spectro-temporal analysis were not significant in five of the six experimental conditions evaluated. The results indicate that different modulation-rate and auditory-frequency weights may be appropriate for indices designed to predict different types of perceptual relationships.

  7. Validating a perceptual distraction model using a personal two-zone sound system

    DEFF Research Database (Denmark)

    Rämö, Jussi; Christensen, Lasse; Bech, Søren

    2017-01-01

    This paper focuses on validating a perceptual distraction model, which aims to predict user's perceived distraction caused by audio-on-audio interference. Originally, the distraction model was trained with music targets and interferers using a simple loudspeaker setup, consisting of only two...... sound zones within the sound-zone system. Thus, validating the model using a different sound-zone system with both speech-on-music and music-on-speech stimuli sets. The results show that the model performance is equally good in both zones, i.e., with both speech- on-music and music-on-speech stimuli...

  8. Low- and high-frequency cortical brain oscillations reflect dissociable mechanisms of concurrent speech segregation in noise.

    Science.gov (United States)

    Yellamsetty, Anusha; Bidelman, Gavin M

    2018-04-01

    Parsing simultaneous speech requires listeners use pitch-guided segregation which can be affected by the signal-to-noise ratio (SNR) in the auditory scene. The interaction of these two cues may occur at multiple levels within the cortex. The aims of the current study were to assess the correspondence between oscillatory brain rhythms and determine how listeners exploit pitch and SNR cues to successfully segregate concurrent speech. We recorded electrical brain activity while participants heard double-vowel stimuli whose fundamental frequencies (F0s) differed by zero or four semitones (STs) presented in either clean or noise-degraded (+5 dB SNR) conditions. We found that behavioral identification was more accurate for vowel mixtures with larger pitch separations but F0 benefit interacted with noise. Time-frequency analysis decomposed the EEG into different spectrotemporal frequency bands. Low-frequency (θ, β) responses were elevated when speech did not contain pitch cues (0ST > 4ST) or was noisy, suggesting a correlate of increased listening effort and/or memory demands. Contrastively, γ power increments were observed for changes in both pitch (0ST > 4ST) and SNR (clean > noise), suggesting high-frequency bands carry information related to acoustic features and the quality of speech representations. Brain-behavior associations corroborated these effects; modulations in low-frequency rhythms predicted the speed of listeners' perceptual decisions with higher bands predicting identification accuracy. Results are consistent with the notion that neural oscillations reflect both automatic (pre-perceptual) and controlled (post-perceptual) mechanisms of speech processing that are largely divisible into high- and low-frequency bands of human brain rhythms. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. The effects of bilingualism on children's perception of speech sounds

    NARCIS (Netherlands)

    Brasileiro, I.

    2009-01-01

    The general topic addressed by this dissertation is that of bilingualism, and more specifically, the topic of bilingual acquisition of speech sounds. The central question in this study is the following: does bilingualism affect children’s perceptual development of speech sounds? The term bilingual

  10. Perceptual restoration of degraded speech is preserved with advancing age.

    Science.gov (United States)

    Saija, Jefta D; Akyürek, Elkan G; Andringa, Tjeerd C; Başkent, Deniz

    2014-02-01

    Cognitive skills, such as processing speed, memory functioning, and the ability to divide attention, are known to diminish with aging. The present study shows that, despite these changes, older adults can successfully compensate for degradations in speech perception. Critically, the older participants of this study were not pre-selected for high performance on cognitive tasks, but only screened for normal hearing. We measured the compensation for speech degradation using phonemic restoration, where intelligibility of degraded speech is enhanced using top-down repair mechanisms. Linguistic knowledge, Gestalt principles of perception, and expectations based on situational and linguistic context are used to effectively fill in the inaudible masked speech portions. A positive compensation effect was previously observed only with young normal hearing people, but not with older hearing-impaired populations, leaving the question whether the lack of compensation was due to aging or due to age-related hearing problems. Older participants in the present study showed poorer intelligibility of degraded speech than the younger group, as expected from previous reports of aging effects. However, in conditions that induce top-down restoration, a robust compensation was observed. Speech perception by the older group was enhanced, and the enhancement effect was similar to that observed with the younger group. This effect was even stronger with slowed-down speech, which gives more time for cognitive processing. Based on previous research, the likely explanations for these observations are that older adults can overcome age-related cognitive deterioration by relying on linguistic skills and vocabulary that they have accumulated over their lifetime. Alternatively, or simultaneously, they may use different cerebral activation patterns or exert more mental effort. This positive finding on top-down restoration skills by the older individuals suggests that new cognitive training methods

  11. Acoustic and Perceptual Effects of Dysarthria in Greek with a Focus on Lexical Stress

    Science.gov (United States)

    Papakyritsis, Ioannis

    The field of motor speech disorders in Greek is substantially underresearched. Additionally, acoustic studies on lexical stress in dysarthria are generally very rare (Kim et al. 2010). This dissertation examined the acoustic and perceptual effects of Greek dysarthria focusing on lexical stress. Additional possibly deviant speech characteristics were acoustically analyzed. Data from three dysarthric participants and matched controls was analyzed using a case study design. The analysis of lexical stress was based on data drawn from a single word repetition task that included pairs of disyllabic words differentiated by stress location. This data was acoustically analyzed in terms of the use of the acoustic cues for Greek stress. The ability of the dysarthric participants to signal stress in single words was further assessed in a stress identification task carried out by 14 naive Greek listeners. Overall, the acoustic and perceptual data indicated that, although all three dysarthric speakers presented with some difficulty in the patterning of stressed and unstressed syllables, each had different underlying problems that gave rise to quite distinct patterns of deviant speech characteristics. The atypical use of lexical stress cues in Anna's data obscured the prominence relations of stressed and unstressed syllables to the extent that the position of lexical stress was usually not perceptually transparent. Chris and Maria on the other hand, did not have marked difficulties signaling lexical stress location, although listeners were not 100% successful in the stress identification task. For the most part, Chris' atypical phonation patterns and Maria's very slow rate of speech did not interfere with lexical stress signaling. The acoustic analysis of the lexical stress cues was generally in agreement with the participants' performance in the stress identification task. Interestingly, in all three dysarthric participants, but more so in Anna, targets stressed on the 1st

  12. The Hypothesis of Apraxia of Speech in Children with Autism Spectrum Disorder

    OpenAIRE

    Shriberg, Lawrence D.; Paul, Rhea; Black, Lois M.; van Santen, Jan P.

    2011-01-01

    In a sample of 46 children aged 4 to 7 years with Autism Spectrum Disorder (ASD) and intelligible speech, there was no statistical support for the hypothesis of concomitant Childhood Apraxia of Speech (CAS). Perceptual and acoustic measures of participants’ speech, prosody, and voice were compared with data from 40 typically-developing children, 13 preschool children with Speech Delay, and 15 participants aged 5 to 49 years with CAS in neurogenetic disorders. Speech Delay and Speech Errors, r...

  13. The Auditory-Visual Speech Benefit on Working Memory in Older Adults with Hearing Impairment.

    Science.gov (United States)

    Frtusova, Jana B; Phillips, Natalie A

    2016-01-01

    This study examined the effect of auditory-visual (AV) speech stimuli on working memory in older adults with poorer-hearing (PH) in comparison to age- and education-matched older adults with better hearing (BH). Participants completed a working memory n-back task (0- to 2-back) in which sequences of digits were presented in visual-only (i.e., speech-reading), auditory-only (A-only), and AV conditions. Auditory event-related potentials (ERP) were collected to assess the relationship between perceptual and working memory processing. The behavioral results showed that both groups were faster in the AV condition in comparison to the unisensory conditions. The ERP data showed perceptual facilitation in the AV condition, in the form of reduced amplitudes and latencies of the auditory N1 and/or P1 components, in the PH group. Furthermore, a working memory ERP component, the P3, peaked earlier for both groups in the AV condition compared to the A-only condition. In general, the PH group showed a more robust AV benefit; however, the BH group showed a dose-response relationship between perceptual facilitation and working memory improvement, especially for facilitation of processing speed. Two measures, reaction time and P3 amplitude, suggested that the presence of visual speech cues may have helped the PH group to counteract the demanding auditory processing, to the level that no group differences were evident during the AV modality despite lower performance during the A-only condition. Overall, this study provides support for the theory of an integrated perceptual-cognitive system. The practical significance of these findings is also discussed.

  14. Musical Experience and the Aging Auditory System: Implications for Cognitive Abilities and Hearing Speech in Noise

    Science.gov (United States)

    Parbery-Clark, Alexandra; Strait, Dana L.; Anderson, Samira; Hittner, Emily; Kraus, Nina

    2011-01-01

    Much of our daily communication occurs in the presence of background noise, compromising our ability to hear. While understanding speech in noise is a challenge for everyone, it becomes increasingly difficult as we age. Although aging is generally accompanied by hearing loss, this perceptual decline cannot fully account for the difficulties experienced by older adults for hearing in noise. Decreased cognitive skills concurrent with reduced perceptual acuity are thought to contribute to the difficulty older adults experience understanding speech in noise. Given that musical experience positively impacts speech perception in noise in young adults (ages 18–30), we asked whether musical experience benefits an older cohort of musicians (ages 45–65), potentially offsetting the age-related decline in speech-in-noise perceptual abilities and associated cognitive function (i.e., working memory). Consistent with performance in young adults, older musicians demonstrated enhanced speech-in-noise perception relative to nonmusicians along with greater auditory, but not visual, working memory capacity. By demonstrating that speech-in-noise perception and related cognitive function are enhanced in older musicians, our results imply that musical training may reduce the impact of age-related auditory decline. PMID:21589653

  15. Musical experience and the aging auditory system: implications for cognitive abilities and hearing speech in noise.

    Science.gov (United States)

    Parbery-Clark, Alexandra; Strait, Dana L; Anderson, Samira; Hittner, Emily; Kraus, Nina

    2011-05-11

    Much of our daily communication occurs in the presence of background noise, compromising our ability to hear. While understanding speech in noise is a challenge for everyone, it becomes increasingly difficult as we age. Although aging is generally accompanied by hearing loss, this perceptual decline cannot fully account for the difficulties experienced by older adults for hearing in noise. Decreased cognitive skills concurrent with reduced perceptual acuity are thought to contribute to the difficulty older adults experience understanding speech in noise. Given that musical experience positively impacts speech perception in noise in young adults (ages 18-30), we asked whether musical experience benefits an older cohort of musicians (ages 45-65), potentially offsetting the age-related decline in speech-in-noise perceptual abilities and associated cognitive function (i.e., working memory). Consistent with performance in young adults, older musicians demonstrated enhanced speech-in-noise perception relative to nonmusicians along with greater auditory, but not visual, working memory capacity. By demonstrating that speech-in-noise perception and related cognitive function are enhanced in older musicians, our results imply that musical training may reduce the impact of age-related auditory decline.

  16. Musical experience and the aging auditory system: implications for cognitive abilities and hearing speech in noise.

    Directory of Open Access Journals (Sweden)

    Alexandra Parbery-Clark

    Full Text Available Much of our daily communication occurs in the presence of background noise, compromising our ability to hear. While understanding speech in noise is a challenge for everyone, it becomes increasingly difficult as we age. Although aging is generally accompanied by hearing loss, this perceptual decline cannot fully account for the difficulties experienced by older adults for hearing in noise. Decreased cognitive skills concurrent with reduced perceptual acuity are thought to contribute to the difficulty older adults experience understanding speech in noise. Given that musical experience positively impacts speech perception in noise in young adults (ages 18-30, we asked whether musical experience benefits an older cohort of musicians (ages 45-65, potentially offsetting the age-related decline in speech-in-noise perceptual abilities and associated cognitive function (i.e., working memory. Consistent with performance in young adults, older musicians demonstrated enhanced speech-in-noise perception relative to nonmusicians along with greater auditory, but not visual, working memory capacity. By demonstrating that speech-in-noise perception and related cognitive function are enhanced in older musicians, our results imply that musical training may reduce the impact of age-related auditory decline.

  17. The Effects of Macroglossia on Speech: A Case Study

    Science.gov (United States)

    Mekonnen, Abebayehu Messele

    2012-01-01

    This article presents a case study of speech production in a 14-year-old Amharic-speaking boy. The boy had developed secondary macroglossia, related to a disturbance of growth hormones, following a history of normal speech development. Perceptual analysis combined with acoustic analysis and static palatography is used to investigate the specific…

  18. A method for Perceptual Assessment of Automotive Audio Systems and Cabin Acoustics

    DEFF Research Database (Denmark)

    Kaplanis, Neofytos; Bech, Søren; Sakari, Tervo

    2016-01-01

    This paper reports the design and implementation of a method to perceptually assess the acoustical prop- erties of a car cabin and the subsequent sound reproduction properties of automotive audio systems. Here, we combine Spatial Decomposition Method and Rapid Sensory Analysis techniques. The for......This paper reports the design and implementation of a method to perceptually assess the acoustical prop- erties of a car cabin and the subsequent sound reproduction properties of automotive audio systems. Here, we combine Spatial Decomposition Method and Rapid Sensory Analysis techniques...

  19. Temporal factors affecting somatosensory-auditory interactions in speech processing

    Directory of Open Access Journals (Sweden)

    Takayuki eIto

    2014-11-01

    Full Text Available Speech perception is known to rely on both auditory and visual information. However, sound specific somatosensory input has been shown also to influence speech perceptual processing (Ito et al., 2009. In the present study we addressed further the relationship between somatosensory information and speech perceptual processing by addressing the hypothesis that the temporal relationship between orofacial movement and sound processing contributes to somatosensory-auditory interaction in speech perception. We examined the changes in event-related potentials in response to multisensory synchronous (simultaneous and asynchronous (90 ms lag and lead somatosensory and auditory stimulation compared to individual unisensory auditory and somatosensory stimulation alone. We used a robotic device to apply facial skin somatosensory deformations that were similar in timing and duration to those experienced in speech production. Following synchronous multisensory stimulation the amplitude of the event-related potential was reliably different from the two unisensory potentials. More importantly, the magnitude of the event-related potential difference varied as a function of the relative timing of the somatosensory-auditory stimulation. Event-related activity change due to stimulus timing was seen between 160-220 ms following somatosensory onset, mostly around the parietal area. The results demonstrate a dynamic modulation of somatosensory-auditory convergence and suggest the contribution of somatosensory information for speech processing process is dependent on the specific temporal order of sensory inputs in speech production.

  20. The Computerized Perceptual Motor Skills Assessment: A new visual perceptual motor skills evaluation tool for children in early elementary grades.

    Science.gov (United States)

    Howe, Tsu-Hsin; Chen, Hao-Ling; Lee, Candy Chieh; Chen, Ying-Dar; Wang, Tien-Ni

    2017-10-01

    Visual perceptual motor skills have been proposed as underlying courses of handwriting difficulties. However, there is no evaluation tool currently available to assess these skills comprehensively and to serve as a sensitive measure. The purpose of this study was to validate the Computerized Perceptual Motor Skills Assessment (CPMSA), a newly developed evaluation tool for children in early elementary grades. Its test-retest reliability, concurrent validity, discriminant validity, and responsiveness were examined in 43 typically developing children and 26 children with handwriting difficulty. The CPMSA demonstrated excellent reliability across all subtests with intra-class correlation coefficients (ICCs)≥0.80. Significant moderate correlations between the domains of the CPMSA and corresponding gold standards including Beery VMI, the TVPS-3, and the eye-hand coordination subtest of the DTVP-2 demonstrated good concurrent validity. In addition, the CPMSA showed evidence of discriminant validity in samples of children with and without handwriting difficulty. This article provides evidence in support of the CPMSA. The CPMSA is a reliable, valid, and promising measure of visual perceptual motor skills for children in early elementary grades. Directions for future study and improvements to the assessment are discussed. Copyright © 2017. Published by Elsevier Ltd.

  1. The Hypothesis of Apraxia of Speech in Children with Autism Spectrum Disorder

    Science.gov (United States)

    Shriberg, Lawrence D.; Paul, Rhea; Black, Lois M.; van Santen, Jan P.

    2011-01-01

    In a sample of 46 children aged 4-7 years with Autism Spectrum Disorder (ASD) and intelligible speech, there was no statistical support for the hypothesis of concomitant Childhood Apraxia of Speech (CAS). Perceptual and acoustic measures of participants' speech, prosody, and voice were compared with data from 40 typically-developing children, 13…

  2. Differences in early speech patterns between Parkinson variant of multiple system atrophy and Parkinson's disease.

    Science.gov (United States)

    Huh, Young Eun; Park, Jongkyu; Suh, Mee Kyung; Lee, Sang Eun; Kim, Jumin; Jeong, Yuri; Kim, Hee-Tae; Cho, Jin Whan

    2015-08-01

    In Parkinson variant of multiple system atrophy (MSA-P), patterns of early speech impairment and their distinguishing features from Parkinson's disease (PD) require further exploration. Here, we compared speech data among patients with early-stage MSA-P, PD, and healthy subjects using quantitative acoustic and perceptual analyses. Variables were analyzed for men and women in view of gender-specific features of speech. Acoustic analysis revealed that male patients with MSA-P exhibited more profound speech abnormalities than those with PD, regarding increased voice pitch, prolonged pause time, and reduced speech rate. This might be due to widespread pathology of MSA-P in nigrostriatal or extra-striatal structures related to speech production. Although several perceptual measures were mildly impaired in MSA-P and PD patients, none of these parameters showed a significant difference between patient groups. Detailed speech analysis using acoustic measures may help distinguish between MSA-P and PD early in the disease process. Copyright © 2015 Elsevier Inc. All rights reserved.

  3. Enhanced perceptual functioning in autism: an update, and eight principles of autistic perception.

    Science.gov (United States)

    Mottron, Laurent; Dawson, Michelle; Soulières, Isabelle; Hubert, Benedicte; Burack, Jake

    2006-01-01

    We propose an "Enhanced Perceptual Functioning" model encompassing the main differences between autistic and non-autistic social and non-social perceptual processing: locally oriented visual and auditory perception, enhanced low-level discrimination, use of a more posterior network in "complex" visual tasks, enhanced perception of first order static stimuli, diminished perception of complex movement, autonomy of low-level information processing toward higher-order operations, and differential relation between perception and general intelligence. Increased perceptual expertise may be implicated in the choice of special ability in savant autistics, and in the variability of apparent presentations within PDD (autism with and without typical speech, Asperger syndrome) in non-savant autistics. The overfunctioning of brain regions typically involved in primary perceptual functions may explain the autistic perceptual endophenotype.

  4. Assessment of Danish-speaking children’s phonological development and speech disorders

    DEFF Research Database (Denmark)

    Clausen, Marit Carolin; Fox-Boyer, Annette

    2018-01-01

    The identification of speech sounds disorders is an important everyday task for speech and language therapists (SLTs) working with children. Therefore, assessment tools are needed that are able to correctly identify and diagnose a child with a suspected speech disorder and furthermore, that provide...... of the existing speech assessments in Denmark showed that none of the materials fulfilled current recommendations identified in research literature. Therefore, the aim of this paper is to describe the evaluation of a newly constructed instrument for assessing the speech development and disorders of Danish...... with suspected speech disorder (Clausen and Fox-Boyer, in prep). The results indicated that the instrument showed strong inter-examiner reliability for both populations as well as a high content and diagnostic validity. Hence, the study showed that the LogoFoVa can be regarded as a reliable and valid tool...

  5. White noise speech illusion and psychosis expression: An experimental investigation of psychosis liability.

    Science.gov (United States)

    Pries, Lotta-Katrin; Guloksuz, Sinan; Menne-Lothmann, Claudia; Decoster, Jeroen; van Winkel, Ruud; Collip, Dina; Delespaul, Philippe; De Hert, Marc; Derom, Catherine; Thiery, Evert; Jacobs, Nele; Wichers, Marieke; Simons, Claudia J P; Rutten, Bart P F; van Os, Jim

    2017-01-01

    An association between white noise speech illusion and psychotic symptoms has been reported in patients and their relatives. This supports the theory that bottom-up and top-down perceptual processes are involved in the mechanisms underlying perceptual abnormalities. However, findings in nonclinical populations have been conflicting. The aim of this study was to examine the association between white noise speech illusion and subclinical expression of psychotic symptoms in a nonclinical sample. Findings were compared to previous results to investigate potential methodology dependent differences. In a general population adolescent and young adult twin sample (n = 704), the association between white noise speech illusion and subclinical psychotic experiences, using the Structured Interview for Schizotypy-Revised (SIS-R) and the Community Assessment of Psychic Experiences (CAPE), was analyzed using multilevel logistic regression analyses. Perception of any white noise speech illusion was not associated with either positive or negative schizotypy in the general population twin sample, using the method by Galdos et al. (2011) (positive: ORadjusted: 0.82, 95% CI: 0.6-1.12, p = 0.217; negative: ORadjusted: 0.75, 95% CI: 0.56-1.02, p = 0.065) and the method by Catalan et al. (2014) (positive: ORadjusted: 1.11, 95% CI: 0.79-1.57, p = 0.557). No association was found between CAPE scores and speech illusion (ORadjusted: 1.25, 95% CI: 0.88-1.79, p = 0.220). For the Catalan et al. (2014) but not the Galdos et al. (2011) method, a negative association was apparent between positive schizotypy and speech illusion with positive or negative affective valence (ORadjusted: 0.44, 95% CI: 0.24-0.81, p = 0.008). Contrary to findings in clinical populations, white noise speech illusion may not be associated with psychosis proneness in nonclinical populations.

  6. White noise speech illusion and psychosis expression : An experimental investigation of psychosis liability

    NARCIS (Netherlands)

    Pries, Lotta-Katrin; Guloksuz, Sinan; Menne-Lothmann, Claudia; Decoster, Jeroen; van Winkel, Ruud; Collip, Dina; Delespaul, Philippe; De Hert, Marc; Derom, Catherine; Thiery, Evert; Jacobs, Nele; Wichers, Marieke; Simons, Claudia J. P.; Rutten, Bart P. F.; van Os, Jim

    2017-01-01

    Background: An association between white noise speech illusion and psychotic symptoms has been reported in patients and their relatives. This supports the theory that bottom-up and top-down perceptual processes are involved in the mechanisms underlying perceptual abnormalities. However, findings in

  7. Real-Time Control of an Articulatory-Based Speech Synthesizer for Brain Computer Interfaces.

    Directory of Open Access Journals (Sweden)

    Florent Bocquelet

    2016-11-01

    Full Text Available Restoring natural speech in paralyzed and aphasic people could be achieved using a Brain-Computer Interface (BCI controlling a speech synthesizer in real-time. To reach this goal, a prerequisite is to develop a speech synthesizer producing intelligible speech in real-time with a reasonable number of control parameters. We present here an articulatory-based speech synthesizer that can be controlled in real-time for future BCI applications. This synthesizer converts movements of the main speech articulators (tongue, jaw, velum, and lips into intelligible speech. The articulatory-to-acoustic mapping is performed using a deep neural network (DNN trained on electromagnetic articulography (EMA data recorded on a reference speaker synchronously with the produced speech signal. This DNN is then used in both offline and online modes to map the position of sensors glued on different speech articulators into acoustic parameters that are further converted into an audio signal using a vocoder. In offline mode, highly intelligible speech could be obtained as assessed by perceptual evaluation performed by 12 listeners. Then, to anticipate future BCI applications, we further assessed the real-time control of the synthesizer by both the reference speaker and new speakers, in a closed-loop paradigm using EMA data recorded in real time. A short calibration period was used to compensate for differences in sensor positions and articulatory differences between new speakers and the reference speaker. We found that real-time synthesis of vowels and consonants was possible with good intelligibility. In conclusion, these results open to future speech BCI applications using such articulatory-based speech synthesizer.

  8. Newborn infants' sensitivity to perceptual cues to lexical and grammatical words.

    Science.gov (United States)

    Shi, R; Werker, J F; Morgan, J L

    1999-09-30

    In our study newborn infants were presented with lists of lexical and grammatical words prepared from natural maternal speech. The results show that newborns are able to categorically discriminate these sets of words based on a constellation of perceptual cues that distinguish them. This general ability to detect and categorically discriminate sets of words on the basis of multiple acoustic and phonological cues may provide a perceptual base that can help older infants bootstrap into the acquisition of grammatical categories and syntactic structure.

  9. Auditory perceptual learning in adults with and without age-related hearing loss

    Directory of Open Access Journals (Sweden)

    Hanin eKarawani

    2016-02-01

    Full Text Available Introduction: Speech recognition in adverse listening conditions becomes more difficult as we age, particularly for individuals with age-related hearing loss (ARHL. Whether these difficulties can be eased with training remains debated, because it is not clear whether the outcomes are sufficiently general to be of use outside of the training context. The aim of the current study was to compare training-induced learning and generalization between normal-hearing older adults and those with ARHL.Methods: 56 listeners (60-72 y/o, 35 participants with ARHL and 21 normal hearing adults participated in the study. The study design was a cross over design with three groups (immediate-training, delayed-training and no-training group. Trained participants received 13 sessions of home-based auditory training over the course of 4 weeks. Three adverse listening conditions were targeted: (1 Speech-in-noise (2 time compressed speech and (3 competing speakers, and the outcomes of training were compared between normal and ARHL groups. Pre- and post-test sessions were completed by all participants. Outcome measures included tests on all of the trained conditions as well as on a series of untrained conditions designed to assess the transfer of learning to other speech and non-speech conditions. Results: Significant improvements on all trained conditions were observed in both ARHL and normal-hearing groups over the course of training. Normal hearing participants learned more than participants with ARHL in the speech-in-noise condition, but showed similar patterns of learning in the other conditions. Greater pre- to post-test changes were observed in trained than in untrained listeners on all trained conditions. In addition, the ability of trained listeners from the ARHL group to discriminate minimally different pseudowords in noise also improved with training. Conclusions: ARHL did not preclude auditory perceptual learning but there was little generalization to

  10. Imitation of contrastive lexical stress in children with speech delay

    Science.gov (United States)

    Vick, Jennell C.; Moore, Christopher A.

    2005-09-01

    This study examined the relationship between acoustic correlates of stress in trochaic (strong-weak), spondaic (strong-strong), and iambic (weak-strong) nonword bisyllables produced by children (30-50) with normal speech acquisition and children with speech delay. Ratios comparing the acoustic measures (vowel duration, rms, and f0) of the first syllable to the second syllable were calculated to evaluate the extent to which each phonetic parameter was used to mark stress. In addition, a calculation of the variability of jaw movement in each bisyllable was made. Finally, perceptual judgments of accuracy of stress production were made. Analysis of perceptual judgments indicated a robust difference between groups: While both groups of children produced errors in imitating the contrastive lexical stress models (~40%), the children with normal speech acquisition tended to produce trochaic forms in substitution for other stress types, whereas children with speech delay showed no preference for trochees. The relationship between segmental acoustic parameters, kinematic variability, and the ratings of stress by trained listeners will be presented.

  11. A Motor Speech Assessment for Children with Severe Speech Disorders: Reliability and Validity Evidence

    Science.gov (United States)

    Strand, Edythe A.; McCauley, Rebecca J.; Weigand, Stephen D.; Stoeckel, Ruth E.; Baas, Becky S.

    2013-01-01

    Purpose: In this article, the authors report reliability and validity evidence for the Dynamic Evaluation of Motor Speech Skill (DEMSS), a new test that uses dynamic assessment to aid in the differential diagnosis of childhood apraxia of speech (CAS). Method: Participants were 81 children between 36 and 79 months of age who were referred to the…

  12. Perceptual assessment of fricative--stop coarticulation.

    Science.gov (United States)

    Repp, B H; Mann, V A

    1981-04-01

    The perceptual dependence of stop consonants on preceding fricatives [Mann and Repp, J. Acoust. Soc. Am. 69, 548--558 (1981)] was further investigated in two experiments employing both natural and synthetic speech. These experiments consistently replicated our original finding that listeners, report velar stops following [s]. In addition, our data confirmed earlier reports that natural fricative noises (excerpted from utterances of [st alpha], [sk alpha], [(formula: see text)k alpha]) contain cues to the following stop consonants; this was revealed in subjects' identifications of stops from isolated fricative noises and from stimuli consisting of these noises followed by synthetic CV portions drawn from a [t alpha]--[k alpha] continuum. However, these cues in the noise portion could not account for the contextual effect of fricative identity ([formula: see text] versus [sp) on stop perception (more "k" responses following [s]). Rather, this effect seems to be related to a coarticulatory influence of a preceding fricative on stop production; Subjects' responses to excised natural CV portions (with bursts and aspiration removed) were biased towards a relatively more forward place of stop articulation when the CVs had originally been preceded by [s]; and the identification of a preceding ambiguous fricative was biased in the direction of the original fricative context in which a given CV portion had been produced. These findings support an articulatory explanation for the effect of preceding fricatives on stop consonant perception.

  13. Status report on speech research. A report on the status and progress of studies on the nature of speech, instrumentation for its investigation, and practical applications

    Science.gov (United States)

    Liberman, A. M.

    1985-10-01

    This interim status report on speech research discusses the following topics: On Vagueness and Fictions as Cornerstones of a Theory of Perceiving and Acting: A Comment on Walter (1983); The Informational Support for Upright Stance; Determining the Extent of Coarticulation-effects of Experimental Design; The Roles of Phoneme Frequency, Similarity, and Availability in the Experimental Elicitation of Speech Errors; On Learning to Speak; The Motor Theory of Speech Perception Revised; Linguistic and Acoustic Correlates of the Perceptual Structure Found in an Individual Differences Scaling Study of Vowels; Perceptual Coherence of Speech: Stability of Silence-cued Stop Consonants; Development of the Speech Perceptuomotor System; Dependence of Reading on Orthography-Investigations in Serbo-Croatian; The Relationship between Knowledge of Derivational Morphology and Spelling Ability in Fourth, Sixth, and Eighth Graders; Relations among Regular and Irregular, Morphologically-Related Words in the Lexicon as Revealed by Repetition Priming; Grammatical Priming of Inflected Nouns by the Gender of Possessive Adjectives; Grammatical Priming of Inflected Nouns by Inflected Adjectives; Deaf Signers and Serial Recall in the Visual Modality-Memory for Signs, Fingerspelling, and Print; Did Orthographies Evolve?; The Development of Children's Sensitivity to Factors Inf luencing Vowel Reading.

  14. The organization and reorganization of audiovisual speech perception in the first year of life.

    Science.gov (United States)

    Danielson, D Kyle; Bruderer, Alison G; Kandhadai, Padmapriya; Vatikiotis-Bateson, Eric; Werker, Janet F

    2017-04-01

    The period between six and 12 months is a sensitive period for language learning during which infants undergo auditory perceptual attunement, and recent results indicate that this sensitive period may exist across sensory modalities. We tested infants at three stages of perceptual attunement (six, nine, and 11 months) to determine 1) whether they were sensitive to the congruence between heard and seen speech stimuli in an unfamiliar language, and 2) whether familiarization with congruent audiovisual speech could boost subsequent non-native auditory discrimination. Infants at six- and nine-, but not 11-months, detected audiovisual congruence of non-native syllables. Familiarization to incongruent, but not congruent, audiovisual speech changed auditory discrimination at test for six-month-olds but not nine- or 11-month-olds. These results advance the proposal that speech perception is audiovisual from early in ontogeny, and that the sensitive period for audiovisual speech perception may last somewhat longer than that for auditory perception alone.

  15. Evaluation of speech and language assessment approaches with bilingual children.

    Science.gov (United States)

    De Lamo White, Caroline; Jin, Lixian

    2011-01-01

    British society is multicultural and multilingual, thus for many children English is not their main or only language. Speech and language therapists are required to assess accurately the speech and language skills of bilingual children if they are suspected of having a disorder. Cultural and linguistic diversity means that a more complex assessment procedure is needed and research suggests that bilingual children are at risk of misdiagnosis. Clinicians have identified a lack of suitable assessment instruments for use with this client group. This paper highlights the challenges of assessing bilingual children and reviews available speech and language assessment procedures and approaches for use with this client group. It evaluates different approaches for assessing bilingual children to identify approaches that may be more appropriate for carrying out assessments effectively. This review discusses and evaluates the efficacy of norm-referenced standardized measures, criterion-referenced measures, language-processing measures, dynamic assessment and a sociocultural approach. When all named procedures and approaches are compared, the sociocultural approach appears to hold the most promise for accurate assessment of bilingual children. Research suggests that language-processing measures are not effective indicators for identifying speech and language disorders in bilingual children, but further research is warranted. The sociocultural approach encompasses some of the other approaches discussed, including norm-referenced measures, criterion-referenced measures and dynamic assessment. The sociocultural approach enables the clinician to interpret results in the light of the child's linguistic and cultural background. In addition, combining approaches mitigates the weaknesses inherent in each approach. © 2011 Royal College of Speech and Language Therapists.

  16. Differences in Speech Recognition Between Children with Attention Deficits and Typically Developed Children Disappear When Exposed to 65 dB of Auditory Noise.

    Science.gov (United States)

    Söderlund, Göran B W; Jobs, Elisabeth Nilsson

    2016-01-01

    The most common neuropsychiatric condition in the in children is attention deficit hyperactivity disorder (ADHD), affecting ∼6-9% of the population. ADHD is distinguished by inattention and hyperactive, impulsive behaviors as well as poor performance in various cognitive tasks often leading to failures at school. Sensory and perceptual dysfunctions have also been noticed. Prior research has mainly focused on limitations in executive functioning where differences are often explained by deficits in pre-frontal cortex activation. Less notice has been given to sensory perception and subcortical functioning in ADHD. Recent research has shown that children with ADHD diagnosis have a deviant auditory brain stem response compared to healthy controls. The aim of the present study was to investigate if the speech recognition threshold differs between attentive and children with ADHD symptoms in two environmental sound conditions, with and without external noise. Previous research has namely shown that children with attention deficits can benefit from white noise exposure during cognitive tasks and here we investigate if noise benefit is present during an auditory perceptual task. For this purpose we used a modified Hagerman's speech recognition test where children with and without attention deficits performed a binaural speech recognition task to assess the speech recognition threshold in no noise and noise conditions (65 dB). Results showed that the inattentive group displayed a higher speech recognition threshold than typically developed children and that the difference in speech recognition threshold disappeared when exposed to noise at supra threshold level. From this we conclude that inattention can partly be explained by sensory perceptual limitations that can possibly be ameliorated through noise exposure.

  17. Differences in Speech Recognition Between Children with Attention Deficits and Typically Developed Children Disappear when Exposed to 65 dB of Auditory Noise

    Directory of Open Access Journals (Sweden)

    Göran B W Söderlund

    2016-01-01

    Full Text Available The most common neuropsychiatric condition in the in children is attention deficit hyperactivity disorder (ADHD, affecting approximately 6-9 % of the population. ADHD is distinguished by inattention and hyperactive, impulsive behaviors as well as poor performance in various cognitive tasks often leading to failures at school. Sensory and perceptual dysfunctions have also been noticed. Prior research has mainly focused on limitations in executive functioning where differences are often explained by deficits in pre-frontal cortex activation. Less notice has been given to sensory perception and subcortical functioning in ADHD. Recent research has shown that children with ADHD diagnosis have a deviant auditory brain stem response compared to healthy controls. The aim of the present study was to investigate if the speech recognition threshold differs between attentive and children with ADHD symptoms in two environmental sound conditions, with and without external noise. Previous research has namely shown that children with attention deficits can benefit from white noise exposure during cognitive tasks and here we investigate if noise benefit is present during an auditory perceptual task. For this purpose we used a modified Hagerman’s speech recognition test where children with and without attention deficits performed a binaural speech recognition task to assess the speech recognition threshold in no noise and noise conditions (65 dB. Results showed that the inattentive group displayed a higher speech recognition threshold than typically developed children (TDC and that the difference in speech recognition threshold disappeared when exposed to noise at supra threshold level. From this we conclude that inattention can partly be explained by sensory perceptual limitations that can possibly be ameliorated through noise exposure.

  18. Statistical Learning, Syllable Processing, and Speech Production in Healthy Hearing and Hearing-Impaired Preschool Children: A Mismatch Negativity Study.

    Science.gov (United States)

    Studer-Eichenberger, Esther; Studer-Eichenberger, Felix; Koenig, Thomas

    2016-01-01

    The objectives of the present study were to investigate temporal/spectral sound-feature processing in preschool children (4 to 7 years old) with peripheral hearing loss compared with age-matched controls. The results verified the presence of statistical learning, which was diminished in children with hearing impairments (HIs), and elucidated possible perceptual mediators of speech production. Perception and production of the syllables /ba/, /da/, /ta/, and /na/ were recorded in 13 children with normal hearing and 13 children with HI. Perception was assessed physiologically through event-related potentials (ERPs) recorded by EEG in a multifeature mismatch negativity paradigm and behaviorally through a discrimination task. Temporal and spectral features of the ERPs during speech perception were analyzed, and speech production was quantitatively evaluated using speech motor maximum performance tasks. Proximal to stimulus onset, children with HI displayed a difference in map topography, indicating diminished statistical learning. In later ERP components, children with HI exhibited reduced amplitudes in the N2 and early parts of the late disciminative negativity components specifically, which are associated with temporal and spectral control mechanisms. Abnormalities of speech perception were only subtly reflected in speech production, as the lone difference found in speech production studies was a mild delay in regulating speech intensity. In addition to previously reported deficits of sound-feature discriminations, the present study results reflect diminished statistical learning in children with HI, which plays an early and important, but so far neglected, role in phonological processing. Furthermore, the lack of corresponding behavioral abnormalities in speech production implies that impaired perceptual capacities do not necessarily translate into productive deficits.

  19. Temporal modulations in speech and music.

    Science.gov (United States)

    Ding, Nai; Patel, Aniruddh D; Chen, Lin; Butler, Henry; Luo, Cheng; Poeppel, David

    2017-10-01

    Speech and music have structured rhythms. Here we discuss a major acoustic correlate of spoken and musical rhythms, the slow (0.25-32Hz) temporal modulations in sound intensity and compare the modulation properties of speech and music. We analyze these modulations using over 25h of speech and over 39h of recordings of Western music. We show that the speech modulation spectrum is highly consistent across 9 languages (including languages with typologically different rhythmic characteristics). A different, but similarly consistent modulation spectrum is observed for music, including classical music played by single instruments of different types, symphonic, jazz, and rock. The temporal modulations of speech and music show broad but well-separated peaks around 5 and 2Hz, respectively. These acoustically dominant time scales may be intrinsic features of speech and music, a possibility which should be investigated using more culturally diverse samples in each domain. Distinct modulation timescales for speech and music could facilitate their perceptual analysis and its neural processing. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. Transfer Effect of Speech-sound Learning on Auditory-motor Processing of Perceived Vocal Pitch Errors.

    Science.gov (United States)

    Chen, Zhaocong; Wong, Francis C K; Jones, Jeffery A; Li, Weifeng; Liu, Peng; Chen, Xi; Liu, Hanjun

    2015-08-17

    Speech perception and production are intimately linked. There is evidence that speech motor learning results in changes to auditory processing of speech. Whether speech motor control benefits from perceptual learning in speech, however, remains unclear. This event-related potential study investigated whether speech-sound learning can modulate the processing of feedback errors during vocal pitch regulation. Mandarin speakers were trained to perceive five Thai lexical tones while learning to associate pictures with spoken words over 5 days. Before and after training, participants produced sustained vowel sounds while they heard their vocal pitch feedback unexpectedly perturbed. As compared to the pre-training session, the magnitude of vocal compensation significantly decreased for the control group, but remained consistent for the trained group at the post-training session. However, the trained group had smaller and faster N1 responses to pitch perturbations and exhibited enhanced P2 responses that correlated significantly with their learning performance. These findings indicate that the cortical processing of vocal pitch regulation can be shaped by learning new speech-sound associations, suggesting that perceptual learning in speech can produce transfer effects to facilitating the neural mechanisms underlying the online monitoring of auditory feedback regarding vocal production.

  1. The Brain Dynamics of Rapid Perceptual Adaptation to Adverse Listening Conditions

    NARCIS (Netherlands)

    Erb, J.; Henry, M.J.; Eisner, F.; Obleser, J.

    2013-01-01

    Listeners show a remarkable ability to quickly adjust to degraded speech input. Here, we aimed to identify the neural mechanisms of such short-term perceptual adaptation. In a sparse-sampling, cardiac-gated functional magnetic resonance imaging (fMRI) acquisition, human listeners heard and repeated

  2. Performance Assessment of Dynaspeak Speech Recognition System on Inflight Databases

    National Research Council Canada - National Science Library

    Barry, Timothy

    2004-01-01

    .... To aid in the assessment of various commercially available speech recognition systems, several aircraft speech databases have been developed at the Air Force Research Laboratory's Human Effectiveness Directorate...

  3. Do long-term tongue piercings affect speech quality?

    Science.gov (United States)

    Heinen, Esther; Birkholz, Peter; Willmes, Klaus; Neuschaefer-Rube, Christiane

    2017-10-01

    To explore possible effects of tongue piercing on perceived speech quality. Using a quasi-experimental design, we analyzed the effect of tongue piercing on speech in a perception experiment. Samples of spontaneous speech and read speech were recorded from 20 long-term pierced and 20 non-pierced individuals (10 males, 10 females each). The individuals having a tongue piercing were recorded with attached and removed piercing. The audio samples were blindly rated by 26 female and 20 male laypersons and by 5 female speech-language pathologists with regard to perceived speech quality along 5 dimensions: speech clarity, speech rate, prosody, rhythm and fluency. We found no statistically significant differences for any of the speech quality dimensions between the pierced and non-pierced individuals, neither for the read nor for the spontaneous speech. In addition, neither length nor position of piercing had a significant effect on speech quality. The removal of tongue piercings had no effects on speech performance either. Rating differences between laypersons and speech-language pathologists were not dependent on the presence of a tongue piercing. People are able to perfectly adapt their articulation to long-term tongue piercings such that their speech quality is not perceptually affected.

  4. Neural networks supporting audiovisual integration for speech: A large-scale lesion study.

    Science.gov (United States)

    Hickok, Gregory; Rogalsky, Corianne; Matchin, William; Basilakos, Alexandra; Cai, Julia; Pillay, Sara; Ferrill, Michelle; Mickelsen, Soren; Anderson, Steven W; Love, Tracy; Binder, Jeffrey; Fridriksson, Julius

    2018-06-01

    Auditory and visual speech information are often strongly integrated resulting in perceptual enhancements for audiovisual (AV) speech over audio alone and sometimes yielding compelling illusory fusion percepts when AV cues are mismatched, the McGurk-MacDonald effect. Previous research has identified three candidate regions thought to be critical for AV speech integration: the posterior superior temporal sulcus (STS), early auditory cortex, and the posterior inferior frontal gyrus. We assess the causal involvement of these regions (and others) in the first large-scale (N = 100) lesion-based study of AV speech integration. Two primary findings emerged. First, behavioral performance and lesion maps for AV enhancement and illusory fusion measures indicate that classic metrics of AV speech integration are not necessarily measuring the same process. Second, lesions involving superior temporal auditory, lateral occipital visual, and multisensory zones in the STS are the most disruptive to AV speech integration. Further, when AV speech integration fails, the nature of the failure-auditory vs visual capture-can be predicted from the location of the lesions. These findings show that AV speech processing is supported by unimodal auditory and visual cortices as well as multimodal regions such as the STS at their boundary. Motor related frontal regions do not appear to play a role in AV speech integration. Copyright © 2018 Elsevier Ltd. All rights reserved.

  5. Influences on infant speech processing: toward a new synthesis.

    Science.gov (United States)

    Werker, J F; Tees, R C

    1999-01-01

    To comprehend and produce language, we must be able to recognize the sound patterns of our language and the rules for how these sounds "map on" to meaning. Human infants are born with a remarkable array of perceptual sensitivities that allow them to detect the basic properties that are common to the world's languages. During the first year of life, these sensitivities undergo modification reflecting an exquisite tuning to just that phonological information that is needed to map sound to meaning in the native language. We review this transition from language-general to language-specific perceptual sensitivity that occurs during the first year of life and consider whether the changes propel the child into word learning. To account for the broad-based initial sensitivities and subsequent reorganizations, we offer an integrated transactional framework based on the notion of a specialized perceptual-motor system that has evolved to serve human speech, but which functions in concert with other developing abilities. In so doing, we highlight the links between infant speech perception, babbling, and word learning.

  6. Dissociating Cortical Activity during Processing of Native and Non-Native Audiovisual Speech from Early to Late Infancy

    Directory of Open Access Journals (Sweden)

    Eswen Fava

    2014-08-01

    Full Text Available Initially, infants are capable of discriminating phonetic contrasts across the world’s languages. Starting between seven and ten months of age, they gradually lose this ability through a process of perceptual narrowing. Although traditionally investigated with isolated speech sounds, such narrowing occurs in a variety of perceptual domains (e.g., faces, visual speech. Thus far, tracking the developmental trajectory of this tuning process has been focused primarily on auditory speech alone, and generally using isolated sounds. But infants learn from speech produced by people talking to them, meaning they learn from a complex audiovisual signal. Here, we use near-infrared spectroscopy to measure blood concentration changes in the bilateral temporal cortices of infants in three different age groups: 3-to-6 months, 7-to-10 months, and 11-to-14-months. Critically, all three groups of infants were tested with continuous audiovisual speech in both their native and another, unfamiliar language. We found that at each age range, infants showed different patterns of cortical activity in response to the native and non-native stimuli. Infants in the youngest group showed bilateral cortical activity that was greater overall in response to non-native relative to native speech; the oldest group showed left lateralized activity in response to native relative to non-native speech. These results highlight perceptual tuning as a dynamic process that happens across modalities and at different levels of stimulus complexity.

  7. Simulation of modified hybrid noise reduction algorithm to enhance the speech quality

    International Nuclear Information System (INIS)

    Waqas, A.; Muhammad, T.; Jamal, H.

    2013-01-01

    Speech is the most essential method of correspondence of humankind. Cell telephony, portable hearing assistants and, hands free are specific provisions in this respect. The performance of these communication devices could be affected because of distortions which might augment them. There are two essential sorts of distortions that might be recognized, specifically: convolutive and additive noises. These mutilations contaminate the clean speech and make it unsatisfactory to human audiences i.e. perceptual value and intelligibility of speech signal diminishes. The objective of speech upgrade systems is to enhance the quality and understandability of speech to make it more satisfactory to audiences. This paper recommends a modified hybrid approach for single channel devices to process the noisy signals considering only the effect of background noises. It is a mixture of pre-processing relative spectral amplitude (RASTA) filter, which is approximated by a straight forward 4th order band-pass filter, and conventional minimum mean square error short time spectral amplitude (MMSE STSA85) estimator. To analyze the performance of the algorithm an objective parameter called Perceptual estimation of speech quality (PESQ) is measured. The results show that the modified algorithm performs well to remove the background noises. SIMULINK implementation is also performed and its profile report has been generated to observe the execution time. (author)

  8. Visuomotor Tracking Abilities of Speakers with Apraxia of Speech or Conduction Aphasia

    Science.gov (United States)

    Robin, Donald A.; Jacks, Adam; Hageman, Carlin; Clark, Heather M.; Woodworth, George

    2008-01-01

    This investigation examined the visuomotor tracking abilities of persons with apraxia of speech (AOS) or conduction aphasia (CA). In addition, tracking performance was correlated with perceptual judgments of speech accuracy. Five individuals with AOS and four with CA served as participants, as well as an equal number of healthy controls matched by…

  9. Inconsistency of speech in children with childhood apraxia of speech, phonological disorders, and typical speech

    Science.gov (United States)

    Iuzzini, Jenya

    There is a lack of agreement on the features used to differentiate Childhood Apraxia of Speech (CAS) from Phonological Disorders (PD). One criterion which has gained consensus is lexical inconsistency of speech (ASHA, 2007); however, no accepted measure of this feature has been defined. Although lexical assessment provides information about consistency of an item across repeated trials, it may not capture the magnitude of inconsistency within an item. In contrast, segmental analysis provides more extensive information about consistency of phoneme usage across multiple contexts and word-positions. The current research compared segmental and lexical inconsistency metrics in preschool-aged children with PD, CAS, and typical development (TD) to determine how inconsistency varies with age in typical and disordered speakers, and whether CAS and PD were differentiated equally well by both assessment levels. Whereas lexical and segmental analyses may be influenced by listener characteristics or speaker intelligibility, the acoustic signal is less vulnerable to these factors. In addition, the acoustic signal may reveal information which is not evident in the perceptual signal. A second focus of the current research was motivated by Blumstein et al.'s (1980) classic study on voice onset time (VOT) in adults with acquired apraxia of speech (AOS) which demonstrated a motor impairment underlying AOS. In the current study, VOT analyses were conducted to determine the relationship between age and group with the voicing distribution for bilabial and alveolar plosives. Findings revealed that 3-year-olds evidenced significantly higher inconsistency than 5-year-olds; segmental inconsistency approached 0% in 5-year-olds with TD, whereas it persisted in children with PD and CAS suggesting that for child in this age-range, inconsistency is a feature of speech disorder rather than typical development (Holm et al., 2007). Likewise, whereas segmental and lexical inconsistency were

  10. Self-reflection and the inner voice: activation of the left inferior frontal gyrus during perceptual and conceptual self-referential thinking.

    Science.gov (United States)

    Morin, Alain; Hamper, Breanne

    2012-01-01

    Inner speech involvement in self-reflection was examined by reviewing 130 studies assessing brain activation during self-referential processing in key self-domains: agency, self-recognition, emotions, personality traits, autobiographical memory, and miscellaneous (e.g., prospection, judgments). The left inferior frontal gyrus (LIFG) has been shown to be reliably recruited during inner speech production. The percentage of studies reporting LIFG activity for each self-dimension was calculated. Fifty five percent of all studies reviewed indicated LIFG (and presumably inner speech) activity during self-reflection tasks; on average LIFG activation is observed 16% of the time during completion of non-self tasks (e.g., attention, perception). The highest LIFG activation rate was observed during retrieval of autobiographical information. The LIFG was significantly more recruited during conceptual tasks (e.g., prospection, traits) than during perceptual tasks (agency and self-recognition). This constitutes additional evidence supporting the idea of a participation of inner speech in self-related thinking.

  11. Connected digit speech recognition system for Malayalam language

    Indian Academy of Sciences (India)

    A connected digit speech recognition is important in many applications such as automated banking system, catalogue-dialing, automatic data entry, automated banking system, etc. This paper presents an optimum speaker-independent connected digit recognizer for Malayalam language. The system employs Perceptual ...

  12. Perceptual Plasticity for Auditory Object Recognition

    Science.gov (United States)

    Heald, Shannon L. M.; Van Hedger, Stephen C.; Nusbaum, Howard C.

    2017-01-01

    In our auditory environment, we rarely experience the exact acoustic waveform twice. This is especially true for communicative signals that have meaning for listeners. In speech and music, the acoustic signal changes as a function of the talker (or instrument), speaking (or playing) rate, and room acoustics, to name a few factors. Yet, despite this acoustic variability, we are able to recognize a sentence or melody as the same across various kinds of acoustic inputs and determine meaning based on listening goals, expectations, context, and experience. The recognition process relates acoustic signals to prior experience despite variability in signal-relevant and signal-irrelevant acoustic properties, some of which could be considered as “noise” in service of a recognition goal. However, some acoustic variability, if systematic, is lawful and can be exploited by listeners to aid in recognition. Perceivable changes in systematic variability can herald a need for listeners to reorganize perception and reorient their attention to more immediately signal-relevant cues. This view is not incorporated currently in many extant theories of auditory perception, which traditionally reduce psychological or neural representations of perceptual objects and the processes that act on them to static entities. While this reduction is likely done for the sake of empirical tractability, such a reduction may seriously distort the perceptual process to be modeled. We argue that perceptual representations, as well as the processes underlying perception, are dynamically determined by an interaction between the uncertainty of the auditory signal and constraints of context. This suggests that the process of auditory recognition is highly context-dependent in that the identity of a given auditory object may be intrinsically tied to its preceding context. To argue for the flexible neural and psychological updating of sound-to-meaning mappings across speech and music, we draw upon examples

  13. Motor speech signature of behavioral variant frontotemporal dementia: Refining the phenotype.

    Science.gov (United States)

    Vogel, Adam P; Poole, Matthew L; Pemberton, Hugh; Caverlé, Marja W J; Boonstra, Frederique M C; Low, Essie; Darby, David; Brodtmann, Amy

    2017-08-22

    To provide a comprehensive description of motor speech function in behavioral variant frontotemporal dementia (bvFTD). Forty-eight individuals (24 bvFTD and 24 age- and sex-matched healthy controls) provided speech samples. These varied in complexity and thus cognitive demand. Their language was assessed using the Progressive Aphasia Language Scale and verbal fluency tasks. Speech was analyzed perceptually to describe the nature of deficits and acoustically to quantify differences between patients with bvFTD and healthy controls. Cortical thickness and subcortical volume derived from MRI scans were correlated with speech outcomes in patients with bvFTD. Speech of affected individuals was significantly different from that of healthy controls. The speech signature of patients with bvFTD is characterized by a reduced rate (75%) and accuracy (65%) on alternating syllable production tasks, and prosodic deficits including reduced speech rate (45%), prolonged intervals (54%), and use of short phrases (41%). Groups differed on acoustic measures derived from the reading, unprepared monologue, and diadochokinetic tasks but not the days of the week or sustained vowel tasks. Variability of silence length was associated with cortical thickness of the inferior frontal gyrus and insula and speech rate with the precentral gyrus. One in 8 patients presented with moderate speech timing deficits with a further two-thirds rated as mild or subclinical. Subtle but measurable deficits in prosody are common in bvFTD and should be considered during disease management. Language function correlated with speech timing measures derived from the unprepared monologue only. © 2017 American Academy of Neurology.

  14. Training the Brain to Weight Speech Cues Differently: A Study of Finnish Second-language Users of English

    Science.gov (United States)

    Ylinen, Sari; Uther, Maria; Latvala, Antti; Vepsalainen, Sara; Iverson, Paul; Akahane-Yamada, Reiko; Naatanen, Risto

    2010-01-01

    Foreign-language learning is a prime example of a task that entails perceptual learning. The correct comprehension of foreign-language speech requires the correct recognition of speech sounds. The most difficult speech-sound contrasts for foreign-language learners often are the ones that have multiple phonetic cues, especially if the cues are…

  15. Emergence of category-level sensitivities in non-native speech sound learning

    Directory of Open Access Journals (Sweden)

    Emily eMyers

    2014-08-01

    Full Text Available Over the course of development, speech sounds that are contrastive in one’s native language tend to become perceived categorically: that is, listeners are unaware of variation within phonetic categories while showing excellent sensitivity to speech sounds that span linguistically meaningful phonetic category boundaries. The end stage of this developmental process is that the perceptual systems that handle acoustic-phonetic information show special tuning to native language contrasts, and as such, category-level information appears to be present at even fairly low levels of the neural processing stream. Research on adults acquiring non-native speech categories offers an avenue for investigating the interplay of category-level information and perceptual sensitivities to these sounds as speech categories emerge. In particular, one can observe the neural changes that unfold as listeners learn not only to perceive acoustic distinctions that mark non-native speech sound contrasts, but also to map these distinctions onto category-level representations. An emergent literature on the neural basis of novel and non-native speech sound learning offers new insight into this question. In this review, I will examine this literature in order to answer two key questions. First, where in the neural pathway does sensitivity to category-level phonetic information first emerge over the trajectory of speech sound learning? Second, how do frontal and temporal brain areas work in concert over the course of non-native speech sound learning? Finally, in the context of this literature I will describe a model of speech sound learning in which rapidly-adapting access to categorical information in the frontal lobes modulates the sensitivity of stable, slowly-adapting responses in the temporal lobes.

  16. Perceptual evaluation of corpus-based speech synthesis techniques in under-resourced environments

    CSIR Research Space (South Africa)

    Van Niekerk, DR

    2009-11-01

    Full Text Available With the increasing prominence and maturity of corpus-based techniques for speech synthesis, the process of system development has in some ways been simplified considerably. However, the dependence on sufficient amounts of relevant speech data...

  17. Effects of muscle tension dysphonia on tone phonation: acoustic and perceptual studies in Vietnamese female teachers.

    Science.gov (United States)

    Nguyen, Duong Duy; Kenny, Dianna T

    2009-07-01

    Muscle tension dysphonia (MTD) is a hyperfunctional voice disorder commonly seen in professional voice users. To date, published acoustic studies of this disorder have mainly focused on nontonal language speakers, and no publication has documented its impact on lexical tone characteristics. In this study, we examined whether and how this voice disorder affected acoustically and perceptually the characteristics of tones in Vietnamese teachers. Voice data were obtained from 42 Vietnamese female primary school teachers diagnosed with MTD and 30 vocally healthy teachers. Tonal data were analyzed using Computerized Speech Lab (CSL-4300B) and Speech Analyzer. Parameters analyzed included the two most important acoustic cues in Vietnamese tones, that is, tonal fundamental frequency (F(0)) and laryngealization. Tonal F(0) was assessed using a factorial analysis of variance with group and career durations as independent variables. Tonal samples were also perceptually assessed by a panel of native speakers of the same dialect. The results showed that MTD lowered tonal F(0) in high tones and tones with extensive fundamental frequency variation. There was also a significant main effect for career duration; in MTD group, tonal F(0) was lower in teachers with longer career duration. The teachers with MTD showed different patterns of laryngealization compared with the control group. Tone perception was poorer for tones with extensive fundamental frequency variation and without a typical phonation type. The results in this group of teachers supported our hypothesis that MTD impairs lexical tone phonation.

  18. Atypical lateralization of ERP response to native and non-native speech in infants at risk for autism spectrum disorder.

    Science.gov (United States)

    Seery, Anne M; Vogel-Farley, Vanessa; Tager-Flusberg, Helen; Nelson, Charles A

    2013-07-01

    Language impairment is common in autism spectrum disorders (ASD) and is often accompanied by atypical neural lateralization. However, it is unclear when in development language impairment or atypical lateralization first emerges. To address these questions, we recorded event-related-potentials (ERPs) to native and non-native speech contrasts longitudinally in infants at risk for ASD (HRA) over the first year of life to determine whether atypical lateralization is present as an endophenotype early in development and whether these infants show delay in a very basic precursor of language acquisition: phonemic perceptual narrowing. ERP response for the HRA group to a non-native speech contrast revealed a trajectory of perceptual narrowing similar to a group of low-risk controls (LRC), suggesting that phonemic perceptual narrowing does not appear to be delayed in these high-risk infants. In contrast there were significant group differences in the development of lateralized ERP response to speech: between 6 and 12 months the LRC group displayed a lateralized response to the speech sounds, while the HRA group failed to display this pattern. We suggest the possibility that atypical lateralization to speech may be an ASD endophenotype over the first year of life. Copyright © 2012 Elsevier Ltd. All rights reserved.

  19. Developmental changes in brain activation involved in the production of novel speech sounds in children.

    Science.gov (United States)

    Hashizume, Hiroshi; Taki, Yasuyuki; Sassa, Yuko; Thyreau, Benjamin; Asano, Michiko; Asano, Kohei; Takeuchi, Hikaru; Nouchi, Rui; Kotozaki, Yuka; Jeong, Hyeonjeong; Sugiura, Motoaki; Kawashima, Ryuta

    2014-08-01

    Older children are more successful at producing unfamiliar, non-native speech sounds than younger children during the initial stages of learning. To reveal the neuronal underpinning of the age-related increase in the accuracy of non-native speech production, we examined the developmental changes in activation involved in the production of novel speech sounds using functional magnetic resonance imaging. Healthy right-handed children (aged 6-18 years) were scanned while performing an overt repetition task and a perceptual task involving aurally presented non-native and native syllables. Productions of non-native speech sounds were recorded and evaluated by native speakers. The mouth regions in the bilateral primary sensorimotor areas were activated more significantly during the repetition task relative to the perceptual task. The hemodynamic response in the left inferior frontal gyrus pars opercularis (IFG pOp) specific to non-native speech sound production (defined by prior hypothesis) increased with age. Additionally, the accuracy of non-native speech sound production increased with age. These results provide the first evidence of developmental changes in the neural processes underlying the production of novel speech sounds. Our data further suggest that the recruitment of the left IFG pOp during the production of novel speech sounds was possibly enhanced due to the maturation of the neuronal circuits needed for speech motor planning. This, in turn, would lead to improvement in the ability to immediately imitate non-native speech. Copyright © 2014 Wiley Periodicals, Inc.

  20. An evaluation of the effectiveness of PROMPT therapy in improving speech production accuracy in six children with cerebral palsy.

    Science.gov (United States)

    Ward, Roslyn; Leitão, Suze; Strauss, Geoff

    2014-08-01

    This study evaluates perceptual changes in speech production accuracy in six children (3-11 years) with moderate-to-severe speech impairment associated with cerebral palsy before, during, and after participation in a motor-speech intervention program (Prompts for Restructuring Oral Muscular Phonetic Targets). An A1BCA2 single subject research design was implemented. Subsequent to the baseline phase (phase A1), phase B targeted each participant's first intervention priority on the PROMPT motor-speech hierarchy. Phase C then targeted one level higher. Weekly speech probes were administered, containing trained and untrained words at the two levels of intervention, plus an additional level that served as a control goal. The speech probes were analysed for motor-speech-movement-parameters and perceptual accuracy. Analysis of the speech probe data showed all participants recorded a statistically significant change. Between phases A1-B and B-C 6/6 and 4/6 participants, respectively, recorded a statistically significant increase in performance level on the motor speech movement patterns targeted during the training of that intervention. The preliminary data presented in this study make a contribution to providing evidence that supports the use of a treatment approach aligned with dynamic systems theory to improve the motor-speech movement patterns and speech production accuracy in children with cerebral palsy.

  1. Can mergers-in-progress be unmerged in speech accommodation?

    Science.gov (United States)

    Babel, Molly; McAuliffe, Michael; Haber, Graham

    2013-01-01

    This study examines spontaneous phonetic accommodation of a dialect with distinct categories by speakers who are in the process of merging those categories. We focus on the merger of the NEAR and SQUARE lexical sets in New Zealand English, presenting New Zealand participants with an unmerged speaker of Australian English. Mergers-in-progress are a uniquely interesting sound change as they showcase the asymmetry between speech perception and production. Yet, we examine mergers using spontaneous phonetic imitation, which is phenomenon that is necessarily a behavior where perceptual input influences speech production. Phonetic imitation is quantified by a perceptual measure and an acoustic calculation of mergedness using a Pillai-Bartlett trace. The results from both analyses indicate spontaneous phonetic imitation is moderated by extra-linguistic factors such as the valence of assigned conditions and social bias. We also find evidence for a decrease in the degree of mergedness in post-exposure productions. Taken together, our results suggest that under the appropriate conditions New Zealanders phonetically accommodate to Australian English and that in the process of speech imitation, mergers-in-progress can, but do not consistently, become less merged.

  2. Can mergers-in-progress be unmerged in speech accommodation?

    Directory of Open Access Journals (Sweden)

    Molly eBabel

    2013-09-01

    Full Text Available This study examines spontaneous phonetic accommodation of a dialect with distinct categories by speakers who are in the process of merging those categories. We focus on the merger of the NEAR and SQUARE lexical sets in New Zealand English, presenting New Zealand participants with an unmerged speaker of Australian English. Mergers-in-progress are a uniquely interesting sound change as they showcase the asymmetry between speech perception and production. Yet, we examine mergers using spontaneous phonetic imitation, which is phenomenon that is necessarily a behavior where perceptual input influences speech production. Phonetic imitation is quantified by a perceptual measure and an acoustic calculation of mergedness using a Pillai-Bartlett trace. The results from both analyses indicate spontaneous phonetic imitation is moderated by extra-linguistic factors such as the valence of assigned conditions and social bias. We also find evidence for a decrease in the degree of mergedness in post-exposure productions. Taken together, our results suggest that under the appropriate conditions New Zealanders phonetically accommodate to Australian English and that in the process of speech imitation, mergers-in-progress can, but do not consistently, become less merged.

  3. Tutorial: Speech Assessment for Multilingual Children Who Do Not Speak the Same Language(s) as the Speech-Language Pathologist.

    Science.gov (United States)

    McLeod, Sharynne; Verdon, Sarah

    2017-08-15

    The aim of this tutorial is to support speech-language pathologists (SLPs) undertaking assessments of multilingual children with suspected speech sound disorders, particularly children who speak languages that are not shared with their SLP. The tutorial was written by the International Expert Panel on Multilingual Children's Speech, which comprises 46 researchers (SLPs, linguists, phoneticians, and speech scientists) who have worked in 43 countries and used 27 languages in professional practice. Seventeen panel members met for a 1-day workshop to identify key points for inclusion in the tutorial, 26 panel members contributed to writing this tutorial, and 34 members contributed to revising this tutorial online (some members contributed to more than 1 task). This tutorial draws on international research evidence and professional expertise to provide a comprehensive overview of working with multilingual children with suspected speech sound disorders. This overview addresses referral, case history, assessment, analysis, diagnosis, and goal setting and the SLP's cultural competence and preparation for working with interpreters and multicultural support workers and dealing with organizational and government barriers to and facilitators of culturally competent practice. The issues raised in this tutorial are applied in a hypothetical case study of an English-speaking SLP's assessment of a multilingual Cantonese- and English-speaking 4-year-old boy. Resources are listed throughout the tutorial.

  4. Evaluation of speech errors in Putonghua speakers with cleft palate: a critical review of methodology issues.

    Science.gov (United States)

    Jiang, Chenghui; Whitehill, Tara L

    2014-04-01

    Speech errors associated with cleft palate are well established for English and several other Indo-European languages. Few articles describing the speech of Putonghua (standard Mandarin Chinese) speakers with cleft palate have been published in English language journals. Although methodological guidelines have been published for the perceptual speech evaluation of individuals with cleft palate, there has been no critical review of methodological issues in studies of Putonghua speakers with cleft palate. A literature search was conducted to identify relevant studies published over the past 30 years in Chinese language journals. Only studies incorporating perceptual analysis of speech were included. Thirty-seven articles which met inclusion criteria were analyzed and coded on a number of methodological variables. Reliability was established by having all variables recoded for all studies. This critical review identified many methodological issues. These design flaws make it difficult to draw reliable conclusions about characteristic speech errors in this group of speakers. Specific recommendations are made to improve the reliability and validity of future studies, as well to facilitate cross-center comparisons.

  5. Microscopic prediction of speech recognition for listeners with normal hearing in noise using an auditory model.

    Science.gov (United States)

    Jürgens, Tim; Brand, Thomas

    2009-11-01

    This study compares the phoneme recognition performance in speech-shaped noise of a microscopic model for speech recognition with the performance of normal-hearing listeners. "Microscopic" is defined in terms of this model twofold. First, the speech recognition rate is predicted on a phoneme-by-phoneme basis. Second, microscopic modeling means that the signal waveforms to be recognized are processed by mimicking elementary parts of human's auditory processing. The model is based on an approach by Holube and Kollmeier [J. Acoust. Soc. Am. 100, 1703-1716 (1996)] and consists of a psychoacoustically and physiologically motivated preprocessing and a simple dynamic-time-warp speech recognizer. The model is evaluated while presenting nonsense speech in a closed-set paradigm. Averaged phoneme recognition rates, specific phoneme recognition rates, and phoneme confusions are analyzed. The influence of different perceptual distance measures and of the model's a-priori knowledge is investigated. The results show that human performance can be predicted by this model using an optimal detector, i.e., identical speech waveforms for both training of the recognizer and testing. The best model performance is yielded by distance measures which focus mainly on small perceptual distances and neglect outliers.

  6. A keyword spotting model using perceptually significant energy features

    Science.gov (United States)

    Umakanthan, Padmalochini

    The task of a keyword recognition system is to detect the presence of certain words in a conversation based on the linguistic information present in human speech. Such keyword spotting systems have applications in homeland security, telephone surveillance and human-computer interfacing. General procedure of a keyword spotting system involves feature generation and matching. In this work, new set of features that are based on the psycho-acoustic masking nature of human speech are proposed. After developing these features a time aligned pattern matching process was implemented to locate the words in a set of unknown words. A word boundary detection technique based on frame classification using the nonlinear characteristics of speech is also addressed in this work. Validation of this keyword spotting model was done using widely acclaimed Cepstral features. The experimental results indicate the viability of using these perceptually significant features as an augmented feature set in keyword spotting.

  7. A Diagnostic Marker to Discriminate Childhood Apraxia of Speech from Speech Delay: II. Validity Studies of the Pause Marker

    Science.gov (United States)

    Shriberg, Lawrence D.; Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.

    2017-01-01

    Purpose: The purpose of this 2nd article in this supplement is to report validity support findings for the Pause Marker (PM), a proposed single-sign diagnostic marker of childhood apraxia of speech (CAS). Method: PM scores and additional perceptual and acoustic measures were obtained from 296 participants in cohorts with idiopathic and…

  8. Multidimensional assessment of strongly irregular voices such as in substitution voicing and spasmodic dysphonia: a compilation of own research.

    Science.gov (United States)

    Moerman, Mieke; Martens, Jean-Pierre; Dejonckere, Philippe

    2015-04-01

    This article is a compilation of own research performed during the European COoperation in Science and Technology (COST) action 2103: 'Advance Voice Function Assessment', an initiative of voice and speech processing teams consisting of physicists, engineers, and clinicians. This manuscript concerns analyzing largely irregular voicing types, namely substitution voicing (SV) and adductor spasmodic dysphonia (AdSD). A specific perceptual rating scale (IINFVo) was developed, and the Auditory Model Based Pitch Extractor (AMPEX), a piece of software that automatically analyses running speech and generates pitch values in background noise, was applied. The IINFVo perceptual rating scale has been shown to be useful in evaluating SV. The analysis of strongly irregular voices stimulated a modification of the European Laryngological Society's assessment protocol which was originally designed for the common types of (less severe) dysphonia. Acoustic analysis with AMPEX demonstrates that the most informative features are, for SV, the voicing-related acoustic features and, for AdSD, the perturbation measures. Poor correlations between self-assessment and acoustic and perceptual dimensions in the assessment of highly irregular voices argue for a multidimensional approach.

  9. [Design of standard voice sample text for subjective auditory perceptual evaluation of voice disorders].

    Science.gov (United States)

    Li, Jin-rang; Sun, Yan-yan; Xu, Wen

    2010-09-01

    To design a speech voice sample text with all phonemes in Mandarin for subjective auditory perceptual evaluation of voice disorders. The principles for design of a speech voice sample text are: The short text should include the 21 initials and 39 finals, this may cover all the phonemes in Mandarin. Also, the short text should have some meanings. A short text was made out. It had 155 Chinese words, and included 21 initials and 38 finals (the final, ê, was not included because it was rarely used in Mandarin). Also, the text covered 17 light tones and one "Erhua". The constituent ratios of the initials and finals presented in this short text were statistically similar as those in Mandarin according to the method of similarity of the sample and population (r = 0.742, P text were statistically not similar as those in Mandarin (r = 0.731, P > 0.05). A speech voice sample text with all phonemes in Mandarin was made out. The constituent ratios of the initials and finals presented in this short text are similar as those in Mandarin. Its value for subjective auditory perceptual evaluation of voice disorders need further study.

  10. Assessment of speech intelligibility in background noise and reverberation

    DEFF Research Database (Denmark)

    Nielsen, Jens Bo

    Reliable methods for assessing speech intelligibility are essential within hearing research, audiology, and related areas. Such methods can be used for obtaining a better understanding of how speech intelligibility is affected by, e.g., various environmental factors or different types of hearing...... impairment. In this thesis, two sentence-based tests for speech intelligibility in Danish were developed. The first test is the Conversational Language Understanding Evaluation (CLUE), which is based on the principles of the original American-English Hearing in Noise Test (HINT). The second test...... is a modified version of CLUE where the speech material and the scoring rules have been reconsidered. An extensive validation of the modified test was conducted with both normal-hearing and hearing-impaired listeners. The validation showed that the test produces reliable results for both groups of listeners...

  11. Kinematic Analysis of Speech Sound Sequencing Errors Induced by Delayed Auditory Feedback.

    Science.gov (United States)

    Cler, Gabriel J; Lee, Jackson C; Mittelman, Talia; Stepp, Cara E; Bohland, Jason W

    2017-06-22

    Delayed auditory feedback (DAF) causes speakers to become disfluent and make phonological errors. Methods for assessing the kinematics of speech errors are lacking, with most DAF studies relying on auditory perceptual analyses, which may be problematic, as errors judged to be categorical may actually represent blends of sounds or articulatory errors. Eight typical speakers produced nonsense syllable sequences under normal and DAF (200 ms). Lip and tongue kinematics were captured with electromagnetic articulography. Time-locked acoustic recordings were transcribed, and the kinematics of utterances with and without perceived errors were analyzed with existing and novel quantitative methods. New multivariate measures showed that for 5 participants, kinematic variability for productions perceived to be error free was significantly increased under delay; these results were validated by using the spatiotemporal index measure. Analysis of error trials revealed both typical productions of a nontarget syllable and productions with articulatory kinematics that incorporated aspects of both the target and the perceived utterance. This study is among the first to characterize articulatory changes under DAF and provides evidence for different classes of speech errors, which may not be perceptually salient. New methods were developed that may aid visualization and analysis of large kinematic data sets. https://doi.org/10.23641/asha.5103067.

  12. Contextual modulation of reading rate for direct versus indirect speech quotations.

    Science.gov (United States)

    Yao, Bo; Scheepers, Christoph

    2011-12-01

    In human communication, direct speech (e.g., Mary said: "I'm hungry") is perceived to be more vivid than indirect speech (e.g., Mary said [that] she was hungry). However, the processing consequences of this distinction are largely unclear. In two experiments, participants were asked to either orally (Experiment 1) or silently (Experiment 2, eye-tracking) read written stories that contained either a direct speech or an indirect speech quotation. The context preceding those quotations described a situation that implied either a fast-speaking or a slow-speaking quoted protagonist. It was found that this context manipulation affected reading rates (in both oral and silent reading) for direct speech quotations, but not for indirect speech quotations. This suggests that readers are more likely to engage in perceptual simulations of the reported speech act when reading direct speech as opposed to meaning-equivalent indirect speech quotations, as part of a more vivid representation of the former. Copyright © 2011 Elsevier B.V. All rights reserved.

  13. Audiovisual spoken word training can promote or impede auditory-only perceptual learning: prelingually deafened adults with late-acquired cochlear implants versus normal hearing adults.

    Science.gov (United States)

    Bernstein, Lynne E; Eberhardt, Silvio P; Auer, Edward T

    2014-01-01

    Training with audiovisual (AV) speech has been shown to promote auditory perceptual learning of vocoded acoustic speech by adults with normal hearing. In Experiment 1, we investigated whether AV speech promotes auditory-only (AO) perceptual learning in prelingually deafened adults with late-acquired cochlear implants. Participants were assigned to learn associations between spoken disyllabic C(=consonant)V(=vowel)CVC non-sense words and non-sense pictures (fribbles), under AV and then AO (AV-AO; or counter-balanced AO then AV, AO-AV, during Periods 1 then 2) training conditions. After training on each list of paired-associates (PA), testing was carried out AO. Across all training, AO PA test scores improved (7.2 percentage points) as did identification of consonants in new untrained CVCVC stimuli (3.5 percentage points). However, there was evidence that AV training impeded immediate AO perceptual learning: During Period-1, training scores across AV and AO conditions were not different, but AO test scores were dramatically lower in the AV-trained participants. During Period-2 AO training, the AV-AO participants obtained significantly higher AO test scores, demonstrating their ability to learn the auditory speech. Across both orders of training, whenever training was AV, AO test scores were significantly lower than training scores. Experiment 2 repeated the procedures with vocoded speech and 43 normal-hearing adults. Following AV training, their AO test scores were as high as or higher than following AO training. Also, their CVCVC identification scores patterned differently than those of the cochlear implant users. In Experiment 1, initial consonants were most accurate, and in Experiment 2, medial consonants were most accurate. We suggest that our results are consistent with a multisensory reverse hierarchy theory, which predicts that, whenever possible, perceivers carry out perceptual tasks immediately based on the experience and biases they bring to the task. We

  14. Characterization of the voice of children with mouth breathing caused by four different etiologies using perceptual and acoustic analyses

    Directory of Open Access Journals (Sweden)

    Rosana Tiepo Arévalo

    2005-09-01

    Full Text Available Objective: To describe vocal characteristics in children aged fiveto twelve years with mouth breathing caused by four etiologies:chronic rhinitis, hypertrophy, hypertrophy + chronic rhinitis andfunctional condition, using perceptual evaluation and acousticanalysis. Methods: Voice recordings of 120 mouth breathers judgedby four speech pathologists using the software Multi-Speech.Results: The perceptual evaluation of the voice revealed highincidence of breathy and hoarse voices, especially in the rhinitisgroup. Most cases were moderate, with low pitch and normalloudness. Hyponasality was found in over 50% of sample, asexpected, but we also found high occurrence of laryngealresonance, especially in the rhinitis group. Mean fundamentalfrequency was 24.81Hz, SD = 15.02; jitter = 2.17; shimmer =0.44, and HNR = 2.11. Values did not show statistically significantdifference among the groups. Conclusion: Perceptual evaluation ofthe voice revealed that most mouth breathers presented hoarseand breathy voice, low pitch, normal loudness and hyponasal andlaryngeal resonance. However, the acoustic analysis did not resultin any significant condition.

  15. Methodology for Speech Assessment in the Scandcleft Project-An International Randomized Clinical Trial on Palatal Surgery

    DEFF Research Database (Denmark)

    Willadsen, Elisabeth

    2009-01-01

    Objective: To present the methodology for speech assessment in the Scandcleft project and discuss issues from a pilot study. Design: Description of methodology and blinded test for speech assessment. Speech samples and instructions for data collection and analysis for comparisons of speech outcomes...... across five included languages were developed and tested. Participants and Materials: Randomly selected video recordings of 10 5-year-old children from each language (n = 50) were included in the project. Speech material consisted of test consonants in single words, connected speech, and syllable chains......-sum and the overall rating of VPC was 78%. Conclusions: Pooling data of speakers of different languages in the same trial and comparing speech outcome across trials seems possible if the assessment of speech concerns consonants and is confined to speech units that are phonetically similar across languages. Agreed...

  16. Speech Perception Engages a General Timer: Evidence from a Divided Attention Word Identification Task

    Science.gov (United States)

    Casini, Laurence; Burle, Boris; Nguyen, Noel

    2009-01-01

    Time is essential to speech. The duration of speech segments plays a critical role in the perceptual identification of these segments, and therefore in that of spoken words. Here, using a French word identification task, we show that vowels are perceived as shorter when attention is divided between two tasks, as compared to a single task control…

  17. When cognition kicks in: Working memory and speech understanding in noise

    Directory of Open Access Journals (Sweden)

    Jerker Ronnberg

    2010-01-01

    Full Text Available Perceptual load and cognitive load can be separately manipulated and dissociated in their effects on speech understanding in noise. The Ease of Language Understanding model assumes a theoretical position where perceptual task characteristics interact with the individual′s implicit capacities to extract the phonological elements of speech. Phonological precision and speed of lexical access are important determinants for listening in adverse conditions. If there are mismatches between the phonological elements perceived and phonological representations in long-term memory, explicit working memory (WM-related capacities will be continually invoked to reconstruct and infer the contents of the ongoing discourse. Whether this induces a high cognitive load or not will in turn depend on the individual′s storage and processing capacities in WM. Data suggest that modulated noise maskers may serve as triggers for speech maskers and therefore induce a WM, explicit mode of processing. Individuals with high WM capacity benefit more than low WM-capacity individuals from fast amplitude compression at low or negative input speech-to-noise ratios. The general conclusion is that there is an overarching interaction between the focal purpose of processing in the primary listening task and the extent to which a secondary, distracting task taps into these processes.

  18. Learning foreign sounds in an alien world: videogame training improves non-native speech categorization.

    Science.gov (United States)

    Lim, Sung-joo; Holt, Lori L

    2011-01-01

    Although speech categories are defined by multiple acoustic dimensions, some are perceptually weighted more than others and there are residual effects of native-language weightings in non-native speech perception. Recent research on nonlinguistic sound category learning suggests that the distribution characteristics of experienced sounds influence perceptual cue weights: Increasing variability across a dimension leads listeners to rely upon it less in subsequent category learning (Holt & Lotto, 2006). The present experiment investigated the implications of this among native Japanese learning English /r/-/l/ categories. Training was accomplished using a videogame paradigm that emphasizes associations among sound categories, visual information, and players' responses to videogame characters rather than overt categorization or explicit feedback. Subjects who played the game for 2.5h across 5 days exhibited improvements in /r/-/l/ perception on par with 2-4 weeks of explicit categorization training in previous research and exhibited a shift toward more native-like perceptual cue weights. Copyright © 2011 Cognitive Science Society, Inc.

  19. Clear speech and lexical competition in younger and older adult listeners.

    Science.gov (United States)

    Van Engen, Kristin J

    2017-08-01

    This study investigated whether clear speech reduces the cognitive demands of lexical competition by crossing speaking style with lexical difficulty. Younger and older adults identified more words in clear versus conversational speech and more easy words than hard words. An initial analysis suggested that the effect of lexical difficulty was reduced in clear speech, but more detailed analyses within each age group showed this interaction was significant only for older adults. The results also showed that both groups improved over the course of the task and that clear speech was particularly helpful for individuals with poorer hearing: for younger adults, clear speech eliminated hearing-related differences that affected performance on conversational speech. For older adults, clear speech was generally more helpful to listeners with poorer hearing. These results suggest that clear speech affords perceptual benefits to all listeners and, for older adults, mitigates the cognitive challenge associated with identifying words with many phonological neighbors.

  20. Vocal Function Exercises for Muscle Tension Dysphonia: Auditory-Perceptual Evaluation and Self-Assessment Rating.

    Science.gov (United States)

    Jafari, Narges; Salehi, Abolfazl; Izadi, Farzad; Talebian Moghadam, Saeed; Ebadi, Abbas; Dabirmoghadam, Payman; Faham, Maryam; Shahbazi, Mehdi

    2017-07-01

    Muscle tension dysphonia (MTD) is a functional dysphonia, which appears with an excessive tension in the intrinsic and extrinsic laryngeal musculatures. MTD can affect voice quality and quality of life. The purpose of the present study was to assess the effectiveness of vocal function exercises (VFEs) on perceptual and self-assessment ratings in a group of 15 subjects with MTD. The study comprised 15 subjects with MTD (8 men and 7 women, mean age 39.8 years, standard deviation 10.6, age range 24-62 years). All participants were native Persian speakers who underwent a 6-week course of VFEs. The Voice Handicap Index (VHI) (the self-assessment scale) and Grade, Roughness, Breathiness, Asthenia, Strain (GRBAS) scale (perceptual rating of voice quality) were used to compare pre- and post-VFEs. GRBAS data of patients before and after VFEs were compared using Wilcoxon signed-rank test, and VHI data of patients pre- and post-VFEs were compared using Student paired t test. These perceptual parameters showed a statistically significant improvement in subjects with MTD after voice therapy (significant at P self-assessment ratings measurements (with the VHI). As a result, the data provide evidence regarding the efficacy of VFEs in the treatment of patients with MTD. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  1. Influence of musical training on perception of L2 speech

    NARCIS (Netherlands)

    Sadakata, M.; Zanden, L.D.T. van der; Sekiyama, K.

    2010-01-01

    The current study reports specific cases in which a positive transfer of perceptual ability from the music domain to the language domain occurs. We tested whether musical training enhances discrimination and identification performance of L2 speech sounds (timing features, nasal consonants and

  2. Validating a perceptual distraction model in a personal two-zone sound system

    DEFF Research Database (Denmark)

    Rämö, Jussi; Christensen, Lasse; Bech, Søren

    2017-01-01

    This paper focuses on validating a perceptual distraction model, which aims to predict user’s perceived distraction caused by audio-on-audio interference, e.g., two competing audio sources within the same listening space. Originally, the distraction model was trained with music-on-music stimuli...... that the model performance is equally good in both zones, i.e., with both speech-on-music and music-on-speech stimuli, and comparable to the previous validation round (RMSE approximately 10%). The results further confirm that the distraction model can be used as a valuable tool in evaluating and optimizing...

  3. Are mirror neurons the basis of speech perception? Evidence from five cases with damage to the purported human mirror system

    Science.gov (United States)

    Rogalsky, Corianne; Love, Tracy; Driscoll, David; Anderson, Steven W.; Hickok, Gregory

    2013-01-01

    The discovery of mirror neurons in macaque has led to a resurrection of motor theories of speech perception. Although the majority of lesion and functional imaging studies have associated perception with the temporal lobes, it has also been proposed that the ‘human mirror system’, which prominently includes Broca’s area, is the neurophysiological substrate of speech perception. Although numerous studies have demonstrated a tight link between sensory and motor speech processes, few have directly assessed the critical prediction of mirror neuron theories of speech perception, namely that damage to the human mirror system should cause severe deficits in speech perception. The present study measured speech perception abilities of patients with lesions involving motor regions in the left posterior frontal lobe and/or inferior parietal lobule (i.e., the proposed human ‘mirror system’). Performance was at or near ceiling in patients with fronto-parietal lesions. It is only when the lesion encroaches on auditory regions in the temporal lobe that perceptual deficits are evident. This suggests that ‘mirror system’ damage does not disrupt speech perception, but rather that auditory systems are the primary substrate for speech perception. PMID:21207313

  4. Sensorimotor influences on speech perception in infancy.

    Science.gov (United States)

    Bruderer, Alison G; Danielson, D Kyle; Kandhadai, Padmapriya; Werker, Janet F

    2015-11-03

    The influence of speech production on speech perception is well established in adults. However, because adults have a long history of both perceiving and producing speech, the extent to which the perception-production linkage is due to experience is unknown. We addressed this issue by asking whether articulatory configurations can influence infants' speech perception performance. To eliminate influences from specific linguistic experience, we studied preverbal, 6-mo-old infants and tested the discrimination of a nonnative, and hence never-before-experienced, speech sound distinction. In three experimental studies, we used teething toys to control the position and movement of the tongue tip while the infants listened to the speech sounds. Using ultrasound imaging technology, we verified that the teething toys consistently and effectively constrained the movement and positioning of infants' tongues. With a looking-time procedure, we found that temporarily restraining infants' articulators impeded their discrimination of a nonnative consonant contrast but only when the relevant articulator was selectively restrained to prevent the movements associated with producing those sounds. Our results provide striking evidence that even before infants speak their first words and without specific listening experience, sensorimotor information from the articulators influences speech perception. These results transform theories of speech perception by suggesting that even at the initial stages of development, oral-motor movements influence speech sound discrimination. Moreover, an experimentally induced "impairment" in articulator movement can compromise speech perception performance, raising the question of whether long-term oral-motor impairments may impact perceptual development.

  5. Speech Language Assessments in Te Reo in a Primary School Maori Immersion Unit

    Science.gov (United States)

    Naidoo, Kershni

    2012-01-01

    This research originated from the need for a speech and language therapy assessment in te reo Maori for a particular child who attended a Maori immersion unit. A Speech and Language Therapy te reo assessment had already been developed but it needed to be revised and normative data collected. Discussions and assessments were carried out in a…

  6. Multiple Transcoding Impact on Speech Quality in Ideal Network Conditions

    Directory of Open Access Journals (Sweden)

    Martin Mikulec

    2015-01-01

    Full Text Available This paper deals with the impact of transcoding on the speech quality. We have focused mainly on the transcoding between codecs without the negative influence of the network parameters such as packet loss and delay. It has ensured objective and repeatable results from our measurement. The measurement was performed on the Transcoding Measuring System developed especially for this purpose. The system is based on the open source projects and is useful as a design tool for VoIP system administrators. The paper compares the most used codecs from the transcoding perspective. The multiple transcoding between G711, GSM and G729 codecs were performed and the speech quality of these calls was evaluated. The speech quality was measured by Perceptual Evaluation of Speech Quality method, which provides results in Mean Opinion Score used to describe the speech quality on a scale from 1 to 5. The obtained results indicate periodical speech quality degradation on every transcoding between two codecs.

  7. The Processing of Biologically Plausible and Implausible forms in American Sign Language: Evidence for Perceptual Tuning.

    Science.gov (United States)

    Almeida, Diogo; Poeppel, David; Corina, David

    The human auditory system distinguishes speech-like information from general auditory signals in a remarkably fast and efficient way. Combining psychophysics and neurophysiology (MEG), we demonstrate a similar result for the processing of visual information used for language communication in users of sign languages. We demonstrate that the earliest visual cortical responses in deaf signers viewing American Sign Language (ASL) signs show specific modulations to violations of anatomic constraints that would make the sign either possible or impossible to articulate. These neural data are accompanied with a significantly increased perceptual sensitivity to the anatomical incongruity. The differential effects in the early visual evoked potentials arguably reflect an expectation-driven assessment of somatic representational integrity, suggesting that language experience and/or auditory deprivation may shape the neuronal mechanisms underlying the analysis of complex human form. The data demonstrate that the perceptual tuning that underlies the discrimination of language and non-language information is not limited to spoken languages but extends to languages expressed in the visual modality.

  8. Audiovisual perceptual learning with multiple speakers.

    Science.gov (United States)

    Mitchel, Aaron D; Gerfen, Chip; Weiss, Daniel J

    2016-05-01

    One challenge for speech perception is between-speaker variability in the acoustic parameters of speech. For example, the same phoneme (e.g. the vowel in "cat") may have substantially different acoustic properties when produced by two different speakers and yet the listener must be able to interpret these disparate stimuli as equivalent. Perceptual tuning, the use of contextual information to adjust phonemic representations, may be one mechanism that helps listeners overcome obstacles they face due to this variability during speech perception. Here we test whether visual contextual cues to speaker identity may facilitate the formation and maintenance of distributional representations for individual speakers, allowing listeners to adjust phoneme boundaries in a speaker-specific manner. We familiarized participants to an audiovisual continuum between /aba/ and /ada/. During familiarization, the "b-face" mouthed /aba/ when an ambiguous token was played, while the "D-face" mouthed /ada/. At test, the same ambiguous token was more likely to be identified as /aba/ when paired with a stilled image of the "b-face" than with an image of the "D-face." This was not the case in the control condition when the two faces were paired equally with the ambiguous token. Together, these results suggest that listeners may form speaker-specific phonemic representations using facial identity cues.

  9. Developmental apraxia of speech in children. Quantitive assessment of speech characteristics

    NARCIS (Netherlands)

    Thoonen, G.H.J.

    1998-01-01

    Developmental apraxia of speech (DAS) in children is a speech disorder, supposed to have a neurological origin, which is commonly considered to result from particular deficits in speech processing (i.e., phonological planning, motor programming). However, the label DAS has often been used as

  10. Perceptual restoration of degraded speech is preserved with advancing age

    NARCIS (Netherlands)

    Saija, Jefta D; Akyürek, Elkan G; Andringa, Tjeerd C; Başkent, Deniz

    Cognitive skills, such as processing speed, memory functioning, and the ability to divide attention, are known to diminish with aging. The present study shows that, despite these changes, older adults can successfully compensate for degradations in speech perception. Critically, the older

  11. Brain Plasticity in Speech Training in Native English Speakers Learning Mandarin Tones

    Science.gov (United States)

    Heinzen, Christina Carolyn

    The current study employed behavioral and event-related potential (ERP) measures to investigate brain plasticity associated with second-language (L2) phonetic learning based on an adaptive computer training program. The program utilized the acoustic characteristics of Infant-Directed Speech (IDS) to train monolingual American English-speaking listeners to perceive Mandarin lexical tones. Behavioral identification and discrimination tasks were conducted using naturally recorded speech, carefully controlled synthetic speech, and non-speech control stimuli. The ERP experiments were conducted with selected synthetic speech stimuli in a passive listening oddball paradigm. Identical pre- and post- tests were administered on nine adult listeners, who completed two-to-three hours of perceptual training. The perceptual training sessions used pair-wise lexical tone identification, and progressed through seven levels of difficulty for each tone pair. The levels of difficulty included progression in speaker variability from one to four speakers and progression through four levels of acoustic exaggeration of duration, pitch range, and pitch contour. Behavioral results for the natural speech stimuli revealed significant training-induced improvement in identification of Tones 1, 3, and 4. Improvements in identification of Tone 4 generalized to novel stimuli as well. Additionally, comparison between discrimination of across-category and within-category stimulus pairs taken from a synthetic continuum revealed a training-induced shift toward more native-like categorical perception of the Mandarin lexical tones. Analysis of the Mismatch Negativity (MMN) responses in the ERP data revealed increased amplitude and decreased latency for pre-attentive processing of across-category discrimination as a result of training. There were also laterality changes in the MMN responses to the non-speech control stimuli, which could reflect reallocation of brain resources in processing pitch patterns

  12. Impact of cognitive function and dysarthria on spoken language and perceived speech severity in multiple sclerosis

    Science.gov (United States)

    Feenaughty, Lynda

    Purpose: The current study sought to investigate the separate effects of dysarthria and cognitive status on global speech timing, speech hesitation, and linguistic complexity characteristics and how these speech behaviors impose on listener impressions for three connected speech tasks presumed to differ in cognitive-linguistic demand for four carefully defined speaker groups; 1) MS with cognitive deficits (MSCI), 2) MS with clinically diagnosed dysarthria and intact cognition (MSDYS), 3) MS without dysarthria or cognitive deficits (MS), and 4) healthy talkers (CON). The relationship between neuropsychological test scores and speech-language production and perceptual variables for speakers with cognitive deficits was also explored. Methods: 48 speakers, including 36 individuals reporting a neurological diagnosis of MS and 12 healthy talkers participated. The three MS groups and control group each contained 12 speakers (8 women and 4 men). Cognitive function was quantified using standard clinical tests of memory, information processing speed, and executive function. A standard z-score of ≤ -1.50 indicated deficits in a given cognitive domain. Three certified speech-language pathologists determined the clinical diagnosis of dysarthria for speakers with MS. Experimental speech tasks of interest included audio-recordings of an oral reading of the Grandfather passage and two spontaneous speech samples in the form of Familiar and Unfamiliar descriptive discourse. Various measures of spoken language were of interest. Suprasegmental acoustic measures included speech and articulatory rate. Linguistic speech hesitation measures included pause frequency (i.e., silent and filled pauses), mean silent pause duration, grammatical appropriateness of pauses, and interjection frequency. For the two discourse samples, three standard measures of language complexity were obtained including subordination index, inter-sentence cohesion adequacy, and lexical diversity. Ten listeners

  13. Noise and pitch interact during the cortical segregation of concurrent speech.

    Science.gov (United States)

    Bidelman, Gavin M; Yellamsetty, Anusha

    2017-08-01

    Behavioral studies reveal listeners exploit intrinsic differences in voice fundamental frequency (F0) to segregate concurrent speech sounds-the so-called "F0-benefit." More favorable signal-to-noise ratio (SNR) in the environment, an extrinsic acoustic factor, similarly benefits the parsing of simultaneous speech. Here, we examined the neurobiological substrates of these two cues in the perceptual segregation of concurrent speech mixtures. We recorded event-related brain potentials (ERPs) while listeners performed a speeded double-vowel identification task. Listeners heard two concurrent vowels whose F0 differed by zero or four semitones presented in either clean (no noise) or noise-degraded (+5 dB SNR) conditions. Behaviorally, listeners were more accurate in correctly identifying both vowels for larger F0 separations but F0-benefit was more pronounced at more favorable SNRs (i.e., pitch × SNR interaction). Analysis of the ERPs revealed that only the P2 wave (∼200 ms) showed a similar F0 x SNR interaction as behavior and was correlated with listeners' perceptual F0-benefit. Neural classifiers applied to the ERPs further suggested that speech sounds are segregated neurally within 200 ms based on SNR whereas segregation based on pitch occurs later in time (400-700 ms). The earlier timing of extrinsic SNR compared to intrinsic F0-based segregation implies that the cortical extraction of speech from noise is more efficient than differentiating speech based on pitch cues alone, which may recruit additional cortical processes. Findings indicate that noise and pitch differences interact relatively early in cerebral cortex and that the brain arrives at the identities of concurrent speech mixtures as early as ∼200 ms. Copyright © 2017 Elsevier B.V. All rights reserved.

  14. The Role of the Listener's State in Speech Perception

    Science.gov (United States)

    Viswanathan, Navin

    2009-01-01

    Accounts of speech perception disagree on whether listeners perceive the acoustic signal (Diehl, Lotto, & Holt, 2004) or the vocal tract gestures that produce the signal (e.g., Fowler, 1986). In this dissertation, I outline a research program using a phenomenon called "perceptual compensation for coarticulation" (Mann, 1980) to examine this…

  15. Song and speech: examining the link between singing talent and speech imitation ability.

    Science.gov (United States)

    Christiner, Markus; Reiterer, Susanne M

    2013-01-01

    In previous research on speech imitation, musicality, and an ability to sing were isolated as the strongest indicators of good pronunciation skills in foreign languages. We, therefore, wanted to take a closer look at the nature of the ability to sing, which shares a common ground with the ability to imitate speech. This study focuses on whether good singing performance predicts good speech imitation. Forty-one singers of different levels of proficiency were selected for the study and their ability to sing, to imitate speech, their musical talent and working memory were tested. Results indicated that singing performance is a better indicator of the ability to imitate speech than the playing of a musical instrument. A multiple regression revealed that 64% of the speech imitation score variance could be explained by working memory together with educational background and singing performance. A second multiple regression showed that 66% of the speech imitation variance of completely unintelligible and unfamiliar language stimuli (Hindi) could be explained by working memory together with a singer's sense of rhythm and quality of voice. This supports the idea that both vocal behaviors have a common grounding in terms of vocal and motor flexibility, ontogenetic and phylogenetic development, neural orchestration and auditory memory with singing fitting better into the category of "speech" on the productive level and "music" on the acoustic level. As a result, good singers benefit from vocal and motor flexibility, productively and cognitively, in three ways. (1) Motor flexibility and the ability to sing improve language and musical function. (2) Good singers retain a certain plasticity and are open to new and unusual sound combinations during adulthood both perceptually and productively. (3) The ability to sing improves the memory span of the auditory working memory.

  16. Song and speech: examining the link between singing talent and speech imitation ability

    Directory of Open Access Journals (Sweden)

    Markus eChristiner

    2013-11-01

    Full Text Available In previous research on speech imitation, musicality and an ability to sing were isolated as the strongest indicators of good pronunciation skills in foreign languages. We, therefore, wanted to take a closer look at the nature of the ability to sing, which shares a common ground with the ability to imitate speech. This study focuses on whether good singing performance predicts good speech imitation. Fourty-one singers of different levels of proficiency were selected for the study and their ability to sing, to imitate speech, their musical talent and working memory were tested. Results indicated that singing performance is a better indicator of the ability to imitate speech than the playing of a musical instrument. A multiple regression revealed that 64 % of the speech imitation score variance could be explained by working memory together with educational background and singing performance. A second multiple regression showed that 66 % of the speech imitation variance of completely unintelligible and unfamiliar language stimuli (Hindi could be explained by working memory together with a singer’s sense of rhythm and quality of voice. This supports the idea that both vocal behaviors have a common grounding in terms of vocal and motor flexibility, ontogenetic and phylogenetic development, neural orchestration and sound memory with singing fitting better into the category of "speech" on the productive level and "music" on the acoustic level. As a result, good singers benefit from vocal and motor flexibility, productively and cognitively, in three ways. 1. Motor flexibility and the ability to sing improve language and musical function. 2. Good singers retain a certain plasticity and are open to new and unusual sound combinations during adulthood both perceptually and productively. 3. The ability to sing improves the memory span of the auditory short term memory.

  17. Refining Stimulus Parameters in Assessing Infant Speech Perception Using Visual Reinforcement Infant Speech Discrimination: Sensation Level.

    Science.gov (United States)

    Uhler, Kristin M; Baca, Rosalinda; Dudas, Emily; Fredrickson, Tammy

    2015-01-01

    Speech perception measures have long been considered an integral piece of the audiological assessment battery. Currently, a prelinguistic, standardized measure of speech perception is missing in the clinical assessment battery for infants and young toddlers. Such a measure would allow systematic assessment of speech perception abilities of infants as well as the potential to investigate the impact early identification of hearing loss and early fitting of amplification have on the auditory pathways. To investigate the impact of sensation level (SL) on the ability of infants with normal hearing (NH) to discriminate /a-i/ and /ba-da/ and to determine if performance on the two contrasts are significantly different in predicting the discrimination criterion. The design was based on a survival analysis model for event occurrence and a repeated measures logistic model for binary outcomes. The outcome for survival analysis was the minimum SL for criterion and the outcome for the logistic regression model was the presence/absence of achieving the criterion. Criterion achievement was designated when an infant's proportion correct score was >0.75 on the discrimination performance task. Twenty-two infants with NH sensitivity participated in this study. There were 9 males and 13 females, aged 6-14 mo. Testing took place over two to three sessions. The first session consisted of a hearing test, threshold assessment of the two speech sounds (/a/ and /i/), and if time and attention allowed, visual reinforcement infant speech discrimination (VRISD). The second session consisted of VRISD assessment for the two test contrasts (/a-i/ and /ba-da/). The presentation level started at 50 dBA. If the infant was unable to successfully achieve criterion (>0.75) at 50 dBA, the presentation level was increased to 70 dBA followed by 60 dBA. Data examination included an event analysis, which provided the probability of criterion distribution across SL. The second stage of the analysis was a

  18. Treating speech subsystems in childhood apraxia of speech with tactual input: the PROMPT approach.

    Science.gov (United States)

    Dale, Philip S; Hayden, Deborah A

    2013-11-01

    Prompts for Restructuring Oral Muscular Phonetic Targets (PROMPT; Hayden, 2004; Hayden, Eigen, Walker, & Olsen, 2010)-a treatment approach for the improvement of speech sound disorders in children-uses tactile-kinesthetic- proprioceptive (TKP) cues to support and shape movements of the oral articulators. No research to date has systematically examined the efficacy of PROMPT for children with childhood apraxia of speech (CAS). Four children (ages 3;6 [years;months] to 4;8), all meeting the American Speech-Language-Hearing Association (2007) criteria for CAS, were treated using PROMPT. All children received 8 weeks of 2 × per week treatment, including at least 4 weeks of full PROMPT treatment that included TKP cues. During the first 4 weeks, 2 of the 4 children received treatment that included all PROMPT components except TKP cues. This design permitted both between-subjects and within-subjects comparisons to evaluate the effect of TKP cues. Gains in treatment were measured by standardized tests and by criterion-referenced measures based on the production of untreated probe words, reflecting change in speech movements and auditory perceptual accuracy. All 4 children made significant gains during treatment, but measures of motor speech control and untreated word probes provided evidence for more gain when TKP cues were included. PROMPT as a whole appears to be effective for treating children with CAS, and the inclusion of TKP cues appears to facilitate greater effect.

  19. Multiple Speech Source Separation Using Inter-Channel Correlation and Relaxed Sparsity

    Directory of Open Access Journals (Sweden)

    Maoshen Jia

    2018-01-01

    Full Text Available In this work, a multiple speech source separation method using inter-channel correlation and relaxed sparsity is proposed. A B-format microphone with four spatially located channels is adopted due to the size of the microphone array to preserve the spatial parameter integrity of the original signal. Specifically, we firstly measure the proportion of overlapped components among multiple sources and find that there exist many overlapped time-frequency (TF components with increasing source number. Then, considering the relaxed sparsity of speech sources, we propose a dynamic threshold-based separation approach of sparse components where the threshold is determined by the inter-channel correlation among the recording signals. After conducting a statistical analysis of the number of active sources at each TF instant, a form of relaxed sparsity called the half-K assumption is proposed so that the active source number in a certain TF bin does not exceed half the total number of simultaneously occurring sources. By applying the half-K assumption, the non-sparse components are recovered by regarding the extracted sparse components as a guide, combined with vector decomposition and matrix factorization. Eventually, the final TF coefficients of each source are recovered by the synthesis of sparse and non-sparse components. The proposed method has been evaluated using up to six simultaneous speech sources under both anechoic and reverberant conditions. Both objective and subjective evaluations validated that the perceptual quality of the separated speech by the proposed approach outperforms existing blind source separation (BSS approaches. Besides, it is robust to different speeches whilst confirming all the separated speeches with similar perceptual quality.

  20. Dynamic Assessment of Phonological Awareness for Children with Speech Sound Disorders

    Science.gov (United States)

    Gillam, Sandra Laing; Ford, Mikenzi Bentley

    2012-01-01

    The current study was designed to examine the relationships between performance on a nonverbal phoneme deletion task administered in a dynamic assessment format with performance on measures of phoneme deletion, word-level reading, and speech sound production that required verbal responses for school-age children with speech sound disorders (SSDs).…

  1. Speech Quality Measurement of GSM Infrastructure Built on USRP N210 and OpenBTS Project

    Directory of Open Access Journals (Sweden)

    Marcel Fajkus

    2014-01-01

    Full Text Available The paper deals with the methodology for speech quality measuring in GSM networks using Perceptual Evaluation of Speech Quality (PESQ. The paper brings results of practical measurement of own GSM network build on the Universal Software Radio Peripheral (USRP N210 hardware and OpenBTS software. This OpenBTS station was installed in open terrain, and the speech quality was measured from different distances from the transmitter. The limit parameters of OpenBTS station with USRP N210 were obtained.

  2. Encoding, Memory, and Transcoding Deficits in Childhood Apraxia of Speech

    Science.gov (United States)

    Shriberg, Lawrence D.; Lohmeier, Heather L.; Strand, Edythe A.; Jakielski, Kathy J.

    2012-01-01

    A central question in Childhood Apraxia of Speech (CAS) is whether the core phenotype is limited to transcoding (planning/programming) deficits or if speakers with CAS also have deficits in auditory-perceptual "encoding" (representational) and/or "memory" (storage and retrieval of representations) processes. We addressed this and other questions…

  3. The Influence of Direct and Indirect Speech on Mental Representations

    NARCIS (Netherlands)

    A. Eerland (Anita); J.A.A. Engelen (Jan A.A.); R.A. Zwaan (Rolf)

    2013-01-01

    textabstractLanguage can be viewed as a set of cues that modulate the comprehender's thought processes. It is a very subtle instrument. For example, the literature suggests that people perceive direct speech (e.g., Joanne said: 'I went out for dinner last night') as more vivid and perceptually

  4. Combined effects of form- and meaning-based predictability on perceived clarity of speech.

    Science.gov (United States)

    Signoret, Carine; Johnsrude, Ingrid; Classon, Elisabet; Rudner, Mary

    2018-02-01

    The perceptual clarity of speech is influenced by more than just the acoustic quality of the sound; it also depends on contextual support. For example, a degraded sentence is perceived to be clearer when the content of the speech signal is provided with matching text (i.e., form-based predictability) before hearing the degraded sentence. Here, we investigate whether sentence-level semantic coherence (i.e., meaning-based predictability), enhances perceptual clarity of degraded sentences, and if so, whether the mechanism is the same as that underlying enhancement by matching text. We also ask whether form- and meaning-based predictability are related to individual differences in cognitive abilities. Twenty participants listened to spoken sentences that were either clear or degraded by noise vocoding and rated the clarity of each item. The sentences had either high or low semantic coherence. Each spoken word was preceded by the homologous printed word (matching text), or by a meaningless letter string (nonmatching text). Cognitive abilities were measured with a working memory test. Results showed that perceptual clarity was significantly enhanced both by matching text and by semantic coherence. Importantly, high coherence enhanced the perceptual clarity of the degraded sentences even when they were preceded by matching text, suggesting that the effects of form- and meaning-based predictions on perceptual clarity are independent and additive. However, when working memory capacity indexed by the Size-Comparison Span Test was controlled for, only form-based predictions enhanced perceptual clarity, and then only at some sound quality levels, suggesting that prediction effects are to a certain extent dependent on cognitive abilities. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  5. Audio-visual speech perception in infants and toddlers with Down syndrome, fragile X syndrome, and Williams syndrome.

    Science.gov (United States)

    D'Souza, Dean; D'Souza, Hana; Johnson, Mark H; Karmiloff-Smith, Annette

    2016-08-01

    Typically-developing (TD) infants can construct unified cross-modal percepts, such as a speaking face, by integrating auditory-visual (AV) information. This skill is a key building block upon which higher-level skills, such as word learning, are built. Because word learning is seriously delayed in most children with neurodevelopmental disorders, we assessed the hypothesis that this delay partly results from a deficit in integrating AV speech cues. AV speech integration has rarely been investigated in neurodevelopmental disorders, and never previously in infants. We probed for the McGurk effect, which occurs when the auditory component of one sound (/ba/) is paired with the visual component of another sound (/ga/), leading to the perception of an illusory third sound (/da/ or /tha/). We measured AV integration in 95 infants/toddlers with Down, fragile X, or Williams syndrome, whom we matched on Chronological and Mental Age to 25 TD infants. We also assessed a more basic AV perceptual ability: sensitivity to matching vs. mismatching AV speech stimuli. Infants with Williams syndrome failed to demonstrate a McGurk effect, indicating poor AV speech integration. Moreover, while the TD children discriminated between matching and mismatching AV stimuli, none of the other groups did, hinting at a basic deficit or delay in AV speech processing, which is likely to constrain subsequent language development. Copyright © 2016 Elsevier Inc. All rights reserved.

  6. Speech Evoked Auditory Brainstem Response in Stuttering

    Directory of Open Access Journals (Sweden)

    Ali Akbar Tahaei

    2014-01-01

    Full Text Available Auditory processing deficits have been hypothesized as an underlying mechanism for stuttering. Previous studies have demonstrated abnormal responses in subjects with persistent developmental stuttering (PDS at the higher level of the central auditory system using speech stimuli. Recently, the potential usefulness of speech evoked auditory brainstem responses in central auditory processing disorders has been emphasized. The current study used the speech evoked ABR to investigate the hypothesis that subjects with PDS have specific auditory perceptual dysfunction. Objectives. To determine whether brainstem responses to speech stimuli differ between PDS subjects and normal fluent speakers. Methods. Twenty-five subjects with PDS participated in this study. The speech-ABRs were elicited by the 5-formant synthesized syllable/da/, with duration of 40 ms. Results. There were significant group differences for the onset and offset transient peaks. Subjects with PDS had longer latencies for the onset and offset peaks relative to the control group. Conclusions. Subjects with PDS showed a deficient neural timing in the early stages of the auditory pathway consistent with temporal processing deficits and their abnormal timing may underlie to their disfluency.

  7. The Influence of Direct and Indirect Speech on Source Memory

    Directory of Open Access Journals (Sweden)

    Anita Eerland

    2018-02-01

    Full Text Available People perceive the same situation described in direct speech (e.g., John said, “I like the food at this restaurant” as more vivid and perceptually engaging than described in indirect speech (e.g., John said that he likes the food at the restaurant. So, if direct speech enhances the perception of vividness relative to indirect speech, what are the effects of using indirect speech? In four experiments, we examined whether the use of direct and indirect speech influences the comprehender’s memory for the identity of the speaker. Participants read a direct or an indirect speech version of a story and then addressed statements to one of the four protagonists of the story in a memory task. We found better source memory at the level of protagonist gender after indirect than direct speech (Exp. 1–3. When the story was rewritten to make the protagonists more distinctive, we also found an effect of speech type on source memory at the level of the individual, with better memory after indirect than direct speech (Exp. 3–4. Memory for the content of the story, however, was not influenced by speech type (Exp. 4. While previous research showed that direct speech may enhance memory for how something was said, we conclude that indirect speech enhances memory for who said what.

  8. Validating a perceptual distraction model in a personal two-zone sound system

    DEFF Research Database (Denmark)

    Rämö, Jussi; Christensen, Lasse; Bech, Søren

    2017-01-01

    This paper focuses on validating a perceptual distraction model, which aims to predict user’s perceived distraction caused by audio-on-audio interference, e.g., two competing audio sources within the same listening space. Originally, the distraction model was trained with music-on-music stimuli...... using a simple loudspeaker setup, consisting of only two loudspeakers, one for the target sound source and the other for the interfering sound source. Recently, the model was successfully validated in a complex personal sound-zone system with speech-on-music stimuli. Second round of validations were...... conducted by physically altering the sound-zone system and running a set of new listening experiments utilizing two sound zones within the sound-zone system. Thus, validating the model using a different sound-zone system with both speech-on-music and music-on-speech stimuli sets. Preliminary results show...

  9. Mapping Speech Spectra from Throat Microphone to Close-Speaking Microphone: A Neural Network Approach

    Directory of Open Access Journals (Sweden)

    B. Yegnanarayana

    2007-01-01

    Full Text Available Speech recorded from a throat microphone is robust to the surrounding noise, but sounds unnatural unlike the speech recorded from a close-speaking microphone. This paper addresses the issue of improving the perceptual quality of the throat microphone speech by mapping the speech spectra from the throat microphone to the close-speaking microphone. A neural network model is used to capture the speaker-dependent functional relationship between the feature vectors (cepstral coefficients of the two speech signals. A method is proposed to ensure the stability of the all-pole synthesis filter. Objective evaluations indicate the effectiveness of the proposed mapping scheme. The advantage of this method is that the model gives a smooth estimate of the spectra of the close-speaking microphone speech. No distortions are perceived in the reconstructed speech. This mapping technique is also used for bandwidth extension of telephone speech.

  10. Perceptual Speech and Paralinguistic Skills of Adolescents with Williams Syndrome

    Science.gov (United States)

    Hargrove, Patricia M.; Pittelko, Stephen; Fillingane, Evan; Rustman, Emily; Lund, Bonnie

    2013-01-01

    The purpose of this research was to compare selected speech and paralinguistic skills of speakers with Williams syndrome (WS) and typically developing peers and to demonstrate the feasibility of providing preexisting databases to students to facilitate graduate research. In a series of three studies, conversational samples of 12 adolescents with…

  11. Perceptual structure of adductor spasmodic dysphonia and its acoustic correlates.

    Science.gov (United States)

    Cannito, Michael P; Doiuchi, Maki; Murry, Thomas; Woodson, Gayle E

    2012-11-01

    To examine the perceptual structure of voice attributes in adductor spasmodic dysphonia (ADSD) before and after botulinum toxin treatment and identify acoustic correlates of underlying perceptual factors. Reliability of perceptual judgments is considered in detail. Pre- and posttreatment trial with comparison to healthy controls, using single-blind randomized listener judgments of voice qualities, as well as retrospective comparison with acoustic measurements. Oral readings were recorded from 42 ADSD speakers before and after treatment as well as from their age- and sex-matched controls. Experienced judges listened to speech samples and rated attributes of overall voice quality, breathiness, roughness, and brokenness, using computer-implemented visual analog scaling. Data were adjusted for regression to the mean and submitted to principal components factor analysis. Acoustic waveforms, extracted from the reading samples, were analyzed and measurements correlated with perceptual factor scores. Four reliable perceptual variables of ADSD voice were effectively reduced to two underlying factors that corresponded to hyperadduction, most strongly associated with roughness, and hypoadduction, most strongly associated with breathiness. After treatment, the hyperadduction factor improved, whereas the hypoadduction factor worsened. Statistically significant (P<0.01) correlations were observed between perceived roughness and four acoustic measures, whereas breathiness correlated with aperiodicity and cepstral peak prominence (CPPs). This study supported a two-factor model of ADSD, suggesting perceptual characterization by both hyperadduction and hypoadduction before and after treatment. Responses of the factors to treatment were consistent with previous research. Correlations among perceptual and acoustic variables suggested that multiple acoustic features contributed to the overall impression of roughness. Although CPPs appears to be a partial correlate of perceived

  12. Speech pathologists' current practice with cognitive-communication assessment during post-traumatic amnesia: a survey.

    Science.gov (United States)

    Steel, Joanne; Ferguson, Alison; Spencer, Elizabeth; Togher, Leanne

    2013-01-01

    To investigate speech pathologists' current practice with adults who are in post-traumatic amnesia (PTA). Speech pathologists with experience of adults in PTA were invited to take part in an online survey through Australian professional email/internet-based interest groups. Forty-five speech pathologists responded to the online survey. The majority of respondents (78%) reported using informal, observational assessment methods commencing at initial contact with people in PTA or when patients' level of alertness allowed and initiating formal assessment on emergence from PTA. Seven respondents (19%) reported undertaking no assessment during PTA. Clinicians described using a range of techniques to monitor cognitive-communication during PTA, including static, dynamic, functional and impairment-based methods. The study confirmed that speech pathologists have a key role in the multidisciplinary team caring for the person in PTA, especially with family education and facilitating interactions with the rehabilitation team and family. Decision-making around timing and means of assessment of cognitive-communication during PTA appeared primarily reliant on speech pathologists' professional experience and the culture of their workplace. The findings support the need for further research into the nature of cognitive-communication disorder and resolution over this period.

  13. Prediction and constraint in audiovisual speech perception

    Science.gov (United States)

    Peelle, Jonathan E.; Sommers, Mitchell S.

    2015-01-01

    During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing precision of prediction. Electrophysiological studies demonstrate oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to auditory information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration, supported

  14. Prediction and constraint in audiovisual speech perception.

    Science.gov (United States)

    Peelle, Jonathan E; Sommers, Mitchell S

    2015-07-01

    During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing the precision of prediction. Electrophysiological studies demonstrate that oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to acoustic information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration

  15. Song and speech: examining the link between singing talent and speech imitation ability

    Science.gov (United States)

    Christiner, Markus; Reiterer, Susanne M.

    2013-01-01

    In previous research on speech imitation, musicality, and an ability to sing were isolated as the strongest indicators of good pronunciation skills in foreign languages. We, therefore, wanted to take a closer look at the nature of the ability to sing, which shares a common ground with the ability to imitate speech. This study focuses on whether good singing performance predicts good speech imitation. Forty-one singers of different levels of proficiency were selected for the study and their ability to sing, to imitate speech, their musical talent and working memory were tested. Results indicated that singing performance is a better indicator of the ability to imitate speech than the playing of a musical instrument. A multiple regression revealed that 64% of the speech imitation score variance could be explained by working memory together with educational background and singing performance. A second multiple regression showed that 66% of the speech imitation variance of completely unintelligible and unfamiliar language stimuli (Hindi) could be explained by working memory together with a singer's sense of rhythm and quality of voice. This supports the idea that both vocal behaviors have a common grounding in terms of vocal and motor flexibility, ontogenetic and phylogenetic development, neural orchestration and auditory memory with singing fitting better into the category of “speech” on the productive level and “music” on the acoustic level. As a result, good singers benefit from vocal and motor flexibility, productively and cognitively, in three ways. (1) Motor flexibility and the ability to sing improve language and musical function. (2) Good singers retain a certain plasticity and are open to new and unusual sound combinations during adulthood both perceptually and productively. (3) The ability to sing improves the memory span of the auditory working memory. PMID:24319438

  16. Speech-language pathologists' practices regarding assessment, analysis, target selection, intervention, and service delivery for children with speech sound disorders.

    Science.gov (United States)

    Mcleod, Sharynne; Baker, Elise

    2014-01-01

    A survey of 231 Australian speech-language pathologists (SLPs) was undertaken to describe practices regarding assessment, analysis, target selection, intervention, and service delivery for children with speech sound disorders (SSD). The participants typically worked in private practice, education, or community health settings and 67.6% had a waiting list for services. For each child, most of the SLPs spent 10-40 min in pre-assessment activities, 30-60 min undertaking face-to-face assessments, and 30-60 min completing paperwork after assessments. During an assessment SLPs typically conducted a parent interview, single-word speech sampling, collected a connected speech sample, and used informal tests. They also determined children's stimulability and estimated intelligibility. With multilingual children, informal assessment procedures and English-only tests were commonly used and SLPs relied on family members or interpreters to assist. Common analysis techniques included determination of phonological processes, substitutions-omissions-distortions-additions (SODA), and phonetic inventory. Participants placed high priority on selecting target sounds that were stimulable, early developing, and in error across all word positions and 60.3% felt very confident or confident selecting an appropriate intervention approach. Eight intervention approaches were frequently used: auditory discrimination, minimal pairs, cued articulation, phonological awareness, traditional articulation therapy, auditory bombardment, Nuffield Centre Dyspraxia Programme, and core vocabulary. Children typically received individual therapy with an SLP in a clinic setting. Parents often observed and participated in sessions and SLPs typically included siblings and grandparents in intervention sessions. Parent training and home programs were more frequently used than the group therapy. Two-thirds kept up-to-date by reading journal articles monthly or every 6 months. There were many similarities with

  17. Referenceless Prediction of Perceptual Fog Density and Perceptual Image Defogging.

    Science.gov (United States)

    Choi, Lark Kwon; You, Jaehee; Bovik, Alan Conrad

    2015-11-01

    We propose a referenceless perceptual fog density prediction model based on natural scene statistics (NSS) and fog aware statistical features. The proposed model, called Fog Aware Density Evaluator (FADE), predicts the visibility of a foggy scene from a single image without reference to a corresponding fog-free image, without dependence on salient objects in a scene, without side geographical camera information, without estimating a depth-dependent transmission map, and without training on human-rated judgments. FADE only makes use of measurable deviations from statistical regularities observed in natural foggy and fog-free images. Fog aware statistical features that define the perceptual fog density index derive from a space domain NSS model and the observed characteristics of foggy images. FADE not only predicts perceptual fog density for the entire image, but also provides a local fog density index for each patch. The predicted fog density using FADE correlates well with human judgments of fog density taken in a subjective study on a large foggy image database. As applications, FADE not only accurately assesses the performance of defogging algorithms designed to enhance the visibility of foggy images, but also is well suited for image defogging. A new FADE-based referenceless perceptual image defogging, dubbed DEnsity of Fog Assessment-based DEfogger (DEFADE) achieves better results for darker, denser foggy images as well as on standard foggy images than the state of the art defogging methods. A software release of FADE and DEFADE is available online for public use: http://live.ece.utexas.edu/research/fog/index.html.

  18. Simultaneous Assessment of Speech Identification and Spatial Discrimination

    Directory of Open Access Journals (Sweden)

    Jennifer K. Bizley

    2015-12-01

    Full Text Available With increasing numbers of children and adults receiving bilateral cochlear implants, there is an urgent need for assessment tools that enable testing of binaural hearing abilities. Current test batteries are either limited in scope or are of an impractical duration for routine testing. Here, we report a behavioral test that enables combined testing of speech identification and spatial discrimination in noise. In this task, multitalker babble was presented from all speakers, and pairs of speech tokens were sequentially presented from two adjacent speakers. Listeners were required to identify both words from a closed set of four possibilities and to determine whether the second token was presented to the left or right of the first. In Experiment 1, normal-hearing adult listeners were tested at 15° intervals throughout the frontal hemifield. Listeners showed highest spatial discrimination performance in and around the frontal midline, with a decline at more eccentric locations. In contrast, speech identification abilities were least accurate near the midline and showed an improvement in performance at more lateral locations. In Experiment 2, normal-hearing listeners were assessed using a restricted range of speaker locations designed to match those found in clinical testing environments. Here, speakers were separated by 15° around the midline and 30° at more lateral locations. This resulted in a similar pattern of behavioral results as in Experiment 1. We conclude, this test offers the potential to assess both spatial discrimination and the ability to use spatial information for unmasking in clinical populations.

  19. Signal Processing Methods for Removing the Effects of Whole Body Vibration upon Speech

    Science.gov (United States)

    Bitner, Rachel M.; Begault, Durand R.

    2014-01-01

    Humans may be exposed to whole-body vibration in environments where clear speech communications are crucial, particularly during the launch phases of space flight and in high-performance aircraft. Prior research has shown that high levels of vibration cause a decrease in speech intelligibility. However, the effects of whole-body vibration upon speech are not well understood, and no attempt has been made to restore speech distorted by whole-body vibration. In this paper, a model for speech under whole-body vibration is proposed and a method to remove its effect is described. The method described reduces the perceptual effects of vibration, yields higher ASR accuracy scores, and may significantly improve intelligibility. Possible applications include incorporation within communication systems to improve radio-communication systems in environments such a spaceflight, aviation, or off-road vehicle operations.

  20. Relative Contributions of the Dorsal vs. Ventral Speech Streams to Speech Perception are Context Dependent: a lesion study

    Directory of Open Access Journals (Sweden)

    Corianne Rogalsky

    2014-04-01

    Full Text Available The neural basis of speech perception has been debated for over a century. While it is generally agreed that the superior temporal lobes are critical for the perceptual analysis of speech, a major current topic is whether the motor system contributes to speech perception, with several conflicting findings attested. In a dorsal-ventral speech stream framework (Hickok & Poeppel 2007, this debate is essentially about the roles of the dorsal versus ventral speech processing streams. A major roadblock in characterizing the neuroanatomy of speech perception is task-specific effects. For example, much of the evidence for dorsal stream involvement comes from syllable discrimination type tasks, which have been found to behaviorally doubly dissociate from auditory comprehension tasks (Baker et al. 1981. Discrimination task deficits could be a result of difficulty perceiving the sounds themselves, which is the typical assumption, or it could be a result of failures in temporary maintenance of the sensory traces, or the comparison and/or the decision process. Similar complications arise in perceiving sentences: the extent of inferior frontal (i.e. dorsal stream activation during listening to sentences increases as a function of increased task demands (Love et al. 2006. Another complication is the stimulus: much evidence for dorsal stream involvement uses speech samples lacking semantic context (CVs, non-words. The present study addresses these issues in a large-scale lesion-symptom mapping study. 158 patients with focal cerebral lesions from the Mutli-site Aphasia Research Consortium underwent a structural MRI or CT scan, as well as an extensive psycholinguistic battery. Voxel-based lesion symptom mapping was used to compare the neuroanatomy involved in the following speech perception tasks with varying phonological, semantic, and task loads: (i two discrimination tasks of syllables (non-words and words, respectively, (ii two auditory comprehension tasks

  1. The performance-perceptual test and its relationship to unaided reported handicap.

    Science.gov (United States)

    Saunders, Gabrielle H; Forsline, Anna; Fausti, Stephen A

    2004-04-01

    Measurement of hearing aid outcomes is necessary for demonstration of treatment efficacy, third-party payment, and cost-benefit analysis. Outcomes are usually measured with hearing-related questionnaires and/or tests of speech recognition. However, results from these two types of test often conflict. In this paper, we provide data from a new test measure, known as the Performance-Perceptual Test (PPT), in which subjective and performance aspects of hearing in noise are measured using the same test materials and procedures. A Performance Speech Reception Threshold (SRTN) and a Perceptual SRTN are measured using the Hearing In Noise Test materials and adaptive procedure. A third variable, the discrepancy between these two SRTNs, is also computed. It measures the accuracy with which subjects assess their own hearing ability and is referred to as the Performance-Perceptual Discrepancy (PPDIS). One hundred seven subjects between 24 and 83 yr of age took part. Thirty-three subjects had normal hearing, while the remaining seventy-four had symmetrical sensorineural hearing loss. Of the subjects with impaired hearing, 24 wore hearing aids and 50 did not. All subjects underwent routine audiological examination and completed the PPT and the Hearing Handicap Inventory for the Elderly/Adults on two occasions, between 1 and 2 wk apart. The PPT was conducted for unaided listening with the masker level set to 50, 65, and 80 dB SPL. PPT data show that the subjects with normal hearing have significantly better Performance and Perceptual SRTNs at each test level than the subjects with impaired hearing but that PPDIS values do not differ between the groups. Test-retest reliability for the PPT is excellent (r-values > 0.93 for all conditions). Stepwise multiple regression analysis showed that the Performance SRTN, the PPDIS, and age explain 40% of the variance in reported handicap (Hearing Handicap Inventory for the Elderly/Adults scores). More specifically, poorer performance

  2. Recovering With Acquired Apraxia of Speech: The First 2 Years.

    Science.gov (United States)

    Haley, Katarina L; Shafer, Jennifer N; Harmon, Tyson G; Jacks, Adam

    2016-12-01

    This study was intended to document speech recovery for 1 person with acquired apraxia of speech quantitatively and on the basis of her lived experience. The second author sustained a traumatic brain injury that resulted in acquired apraxia of speech. Over a 2-year period, she documented her recovery through 22 video-recorded monologues. We analyzed these monologues using a combination of auditory perceptual, acoustic, and qualitative methods. Recovery was evident for all quantitative variables examined. For speech sound production, the recovery was most prominent during the first 3 months, but slower improvement was evident for many months. Measures of speaking rate, fluency, and prosody changed more gradually throughout the entire period. A qualitative analysis of topics addressed in the monologues was consistent with the quantitative speech recovery and indicated a subjective dynamic relationship between accuracy and rate, an observation that several factors made speech sound production variable, and a persisting need for cognitive effort while speaking. Speech features improved over an extended time, but the recovery trajectories differed, indicating dynamic reorganization of the underlying speech production system. The relationship among speech dimensions should be examined in other cases and in population samples. The combination of quantitative and qualitative analysis methods offers advantages for understanding clinically relevant aspects of recovery.

  3. Functional heterogeneity within the default network during semantic processing and speech production.

    Directory of Open Access Journals (Sweden)

    Mohamed L Seghier

    2012-08-01

    Full Text Available This fMRI study investigated the functional heterogeneity of the core nodes of the default mode network (DMN during language processing. The core nodes of the DMN were defined as task-induced deactivations over multiple tasks in 94 healthy subjects. We used a factorial design that manipulated different tasks (semantic matching or speech production and stimuli (familiar words and objects or unfamiliar stimuli, alternating with periods of fixation/rest. Our findings revealed several consistent effects in the DMN, namely less deactivations in the left inferior parietal lobule during semantic than perceptual matching in parallel with greater deactivations during semantic matching in anterior subdivisions of the posterior cingulate cortex and the ventromedial prefrontal cortex. This suggests that, when the brain is engaged in effortful semantic tasks, a part of the DMN in the left angular gyrus was less deactivated as five other nodes of the DMN were more deactivated. These five DMN areas, where deactivation was greater for semantic than perceptual matching, were further differentiated because deactivation was greater in (i posterior ventromedial prefrontal cortex for speech production relative to semantic matching, (ii posterior precuneus and posterior cingulate cortex for perceptual processing relative to speech production and (iii right inferior parietal cortex for pictures of objects relative to written words during both naming and semantic decisions. Our results thus highlight that task difficulty alone cannot fully explain the functional variability in task-induced deactivations. Together these results emphasize that core nodes within the DMN are functionally heterogeneous and differentially sensitive to the type of language processing.

  4. Finding the music of speech: Musical knowledge influences pitch processing in speech.

    Science.gov (United States)

    Vanden Bosch der Nederlanden, Christina M; Hannon, Erin E; Snyder, Joel S

    2015-10-01

    Few studies comparing music and language processing have adequately controlled for low-level acoustical differences, making it unclear whether differences in music and language processing arise from domain-specific knowledge, acoustic characteristics, or both. We controlled acoustic characteristics by using the speech-to-song illusion, which often results in a perceptual transformation to song after several repetitions of an utterance. Participants performed a same-different pitch discrimination task for the initial repetition (heard as speech) and the final repetition (heard as song). Better detection was observed for pitch changes that violated rather than conformed to Western musical scale structure, but only when utterances transformed to song, indicating that music-specific pitch representations were activated and influenced perception. This shows that music-specific processes can be activated when an utterance is heard as song, suggesting that the high-level status of a stimulus as either language or music can be behaviorally dissociated from low-level acoustic factors. Copyright © 2015 Elsevier B.V. All rights reserved.

  5. Interdependent processing and encoding of speech and concurrent background noise.

    Science.gov (United States)

    Cooper, Angela; Brouwer, Susanne; Bradlow, Ann R

    2015-05-01

    Speech processing can often take place in adverse listening conditions that involve the mixing of speech and background noise. In this study, we investigated processing dependencies between background noise and indexical speech features, using a speeded classification paradigm (Garner, 1974; Exp. 1), and whether background noise is encoded and represented in memory for spoken words in a continuous recognition memory paradigm (Exp. 2). Whether or not the noise spectrally overlapped with the speech signal was also manipulated. The results of Experiment 1 indicated that background noise and indexical features of speech (gender, talker identity) cannot be completely segregated during processing, even when the two auditory streams are spectrally nonoverlapping. Perceptual interference was asymmetric, whereby irrelevant indexical feature variation in the speech signal slowed noise classification to a greater extent than irrelevant noise variation slowed speech classification. This asymmetry may stem from the fact that speech features have greater functional relevance to listeners, and are thus more difficult to selectively ignore than background noise. Experiment 2 revealed that a recognition cost for words embedded in different types of background noise on the first and second occurrences only emerged when the noise and the speech signal were spectrally overlapping. Together, these data suggest integral processing of speech and background noise, modulated by the level of processing and the spectral separation of the speech and noise.

  6. L’unité intonative dans les textes oralisés // Intonation unit in read speech

    Directory of Open Access Journals (Sweden)

    Lea Tylečková

    2015-12-01

    Full Text Available Prosodic phrasing, i.e. division of speech into intonation units, represents a phenomenon which is central to language comprehension. Incorrect prosodic boundary markings may lead to serious misunderstandings and ambiguous interpretations of utterances. The present paper investigates prosodic competencies of Czech students of French in the domain of prosodic phrasing in French read speech. Two texts of different length are examined through a perceptual method to observe how Czech speakers of French (B1–B2 level of CEFR divide read speech into prosodic units compared to French native speakers.

  7. Psychoacoustic cues to emotion in speech prosody and music.

    Science.gov (United States)

    Coutinho, Eduardo; Dibben, Nicola

    2013-01-01

    There is strong evidence of shared acoustic profiles common to the expression of emotions in music and speech, yet relatively limited understanding of the specific psychoacoustic features involved. This study combined a controlled experiment and computational modelling to investigate the perceptual codes associated with the expression of emotion in the acoustic domain. The empirical stage of the study provided continuous human ratings of emotions perceived in excerpts of film music and natural speech samples. The computational stage created a computer model that retrieves the relevant information from the acoustic stimuli and makes predictions about the emotional expressiveness of speech and music close to the responses of human subjects. We show that a significant part of the listeners' second-by-second reported emotions to music and speech prosody can be predicted from a set of seven psychoacoustic features: loudness, tempo/speech rate, melody/prosody contour, spectral centroid, spectral flux, sharpness, and roughness. The implications of these results are discussed in the context of cross-modal similarities in the communication of emotion in the acoustic domain.

  8. Speech-like orofacial oscillations in stump-tailed macaque (Macaca arctoides) facial and vocal signals.

    Science.gov (United States)

    Toyoda, Aru; Maruhashi, Tamaki; Malaivijitnond, Suchinda; Koda, Hiroki

    2017-10-01

    Speech is unique to humans and characterized by facial actions of ∼5 Hz oscillations of lip, mouth or jaw movements. Lip-smacking, a facial display of primates characterized by oscillatory actions involving the vertical opening and closing of the jaw and lips, exhibits stable 5-Hz oscillation patterns, matching that of speech, suggesting that lip-smacking is a precursor of speech. We tested if facial or vocal actions exhibiting the same rate of oscillation are found in wide forms of facial or vocal displays in various social contexts, exhibiting diversity among species. We observed facial and vocal actions of wild stump-tailed macaques (Macaca arctoides), and selected video clips including facial displays (teeth chattering; TC), panting calls, and feeding. Ten open-to-open mouth durations during TC and feeding and five amplitude peak-to-peak durations in panting were analyzed. Facial display (TC) and vocalization (panting) oscillated within 5.74 ± 1.19 and 6.71 ± 2.91 Hz, respectively, similar to the reported lip-smacking of long-tailed macaques and the speech of humans. These results indicated a common mechanism for the central pattern generator underlying orofacial movements, which would evolve to speech. Similar oscillations in panting, which evolved from different muscular control than the orofacial action, suggested the sensory foundations for perceptual saliency particular to 5-Hz rhythms in macaques. This supports the pre-adaptation hypothesis of speech evolution, which states a central pattern generator for 5-Hz facial oscillation and perceptual background tuned to 5-Hz actions existed in common ancestors of macaques and humans, before the emergence of speech. © 2017 Wiley Periodicals, Inc.

  9. Visual speech alters the discrimination and identification of non-intact auditory speech in children with hearing loss.

    Science.gov (United States)

    Jerger, Susan; Damian, Markus F; McAlpine, Rachel P; Abdi, Hervé

    2017-03-01

    Understanding spoken language is an audiovisual event that depends critically on the ability to discriminate and identify phonemes yet we have little evidence about the role of early auditory experience and visual speech on the development of these fundamental perceptual skills. Objectives of this research were to determine 1) how visual speech influences phoneme discrimination and identification; 2) whether visual speech influences these two processes in a like manner, such that discrimination predicts identification; and 3) how the degree of hearing loss affects this relationship. Such evidence is crucial for developing effective intervention strategies to mitigate the effects of hearing loss on language development. Participants were 58 children with early-onset sensorineural hearing loss (CHL, 53% girls, M = 9;4 yrs) and 58 children with normal hearing (CNH, 53% girls, M = 9;4 yrs). Test items were consonant-vowel (CV) syllables and nonwords with intact visual speech coupled to non-intact auditory speech (excised onsets) as, for example, an intact consonant/rhyme in the visual track (Baa or Baz) coupled to non-intact onset/rhyme in the auditory track (/-B/aa or/-B/az). The items started with an easy-to-speechread/B/or difficult-to-speechread/G/onset and were presented in the auditory (static face) vs. audiovisual (dynamic face) modes. We assessed discrimination for intact vs. non-intact different pairs (e.g., Baa:/-B/aa). We predicted that visual speech would cause the non-intact onset to be perceived as intact and would therefore generate more same-as opposed to different-responses in the audiovisual than auditory mode. We assessed identification by repetition of nonwords with non-intact onsets (e.g.,/-B/az). We predicted that visual speech would cause the non-intact onset to be perceived as intact and would therefore generate more Baz-as opposed to az- responses in the audiovisual than auditory mode. Performance in the audiovisual mode showed more same

  10. Visual Speech Alters the Discrimination and Identification of Non-Intact Auditory Speech in Children with Hearing Loss

    Science.gov (United States)

    Jerger, Susan; Damian, Markus F.; McAlpine, Rachel P.; Abdi, Hervé

    2017-01-01

    Objectives Understanding spoken language is an audiovisual event that depends critically on the ability to discriminate and identify phonemes yet we have little evidence about the role of early auditory experience and visual speech on the development of these fundamental perceptual skills. Objectives of this research were to determine 1) how visual speech influences phoneme discrimination and identification; 2) whether visual speech influences these two processes in a like manner, such that discrimination predicts identification; and 3) how the degree of hearing loss affects this relationship. Such evidence is crucial for developing effective intervention strategies to mitigate the effects of hearing loss on language development. Methods Participants were 58 children with early-onset sensorineural hearing loss (CHL, 53% girls, M = 9;4 yrs) and 58 children with normal hearing (CNH, 53% girls, M = 9;4 yrs). Test items were consonant-vowel (CV) syllables and nonwords with intact visual speech coupled to non-intact auditory speech (excised onsets) as, for example, an intact consonant/rhyme in the visual track (Baa or Baz) coupled to non-intact onset/rhyme in the auditory track (/–B/aa or /–B/az). The items started with an easy-to-speechread /B/ or difficult-to-speechread /G/ onset and were presented in the auditory (static face) vs. audiovisual (dynamic face) modes. We assessed discrimination for intact vs. non-intact different pairs (e.g., Baa:/–B/aa). We predicted that visual speech would cause the non-intact onset to be perceived as intact and would therefore generate more same—as opposed to different—responses in the audiovisual than auditory mode. We assessed identification by repetition of nonwords with non-intact onsets (e.g., /–B/az). We predicted that visual speech would cause the non-intact onset to be perceived as intact and would therefore generate more Baz—as opposed to az— responses in the audiovisual than auditory mode. Results

  11. [Improving speech comprehension using a new cochlear implant speech processor].

    Science.gov (United States)

    Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A

    2009-06-01

    The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg

  12. Evaluative pressure overcomes perceptual load effects.

    Science.gov (United States)

    Normand, Alice; Autin, Frédérique; Croizet, Jean-Claude

    2015-06-01

    Perceptual load has been found to be a powerful bottom-up determinant of distractibility, with high perceptual load preventing distraction by any irrelevant information. However, when under evaluative pressure, individuals exert top-down attentional control by giving greater weight to task-relevant features, making them more distractible from task-relevant distractors. One study tested whether the top-down modulation of attention under evaluative pressure overcomes the beneficial bottom-up effect of high perceptual load on distraction. Using a response-competition task, we replicated previous findings that high levels of perceptual load suppress task-relevant distractor response interference, but only for participants in a control condition. Participants under evaluative pressure (i.e., who believed their intelligence was assessed) showed interference from task-relevant distractor at all levels of perceptual load. This research challenges the assumptions of the perceptual load theory and sheds light on a neglected determinant of distractibility: the self-relevance of the performance situation in which attentional control is solicited.

  13. Voice-associated static face image releases speech from informational masking.

    Science.gov (United States)

    Gao, Yayue; Cao, Shuyang; Qu, Tianshu; Wu, Xihong; Li, Haifeng; Zhang, Jinsheng; Li, Liang

    2014-06-01

    In noisy, multipeople talking environments such as a cocktail party, listeners can use various perceptual and/or cognitive cues to improve recognition of target speech against masking, particularly informational masking. Previous studies have shown that temporally prepresented voice cues (voice primes) improve recognition of target speech against speech masking but not noise masking. This study investigated whether static face image primes that have become target-voice associated (i.e., facial images linked through associative learning with voices reciting the target speech) can be used by listeners to unmask speech. The results showed that in 32 normal-hearing younger adults, temporally prepresenting a voice-priming sentence with the same voice reciting the target sentence significantly improved the recognition of target speech that was masked by irrelevant two-talker speech. When a person's face photograph image became associated with the voice reciting the target speech by learning, temporally prepresenting the target-voice-associated face image significantly improved recognition of target speech against speech masking, particularly for the last two keywords in the target sentence. Moreover, speech-recognition performance under the voice-priming condition was significantly correlated to that under the face-priming condition. The results suggest that learned facial information on talker identity plays an important role in identifying the target-talker's voice and facilitating selective attention to the target-speech stream against the masking-speech stream. © 2014 The Institute of Psychology, Chinese Academy of Sciences and Wiley Publishing Asia Pty Ltd.

  14. "The Caterpillar": A Novel Reading Passage for Assessment of Motor Speech Disorders

    Science.gov (United States)

    Patel, Rupal; Connaghan, Kathryn; Franco, Diana; Edsall, Erika; Forgit, Dory; Olsen, Laura; Ramage, Lianna; Tyler, Emily; Russell, Scott

    2013-01-01

    Purpose: A review of the salient characteristics of motor speech disorders and common assessment protocols revealed the need for a novel reading passage tailored specifically to differentiate between and among the dysarthrias (DYSs) and apraxia of speech (AOS). Method: "The Caterpillar" passage was designed to provide a contemporary, easily read,…

  15. Speech-language pathologists' assessment and intervention practices with multilingual children.

    Science.gov (United States)

    Williams, Corinne J; McLeod, Sharynne

    2012-06-01

    Within predominantly English-speaking countries such as the US, UK, Canada, New Zealand, and Australia, there are a significant number of people who speak languages other than English. This study aimed to examine Australian speech-language pathologists' (SLPs) perspectives and experiences of multilingualism, including their assessment and intervention practices, and service delivery methods when working with children who speak languages other than English. A questionnaire was completed by 128 SLPs who attended an SLP seminar about cultural and linguistic diversity. Approximately one half of the SLPs (48.4%) reported that they had at least minimal competence in a language(s) other than English; but only 12 (9.4%) reported that they were proficient in another language. The SLPs spoke a total of 28 languages other than English, the most common being French, Italian, German, Spanish, Mandarin, and Auslan (Australian sign language). Participants reported that they had, in the past 12 months, worked with a mean of 59.2 (range 1-100) children from multilingual backgrounds. These children were reported to speak between two and five languages each; the most common being: Vietnamese, Arabic, Cantonese, Mandarin, Australian Indigenous languages, Tagalog, Greek, and other Chinese languages. There was limited overlap between the languages spoken by the SLPs and the children on the SLPs' caseloads. Many of the SLPs assessed children's speech (50.5%) and/or language (34.2%) without assistance from others (including interpreters). English was the primary language used during assessments and intervention. The majority of SLPs always used informal speech (76.7%) and language (78.2%) assessments and, if standardized tests were used, typically they were in English. The SLPs sought additional information about the children's languages and cultural backgrounds, but indicated that they had limited resources to discriminate between speech and language difference vs disorder.

  16. Perceptual learning.

    Science.gov (United States)

    Seitz, Aaron R

    2017-07-10

    Perceptual learning refers to how experience can change the way we perceive sights, sounds, smells, tastes, and touch. Examples abound: music training improves our ability to discern tones; experience with food and wines can refine our pallet (and unfortunately more quickly empty our wallet), and with years of training radiologists learn to save lives by discerning subtle details of images that escape the notice of untrained viewers. We often take perceptual learning for granted, but it has a profound impact on how we perceive the world. In this Primer, I will explain how perceptual learning is transformative in guiding our perceptual processes, how research into perceptual learning provides insight into fundamental mechanisms of learning and brain processes, and how knowledge of perceptual learning can be used to develop more effective training approaches for those requiring expert perceptual skills or those in need of perceptual rehabilitation (such as individuals with poor vision). I will make a case that perceptual learning is ubiquitous, scientifically interesting, and has substantial practical utility to us all. Copyright © 2017. Published by Elsevier Ltd.

  17. A comparison of several computational auditory scene analysis (CASA) techniques for monaural speech segregation.

    Science.gov (United States)

    Zeremdini, Jihen; Ben Messaoud, Mohamed Anouar; Bouzid, Aicha

    2015-09-01

    Humans have the ability to easily separate a composed speech and to form perceptual representations of the constituent sources in an acoustic mixture thanks to their ears. Until recently, researchers attempt to build computer models of high-level functions of the auditory system. The problem of the composed speech segregation is still a very challenging problem for these researchers. In our case, we are interested in approaches that are addressed to the monaural speech segregation. For this purpose, we study in this paper the computational auditory scene analysis (CASA) to segregate speech from monaural mixtures. CASA is the reproduction of the source organization achieved by listeners. It is based on two main stages: segmentation and grouping. In this work, we have presented, and compared several studies that have used CASA for speech separation and recognition.

  18. Prosodic influences on speech production in children with specific language impairment and speech deficits: kinematic, acoustic, and transcription evidence.

    Science.gov (United States)

    Goffman, L

    1999-12-01

    It is often hypothesized that young children's difficulties with producing weak-strong (iambic) prosodic forms arise from perceptual or linguistically based production factors. A third possible contributor to errors in the iambic form may be biological constraints, or biases, of the motor system. In the present study, 7 children with specific language impairment (SLI) and speech deficits were matched to same age peers. Multiple levels of analysis, including kinematic (modulation and stability of movement), acoustic, and transcription, were applied to children's productions of iambic (weak-strong) and trochaic (strong-weak) prosodic forms. Findings suggest that a motor bias toward producing unmodulated rhythmic articulatory movements, similar to that observed in canonical babbling, contribute to children's acquisition of metrical forms. Children with SLI and speech deficits show less mature segmental and speech motor systems, as well as decreased modulation of movement in later developing iambic forms. Further, components of prosodic and segmental acquisition develop independently and at different rates.

  19. Perceptual video quality assessment in H.264 video coding standard using objective modeling.

    Science.gov (United States)

    Karthikeyan, Ramasamy; Sainarayanan, Gopalakrishnan; Deepa, Subramaniam Nachimuthu

    2014-01-01

    Since usage of digital video is wide spread nowadays, quality considerations have become essential, and industry demand for video quality measurement is rising. This proposal provides a method of perceptual quality assessment in H.264 standard encoder using objective modeling. For this purpose, quality impairments are calculated and a model is developed to compute the perceptual video quality metric based on no reference method. Because of the shuttle difference between the original video and the encoded video the quality of the encoded picture gets degraded, this quality difference is introduced by the encoding process like Intra and Inter prediction. The proposed model takes into account of the artifacts introduced by these spatial and temporal activities in the hybrid block based coding methods and an objective modeling of these artifacts into subjective quality estimation is proposed. The proposed model calculates the objective quality metric using subjective impairments; blockiness, blur and jerkiness compared to the existing bitrate only calculation defined in the ITU G 1070 model. The accuracy of the proposed perceptual video quality metrics is compared against popular full reference objective methods as defined by VQEG.

  20. Do gender differences in audio-visual benefit and visual influence in audio-visual speech perception emerge with age?

    Directory of Open Access Journals (Sweden)

    Magnus eAlm

    2015-07-01

    Full Text Available Gender and age have been found to affect adults’ audio-visual (AV speech perception. However, research on adult aging focuses on adults over 60 years, who have an increasing likelihood for cognitive and sensory decline, which may confound positive effects of age-related AV-experience and its interaction with gender. Observed age and gender differences in AV speech perception may also depend on measurement sensitivity and AV task difficulty. Consequently both AV benefit and visual influence were used to measure visual contribution for gender-balanced groups of young (20-30 years and middle-aged adults (50-60 years with task difficulty varied using AV syllables from different talkers in alternative auditory backgrounds. Females had better speech-reading performance than males. Whereas no gender differences in AV benefit or visual influence were observed for young adults, visually influenced responses were significantly greater for middle-aged females than middle-aged males. That speech-reading performance did not influence AV benefit may be explained by visual speech extraction and AV integration constituting independent abilities. Contrastingly, the gender difference in visually influenced responses in middle adulthood may reflect an experience-related shift in females’ general AV perceptual strategy. Although young females’ speech-reading proficiency may not readily contribute to greater visual influence, between young and middle-adulthood recurrent confirmation of the contribution of visual cues induced by speech-reading proficiency may gradually shift females AV perceptual strategy towards more visually dominated responses.

  1. SISL (ScreeningsInstrument Schisis Leuven): assessment of cleft palate speech, resonance and myofunction.

    Science.gov (United States)

    Breuls, M; Sell, D; Manders, E; Boulet, E; Vander Poorten, V

    2006-01-01

    This paper presents an assessment protocol for the evaluation and description of speech, resonance and myofunctional characteristics commonly associated with cleft palate and/or velopharyngeal dysfunction. The protocol is partly based on the GOS.SP.ASS'98 and adapted to Flemish. It focuses on the relevant aspects of cleft type speech necessary to facilitate assessment, adequate diagnosis and management planning in a multi-disciplinary setting of cleft team care.

  2. Association of Velopharyngeal Insufficiency With Quality of Life and Patient-Reported Outcomes After Speech Surgery.

    Science.gov (United States)

    Bhuskute, Aditi; Skirko, Jonathan R; Roth, Christina; Bayoumi, Ahmed; Durbin-Johnson, Blythe; Tollefson, Travis T

    2017-09-01

    Patients with cleft palate and other causes of velopharyngeal insufficiency (VPI) suffer adverse effects on social interactions and communication. Measurement of these patient-reported outcomes is needed to help guide surgical and nonsurgical care. To further validate the VPI Effects on Life Outcomes (VELO) instrument, measure the change in quality of life (QOL) after speech surgery, and test the association of change in speech with change in QOL. Prospective descriptive cohort including children and young adults undergoing speech surgery for VPI in a tertiary academic center. Participants completed the validated VELO instrument before and after surgical treatment. The main outcome measures were preoperative and postoperative VELO scores and the perceptual speech assessment of speech intelligibility. The VELO scores are divided into subscale domains. Changes in VELO after surgery were analyzed using linear regression models. VELO scores were analyzed as a function of speech intelligibility adjusting for age and cleft type. The correlation between speech intelligibility rating and VELO scores was estimated using the polyserial correlation. Twenty-nine patients (13 males and 16 females) were included. Mean (SD) age was 7.9 (4.1) years (range, 4-20 years). Pharyngeal flap was used in 14 (48%) cases, Furlow palatoplasty in 12 (41%), and sphincter pharyngoplasty in 1 (3%). The mean (SD) preoperative speech intelligibility rating was 1.71 (1.08), which decreased postoperatively to 0.79 (0.93) in 24 patients who completed protocol (P Speech Intelligibility was correlated with preoperative and postoperative total VELO score (P speech intelligibility. Speech surgery improves VPI-specific quality of life. We confirmed validation in a population of untreated patients with VPI and included pharyngeal flap surgery, which had not previously been included in validation studies. The VELO instrument provides patient-specific outcomes, which allows a broader understanding of the

  3. A predictive model for diagnosing stroke-related apraxia of speech.

    Science.gov (United States)

    Ballard, Kirrie J; Azizi, Lamiae; Duffy, Joseph R; McNeil, Malcolm R; Halaki, Mark; O'Dwyer, Nicholas; Layfield, Claire; Scholl, Dominique I; Vogel, Adam P; Robin, Donald A

    2016-01-29

    Diagnosis of the speech motor planning/programming disorder, apraxia of speech (AOS), has proven challenging, largely due to its common co-occurrence with the language-based impairment of aphasia. Currently, diagnosis is based on perceptually identifying and rating the severity of several speech features. It is not known whether all, or a subset of the features, are required for a positive diagnosis. The purpose of this study was to assess predictor variables for the presence of AOS after left-hemisphere stroke, with the goal of increasing diagnostic objectivity and efficiency. This population-based case-control study involved a sample of 72 cases, using the outcome measure of expert judgment on presence of AOS and including a large number of independently collected candidate predictors representing behavioral measures of linguistic, cognitive, nonspeech oral motor, and speech motor ability. We constructed a predictive model using multiple imputation to deal with missing data; the Least Absolute Shrinkage and Selection Operator (Lasso) technique for variable selection to define the most relevant predictors, and bootstrapping to check the model stability and quantify the optimism of the developed model. Two measures were sufficient to distinguish between participants with AOS plus aphasia and those with aphasia alone, (1) a measure of speech errors with words of increasing length and (2) a measure of relative vowel duration in three-syllable words with weak-strong stress pattern (e.g., banana, potato). The model has high discriminative ability to distinguish between cases with and without AOS (c-index=0.93) and good agreement between observed and predicted probabilities (calibration slope=0.94). Some caution is warranted, given the relatively small sample specific to left-hemisphere stroke, and the limitations of imputing missing data. These two speech measures are straightforward to collect and analyse, facilitating use in research and clinical settings. Copyright

  4. Analysis and removing noise from speech using wavelet transform

    Science.gov (United States)

    Tomala, Karel; Voznak, Miroslav; Partila, Pavol; Rezac, Filip; Safarik, Jakub

    2013-05-01

    The paper discusses the use of Discrete Wavelet Transform (DWT) and Stationary Wavelet Transform (SWT) wavelet in removing noise from voice samples and evaluation of its impact on speech quality. One significant part of Quality of Service (QoS) in communication technology is the speech quality assessment. However, this part is seriously overlooked as telecommunication providers often focus on increasing network capacity, expansion of services offered and their enforcement in the market. Among the fundamental factors affecting the transmission properties of the communication chain is noise, either at the transmitter or the receiver side. A wavelet transform (WT) is a modern tool for signal processing. One of the most significant areas in which wavelet transforms are used is applications designed to suppress noise in signals. To remove noise from the voice sample in our experiment, we used the reference segment of the voice which was distorted by Gaussian white noise. An evaluation of the impact on speech quality was carried out by an intrusive objective algorithm Perceptual Evaluation of Speech Quality (PESQ). DWT and SWT transformation was applied to voice samples that were devalued by Gaussian white noise. Afterwards, we determined the effectiveness of DWT and SWT by means of objective algorithm PESQ. The decisive criterion for determining the quality of a voice sample once the noise had been removed was Mean Opinion Score (MOS) which we obtained in PESQ. The contribution of this work lies in the evaluation of efficiency of wavelet transformation to suppress noise in voice samples.

  5. Infant-Directed Speech Supports Phonetic Category Learning in English and Japanese

    Science.gov (United States)

    Werker, Janet F.; Pons, Ferran; Dietrich, Christiane; Kajikawa, Sachiyo; Fais, Laurel; Amano, Shigeaki

    2007-01-01

    Across the first year of life, infants show decreased sensitivity to phonetic differences not used in the native language [Werker, J. F., & Tees, R. C. (1984). Cross-language speech perception: evidence for perceptual reorganization during the first year of life. "Infant Behaviour and Development," 7, 49-63]. In an artificial language learning…

  6. Perceptual fluency and judgments of vocal aesthetics and stereotypicality.

    Science.gov (United States)

    Babel, Molly; McGuire, Grant

    2015-05-01

    Research has shown that processing dynamics on the perceiver's end determine aesthetic pleasure. Specifically, typical objects, which are processed more fluently, are perceived as more attractive. We extend this notion of perceptual fluency to judgments of vocal aesthetics. Vocal attractiveness has traditionally been examined with respect to sexual dimorphism and the apparent size of a talker, as reconstructed from the acoustic signal, despite evidence that gender-specific speech patterns are learned social behaviors. In this study, we report on a series of three experiments using 60 voices (30 females) to compare the relationship between judgments of vocal attractiveness, stereotypicality, and gender categorization fluency. Our results indicate that attractiveness and stereotypicality are highly correlated for female and male voices. Stereotypicality and categorization fluency were also correlated for male voices, but not female voices. Crucially, stereotypicality and categorization fluency interacted to predict attractiveness, suggesting the role of perceptual fluency is present, but nuanced, in judgments of human voices. © 2014 Cognitive Science Society, Inc.

  7. Modulation of Speech Motor Learning with Transcranial Direct Current Stimulation of the Inferior Parietal Lobe

    Directory of Open Access Journals (Sweden)

    Mickael L. D. Deroche

    2017-12-01

    Full Text Available The inferior parietal lobe (IPL is a region of the cortex believed to participate in speech motor learning. In this study, we investigated whether transcranial direct current stimulation (tDCS of the IPL could influence the extent to which healthy adults (1 adapted to a sensory alteration of their own auditory feedback, and (2 changed their perceptual representation. Seventy subjects completed three tasks: a baseline perceptual task that located the phonetic boundary between the vowels /e/ and /a/; a sensorimotor adaptation task in which subjects produced the word “head” under conditions of altered or unaltered feedback; and a post-adaptation perceptual task identical to the first. Subjects were allocated to four groups which differed in current polarity and feedback manipulation. Subjects who received anodal tDCS to their IPL (i.e., presumably increasing cortical excitability lowered their first formant frequency (F1 by 10% in opposition to the upward shift in F1 in their auditory feedback. Subjects who received the same stimulation with unaltered feedback did not change their production. Subjects who received cathodal tDCS to their IPL (i.e., presumably decreasing cortical excitability showed a 5% adaptation to the F1 alteration similar to subjects who received sham tDCS. A subset of subjects returned a few days later to reiterate the same protocol but without tDCS, enabling assessment of any facilitatory effects of the previous tDCS. All subjects exhibited a 5% adaptation effect. In addition, across all subjects and for the two recording sessions, the phonetic boundary was shifted toward the vowel /e/ being repeated, consistently with the selective adaptation effect, but a correlation between perception and production suggested that anodal tDCS had enhanced this perceptual shift. In conclusion, we successfully demonstrated that anodal tDCS could (1 enhance the motor adaptation to a sensory alteration, and (2 potentially affect the

  8. Speech disorders did not correlate with age at onset of Parkinson’s disease

    Directory of Open Access Journals (Sweden)

    Alice Estevo Dias

    2016-02-01

    Full Text Available ABSTRACT Speech disorders are common manifestations of Parkinson´s disease. Objective To compare speech articulation in patients according to age at onset of the disease. Methods Fifty patients was divided into two groups: Group I consisted of 30 patients with age at onset between 40 and 55 years; Group II consisted of 20 patients with age at onset after 65 years. All patients were evaluated based on the Unified Parkinson’s Disease Rating Scale scores, Hoehn and Yahr scale and speech evaluation by perceptual and acoustical analysis. Results There was no statistically significant difference between the two groups regarding neurological involvement and speech characteristics. Correlation analysis indicated differences in speech articulation in relation to staging and axial scores of rigidity and bradykinesia for middle and late-onset. Conclusions Impairment of speech articulation did not correlate with age at onset of disease, but was positively related with disease duration and higher scores in both groups.

  9. Right-Ear Advantage for Speech-in-Noise Recognition in Patients with Nonlateralized Tinnitus and Normal Hearing Sensitivity.

    Science.gov (United States)

    Tai, Yihsin; Husain, Fatima T

    2018-04-01

    Despite having normal hearing sensitivity, patients with chronic tinnitus may experience more difficulty recognizing speech in adverse listening conditions as compared to controls. However, the association between the characteristics of tinnitus (severity and loudness) and speech recognition remains unclear. In this study, the Quick Speech-in-Noise test (QuickSIN) was conducted monaurally on 14 patients with bilateral tinnitus and 14 age- and hearing-matched adults to determine the relation between tinnitus characteristics and speech understanding. Further, Tinnitus Handicap Inventory (THI), tinnitus loudness magnitude estimation, and loudness matching were obtained to better characterize the perceptual and psychological aspects of tinnitus. The patients reported low THI scores, with most participants in the slight handicap category. Significant between-group differences in speech-in-noise performance were only found at the 5-dB signal-to-noise ratio (SNR) condition. The tinnitus group performed significantly worse in the left ear than in the right ear, even though bilateral tinnitus percept and symmetrical thresholds were reported in all patients. This between-ear difference is likely influenced by a right-ear advantage for speech sounds, as factors related to testing order and fatigue were ruled out. Additionally, significant correlations found between SNR loss in the left ear and tinnitus loudness matching suggest that perceptual factors related to tinnitus had an effect on speech-in-noise performance, pointing to a possible interaction between peripheral and cognitive factors in chronic tinnitus. Further studies, that take into account both hearing and cognitive abilities of patients, are needed to better parse out the effect of tinnitus in the absence of hearing impairment.

  10. Visual Input Enhances Selective Speech Envelope Tracking in Auditory Cortex at a ‘Cocktail Party’

    Science.gov (United States)

    Golumbic, Elana Zion; Cogan, Gregory B.; Schroeder, Charles E.; Poeppel, David

    2013-01-01

    Our ability to selectively attend to one auditory signal amidst competing input streams, epitomized by the ‘Cocktail Party’ problem, continues to stimulate research from various approaches. How this demanding perceptual feat is achieved from a neural systems perspective remains unclear and controversial. It is well established that neural responses to attended stimuli are enhanced compared to responses to ignored ones, but responses to ignored stimuli are nonetheless highly significant, leading to interference in performance. We investigated whether congruent visual input of an attended speaker enhances cortical selectivity in auditory cortex, leading to diminished representation of ignored stimuli. We recorded magnetoencephalographic (MEG) signals from human participants as they attended to segments of natural continuous speech. Using two complementary methods of quantifying the neural response to speech, we found that viewing a speaker’s face enhances the capacity of auditory cortex to track the temporal speech envelope of that speaker. This mechanism was most effective in a ‘Cocktail Party’ setting, promoting preferential tracking of the attended speaker, whereas without visual input no significant attentional modulation was observed. These neurophysiological results underscore the importance of visual input in resolving perceptual ambiguity in a noisy environment. Since visual cues in speech precede the associated auditory signals, they likely serve a predictive role in facilitating auditory processing of speech, perhaps by directing attentional resources to appropriate points in time when to-be-attended acoustic input is expected to arrive. PMID:23345218

  11. Electrophysiological evidence for differences between fusion and combination illusions in audiovisual speech perception

    DEFF Research Database (Denmark)

    Baart, Martijn; Lindborg, Alma Cornelia; Andersen, Tobias S

    2017-01-01

    Incongruent audiovisual speech stimuli can lead to perceptual illusions such as fusions or combinations. Here, we investigated the underlying audiovisual integration process by measuring ERPs. We observed that visual speech-induced suppression of P2 amplitude (which is generally taken as a measure...... of audiovisual integration) for fusions was comparable to suppression obtained with fully congruent stimuli, whereas P2 suppression for combinations was larger. We argue that these effects arise because the phonetic incongruency is solved differently for both types of stimuli. This article is protected...

  12. Reproducibility of somatosensory spatial perceptual maps.

    Science.gov (United States)

    Steenbergen, Peter; Buitenweg, Jan R; Trojan, Jörg; Veltink, Peter H

    2013-02-01

    Various studies have shown subjects to mislocalize cutaneous stimuli in an idiosyncratic manner. Spatial properties of individual localization behavior can be represented in the form of perceptual maps. Individual differences in these maps may reflect properties of internal body representations, and perceptual maps may therefore be a useful method for studying these representations. For this to be the case, individual perceptual maps need to be reproducible, which has not yet been demonstrated. We assessed the reproducibility of localizations measured twice on subsequent days. Ten subjects participated in the experiments. Non-painful electrocutaneous stimuli were applied at seven sites on the lower arm. Subjects localized the stimuli on a photograph of their own arm, which was presented on a tablet screen overlaying the real arm. Reproducibility was assessed by calculating intraclass correlation coefficients (ICC) for the mean localizations of each electrode site and the slope and offset of regression models of the localizations, which represent scaling and displacement of perceptual maps relative to the stimulated sites. The ICCs of the mean localizations ranged from 0.68 to 0.93; the ICCs of the regression parameters were 0.88 for the intercept and 0.92 for the slope. These results indicate a high degree of reproducibility. We conclude that localization patterns of non-painful electrocutaneous stimuli on the arm are reproducible on subsequent days. Reproducibility is a necessary property of perceptual maps for these to reflect properties of a subject's internal body representations. Perceptual maps are therefore a promising method for studying body representations.

  13. Effect of Perceptual Load on Semantic Access by Speech in Children

    Science.gov (United States)

    Jerger, Susan; Damian, Markus F.; Mills, Candice; Bartlett, James; Tye-Murray, Nancy; Abdi, Herve

    2013-01-01

    Purpose: To examine whether semantic access by speech requires attention in children. Method: Children ("N" = 200) named pictures and ignored distractors on a cross-modal (distractors: auditory-no face) or multimodal (distractors: auditory-static face and audiovisual- dynamic face) picture word task. The cross-modal task had a low load,…

  14. Comparing Binaural Pre-processing Strategies II: Speech Intelligibility of Bilateral Cochlear Implant Users.

    Science.gov (United States)

    Baumgärtel, Regina M; Hu, Hongmei; Krawczyk-Becker, Martin; Marquardt, Daniel; Herzke, Tobias; Coleman, Graham; Adiloğlu, Kamil; Bomke, Katrin; Plotz, Karsten; Gerkmann, Timo; Doclo, Simon; Kollmeier, Birger; Hohmann, Volker; Dietz, Mathias

    2015-12-30

    Several binaural audio signal enhancement algorithms were evaluated with respect to their potential to improve speech intelligibility in noise for users of bilateral cochlear implants (CIs). 50% speech reception thresholds (SRT50) were assessed using an adaptive procedure in three distinct, realistic noise scenarios. All scenarios were highly nonstationary, complex, and included a significant amount of reverberation. Other aspects, such as the perfectly frontal target position, were idealized laboratory settings, allowing the algorithms to perform better than in corresponding real-world conditions. Eight bilaterally implanted CI users, wearing devices from three manufacturers, participated in the study. In all noise conditions, a substantial improvement in SRT50 compared to the unprocessed signal was observed for most of the algorithms tested, with the largest improvements generally provided by binaural minimum variance distortionless response (MVDR) beamforming algorithms. The largest overall improvement in speech intelligibility was achieved by an adaptive binaural MVDR in a spatially separated, single competing talker noise scenario. A no-pre-processing condition and adaptive differential microphones without a binaural link served as the two baseline conditions. SRT50 improvements provided by the binaural MVDR beamformers surpassed the performance of the adaptive differential microphones in most cases. Speech intelligibility improvements predicted by instrumental measures were shown to account for some but not all aspects of the perceptually obtained SRT50 improvements measured in bilaterally implanted CI users. © The Author(s) 2015.

  15. Cortical Representations of Speech in a Multitalker Auditory Scene.

    Science.gov (United States)

    Puvvada, Krishna C; Simon, Jonathan Z

    2017-09-20

    The ability to parse a complex auditory scene into perceptual objects is facilitated by a hierarchical auditory system. Successive stages in the hierarchy transform an auditory scene of multiple overlapping sources, from peripheral tonotopically based representations in the auditory nerve, into perceptually distinct auditory-object-based representations in the auditory cortex. Here, using magnetoencephalography recordings from men and women, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in distinct hierarchical stages of the auditory cortex. Using systems-theoretic methods of stimulus reconstruction, we show that the primary-like areas in the auditory cortex contain dominantly spectrotemporal-based representations of the entire auditory scene. Here, both attended and ignored speech streams are represented with almost equal fidelity, and a global representation of the full auditory scene with all its streams is a better candidate neural representation than that of individual streams being represented separately. We also show that higher-order auditory cortical areas, by contrast, represent the attended stream separately and with significantly higher fidelity than unattended streams. Furthermore, the unattended background streams are more faithfully represented as a single unsegregated background object rather than as separated objects. Together, these findings demonstrate the progression of the representations and processing of a complex acoustic scene up through the hierarchy of the human auditory cortex. SIGNIFICANCE STATEMENT Using magnetoencephalography recordings from human listeners in a simulated cocktail party environment, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in separate hierarchical stages of the auditory cortex. We show that the primary-like areas in the auditory cortex use a dominantly spectrotemporal-based representation of the entire auditory

  16. Livrable D6.1 of the PERSEE project : Perceptual Assessment : Definition of the scenarios

    OpenAIRE

    Wang , Junle; Gautier , Josselin; Bosc , Emilie; Li , Jing; Ricordel , Vincent

    2011-01-01

    62; Livrable D6,1 du projet ANR PERSEE; Ce rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D6.1 du projet. Son titre : Perceptual Assessment : Definition of the scenarios

  17. Acoustic assessment of speech privacy curtains in two nursing units

    Science.gov (United States)

    Pope, Diana S.; Miller-Klein, Erik T.

    2016-01-01

    Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation) and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient's bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s’ standard hospital construction and the other was newly refurbished (2013) with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered. PMID:26780959

  18. Acoustic assessment of speech privacy curtains in two nursing units.

    Science.gov (United States)

    Pope, Diana S; Miller-Klein, Erik T

    2016-01-01

    Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation) and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient's bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s' standard hospital construction and the other was newly refurbished (2013) with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered.

  19. Acoustic assessment of speech privacy curtains in two nursing units

    Directory of Open Access Journals (Sweden)

    Diana S Pope

    2016-01-01

    Full Text Available Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient′s bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s′ standard hospital construction and the other was newly refurbished (2013 with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered.

  20. Accommodating variation: dialects, idiolects, and speech processing.

    Science.gov (United States)

    Kraljic, Tanya; Brennan, Susan E; Samuel, Arthur G

    2008-04-01

    Listeners are faced with enormous variation in pronunciation, yet they rarely have difficulty understanding speech. Although much research has been devoted to figuring out how listeners deal with variability, virtually none (outside of sociolinguistics) has focused on the source of the variation itself. The current experiments explore whether different kinds of variation lead to different cognitive and behavioral adjustments. Specifically, we compare adjustments to the same acoustic consequence when it is due to context-independent variation (resulting from articulatory properties unique to a speaker) versus context-conditioned variation (resulting from common articulatory properties of speakers who share a dialect). The contrasting results for these two cases show that the source of a particular acoustic-phonetic variation affects how that variation is handled by the perceptual system. We also show that changes in perceptual representations do not necessarily lead to changes in production.

  1. Perception of the multisensory coherence of fluent audiovisual speech in infancy: its emergence and the role of experience.

    Science.gov (United States)

    Lewkowicz, David J; Minar, Nicholas J; Tift, Amy H; Brandon, Melissa

    2015-02-01

    To investigate the developmental emergence of the perception of the multisensory coherence of native and non-native audiovisual fluent speech, we tested 4-, 8- to 10-, and 12- to 14-month-old English-learning infants. Infants first viewed two identical female faces articulating two different monologues in silence and then in the presence of an audible monologue that matched the visible articulations of one of the faces. Neither the 4-month-old nor 8- to 10-month-old infants exhibited audiovisual matching in that they did not look longer at the matching monologue. In contrast, the 12- to 14-month-old infants exhibited matching and, consistent with the emergence of perceptual expertise for the native language, perceived the multisensory coherence of native-language monologues earlier in the test trials than that of non-native language monologues. Moreover, the matching of native audible and visible speech streams observed in the 12- to 14-month-olds did not depend on audiovisual synchrony, whereas the matching of non-native audible and visible speech streams did depend on synchrony. Overall, the current findings indicate that the perception of the multisensory coherence of fluent audiovisual speech emerges late in infancy, that audiovisual synchrony cues are more important in the perception of the multisensory coherence of non-native speech than that of native audiovisual speech, and that the emergence of this skill most likely is affected by perceptual narrowing. Copyright © 2014 Elsevier Inc. All rights reserved.

  2. Temporal Context in Speech Processing and Attentional Stream Selection: A Behavioral and Neural perspective

    Science.gov (United States)

    Zion Golumbic, Elana M.; Poeppel, David; Schroeder, Charles E.

    2012-01-01

    The human capacity for processing speech is remarkable, especially given that information in speech unfolds over multiple time scales concurrently. Similarly notable is our ability to filter out of extraneous sounds and focus our attention on one conversation, epitomized by the ‘Cocktail Party’ effect. Yet, the neural mechanisms underlying on-line speech decoding and attentional stream selection are not well understood. We review findings from behavioral and neurophysiological investigations that underscore the importance of the temporal structure of speech for achieving these perceptual feats. We discuss the hypothesis that entrainment of ambient neuronal oscillations to speech’s temporal structure, across multiple time-scales, serves to facilitate its decoding and underlies the selection of an attended speech stream over other competing input. In this regard, speech decoding and attentional stream selection are examples of ‘active sensing’, emphasizing an interaction between proactive and predictive top-down modulation of neuronal dynamics and bottom-up sensory input. PMID:22285024

  3. A Multivariate Analytic Approach to the Differential Diagnosis of Apraxia of Speech

    Science.gov (United States)

    Basilakos, Alexandra; Yourganov, Grigori; den Ouden, Dirk-Bart; Fogerty, Daniel; Rorden, Chris; Feenaughty, Lynda; Fridriksson, Julius

    2017-01-01

    Purpose: Apraxia of speech (AOS) is a consequence of stroke that frequently co-occurs with aphasia. Its study is limited by difficulties with its perceptual evaluation and dissociation from co-occurring impairments. This study examined the classification accuracy of several acoustic measures for the differential diagnosis of AOS in a sample of…

  4. The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests

    Science.gov (United States)

    Heinrich, Antje; Henshaw, Helen; Ferguson, Melanie A.

    2015-01-01

    Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests. Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study. Forty-four listeners aged between 50 and 74 years with mild sensorineural hearing loss were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet), to medium (digit triplet perception in speech-shaped noise) to high (sentence perception in modulated noise); cognitive tests of attention, memory, and non-verbal intelligence quotient; and self-report questionnaires of general health-related and hearing-specific quality of life. Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that

  5. The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests

    Directory of Open Access Journals (Sweden)

    Antje eHeinrich

    2015-06-01

    Full Text Available Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests.Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study.Forty-four listeners aged between 50-74 years with mild SNHL were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet, to medium (digit triplet perception in speech-shaped noise to high (sentence perception in modulated noise; cognitive tests of attention, memory, and nonverbal IQ; and self-report questionnaires of general health-related and hearing-specific quality of life.Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that auditory environments pose on

  6. Using Telerehabilitation to Assess Apraxia of Speech in Adults

    Science.gov (United States)

    Hill, Anne Jane; Theodoros, Deborah; Russell, Trevor; Ward, Elizabeth

    2009-01-01

    Background: Telerehabilitation is the remote delivery of rehabilitation services via information technology and telecommunication systems. There have been a number of studies that have used videoconferencing to assess speech and language skills in people with acquired neurogenic communication disorders. However, few studies have focused on cases…

  7. Dissociating speech perception and comprehension at reduced levels of awareness

    Science.gov (United States)

    Davis, Matthew H.; Coleman, Martin R.; Absalom, Anthony R.; Rodd, Jennifer M.; Johnsrude, Ingrid S.; Matta, Basil F.; Owen, Adrian M.; Menon, David K.

    2007-01-01

    We used functional MRI and the anesthetic agent propofol to assess the relationship among neural responses to speech, successful comprehension, and conscious awareness. Volunteers were scanned while listening to sentences containing ambiguous words, matched sentences without ambiguous words, and signal-correlated noise (SCN). During three scanning sessions, participants were nonsedated (awake), lightly sedated (a slowed response to conversation), and deeply sedated (no conversational response, rousable by loud command). Bilateral temporal-lobe responses for sentences compared with signal-correlated noise were observed at all three levels of sedation, although prefrontal and premotor responses to speech were absent at the deepest level of sedation. Additional inferior frontal and posterior temporal responses to ambiguous sentences provide a neural correlate of semantic processes critical for comprehending sentences containing ambiguous words. However, this additional response was absent during light sedation, suggesting a marked impairment of sentence comprehension. A significant decline in postscan recognition memory for sentences also suggests that sedation impaired encoding of sentences into memory, with left inferior frontal and temporal lobe responses during light sedation predicting subsequent recognition memory. These findings suggest a graded degradation of cognitive function in response to sedation such that “higher-level” semantic and mnemonic processes can be impaired at relatively low levels of sedation, whereas perceptual processing of speech remains resilient even during deep sedation. These results have important implications for understanding the relationship between speech comprehension and awareness in the healthy brain in patients receiving sedation and in patients with disorders of consciousness. PMID:17938125

  8. New Results on Single-Channel Speech Separation Using Sinusoidal Modeling

    DEFF Research Database (Denmark)

    Mowlaee, Pejman; Christensen, Mads Græsbøll; Jensen, Søren Holdt

    2011-01-01

    We present new results on single-channel speech separation and suggest a new separation approach to improve the speech quality of separated signals from an observed mix- ture. The key idea is to derive a mixture estimator based on sinusoidal parameters. The proposed estimator is aimed at finding...... mixture estimator used in binary masks and the Wiener filtering approach, it is observed that the proposed method achieves an acceptable perceptual speech quality with less cross- talk at different signal-to-signal ratios. Moreover, the method is independent of pitch estimates and reduces the computational...... complexity of the separation by replacing the short-time Fourier transform (STFT) feature vectors of high dimensionality with sinusoidal feature vectors. We report separation results for the proposed method and compare them with respect to other benchmark methods. The improvements made by applying...

  9. Speech-language pathologists' contribution to the assessment of decision-making capacity in aphasia: a survey of common practices.

    Science.gov (United States)

    Aldous, Kerryn; Tolmie, Rhiannon; Worrall, Linda; Ferguson, Alison

    2014-06-01

    Speech-language pathologists' scope of practice is currently unclear in relation to their contribution to the multi-disciplinary assessment of decision-making capacity for clients with aphasia and related neurogenic communication disorders. The primary aim of the current research study was to investigate the common practices of speech-language pathologists involved in assessments of decision-making capacity. The study was completed through the use of an online survey. There were 51 of 59 respondents who indicated involvement in evaluations of decision-making. Involvement in this kind of assessment was most commonly reported by speech-language pathologists working in inpatient acute and rehabilitation settings. Respondents reported using a variety of formal and informal assessment methods in their contributions to capacity assessment. Discussion with multidisciplinary team members was reported to have the greatest influence on their recommendations. Speech-language pathologists reported that they were dissatisfied with current protocols for capacity assessments in their workplace and indicated they would benefit from further education and training in this area. The findings of this study are discussed in light of their implications for speech-language pathology practice.

  10. Fast Algorithms for High-Order Sparse Linear Prediction with Applications to Speech Processing

    DEFF Research Database (Denmark)

    Jensen, Tobias Lindstrøm; Giacobello, Daniele; van Waterschoot, Toon

    2016-01-01

    In speech processing applications, imposing sparsity constraints on high-order linear prediction coefficients and prediction residuals has proven successful in overcoming some of the limitation of conventional linear predictive modeling. However, this modeling scheme, named sparse linear prediction...... problem with lower accuracy than in previous work. In the experimental analysis, we clearly show that a solution with lower accuracy can achieve approximately the same performance as a high accuracy solution both objectively, in terms of prediction gain, as well as with perceptual relevant measures, when...... evaluated in a speech reconstruction application....

  11. Speech disorders in olivopontocerebellar atrophy correlate with positron emission tomography findings

    International Nuclear Information System (INIS)

    Kluin, K.J.; Gilman, S.; Markel, D.S.; Koeppe, R.A.; Rosenthal, G.; Junck, L.

    1988-01-01

    We compared the severity of ataxic and spastic dysarthria with local cerebral metabolic rates for glucose (lCMRGlc) in 30 patients with olivopontocerebellar atrophy (OPCA). Perceptual analysis was used to examine the speech disorders, and rating scales were devised to quantitate the degree of ataxia and spasticity in the speech of each patient. lCMRGlc was measured with 18 F-2-fluoro-2-deoxy-D-glucose and positron emission tomography (PET). PET studies revealed marked hypometabolism in the cerebellar hemispheres, cerebellar vermis, and brainstem of OPCA patients compared with 30 control subjects. With data normalized to the cerebral cortex, a significant inverse correlation was found between the severity of ataxia in speech and the lCMRGlc within the cerebellar vermis, cerebellar hemispheres, and brainstem, but not within the thalamus. No significant correlation was found between the severity of spasticity in speech and lCMRGlc in any of these structures. The findings support the view that the severity of ataxia in speech in OPCA is related to the functional activity of the cerebellum and its connections in the brainstem

  12. Perceptual Measures of Speech from Individuals with Parkinson's Disease and Multiple Sclerosis: Intelligibility and beyond

    Science.gov (United States)

    Sussman, Joan E.; Tjaden, Kris

    2012-01-01

    Purpose: The primary purpose of this study was to compare percent correct word and sentence intelligibility scores for individuals with multiple sclerosis (MS) and Parkinson's disease (PD) with scaled estimates of speech severity obtained for a reading passage. Method: Speech samples for 78 talkers were judged, including 30 speakers with MS, 16…

  13. Effect of Simultaneous Bilingualism on Speech Intelligibility across Different Masker Types, Modalities, and Signal-to-Noise Ratios in School-Age Children.

    Science.gov (United States)

    Reetzke, Rachel; Lam, Boji Pak-Wing; Xie, Zilong; Sheng, Li; Chandrasekaran, Bharath

    2016-01-01

    Recognizing speech in adverse listening conditions is a significant cognitive, perceptual, and linguistic challenge, especially for children. Prior studies have yielded mixed results on the impact of bilingualism on speech perception in noise. Methodological variations across studies make it difficult to converge on a conclusion regarding the effect of bilingualism on speech-in-noise performance. Moreover, there is a dearth of speech-in-noise evidence for bilingual children who learn two languages simultaneously. The aim of the present study was to examine the extent to which various adverse listening conditions modulate differences in speech-in-noise performance between monolingual and simultaneous bilingual children. To that end, sentence recognition was assessed in twenty-four school-aged children (12 monolinguals; 12 simultaneous bilinguals, age of English acquisition ≤ 3 yrs.). We implemented a comprehensive speech-in-noise battery to examine recognition of English sentences across different modalities (audio-only, audiovisual), masker types (steady-state pink noise, two-talker babble), and a range of signal-to-noise ratios (SNRs; 0 to -16 dB). Results revealed no difference in performance between monolingual and simultaneous bilingual children across each combination of modality, masker, and SNR. Our findings suggest that when English age of acquisition and socioeconomic status is similar between groups, monolingual and bilingual children exhibit comparable speech-in-noise performance across a range of conditions analogous to everyday listening environments.

  14. The abstract representations in speech processing.

    Science.gov (United States)

    Cutler, Anne

    2008-11-01

    Speech processing by human listeners derives meaning from acoustic input via intermediate steps involving abstract representations of what has been heard. Recent results from several lines of research are here brought together to shed light on the nature and role of these representations. In spoken-word recognition, representations of phonological form and of conceptual content are dissociable. This follows from the independence of patterns of priming for a word's form and its meaning. The nature of the phonological-form representations is determined not only by acoustic-phonetic input but also by other sources of information, including metalinguistic knowledge. This follows from evidence that listeners can store two forms as different without showing any evidence of being able to detect the difference in question when they listen to speech. The lexical representations are in turn separate from prelexical representations, which are also abstract in nature. This follows from evidence that perceptual learning about speaker-specific phoneme realization, induced on the basis of a few words, generalizes across the whole lexicon to inform the recognition of all words containing the same phoneme. The efficiency of human speech processing has its basis in the rapid execution of operations over abstract representations.

  15. Causal inference of asynchronous audiovisual speech

    Directory of Open Access Journals (Sweden)

    John F Magnotti

    2013-11-01

    Full Text Available During speech perception, humans integrate auditory information from the voice with visual information from the face. This multisensory integration increases perceptual precision, but only if the two cues come from the same talker; this requirement has been largely ignored by current models of speech perception. We describe a generative model of multisensory speech perception that includes this critical step of determining the likelihood that the voice and face information have a common cause. A key feature of the model is that it is based on a principled analysis of how an observer should solve this causal inference problem using the asynchrony between two cues and the reliability of the cues. This allows the model to make predictions abut the behavior of subjects performing a synchrony judgment task, predictive power that does not exist in other approaches, such as post hoc fitting of Gaussian curves to behavioral data. We tested the model predictions against the performance of 37 subjects performing a synchrony judgment task viewing audiovisual speech under a variety of manipulations, including varying asynchronies, intelligibility, and visual cue reliability. The causal inference model outperformed the Gaussian model across two experiments, providing a better fit to the behavioral data with fewer parameters. Because the causal inference model is derived from a principled understanding of the task, model parameters are directly interpretable in terms of stimulus and subject properties.

  16. Logopenic and nonfluent variants of primary progressive aphasia are differentiated by acoustic measures of speech production.

    Directory of Open Access Journals (Sweden)

    Kirrie J Ballard

    Full Text Available Differentiation of logopenic (lvPPA and nonfluent/agrammatic (nfvPPA variants of Primary Progressive Aphasia is important yet remains challenging since it hinges on expert based evaluation of speech and language production. In this study acoustic measures of speech in conjunction with voxel-based morphometry were used to determine the success of the measures as an adjunct to diagnosis and to explore the neural basis of apraxia of speech in nfvPPA. Forty-one patients (21 lvPPA, 20 nfvPPA were recruited from a consecutive sample with suspected frontotemporal dementia. Patients were diagnosed using the current gold-standard of expert perceptual judgment, based on presence/absence of particular speech features during speaking tasks. Seventeen healthy age-matched adults served as controls. MRI scans were available for 11 control and 37 PPA cases; 23 of the PPA cases underwent amyloid ligand PET imaging. Measures, corresponding to perceptual features of apraxia of speech, were periods of silence during reading and relative vowel duration and intensity in polysyllable word repetition. Discriminant function analyses revealed that a measure of relative vowel duration differentiated nfvPPA cases from both control and lvPPA cases (r(2 = 0.47 with 88% agreement with expert judgment of presence of apraxia of speech in nfvPPA cases. VBM analysis showed that relative vowel duration covaried with grey matter intensity in areas critical for speech motor planning and programming: precentral gyrus, supplementary motor area and inferior frontal gyrus bilaterally, only affected in the nfvPPA group. This bilateral involvement of frontal speech networks in nfvPPA potentially affects access to compensatory mechanisms involving right hemisphere homologues. Measures of silences during reading also discriminated the PPA and control groups, but did not increase predictive accuracy. Findings suggest that a measure of relative vowel duration from of a polysyllable word

  17. Speech characteristics in a Ugandan child with a rare paramedian craniofacial cleft: a case report.

    Science.gov (United States)

    Van Lierde, K M; Bettens, K; Luyten, A; De Ley, S; Tungotyo, M; Balumukad, D; Galiwango, G; Bauters, W; Vermeersch, H; Hodges, A

    2013-03-01

    The purpose of this study is to describe the speech characteristics in an English-speaking Ugandan boy of 4.5 years who has a rare paramedian craniofacial cleft (unilateral lip, alveolar, palatal, nasal and maxillary cleft, and associated hypertelorism). Closure of the lip together with the closure of the hard and soft palate (one-stage palatal closure) was performed at the age of 5 months. Objective as well as subjective speech assessment techniques were used. The speech samples were perceptually judged for articulation, intelligibility and nasality. The Nasometer was used for the objective measurement of the nasalance values. The most striking communication problems in this child with the rare craniofacial cleft are an incomplete phonetic inventory, a severely impaired speech intelligibility with the presence of very severe hypernasality, mild nasal emission, phonetic disorders (omission of several consonants, decreased intraoral pressure in explosives, insufficient frication of fricatives and the use of a middorsum palatal stop) and phonological disorders (deletion of initial and final consonants and consonant clusters). The increased objective nasalance values are in agreement with the presence of the audible nasality disorders. The results revealed that several phonetic and phonological articulation disorders together with a decreased speech intelligibility and resonance disorders are present in the child with a rare craniofacial cleft. To what extent a secondary surgery for velopharyngeal insufficiency, combined with speech therapy, will improve speech intelligibility, articulation and resonance characteristics is a subject for further research. The results of such analyses may ultimately serve as a starting point for specific surgical and logopedic treatment that addresses the specific needs of children with rare facial clefts. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  18. Speech Recognition: Acoustic-Phonetic Knowledge Acquisition and Representation.

    Science.gov (United States)

    1987-09-25

    Society of "" America , Anaheim, CA, Dec. 1986. # Randolph, M. A., and V. W. Zue, "The Role of Syllable Structure in the Acoustic Realizations of Stops...input speech signal is first transformed into a represen- ences in sociolinguistic background, dialect, and vocal tract tation that takes into account...Perceptual Evidence,’ Journal of the Acovuticai Society of America , vol. 59, * no. 5, pp. 1208-1221, May 1976. � G. E. Kupec and M. A. Bush, ’Network

  19. Perception of the Multisensory Coherence of Fluent Audiovisual Speech in Infancy: Its Emergence & the Role of Experience

    Science.gov (United States)

    Lewkowicz, David J.; Minar, Nicholas J.; Tift, Amy H.; Brandon, Melissa

    2014-01-01

    To investigate the developmental emergence of the ability to perceive the multisensory coherence of native and non-native audiovisual fluent speech, we tested 4-, 8–10, and 12–14 month-old English-learning infants. Infants first viewed two identical female faces articulating two different monologues in silence and then in the presence of an audible monologue that matched the visible articulations of one of the faces. Neither the 4-month-old nor the 8–10 month-old infants exhibited audio-visual matching in that neither group exhibited greater looking at the matching monologue. In contrast, the 12–14 month-old infants exhibited matching and, consistent with the emergence of perceptual expertise for the native language, they perceived the multisensory coherence of native-language monologues earlier in the test trials than of non-native language monologues. Moreover, the matching of native audible and visible speech streams observed in the 12–14 month olds did not depend on audio-visual synchrony whereas the matching of non-native audible and visible speech streams did depend on synchrony. Overall, the current findings indicate that the perception of the multisensory coherence of fluent audiovisual speech emerges late in infancy, that audio-visual synchrony cues are more important in the perception of the multisensory coherence of non-native than native audiovisual speech, and that the emergence of this skill most likely is affected by perceptual narrowing. PMID:25462038

  20. Assessment of Speech in Primary Cleft Palate by Two-layer Closure (Conservative Management).

    Science.gov (United States)

    Jain, Harsha; Rao, Dayashankara; Sharma, Shailender; Gupta, Saurabh

    2012-01-01

    Treatment of the cleft palate has evolved over a long period of time. Various techniques of cleft palate repair that are practiced today are the results of principles learned through many years of modifications. The challenge in the art of modern palatoplasty is no longer successful closure of the cleft palate but an optimal speech outcome without compromising maxillofacial growth. Throughout these periods of evolution in the treatment of cleft palate, the effectiveness of various treatment protocols has been challenged by controversies concerning speech and maxillofacial growth. In this article we have evaluated the results of Pinto's modification of Wardill-Kilner palatoplasty without radical dissection of the levator veli palitini muscle on speech and post-op fistula in two different age groups in 20 patients. Preoperative and 6-month postoperative speech assessment values indicated that two-layer palatoplasty (modified Wardill-Kilner V-Y pushback technique) without an intravelar veloplasty technique was good for speech.

  1. ERP evidence that auditory-visual speech facilitates working memory in younger and older adults.

    Science.gov (United States)

    Frtusova, Jana B; Winneke, Axel H; Phillips, Natalie A

    2013-06-01

    Auditory-visual (AV) speech enhances speech perception and facilitates auditory processing, as measured by event-related brain potentials (ERPs). Considering a perspective of shared resources between perceptual and cognitive processes, facilitated speech perception may render more resources available for higher-order functions. This study examined whether AV speech facilitation leads to better working memory (WM) performance in 23 younger and 20 older adults. Participants completed an n-back task (0- to 3-back) under visual-only (V-only), auditory-only (A-only), and AV conditions. The results showed faster responses across all memory loads and improved accuracy in the most demanding conditions (2- and 3-back) during AV compared with unisensory conditions. Older adults benefited from the AV presentation to the same extent as younger adults. WM performance of older adults during the AV presentation did not differ from that of younger adults in the A-only condition, suggesting that an AV presentation can help to counteract some of the age-related WM decline. The ERPs showed a decrease in the auditory N1 amplitude during the AV compared with A-only presentation in older adults, suggesting that the facilitation of perceptual processing becomes especially beneficial with aging. Additionally, the N1 occurred earlier in the AV than in the A-only condition for both age groups. These AV-induced modulations of auditory processing correlated with improvement in certain behavioral and ERP measures of WM. These results support an integrated model between perception and cognition, and suggest that processing speech under AV conditions enhances WM performance of both younger and older adults. PsycINFO Database Record (c) 2013 APA, all rights reserved.

  2. How may the basal ganglia contribute to auditory categorization and speech perception?

    Directory of Open Access Journals (Sweden)

    Sung-Joo eLim

    2014-08-01

    Full Text Available Listeners must accomplish two complementary perceptual feats in extracting a message from speech. They must discriminate linguistically-relevant acoustic variability and generalize across irrelevant variability. Said another way, they must categorize speech. Since the mapping of acoustic variability is language-specific, these categories must be learned from experience. Thus, understanding how, in general, the auditory system acquires and represents categories can inform us about the toolbox of mechanisms available to speech perception. This perspective invites consideration of findings from cognitive neuroscience literatures outside of the speech domain as a means of constraining models of speech perception. Although neurobiological models of speech perception have mainly focused on cerebral cortex, research outside the speech domain is consistent with the possibility of significant subcortical contributions in category learning. Here, we review the functional role of one such structure, the basal ganglia. We examine research from animal electrophysiology, human neuroimaging, and behavior to consider characteristics of basal ganglia processing that may be advantageous for speech category learning. We also present emerging evidence for a direct role for basal ganglia in learning auditory categories in a complex, naturalistic task intended to model the incidental manner in which speech categories are acquired. To conclude, we highlight new research questions that arise in incorporating the broader neuroscience research literature in modeling speech perception, and suggest how understanding contributions of the basal ganglia can inform attempts to optimize training protocols for learning non-native speech categories in adulthood.

  3. An ALE meta-analysis on the audiovisual integration of speech signals.

    Science.gov (United States)

    Erickson, Laura C; Heeg, Elizabeth; Rauschecker, Josef P; Turkeltaub, Peter E

    2014-11-01

    The brain improves speech processing through the integration of audiovisual (AV) signals. Situations involving AV speech integration may be crudely dichotomized into those where auditory and visual inputs contain (1) equivalent, complementary signals (validating AV speech) or (2) inconsistent, different signals (conflicting AV speech). This simple framework may allow the systematic examination of broad commonalities and differences between AV neural processes engaged by various experimental paradigms frequently used to study AV speech integration. We conducted an activation likelihood estimation metaanalysis of 22 functional imaging studies comprising 33 experiments, 311 subjects, and 347 foci examining "conflicting" versus "validating" AV speech. Experimental paradigms included content congruency, timing synchrony, and perceptual measures, such as the McGurk effect or synchrony judgments, across AV speech stimulus types (sublexical to sentence). Colocalization of conflicting AV speech experiments revealed consistency across at least two contrast types (e.g., synchrony and congruency) in a network of dorsal stream regions in the frontal, parietal, and temporal lobes. There was consistency across all contrast types (synchrony, congruency, and percept) in the bilateral posterior superior/middle temporal cortex. Although fewer studies were available, validating AV speech experiments were localized to other regions, such as ventral stream visual areas in the occipital and inferior temporal cortex. These results suggest that while equivalent, complementary AV speech signals may evoke activity in regions related to the corroboration of sensory input, conflicting AV speech signals recruit widespread dorsal stream areas likely involved in the resolution of conflicting sensory signals. Copyright © 2014 Wiley Periodicals, Inc.

  4. Inner Speech's Relationship With Overt Speech in Poststroke Aphasia.

    Science.gov (United States)

    Stark, Brielle C; Geva, Sharon; Warburton, Elizabeth A

    2017-09-18

    Relatively preserved inner speech alongside poor overt speech has been documented in some persons with aphasia (PWA), but the relationship of overt speech with inner speech is still largely unclear, as few studies have directly investigated these factors. The present study investigates the relationship of relatively preserved inner speech in aphasia with selected measures of language and cognition. Thirty-eight persons with chronic aphasia (27 men, 11 women; average age 64.53 ± 13.29 years, time since stroke 8-111 months) were classified as having relatively preserved inner and overt speech (n = 21), relatively preserved inner speech with poor overt speech (n = 8), or not classified due to insufficient measurements of inner and/or overt speech (n = 9). Inner speech scores (by group) were correlated with selected measures of language and cognition from the Comprehensive Aphasia Test (Swinburn, Porter, & Al, 2004). The group with poor overt speech showed a significant relationship of inner speech with overt naming (r = .95, p speech and language and cognition factors were not significant for the group with relatively good overt speech. As in previous research, we show that relatively preserved inner speech is found alongside otherwise severe production deficits in PWA. PWA with poor overt speech may rely more on preserved inner speech for overt picture naming (perhaps due to shared resources with verbal working memory) and for written picture description (perhaps due to reliance on inner speech due to perceived task difficulty). Assessments of inner speech may be useful as a standard component of aphasia screening, and therapy focused on improving and using inner speech may prove clinically worthwhile. https://doi.org/10.23641/asha.5303542.

  5. Vocal effectiveness of speech-language pathology students: Before and after voice use during service delivery

    Science.gov (United States)

    Couch, Stephanie; Zieba, Dominique; van der Merwe, Anita

    2015-01-01

    Background As a professional voice user, it is imperative that a speech-language pathologist's (SLP) vocal effectiveness remain consistent throughout the day. Many factors may contribute to reduced vocal effectiveness, including prolonged voice use, vocally abusive behaviours, poor vocal hygiene and environmental factors. Objectives To determine the effect of service delivery on the perceptual and acoustic features of voice. Method A quasi-experimental., pre-test–post-test research design was used. Participants included third- and final-year speech-language pathology students at the University of Pretoria (South Africa). Voice parameters were evaluated in a pre-test measurement, after which the participants provided two consecutive hours of therapy. A post-test measurement was then completed. Data analysis consisted of an instrumental analysis in which the multidimensional voice programme (MDVP) and the voice range profile (VRP) were used to measure vocal parameters and then calculate the dysphonia severity index (DSI). The GRBASI scale was used to conduct a perceptual analysis of voice quality. Data were processed using descriptive statistics to determine change in each measured parameter after service delivery. Results A change of clinical significance was observed in the acoustic and perceptual parameters of voice. Conclusion Guidelines for SLPs in order to maintain optimal vocal effectiveness were suggested. PMID:26304213

  6. Status Report on Speech Research. A Report on the Status and Progress of Studies on the Nature of Speech, Instrumentation for Its Investigation, and Practical Applications.

    Science.gov (United States)

    1985-10-01

    Kay Arlyne Russo Sara Basson Noriko Kobayashi Richard C. Schmidt "..’ Eric Bateson Rena A. Krakow John Scholz ... . Suzanne Boyce Deborah Kuglitsch...greater accuracy than isolated vowels ( Gottfried & Strange, 1980; Rakerd et al., 1984; Strange, Edman, & Jenkins, 1976; Strange, Verbrugge, Shankweiler...Perceptual structure of monophthongs and diphthongs in "" English. Language and Speech, 26, 21-59. Gottfried , T. L., & Strange, W. (1980

  7. Electrophysiological evidence for differences between fusion and combination illusions in audiovisual speech perception.

    Science.gov (United States)

    Baart, Martijn; Lindborg, Alma; Andersen, Tobias S

    2017-11-01

    Incongruent audiovisual speech stimuli can lead to perceptual illusions such as fusions or combinations. Here, we investigated the underlying audiovisual integration process by measuring ERPs. We observed that visual speech-induced suppression of P2 amplitude (which is generally taken as a measure of audiovisual integration) for fusions was similar to suppression obtained with fully congruent stimuli, whereas P2 suppression for combinations was larger. We argue that these effects arise because the phonetic incongruency is solved differently for both types of stimuli. © 2017 The Authors. European Journal of Neuroscience published by Federation of European Neuroscience Societies and John Wiley & Sons Ltd.

  8. A Danish open-set speech corpus for competing-speech studies

    DEFF Research Database (Denmark)

    Nielsen, Jens Bo; Dau, Torsten; Neher, Tobias

    2014-01-01

    Studies investigating speech-on-speech masking effects commonly use closed-set speech materials such as the coordinate response measure [Bolia et al. (2000). J. Acoust. Soc. Am. 107, 1065-1066]. However, these studies typically result in very low (i.e., negative) speech recognition thresholds (SRTs......) when the competing speech signals are spatially separated. To achieve higher SRTs that correspond more closely to natural communication situations, an open-set, low-context, multi-talker speech corpus was developed. Three sets of 268 unique Danish sentences were created, and each set was recorded...... with one of three professional female talkers. The intelligibility of each sentence in the presence of speech-shaped noise was measured. For each talker, 200 approximately equally intelligible sentences were then selected and systematically distributed into 10 test lists. Test list homogeneity was assessed...

  9. Hearing history influences voice gender perceptual performance in cochlear implant users.

    Science.gov (United States)

    Kovačić, Damir; Balaban, Evan

    2010-12-01

    The study was carried out to assess the role that five hearing history variables (chronological age, age at onset of deafness, age of first cochlear implant [CI] activation, duration of CI use, and duration of known deafness) play in the ability of CI users to identify speaker gender. Forty-one juvenile CI users participated in two voice gender identification tasks. In a fixed, single-interval task, subjects listened to a single speech item from one of 20 adult male or 20 adult female speakers and had to identify speaker gender. In an adaptive speech-based voice gender discrimination task with the fundamental frequency difference between the voices as the adaptive parameter, subjects listened to a pair of speech items presented in sequential order, one of which was always spoken by an adult female and the other by an adult male. Subjects had to identify the speech item spoken by the female voice. Correlation and regression analyses between perceptual scores in the two tasks and the hearing history variables were performed. Subjects fell into three performance groups: (1) those who could distinguish voice gender in both tasks, (2) those who could distinguish voice gender in the adaptive but not the fixed task, and (3) those who could not distinguish voice gender in either task. Gender identification performance for single voices in the fixed task was significantly and negatively related to the duration of deafness before cochlear implantation (shorter deafness yielded better performance), whereas performance in the adaptive task was weakly but significantly related to age at first activation of the CI device, with earlier activations yielding better scores. The existence of a group of subjects able to perform adaptive discrimination but unable to identify the gender of singly presented voices demonstrates the potential dissociability of the skills required for these two tasks, suggesting that duration of deafness and age of cochlear implantation could have

  10. Acoustic and Perceptual Measurement of Expressive Prosody in High-Functioning Autism: Increased Pitch Range and What it Means to Listeners

    Science.gov (United States)

    Nadig, Aparna; Shaw, Holly

    2012-01-01

    Are there consistent markers of atypical prosody in speakers with high functioning autism (HFA) compared to typically-developing speakers? We examined: (1) acoustic measurements of pitch range, mean pitch and speech rate in conversation, (2) perceptual ratings of conversation for these features and overall prosody, and (3) acoustic measurements of…

  11. Preserved appreciation of aesthetic elements of speech and music prosody in an amusic individual: A holistic approach.

    Science.gov (United States)

    Loutrari, Ariadne; Lorch, Marjorie Perlman

    2017-07-01

    We present a follow-up study on the case of a Greek amusic adult, B.Z., whose impaired performance on scale, contour, interval, and meter was reported by Paraskevopoulos, Tsapkini, and Peretz in 2010, employing a culturally-tailored version of the Montreal Battery of Evaluation of Amusia. In the present study, we administered a novel set of perceptual judgement tasks designed to investigate the ability to appreciate holistic prosodic aspects of 'expressiveness' and emotion in phrase length music and speech stimuli. Our results show that, although diagnosed as a congenital amusic, B.Z. scored as well as healthy controls (N=24) on judging 'expressiveness' and emotional prosody in both speech and music stimuli. These findings suggest that the ability to make perceptual judgements about such prosodic qualities may be preserved in individuals who demonstrate difficulties perceiving basic musical features such as melody or rhythm. B.Z.'s case yields new insights into amusia and the processing of speech and music prosody through a holistic approach. The employment of novel stimuli with relatively fewer non-naturalistic manipulations, as developed for this study, may be a useful tool for revealing unexplored aspects of music and speech cognition and offer the possibility to further the investigation of the perception of acoustic streams in more authentic auditory conditions. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  12. Understanding the nature of apraxia of speech: Theory, analysis, and treatment

    Directory of Open Access Journals (Sweden)

    Kirrie J. Ballard

    2010-08-01

    Full Text Available Researchers have interpreted the behaviours of individuals with acquired apraxia of speech (AOS as impairment of linguistic phonological processing, motor control, or both. Acoustic, kinematic, and perceptual studies of speech in more recent years have led to significant advances in our understanding of the disorder and wide acceptance that it affects phonetic - motoric planning of speech. However, newly developed methods for studying nonspeech motor control are providing new insights, indicating that the motor control impairment of AOS extends beyond speech and is manifest in nonspeech movements of the oral structures. We present the most recent developments in theory and methods to examine and define the nature of AOS. Theories of the disorder are then related to existing treatment approaches and the efficacy of these approaches is examined. Directions for development of new treatments are posited. It is proposed that treatment programmes driven by a principled account of how the motor system learns to produce skilled actions will provide the most efficient and effective framework for treating motorbased speech disorders. In turn, well controlled and theoretically motivated studies of treatment efficacy promise to stimulate further development of theoretical accounts and contribute to our understanding of AOS.

  13. Multidisciplinary Assessment and Diagnosis of Conversion Disorder in a Patient with Foreign Accent Syndrome

    Directory of Open Access Journals (Sweden)

    Harrison N. Jones

    2011-01-01

    Full Text Available Multiple reports have described patients with disordered articulation and prosody, often following acute aphasia, dysarthria, or apraxia of speech, which results in the perception by listeners of a foreign-like accent. These features led to the term foreign accent syndrome (FAS, a speech disorder with perceptual features that suggest an indistinct, non-native speaking accent. Also correctly known as psuedoforeign accent, the speech does not typically match a specific foreign accent, but is rather a constellation of speech features that result in the perception of a foreign accent by listeners. The primary etiologies of FAS are cerebrovascular accidents or traumatic brain injuries which affect cortical and subcortical regions critical to expressive speech and language production. Far fewer cases of FAS associated with psychiatric conditions have been reported. We will present the clinical history, neurological examination, neuropsychological assessment, cognitive-behavioral and biofeedback assessments, and motor speech examination of a patient with FAS without a known vascular, traumatic, or infectious precipitant. Repeated multidisciplinary examinations of this patient provided convergent evidence in support of FAS secondary to conversion disorder. We discuss these findings and their implications for evaluation and treatment of rare neurological and psychiatric conditions.

  14. Do We Perceive Others Better than Ourselves? A Perceptual Benefit for Noise-Vocoded Speech Produced by an Average Speaker.

    Directory of Open Access Journals (Sweden)

    William L Schuerman

    Full Text Available In different tasks involving action perception, performance has been found to be facilitated when the presented stimuli were produced by the participants themselves rather than by another participant. These results suggest that the same mental representations are accessed during both production and perception. However, with regard to spoken word perception, evidence also suggests that listeners' representations for speech reflect the input from their surrounding linguistic community rather than their own idiosyncratic productions. Furthermore, speech perception is heavily influenced by indexical cues that may lead listeners to frame their interpretations of incoming speech signals with regard to speaker identity. In order to determine whether word recognition evinces similar self-advantages as found in action perception, it was necessary to eliminate indexical cues from the speech signal. We therefore asked participants to identify noise-vocoded versions of Dutch words that were based on either their own recordings or those of a statistically average speaker. The majority of participants were more accurate for the average speaker than for themselves, even after taking into account differences in intelligibility. These results suggest that the speech representations accessed during perception of noise-vocoded speech are more reflective of the input of the speech community, and hence that speech perception is not necessarily based on representations of one's own speech.

  15. Perceptual evaluation and acoustic analysis of pneumatic artificial larynx.

    Science.gov (United States)

    Xu, Jie Jie; Chen, Xi; Lu, Mei Ping; Qiao, Ming Zhe

    2009-12-01

    To investigate the perceptual and acoustic characteristics of the pneumatic artificial larynx (PAL) and evaluate its speech ability and clinical value. Prospective study. The study was conducted in the Voice Lab, Department of Otorhinolaryngology, The First Affiliated Hospital of Nanjing Medical University. Forty-six laryngectomy patients using the PAL were rated for intelligibility and fluency of speech. The voice signals of sustained vowel /a/ for 40 healthy controls and 42 successful patients using the PAL were measured by a computer system. The acoustic parameters and sound spectrographs were analyzed and compared between the two groups. Forty-two of 46 patients using the PAL (91.3%) acquired successful speech capability. The intelligibility scores of 42 successful PAL speakers ranged from 71 to 95 percent, and the intelligibility range of four unsuccessful speakers was 30 to 50 percent. The fluency was judged as good or excellent in 42 successful patients, and poor or fair in four unsuccessful patients. There was no significant difference in average fundamental frequency, maximum intensity, jitter, shimmer, and normalized noise energy (NNE) between 42 successful PAL speakers and 40 healthy controls, while the maximum phonation time (MPT) of PAL speakers was slightly lower than that of the controls. The sound spectrographs of the patients using the PAL approximated those of the healthy controls. The PAL has the advantage of a high percentage of successful vocal rehabilitation. PAL speech is fluent and intelligible. The acoustic characteristics of the PAL are similar to those of a normal voice.

  16. Assessment of perceptual diffuseness in the time domain

    DEFF Research Database (Denmark)

    Garcia, Julian Martinez-Villalba; Jeong, Cheol-Ho; Brunskog, Jonas

    2017-01-01

    This study proposes a numerical and experimental framework for evaluating the perceptual aspect of the diffuse field condition with intended final use in music auditoria. Multiple Impulse Responses are simulated based on the time domain Poisson process with increasing reflection density. Different...

  17. Audio-visual speech timing sensitivity is enhanced in cluttered conditions.

    Directory of Open Access Journals (Sweden)

    Warrick Roseboom

    2011-04-01

    Full Text Available Events encoded in separate sensory modalities, such as audition and vision, can seem to be synchronous across a relatively broad range of physical timing differences. This may suggest that the precision of audio-visual timing judgments is inherently poor. Here we show that this is not necessarily true. We contrast timing sensitivity for isolated streams of audio and visual speech, and for streams of audio and visual speech accompanied by additional, temporally offset, visual speech streams. We find that the precision with which synchronous streams of audio and visual speech are identified is enhanced by the presence of additional streams of asynchronous visual speech. Our data suggest that timing perception is shaped by selective grouping processes, which can result in enhanced precision in temporally cluttered environments. The imprecision suggested by previous studies might therefore be a consequence of examining isolated pairs of audio and visual events. We argue that when an isolated pair of cross-modal events is presented, they tend to group perceptually and to seem synchronous as a consequence. We have revealed greater precision by providing multiple visual signals, possibly allowing a single auditory speech stream to group selectively with the most synchronous visual candidate. The grouping processes we have identified might be important in daily life, such as when we attempt to follow a conversation in a crowded room.

  18. General contrast effects in speech perception: effect of preceding liquid on stop consonant identification.

    Science.gov (United States)

    Lotto, A J; Kluender, K R

    1998-05-01

    When members of a series of synthesized stop consonants varying acoustically in F3 characteristics and varying perceptually from /da/ to /ga/ are preceded by /al/, subjects report hearing more /ga/ syllables relative to when each member is preceded by /ar/ (Mann, 1980). It has been suggested that this result demonstrates the existence of a mechanism that compensates for coarticulation via tacit knowledge of articulatory dynamics and constraints, or through perceptual recovery of vocal-tract dynamics. The present study was designed to assess the degree to which these perceptual effects are specific to qualities of human articulatory sources. In three experiments, series of consonant-vowel (CV) stimuli varying in F3-onset frequency (/da/-/ga/) were preceded by speech versions or nonspeech analogues of /al/ and /ar/. The effect of liquid identity on stop consonant labeling remained when the preceding VC was produced by a female speaker and the CV syllable was modeled after a male speaker's productions. Labeling boundaries also shifted when the CV was preceded by a sine wave glide modeled after F3 characteristics of /al/ and /ar/. Identifications shifted even when the preceding sine wave was of constant frequency equal to the offset frequency of F3 from a natural production. These results suggest an explanation in terms of general auditory processes as opposed to recovery of or knowledge of specific articulatory dynamics.

  19. Issues in developing valid assessments of speech pathology students' performance in the workplace.

    Science.gov (United States)

    McAllister, Sue; Lincoln, Michelle; Ferguson, Alison; McAllister, Lindy

    2010-01-01

    Workplace-based learning is a critical component of professional preparation in speech pathology. A validated assessment of this learning is seen to be 'the gold standard', but it is difficult to develop because of design and validation issues. These issues include the role and nature of judgement in assessment, challenges in measuring quality, and the relationship between assessment and learning. Valid assessment of workplace-based performance needs to capture the development of competence over time and account for both occupation specific and generic competencies. This paper reviews important conceptual issues in the design of valid and reliable workplace-based assessments of competence including assessment content, process, impact on learning, measurement issues, and validation strategies. It then goes on to share what has been learned about quality assessment and validation of a workplace-based performance assessment using competency-based ratings. The outcomes of a four-year national development and validation of an assessment tool are described. A literature review of issues in conceptualizing, designing, and validating workplace-based assessments was conducted. Key factors to consider in the design of a new tool were identified and built into the cycle of design, trialling, and data analysis in the validation stages of the development process. This paper provides an accessible overview of factors to consider in the design and validation of workplace-based assessment tools. It presents strategies used in the development and national validation of a tool COMPASS, used in an every speech pathology programme in Australia, New Zealand, and Singapore. The paper also describes Rasch analysis, a model-based statistical approach which is useful for establishing validity and reliability of assessment tools. Through careful attention to conceptual and design issues in the development and trialling of workplace-based assessments, it has been possible to develop the

  20. The Performance-Perceptual Test (PPT) and its relationship to aided reported handicap and hearing aid satisfaction.

    Science.gov (United States)

    Saunders, Gabrielle H; Forsline, Anna

    2006-06-01

    Results of objective clinical tests (e.g., measures of speech understanding in noise) often conflict with subjective reports of hearing aid benefit and satisfaction. The Performance-Perceptual Test (PPT) is an outcome measure in which objective and subjective evaluations are made by using the same test materials, testing format, and unit of measurement (signal-to-noise ratio, S/N), permitting a direct comparison between measured and perceived ability to hear. Two variables are measured: a Performance Speech Reception Threshold in Noise (SRTN) for 50% correct performance and a Perceptual SRTN, which is the S/N at which listeners perceive that they can understand the speech material. A third variable is computed: the Performance-Perceptual Discrepancy (PPDIS); it is the difference between the Performance and Perceptual SRTNs and measures the extent to which listeners "misjudge" their hearing ability. Saunders et al. in 2004 examined the relation between PPT scores and unaided hearing handicap. In this publication, the relations between the PPT, residual aided handicap, and hearing aid satisfaction are described. Ninety-four individuals between the ages of 47 and 86 yr participated. All had symmetrical sensorineural hearing loss and had worn binaural hearing aids for at least 6 wk before participating. All subjects underwent routine audiological examination and completed the PPT, the Hearing Handicap Inventory for the Elderly/Adults (HHIE/A), and the Satisfaction for Amplification in Daily Life questionnaire. Sixty-five subjects attended one research visit for participation in this study, and 29 attended a second visit to complete the PPT a second time. Performance and Perceptual SRTN and PPDIS scores were normally distributed and showed excellent test-retest reliability. Aided SRTNs were significantly better than unaided SRTNs; aided and unaided PPDIS values did not differ. Stepwise multiple linear regression showed that the PPDIS, the Performance SRTN, and age were

  1. Perceptual adaptation of voice gender discrimination with spectrally shifted vowels.

    Science.gov (United States)

    Li, Tianhao; Fu, Qian-Jie

    2011-08-01

    To determine whether perceptual adaptation improves voice gender discrimination of spectrally shifted vowels and, if so, which acoustic cues contribute to the improvement. Voice gender discrimination was measured for 10 normal-hearing subjects, during 5 days of adaptation to spectrally shifted vowels, produced by processing the speech of 5 male and 5 female talkers with 16-channel sine-wave vocoders. The subjects were randomly divided into 2 groups; one subjected to 50-Hz, and the other to 200-Hz, temporal envelope cutoff frequencies. No preview or feedback was provided. There was significant adaptation in voice gender discrimination with the 200-Hz cutoff frequency, but significant improvement was observed only for 3 female talkers with F(0) > 180 Hz and 3 male talkers with F(0) gender discrimination under spectral shift conditions with perceptual adaptation, but spectral shift may limit the exclusive use of spectral information and/or the use of formant structure on voice gender discrimination. The results have implications for cochlear implant users and for understanding voice gender discrimination.

  2. Silent reading of direct versus indirect speech activates voice-selective areas in the auditory cortex.

    Science.gov (United States)

    Yao, Bo; Belin, Pascal; Scheepers, Christoph

    2011-10-01

    In human communication, direct speech (e.g., Mary said: "I'm hungry") is perceived to be more vivid than indirect speech (e.g., Mary said [that] she was hungry). However, for silent reading, the representational consequences of this distinction are still unclear. Although many of us share the intuition of an "inner voice," particularly during silent reading of direct speech statements in text, there has been little direct empirical confirmation of this experience so far. Combining fMRI with eye tracking in human volunteers, we show that silent reading of direct versus indirect speech engenders differential brain activation in voice-selective areas of the auditory cortex. This suggests that readers are indeed more likely to engage in perceptual simulations (or spontaneous imagery) of the reported speaker's voice when reading direct speech as opposed to meaning-equivalent indirect speech statements as part of a more vivid representation of the former. Our results may be interpreted in line with embodied cognition and form a starting point for more sophisticated interdisciplinary research on the nature of auditory mental simulation during reading.

  3. Perceptual load in sport and the heuristic value of the perceptual load paradigm in examining expertise-related perceptual-cognitive adaptations.

    Science.gov (United States)

    Furley, Philip; Memmert, Daniel; Schmid, Simone

    2013-03-01

    In two experiments, we transferred perceptual load theory to the dynamic field of team sports and tested the predictions derived from the theory using a novel task and stimuli. We tested a group of college students (N = 33) and a group of expert team sport players (N = 32) on a general perceptual load task and a complex, soccer-specific perceptual load task in order to extend the understanding of the applicability of perceptual load theory and further investigate whether distractor interference may differ between the groups, as the sport-specific processing task may not exhaust the processing capacity of the expert participants. In both, the general and the specific task, the pattern of results supported perceptual load theory and demonstrates that the predictions of the theory also transfer to more complex, unstructured situations. Further, perceptual load was the only determinant of distractor processing, as we neither found expertise effects in the general perceptual load task nor the sport-specific task. We discuss the heuristic utility of using response-competition paradigms for studying both general and domain-specific perceptual-cognitive adaptations.

  4. Multilevel Analysis in Analyzing Speech Data

    Science.gov (United States)

    Guddattu, Vasudeva; Krishna, Y.

    2011-01-01

    The speech produced by human vocal tract is a complex acoustic signal, with diverse applications in phonetics, speech synthesis, automatic speech recognition, speaker identification, communication aids, speech pathology, speech perception, machine translation, hearing research, rehabilitation and assessment of communication disorders and many…

  5. Perceptual Load Affects Eyewitness Accuracy & Susceptibility to Leading Questions

    Directory of Open Access Journals (Sweden)

    Gillian Murphy

    2016-08-01

    Full Text Available Load Theory (Lavie, 1995; 2005 states that the level of perceptual load in a task (i.e. the amount of information involved in processing task-relevant stimuli determines the efficiency of selective attention. There is evidence that perceptual load affects distractor processing, with increased inattentional blindness under high load. Given that high load can result in individuals failing to report seeing obvious objects, it is conceivable that load may also impair memory for the scene. The current study is the first to assess the effect of perceptual load on eyewitness memory. Across three experiments (two video-based and one in a driving simulator, the effect of perceptual load on eyewitness memory was assessed. The results showed that eyewitnesses were less accurate under high load, in particular for peripheral details. For example, memory for the central character in the video was not affected by load but memory for a witness who passed by the window at the edge of the scene was significantly worse under high load. High load memories were also more open to suggestion, showing increased susceptibility to leading questions. High visual perceptual load also affected recall for auditory information, illustrating a possible cross-modal perceptual load effect on memory accuracy. These results have implications for eyewitness memory researchers and forensic professionals.

  6. Perceptual processing during trauma, priming and the development of intrusive memories

    Science.gov (United States)

    Sündermann, Oliver; Hauschildt, Marit; Ehlers, Anke

    2013-01-01

    Background Intrusive reexperiencing in posttraumatic stress disorder (PTSD) is commonly triggered by stimuli with perceptual similarity to those present during the trauma. Information processing theories suggest that perceptual processing during the trauma and enhanced perceptual priming contribute to the easy triggering of intrusive memories by these cues. Methods Healthy volunteers (N = 51) watched neutral and trauma picture stories on a computer screen. Neutral objects that were unrelated to the content of the stories briefly appeared in the interval between the pictures. Dissociation and data-driven processing (as indicators of perceptual processing) and state anxiety during the stories were assessed with self-report questionnaires. After filler tasks, participants completed a blurred object identification task to assess priming and a recognition memory task. Intrusive memories were assessed with telephone interviews 2 weeks and 3 months later. Results Neutral objects were more strongly primed if they occurred in the context of trauma stories than if they occurred during neutral stories, although the effect size was only moderate (ηp2=.08) and only significant when trauma stories were presented first. Regardless of story order, enhanced perceptual priming predicted intrusive memories at 2-week follow-up (N = 51), but not at 3 months (n = 40). Data-driven processing, dissociation and anxiety increases during the trauma stories also predicted intrusive memories. Enhanced perceptual priming and data-driven processing were associated with lower verbal intelligence. Limitations It is unclear to what extent these findings generalize to real-life traumatic events and whether they are specific to negative emotional events. Conclusions The results provide some support for the role of perceptual processing and perceptual priming in reexperiencing symptoms. PMID:23207970

  7. Perceptual Bias and Loudness Change: An Investigation of Memory, Masking, and Psychophysiology

    Science.gov (United States)

    Olsen, Kirk N.

    Loudness is a fundamental aspect of human auditory perception that is closely associated with a sound's physical acoustic intensity. The dynamic quality of intensity change is an inherent acoustic feature in real-world listening domains such as speech and music. However, perception of loudness change in response to continuous intensity increases (up-ramps) and decreases (down-ramps) has received relatively little empirical investigation. Overestimation of loudness change in response to up-ramps is said to be linked to an adaptive survival response associated with looming (or approaching) motion in the environment. The hypothesised 'perceptual bias' to looming auditory motion suggests why perceptual overestimation of up-ramps may occur; however it does not offer a causal explanation. It is concluded that post-stimulus judgements of perceived loudness change are significantly affected by a cognitive recency response bias that, until now, has been an artefact of experimental procedure. Perceptual end-level differences caused by duration specific sensory adaptation at peripheral and/or central stages of auditory processing may explain differences in post-stimulus judgements of loudness change. Experiments that investigate human responses to acoustic intensity dynamics, encompassing topics from basic auditory psychophysics (e.g., sensory adaptation) to cognitive-emotional appraisal of increasingly complex stimulus events such as music and auditory warnings, are proposed for future research.

  8. Can you hear me now? Musical training shapes functional brain networks for selective auditory attention and hearing speech in noise

    Directory of Open Access Journals (Sweden)

    Dana L Strait

    2011-06-01

    Full Text Available Even in the quietest of rooms, our senses are perpetually inundated by a barrage of sounds, requiring the auditory system to adapt to a variety of listening conditions in order to extract signals of interest (e.g., one speaker’s voice amidst others. Brain networks that promote selective attention are thought to sharpen the neural encoding of a target signal, suppressing competing sounds and enhancing perceptual performance. Here, we ask: does musical training benefit cortical mechanisms that underlie selective attention to speech? To answer this question, we assessed the impact of selective auditory attention on cortical auditory-evoked response variability in musicians and nonmusicians. Outcomes indicate strengthened brain networks for selective auditory attention in musicians in that musicians but not nonmusicians demonstrate decreased prefrontal response variability with auditory attention. Results are interpreted in the context of previous work from our laboratory documenting perceptual and subcortical advantages in musicians for the hearing and neural encoding of speech in background noise. Musicians’ neural proficiency for selectively engaging and sustaining auditory attention to language indicates a potential benefit of music for auditory training. Given the importance of auditory attention for the development of language-related skills, musical training may aid in the prevention, habilitation and remediation of children with a wide range of attention-based language and learning impairments.

  9. Classroom listening assessment: strategies for speech-language pathologists.

    Science.gov (United States)

    Johnson, Cheryl DeConde

    2012-11-01

    Emphasis on classroom listening has gained importance for all children and especially for those with hearing loss and special listening needs. The rationale can be supported from trends in educational placements, the Response to Intervention initiative, student performance and accountability, the role of audition in reading, and improvement in hearing technologies. Speech-language pathologists have an instrumental role advocating for the accommodations that are necessary for effective listening for these children in school. To identify individual listening needs and make relevant recommendations for accommodations, a classroom listening assessment is suggested. Components of the classroom listening assessment include observation, behavioral assessment, self-assessment, and classroom acoustics measurements. Together, with a strong rationale, the results can be used to implement a plan that results in effective classroom listening for these children. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.

  10. Modulations of 'late' event-related brain potentials in humans by dynamic audiovisual speech stimuli.

    Science.gov (United States)

    Lebib, Riadh; Papo, David; Douiri, Abdel; de Bode, Stella; Gillon Dowens, Margaret; Baudonnière, Pierre-Marie

    2004-11-30

    Lipreading reliably improve speech perception during face-to-face conversation. Within the range of good dubbing, however, adults tolerate some audiovisual (AV) discrepancies and lipreading, then, can give rise to confusion. We used event-related brain potentials (ERPs) to study the perceptual strategies governing the intermodal processing of dynamic and bimodal speech stimuli, either congruently dubbed or not. Electrophysiological analyses revealed that non-coherent audiovisual dubbings modulated in amplitude an endogenous ERP component, the N300, we compared to a 'N400-like effect' reflecting the difficulty to integrate these conflicting pieces of information. This result adds further support for the existence of a cerebral system underlying 'integrative processes' lato sensu. Further studies should take advantage of this 'N400-like effect' with AV speech stimuli to open new perspectives in the domain of psycholinguistics.

  11. Linking Ecological and Perceptual Assessments for Environmental Management: a Coral Reef Case Study

    Directory of Open Access Journals (Sweden)

    Elizabeth A. Dinsdale

    2009-12-01

    Full Text Available Integrating information from a range of community members in environmental management provides a more complete assessment of the problem and a diversification of management options, but is difficult to achieve. To investigate the relationship between different environmental interpretations, I compared three distinct measures of anchor damage on coral reefs: ecological measures, perceptual meanings, and subjective health judgments. The ecological measures identified an increase in the number of overturned corals and a reduction in coral cover, the perceptual meanings identified a loss of visual quality, and the health judgments identified a reduction in the health of the coral reef sites associated with high levels of anchoring. Combining the perceptual meanings and health judgments identified that the judgment of environmental health was a key feature that both scientific and lay participants used to describe the environment. Some participants in the survey were familiar with the coral reef environment, and others were not. However, they provided consistent judgment of a healthy coral reef, suggesting that these judgments were not linked to present-day experiences. By combining subjective judgments and ecological measures, the point at which the environment is deemed to lose visual quality was identified; for these coral reefs, if the level of damage rose above 10.3% and the cover of branching corals dropped below 17.1%, the reefs were described as unhealthy. Therefore, by combining the information, a management agency can involve the community in identifying when remedial action is required or when management policies are effectively maintaining a healthy ecosystem.

  12. Perceptual Load Affects Eyewitness Accuracy and Susceptibility to Leading Questions.

    Science.gov (United States)

    Murphy, Gillian; Greene, Ciara M

    2016-01-01

    Load Theory (Lavie, 1995, 2005) states that the level of perceptual load in a task (i.e., the amount of information involved in processing task-relevant stimuli) determines the efficiency of selective attention. There is evidence that perceptual load affects distractor processing, with increased inattentional blindness under high load. Given that high load can result in individuals failing to report seeing obvious objects, it is conceivable that load may also impair memory for the scene. The current study is the first to assess the effect of perceptual load on eyewitness memory. Across three experiments (two video-based and one in a driving simulator), the effect of perceptual load on eyewitness memory was assessed. The results showed that eyewitnesses were less accurate under high load, in particular for peripheral details. For example, memory for the central character in the video was not affected by load but memory for a witness who passed by the window at the edge of the scene was significantly worse under high load. High load memories were also more open to suggestion, showing increased susceptibility to leading questions. High visual perceptual load also affected recall for auditory information, illustrating a possible cross-modal perceptual load effect on memory accuracy. These results have implications for eyewitness memory researchers and forensic professionals.

  13. Speech and Speech-Related Quality of Life After Late Palate Repair: A Patient's Perspective.

    Science.gov (United States)

    Schönmeyr, Björn; Wendby, Lisa; Sharma, Mitali; Jacobson, Lia; Restrepo, Carolina; Campbell, Alex

    2015-07-01

    Many patients with cleft palate deformities worldwide receive treatment at a later age than is recommended for normal speech to develop. The outcomes after late palate repairs in terms of speech and quality of life (QOL) still remain largely unstudied. In the current study, questionnaires were used to assess the patients' perception of speech and QOL before and after primary palate repair. All of the patients were operated at a cleft center in northeast India and had a cleft palate with a normal lip or with a cleft lip that had been previously repaired. A total of 134 patients (7-35 years) were interviewed preoperatively and 46 patients (7-32 years) were assessed in the postoperative survey. The survey showed that scores based on the speech handicap index, concerning speech and speech-related QOL, did not improve postoperatively. In fact, the questionnaires indicated that the speech became more unpredictable (P reported that their self-confidence had improved after the operation. Thus, the majority of interviewed patients who underwent late primary palate repair were satisfied with the surgery. At the same time, speech and speech-related QOL did not improve according to the speech handicap index-based survey. Speech predictability may even become worse and nasal regurgitation may increase after late palate repair, according to these results.

  14. Speech feature discrimination in deaf children following cochlear implantation

    Science.gov (United States)

    Bergeson, Tonya R.; Pisoni, David B.; Kirk, Karen Iler

    2002-05-01

    Speech feature discrimination is a fundamental perceptual skill that is often assumed to underlie word recognition and sentence comprehension performance. To investigate the development of speech feature discrimination in deaf children with cochlear implants, we conducted a retrospective analysis of results from the Minimal Pairs Test (Robbins et al., 1988) selected from patients enrolled in a longitudinal study of speech perception and language development. The MP test uses a 2AFC procedure in which children hear a word and select one of two pictures (bat-pat). All 43 children were prelingually deafened, received a cochlear implant before 6 years of age or between ages 6 and 9, and used either oral or total communication. Children were tested once every 6 months to 1 year for 7 years; not all children were tested at each interval. By 2 years postimplant, the majority of these children achieved near-ceiling levels of discrimination performance for vowel height, vowel place, and consonant manner. Most of the children also achieved plateaus but did not reach ceiling performance for consonant place and voicing. The relationship between speech feature discrimination, spoken word recognition, and sentence comprehension will be discussed. [Work supported by NIH/NIDCD Research Grant No. R01DC00064 and NIH/NIDCD Training Grant No. T32DC00012.

  15. Methodical bases of perceptual mapping of printing industry companies

    Directory of Open Access Journals (Sweden)

    Kalinin Pavel

    2017-01-01

    Full Text Available This is to study the methodological foundations of perceptual mapping in printing industry enterprises. This research has a practice focus which affects the choice of its methodological framework. The authors use such scientific research as analysis of cause-effect relationships, synthesis, problem analysis, expert evaluation and image visualization methods. In this paper, the authors present their assessment of the competitive environment of major printing industry companies in Kirov oblast; their assessment employs perceptual mapping enables by Minitab 14. This technique can be used by experts in the field of marketing and branding to assess the competitive environment in any market. The object of research is printing industry in Kirov oblast. The most important conclusion of this study is that in perceptual mapping, all the parameters are integrated in a single system and provide a more objective view of the company’s market situation.

  16. Objective assessment of stream segregation abilities of CI users as a function of electrode separation

    DEFF Research Database (Denmark)

    Paredes Gallardo, Andreu; Madsen, Sara Miay Kim; Dau, Torsten

    Auditory streaming is a perceptual process by which the human auditory system organizes sounds from different sources into perceptually meaningful elements. Segregation of sound sources is important, among others, for understanding speech in noisy environments, which is especially challenging...

  17. The neural processing of foreign-accented speech and its relationship to listener bias

    Directory of Open Access Journals (Sweden)

    Han-Gyol eYi

    2014-10-01

    Full Text Available Foreign-accented speech often presents a challenging listening condition. In addition to deviations from the target speech norms related to the inexperience of the nonnative speaker, listener characteristics may play a role in determining intelligibility levels. We have previously shown that an implicit visual bias for associating East Asian faces and foreignness predicts the listeners’ perceptual ability to process Korean-accented English audiovisual speech (Yi et al., 2013. Here, we examine the neural mechanism underlying the influence of listener bias to foreign faces on speech perception. In a functional magnetic resonance imaging (fMRI study, native English speakers listened to native- and Korean-accented English sentences, with or without faces. The participants’ Asian-foreign association was measured using an implicit association test (IAT, conducted outside the scanner. We found that foreign-accented speech evoked greater activity in the bilateral primary auditory cortices and the inferior frontal gyri, potentially reflecting greater computational demand. Higher IAT scores, indicating greater bias, were associated with increased BOLD response to foreign-accented speech with faces in the primary auditory cortex, the early node for spectrotemporal analysis. We conclude the following: (1 foreign-accented speech perception places greater demand on the neural systems underlying speech perception; (2 face of the talker can exaggerate the perceived foreignness of foreign-accented speech; (3 implicit Asian-foreign association is associated with decreased neural efficiency in early spectrotemporal processing.

  18. The role of metrical information in apraxia of speech. Perceptual and acoustic analyses of word stress.

    Science.gov (United States)

    Aichert, Ingrid; Späth, Mona; Ziegler, Wolfram

    2016-02-01

    Several factors are known to influence speech accuracy in patients with apraxia of speech (AOS), e.g., syllable structure or word length. However, the impact of word stress has largely been neglected so far. More generally, the role of prosodic information at the phonetic encoding stage of speech production often remains unconsidered in models of speech production. This study aimed to investigate the influence of word stress on error production in AOS. Two-syllabic words with stress on the first (trochees) vs. the second syllable (iambs) were compared in 14 patients with AOS, three of them exhibiting pure AOS, and in a control group of six normal speakers. The patients produced significantly more errors on iambic than on trochaic words. A most prominent metrical effect was obtained for segmental errors. Acoustic analyses of word durations revealed a disproportionate advantage of the trochaic meter in the patients relative to the healthy controls. The results indicate that German apraxic speakers are sensitive to metrical information. It is assumed that metrical patterns function as prosodic frames for articulation planning, and that the regular metrical pattern in German, the trochaic form, has a facilitating effect on word production in patients with AOS. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. Effects of emotional and perceptual-motor stress on a voice recognition system's accuracy: An applied investigation

    Science.gov (United States)

    Poock, G. K.; Martin, B. J.

    1984-02-01

    This was an applied investigation examining the ability of a speech recognition system to recognize speakers' inputs when the speakers were under different stress levels. Subjects were asked to speak to a voice recognition system under three conditions: (1) normal office environment, (2) emotional stress, and (3) perceptual-motor stress. Results indicate a definite relationship between voice recognition system performance and the type of low stress reference patterns used to achieve recognition.

  20. Factors influencing individual variation in perceptual directional microphone benefit.

    Science.gov (United States)

    Keidser, Gitte; Dillon, Harvey; Convery, Elizabeth; Mejia, Jorge

    2013-01-01

    Large variations in perceptual directional microphone benefit, which far exceed the variation expected from physical performance measures of directional microphones, have been reported in the literature. The cause for the individual variation has not been systematically investigated. To determine the factors that are responsible for the individual variation in reported perceptual directional benefit. A correlational study. Physical performance measures of the directional microphones obtained after they had been fitted to individuals, cognitive abilities of individuals, and measurement errors were related to perceptual directional benefit scores. Fifty-nine hearing-impaired adults with varied degrees of hearing loss participated in the study. All participants were bilaterally fitted with a Motion behind-the-ear device (500 M, 501 SX, or 501 P) from Siemens according to the National Acoustic Laboratories' non-linear prescription, version two (NAL-NL2). Using the Bamford-Kowal-Bench (BKB) sentences, the perceptual directional benefit was obtained as the difference in speech reception threshold measured in babble noise (SRTn) with the devices in directional (fixed hypercardioid) and in omnidirectional mode. The SRTn measurements were repeated three times with each microphone mode. Physical performance measures of the directional microphone included the angle of the microphone ports to loudspeaker axis, the frequency range dominated by amplified sound, the in situ signal-to-noise ratio (SNR), and the in situ three-dimensional, articulation-index weighted directivity index (3D AI-DI). The cognitive tests included auditory selective attention, speed of processing, and working memory. Intraparticipant variation on the repeated SRTn's and the interparticipant variation on the average SRTn were used to determine the effect of measurement error. A multiple regression analysis was used to determine the effect of other factors. Measurement errors explained 52% of the variation

  1. Assessing Disordered Speech and Voice in Parkinson's Disease: A Telerehabilitation Application

    Science.gov (United States)

    Constantinescu, Gabriella; Theodoros, Deborah; Russell, Trevor; Ward, Elizabeth; Wilson, Stephen; Wootton, Richard

    2010-01-01

    Background: Patients with Parkinson's disease face numerous access barriers to speech pathology services for appropriate assessment and treatment. Telerehabilitation is a possible solution to this problem, whereby rehabilitation services may be delivered to the patient at a distance, via telecommunication and information technologies. A number of…

  2. Multimodal Speech Capture System for Speech Rehabilitation and Learning.

    Science.gov (United States)

    Sebkhi, Nordine; Desai, Dhyey; Islam, Mohammad; Lu, Jun; Wilson, Kimberly; Ghovanloo, Maysam

    2017-11-01

    Speech-language pathologists (SLPs) are trained to correct articulation of people diagnosed with motor speech disorders by analyzing articulators' motion and assessing speech outcome while patients speak. To assist SLPs in this task, we are presenting the multimodal speech capture system (MSCS) that records and displays kinematics of key speech articulators, the tongue and lips, along with voice, using unobtrusive methods. Collected speech modalities, tongue motion, lips gestures, and voice are visualized not only in real-time to provide patients with instant feedback but also offline to allow SLPs to perform post-analysis of articulators' motion, particularly the tongue, with its prominent but hardly visible role in articulation. We describe the MSCS hardware and software components, and demonstrate its basic visualization capabilities by a healthy individual repeating the words "Hello World." A proof-of-concept prototype has been successfully developed for this purpose, and will be used in future clinical studies to evaluate its potential impact on accelerating speech rehabilitation by enabling patients to speak naturally. Pattern matching algorithms to be applied to the collected data can provide patients with quantitative and objective feedback on their speech performance, unlike current methods that are mostly subjective, and may vary from one SLP to another.

  3. Speech Clarity Index (Ψ): A Distance-Based Speech Quality Indicator and Recognition Rate Prediction for Dysarthric Speakers with Cerebral Palsy

    Science.gov (United States)

    Kayasith, Prakasith; Theeramunkong, Thanaruk

    It is a tedious and subjective task to measure severity of a dysarthria by manually evaluating his/her speech using available standard assessment methods based on human perception. This paper presents an automated approach to assess speech quality of a dysarthric speaker with cerebral palsy. With the consideration of two complementary factors, speech consistency and speech distinction, a speech quality indicator called speech clarity index (Ψ) is proposed as a measure of the speaker's ability to produce consistent speech signal for a certain word and distinguished speech signal for different words. As an application, it can be used to assess speech quality and forecast speech recognition rate of speech made by an individual dysarthric speaker before actual exhaustive implementation of an automatic speech recognition system for the speaker. The effectiveness of Ψ as a speech recognition rate predictor is evaluated by rank-order inconsistency, correlation coefficient, and root-mean-square of difference. The evaluations had been done by comparing its predicted recognition rates with ones predicted by the standard methods called the articulatory and intelligibility tests based on the two recognition systems (HMM and ANN). The results show that Ψ is a promising indicator for predicting recognition rate of dysarthric speech. All experiments had been done on speech corpus composed of speech data from eight normal speakers and eight dysarthric speakers.

  4. Quantitative assessment of motor speech abnormalities in idiopathic rapid eye movement sleep behaviour disorder.

    Science.gov (United States)

    Rusz, Jan; Hlavnička, Jan; Tykalová, Tereza; Bušková, Jitka; Ulmanová, Olga; Růžička, Evžen; Šonka, Karel

    2016-03-01

    Patients with idiopathic rapid eye movement sleep behaviour disorder (RBD) are at substantial risk for developing Parkinson's disease (PD) or related neurodegenerative disorders. Speech is an important indicator of motor function and movement coordination, and therefore may be an extremely sensitive early marker of changes due to prodromal neurodegeneration. Speech data were acquired from 16 RBD subjects and 16 age- and sex-matched healthy control subjects. Objective acoustic assessment of 15 speech dimensions representing various phonatory, articulatory, and prosodic deviations was performed. Statistical models were applied to characterise speech disorders in RBD and to estimate sensitivity and specificity in differentiating between RBD and control subjects. Some form of speech impairment was revealed in 88% of RBD subjects. Articulatory deficits were the most prominent findings in RBD. In comparison to controls, the RBD group showed significant alterations in irregular alternating motion rates (p = 0.009) and articulatory decay (p = 0.01). The combination of four distinctive speech dimensions, including aperiodicity, irregular alternating motion rates, articulatory decay, and dysfluency, led to 96% sensitivity and 79% specificity in discriminating between RBD and control subjects. Speech impairment was significantly more pronounced in RBD subjects with the motor score of the Unified Parkinson's Disease Rating Scale greater than 4 points when compared to other RBD individuals. Simple quantitative speech motor measures may be suitable for the reliable detection of prodromal neurodegeneration in subjects with RBD, and therefore may provide important outcomes for future therapy trials. Copyright © 2015 Elsevier B.V. All rights reserved.

  5. Perceptual Robust Design

    DEFF Research Database (Denmark)

    Pedersen, Søren Nygaard

    The research presented in this PhD thesis has focused on a perceptual approach to robust design. The results of the research and the original contribution to knowledge is a preliminary framework for understanding, positioning, and applying perceptual robust design. Product quality is a topic...... been presented. Therefore, this study set out to contribute to the understanding and application of perceptual robust design. To achieve this, a state-of-the-art and current practice review was performed. From the review two main research problems were identified. Firstly, a lack of tools...... for perceptual robustness was found to overlap with the optimum for functional robustness and at most approximately 2.2% out of the 14.74% could be ascribed solely to the perceptual robustness optimisation. In conclusion, the thesis have offered a new perspective on robust design by merging robust design...

  6. Predicting the effect of spectral subtraction on the speech recognition threshold based on the signal-to-noise ratio in the envelope domain

    DEFF Research Database (Denmark)

    Jørgensen, Søren; Dau, Torsten

    2011-01-01

    rarely been evaluated perceptually in terms of speech intelligibility. This study analyzed the effects of the spectral subtraction strategy proposed by Berouti at al. [ICASSP 4 (1979), 208-211] on the speech recognition threshold (SRT) obtained with sentences presented in stationary speech-shaped noise....... The SRT was measured in five normal-hearing listeners in six conditions of spectral subtraction. The results showed an increase of the SRT after processing, i.e. a decreased speech intelligibility, in contrast to what is predicted by the Speech Transmission Index (STI). Here, another approach is proposed......, denoted the speech-based envelope power spectrum model (sEPSM) which predicts the intelligibility based on the signal-to-noise ratio in the envelope domain. In contrast to the STI, the sEPSM is sensitive to the increased amount of the noise envelope power as a consequence of the spectral subtraction...

  7. Summation versus suppression in metacontrast masking: On the potential pitfalls of using metacontrast masking to assess perceptual-motor dissociation.

    Science.gov (United States)

    Cardoso-Leite, Pedro; Waszak, Florian

    2014-07-01

    A briefly flashed target stimulus can become "invisible" when immediately followed by a mask-a phenomenon known as backward masking, which constitutes a major tool in the cognitive sciences. One form of backward masking is termed metacontrast masking. It is generally assumed that in metacontrast masking, the mask suppresses activity on which the conscious perception of the target relies. This assumption biases conclusions when masking is used as a tool-for example, to study the independence between perceptual detection and motor reaction. This is because other models can account for reduced perceptual performance without requiring suppression mechanisms. In this study, we used signal detection theory to test the suppression model against an alternative view of metacontrast masking, referred to as the summation model. This model claims that target- and mask-related activations fuse and that the difficulty in detecting the target results from the difficulty to discriminate this fused response from the response produced by the mask alone. Our data support this alternative view. This study is not a thorough investigation of metacontrast masking. Instead, we wanted to point out that when a different model is used to account for the reduced perceptual performance in metacontrast masking, there is no need to postulate a dissociation between perceptual and motor responses to account for the data. Metacontrast masking, as implemented in the Fehrer-Raab situation, therefore is not a valid method to assess perceptual-motor dissociations.

  8. Neural plasticity underlying visual perceptual learning in aging.

    Science.gov (United States)

    Mishra, Jyoti; Rolle, Camarin; Gazzaley, Adam

    2015-07-01

    Healthy aging is associated with a decline in basic perceptual abilities, as well as higher-level cognitive functions such as working memory. In a recent perceptual training study using moving sweeps of Gabor stimuli, Berry et al. (2010) observed that older adults significantly improved discrimination abilities on the most challenging perceptual tasks that presented paired sweeps at rapid rates of 5 and 10 Hz. Berry et al. further showed that this perceptual training engendered transfer-of-benefit to an untrained working memory task. Here, we investigated the neural underpinnings of the improvements in these perceptual tasks, as assessed by event-related potential (ERP) recordings. Early visual ERP components time-locked to stimulus onset were compared pre- and post-training, as well as relative to a no-contact control group. The visual N1 and N2 components were significantly enhanced after training, and the N1 change correlated with improvements in perceptual discrimination on the task. Further, the change observed for the N1 and N2 was associated with the rapidity of the perceptual challenge; the visual N1 (120-150 ms) was enhanced post-training for 10 Hz sweep pairs, while the N2 (240-280 ms) was enhanced for the 5 Hz sweep pairs. We speculate that these observed post-training neural enhancements reflect improvements by older adults in the allocation of attention that is required to accurately dissociate perceptually overlapping stimuli when presented in rapid sequence. This article is part of a Special Issue entitled SI: Memory Å. Copyright © 2014 Elsevier B.V. All rights reserved.

  9. Varying acoustic-phonemic ambiguity reveals that talker normalization is obligatory in speech processing.

    Science.gov (United States)

    Choi, Ja Young; Hu, Elly R; Perrachione, Tyler K

    2018-04-01

    The nondeterministic relationship between speech acoustics and abstract phonemic representations imposes a challenge for listeners to maintain perceptual constancy despite the highly variable acoustic realization of speech. Talker normalization facilitates speech processing by reducing the degrees of freedom for mapping between encountered speech and phonemic representations. While this process has been proposed to facilitate the perception of ambiguous speech sounds, it is currently unknown whether talker normalization is affected by the degree of potential ambiguity in acoustic-phonemic mapping. We explored the effects of talker normalization on speech processing in a series of speeded classification paradigms, parametrically manipulating the potential for inconsistent acoustic-phonemic relationships across talkers for both consonants and vowels. Listeners identified words with varying potential acoustic-phonemic ambiguity across talkers (e.g., beet/boat vs. boot/boat) spoken by single or mixed talkers. Auditory categorization of words was always slower when listening to mixed talkers compared to a single talker, even when there was no potential acoustic ambiguity between target sounds. Moreover, the processing cost imposed by mixed talkers was greatest when words had the most potential acoustic-phonemic overlap across talkers. Models of acoustic dissimilarity between target speech sounds did not account for the pattern of results. These results suggest (a) that talker normalization incurs the greatest processing cost when disambiguating highly confusable sounds and (b) that talker normalization appears to be an obligatory component of speech perception, taking place even when the acoustic-phonemic relationships across sounds are unambiguous.

  10. Apraxia of Speech: Perceptual Analysis of Trisyllabic Word Productions across Repeated Sampling Occasions

    Science.gov (United States)

    Mauszycki, Shannon C.; Wambaugh, Julie L.; Cameron, Rosalea M.

    2012-01-01

    Purpose: Early apraxia of speech (AOS) research has characterized errors as being variable, resulting in a number of different error types being produced on repeated productions of the same stimuli. Conversely, recent research has uncovered greater consistency in errors, but there are limited data examining sound errors over time (more than one…

  11. Inferior frontal gyrus activation predicts individual differences in perceptual learning of cochlear-implant simulations.

    Science.gov (United States)

    Eisner, Frank; McGettigan, Carolyn; Faulkner, Andrew; Rosen, Stuart; Scott, Sophie K

    2010-05-26

    This study investigated the neural plasticity associated with perceptual learning of a cochlear implant (CI) simulation. Normal-hearing listeners were trained with vocoded and spectrally shifted speech simulating a CI while cortical responses were measured with functional magnetic resonance imaging (fMRI). A condition in which the vocoded speech was spectrally inverted provided a control for learnability and adaptation. Behavioral measures showed considerable individual variability both in the ability to learn to understand the degraded speech, and in phonological working memory capacity. Neurally, left-lateralized regions in superior temporal sulcus and inferior frontal gyrus (IFG) were sensitive to the learnability of the simulations, but only the activity in prefrontal cortex correlated with interindividual variation in intelligibility scores and phonological working memory. A region in left angular gyrus (AG) showed an activation pattern that reflected learning over the course of the experiment, and covariation of activity in AG and IFG was modulated by the learnability of the stimuli. These results suggest that variation in listeners' ability to adjust to vocoded and spectrally shifted speech is partly reflected in differences in the recruitment of higher-level language processes in prefrontal cortex, and that this variability may further depend on functional links between the left inferior frontal gyrus and angular gyrus. Differences in the engagement of left inferior prefrontal cortex, and its covariation with posterior parietal areas, may thus underlie some of the variation in speech perception skills that have been observed in clinical populations of CI users.

  12. Particle swarm optimization based feature enhancement and feature selection for improved emotion recognition in speech and glottal signals.

    Science.gov (United States)

    Muthusamy, Hariharan; Polat, Kemal; Yaacob, Sazali

    2015-01-01

    In the recent years, many research works have been published using speech related features for speech emotion recognition, however, recent studies show that there is a strong correlation between emotional states and glottal features. In this work, Mel-frequency cepstralcoefficients (MFCCs), linear predictive cepstral coefficients (LPCCs), perceptual linear predictive (PLP) features, gammatone filter outputs, timbral texture features, stationary wavelet transform based timbral texture features and relative wavelet packet energy and entropy features were extracted from the emotional speech (ES) signals and its glottal waveforms(GW). Particle swarm optimization based clustering (PSOC) and wrapper based particle swarm optimization (WPSO) were proposed to enhance the discerning ability of the features and to select the discriminating features respectively. Three different emotional speech databases were utilized to gauge the proposed method. Extreme learning machine (ELM) was employed to classify the different types of emotions. Different experiments were conducted and the results show that the proposed method significantly improves the speech emotion recognition performance compared to previous works published in the literature.

  13. A novel method for assessing the development of speech motor function in toddlers with autism spectrum disorders

    Directory of Open Access Journals (Sweden)

    Katherine eSullivan

    2013-03-01

    Full Text Available There is increasing evidence to show that indicators other than socio-cognitive abilities might predict communicative function in Autism Spectrum Disorders (ASD. A potential area of research is the development of speech motor function in toddlers. Utilizing a novel measure called ‘articulatory features’, we assess the abilities of toddlers to produce sounds at different timescales as a metric of their speech motor skills. In the current study, we examined 1 whether speech motor function differed between toddlers with ASD, developmental delay, and typical development; and 2 whether differences in speech motor function are correlated with standard measures of language in toddlers with ASD. Our results revealed significant differences between a subgroup of the ASD population with poor verbal skills, and the other groups for the articulatory features associated with the shortest time scale, namely place of articulation, (p<0.05. We also found significant correlations between articulatory features and language and motor ability as assessed by the Mullen and the Vineland scales for the ASD group. Our findings suggest that articulatory features may be an additional measure of speech motor function that could potentially be useful as an early risk indicator of ASD.

  14. Auditory, Visual and Audiovisual Speech Processing Streams in Superior Temporal Sulcus.

    Science.gov (United States)

    Venezia, Jonathan H; Vaden, Kenneth I; Rong, Feng; Maddox, Dale; Saberi, Kourosh; Hickok, Gregory

    2017-01-01

    The human superior temporal sulcus (STS) is responsive to visual and auditory information, including sounds and facial cues during speech recognition. We investigated the functional organization of STS with respect to modality-specific and multimodal speech representations. Twenty younger adult participants were instructed to perform an oddball detection task and were presented with auditory, visual, and audiovisual speech stimuli, as well as auditory and visual nonspeech control stimuli in a block fMRI design. Consistent with a hypothesized anterior-posterior processing gradient in STS, auditory, visual and audiovisual stimuli produced the largest BOLD effects in anterior, posterior and middle STS (mSTS), respectively, based on whole-brain, linear mixed effects and principal component analyses. Notably, the mSTS exhibited preferential responses to multisensory stimulation, as well as speech compared to nonspeech. Within the mid-posterior and mSTS regions, response preferences changed gradually from visual, to multisensory, to auditory moving posterior to anterior. Post hoc analysis of visual regions in the posterior STS revealed that a single subregion bordering the mSTS was insensitive to differences in low-level motion kinematics yet distinguished between visual speech and nonspeech based on multi-voxel activation patterns. These results suggest that auditory and visual speech representations are elaborated gradually within anterior and posterior processing streams, respectively, and may be integrated within the mSTS, which is sensitive to more abstract speech information within and across presentation modalities. The spatial organization of STS is consistent with processing streams that are hypothesized to synthesize perceptual speech representations from sensory signals that provide convergent information from visual and auditory modalities.

  15. Perceptually-Inspired Computing

    Directory of Open Access Journals (Sweden)

    Ming Lin

    2015-08-01

    Full Text Available Human sensory systems allow individuals to see, hear, touch, and interact with the surrounding physical environment. Understanding human perception and its limit enables us to better exploit the psychophysics of human perceptual systems to design more efficient, adaptive algorithms and develop perceptually-inspired computational models. In this talk, I will survey some of recent efforts on perceptually-inspired computing with applications to crowd simulation and multimodal interaction. In particular, I will present data-driven personality modeling based on the results of user studies, example-guided physics-based sound synthesis using auditory perception, as well as perceptually-inspired simplification for multimodal interaction. These perceptually guided principles can be used to accelerating multi-modal interaction and visual computing, thereby creating more natural human-computer interaction and providing more immersive experiences. I will also present their use in interactive applications for entertainment, such as video games, computer animation, and shared social experience. I will conclude by discussing possible future research directions.

  16. Load theory behind the wheel; perceptual and cognitive load effects.

    Science.gov (United States)

    Murphy, Gillian; Greene, Ciara M

    2017-09-01

    Perceptual Load Theory has been proposed as a resolution to the longstanding early versus late selection debate in cognitive psychology. There is much evidence in support of Load Theory but very few applied studies, despite the potential for the model to shed light on everyday attention and distraction. Using a driving simulator, the effect of perceptual and cognitive load on drivers' visual search was assessed. The findings were largely in line with Load Theory, with reduced distractor processing under high perceptual load, but increased distractor processing under high cognitive load. The effect of load on driving behaviour was also analysed, with significant differences in driving behaviour under perceptual and cognitive load. In addition, the effect of perceptual load on drivers' levels of awareness was investigated. High perceptual load significantly increased inattentional blindness and deafness, for stimuli that were both relevant and irrelevant to driving. High perceptual load also increased RTs to hazards. The current study helps to advance Load Theory by illustrating its usefulness outside of traditional paradigms. There are also applied implications for driver safety and roadway design, as the current study suggests that perceptual and cognitive load are important factors in driver attention. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  17. Evidence accumulation detected in BOLD signal using slow perceptual decision making

    NARCIS (Netherlands)

    Krueger, Paul M.; van Vugt, Marieke K.; Simen, Patrick; Nystrom, Leigh; Holmes, Philip; Cohen, Jonathan D.

    2017-01-01

    BACKGROUND: We assessed whether evidence accumulation could be observed in the BOLD signal during perceptual decision making. This presents a challenge since the hemodynamic response is slow, while perceptual decisions are typically fast. NEW METHOD: Guided by theoretical predictions of the drift

  18. Asymmetry in infants’ selective attention to facial features during visual processing of infant-directed speech

    Directory of Open Access Journals (Sweden)

    Nicholas A Smith

    2013-09-01

    Full Text Available Two experiments used eye tracking to examine how infant and adult observers distribute their eye gaze on videos of a mother producing infant- and adult-directed speech. Both groups showed greater attention to the eyes than to the nose and mouth, as well as an asymmetrical focus on the talker’s right eye for infant-directed speech stimuli. Observers continued to look more at the talker’s apparent right eye when the video stimuli were mirror flipped, suggesting that the asymmetry reflects a perceptual processing bias rather than a stimulus artifact, which may be related to cerebral lateralization of emotion processing.

  19. When speaker identity is unavoidable: Neural processing of speaker identity cues in natural speech.

    Science.gov (United States)

    Tuninetti, Alba; Chládková, Kateřina; Peter, Varghese; Schiller, Niels O; Escudero, Paola

    2017-11-01

    Speech sound acoustic properties vary largely across speakers and accents. When perceiving speech, adult listeners normally disregard non-linguistic variation caused by speaker or accent differences, in order to comprehend the linguistic message, e.g. to correctly identify a speech sound or a word. Here we tested whether the process of normalizing speaker and accent differences, facilitating the recognition of linguistic information, is found at the level of neural processing, and whether it is modulated by the listeners' native language. In a multi-deviant oddball paradigm, native and nonnative speakers of Dutch were exposed to naturally-produced Dutch vowels varying in speaker, sex, accent, and phoneme identity. Unexpectedly, the analysis of mismatch negativity (MMN) amplitudes elicited by each type of change shows a large degree of early perceptual sensitivity to non-linguistic cues. This finding on perception of naturally-produced stimuli contrasts with previous studies examining the perception of synthetic stimuli wherein adult listeners automatically disregard acoustic cues to speaker identity. The present finding bears relevance to speech normalization theories, suggesting that at an unattended level of processing, listeners are indeed sensitive to changes in fundamental frequency in natural speech tokens. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. Experience with speech sounds is not necessary for cue trading by budgerigars (Melopsittacus undulatus.

    Directory of Open Access Journals (Sweden)

    Mary Flaherty

    Full Text Available The influence of experience with human speech sounds on speech perception in budgerigars, vocal mimics whose speech exposure can be tightly controlled in a laboratory setting, was measured. Budgerigars were divided into groups that differed in auditory exposure and then tested on a cue-trading identification paradigm with synthetic speech. Phonetic cue trading is a perceptual phenomenon observed when changes on one cue dimension are offset by changes in another cue dimension while still maintaining the same phonetic percept. The current study examined whether budgerigars would trade the cues of voice onset time (VOT and the first formant onset frequency when identifying syllable initial stop consonants and if this would be influenced by exposure to speech sounds. There were a total of four different exposure groups: No speech exposure (completely isolated, Passive speech exposure (regular exposure to human speech, and two Speech-trained groups. After the exposure period, all budgerigars were tested for phonetic cue trading using operant conditioning procedures. Birds were trained to peck keys in response to different synthetic speech sounds that began with "d" or "t" and varied in VOT and frequency of the first formant at voicing onset. Once training performance criteria were met, budgerigars were presented with the entire intermediate series, including ambiguous sounds. Responses on these trials were used to determine which speech cues were used, if a trading relation between VOT and the onset frequency of the first formant was present, and whether speech exposure had an influence on perception. Cue trading was found in all birds and these results were largely similar to those of a group of humans. Results indicated that prior speech experience was not a requirement for cue trading by budgerigars. The results are consistent with theories that explain phonetic cue trading in terms of a rich auditory encoding of the speech signal.

  1. Visual Perceptual Learning and Models.

    Science.gov (United States)

    Dosher, Barbara; Lu, Zhong-Lin

    2017-09-15

    Visual perceptual learning through practice or training can significantly improve performance on visual tasks. Originally seen as a manifestation of plasticity in the primary visual cortex, perceptual learning is more readily understood as improvements in the function of brain networks that integrate processes, including sensory representations, decision, attention, and reward, and balance plasticity with system stability. This review considers the primary phenomena of perceptual learning, theories of perceptual learning, and perceptual learning's effect on signal and noise in visual processing and decision. Models, especially computational models, play a key role in behavioral and physiological investigations of the mechanisms of perceptual learning and for understanding, predicting, and optimizing human perceptual processes, learning, and performance. Performance improvements resulting from reweighting or readout of sensory inputs to decision provide a strong theoretical framework for interpreting perceptual learning and transfer that may prove useful in optimizing learning in real-world applications.

  2. 78 FR 49693 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

    Science.gov (United States)

    2013-08-15

    ...-Speech Services for Individuals with Hearing and Speech Disabilities, Report and Order (Order), document...] Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services; Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and Speech Disabilities...

  3. Some factors underlying individual differences in speech recognition on PRESTO: a first report.

    Science.gov (United States)

    Tamati, Terrin N; Gilbert, Jaimie L; Pisoni, David B

    2013-01-01

    Previous studies investigating speech recognition in adverse listening conditions have found extensive variability among individual listeners. However, little is currently known about the core underlying factors that influence speech recognition abilities. To investigate sensory, perceptual, and neurocognitive differences between good and poor listeners on the Perceptually Robust English Sentence Test Open-set (PRESTO), a new high-variability sentence recognition test under adverse listening conditions. Participants who fell in the upper quartile (HiPRESTO listeners) or lower quartile (LoPRESTO listeners) on key word recognition on sentences from PRESTO in multitalker babble completed a battery of behavioral tasks and self-report questionnaires designed to investigate real-world hearing difficulties, indexical processing skills, and neurocognitive abilities. Young, normal-hearing adults (N = 40) from the Indiana University community participated in the current study. Participants' assessment of their own real-world hearing difficulties was measured with a self-report questionnaire on situational hearing and hearing health history. Indexical processing skills were assessed using a talker discrimination task, a gender discrimination task, and a forced-choice regional dialect categorization task. Neurocognitive abilities were measured with the Auditory Digit Span Forward (verbal short-term memory) and Digit Span Backward (verbal working memory) tests, the Stroop Color and Word Test (attention/inhibition), the WordFam word familiarity test (vocabulary size), the Behavioral Rating Inventory of Executive Function-Adult Version (BRIEF-A) self-report questionnaire on executive function, and two performance subtests of the Wechsler Abbreviated Scale of Intelligence (WASI) Performance Intelligence Quotient (IQ; nonverbal intelligence). Scores on self-report questionnaires and behavioral tasks were tallied and analyzed by listener group (HiPRESTO and LoPRESTO). The extreme

  4. Speech perception in older listeners with normal hearing:conditions of time alteration, selective word stress, and length of sentences.

    Science.gov (United States)

    Cho, Soojin; Yu, Jyaehyoung; Chun, Hyungi; Seo, Hyekyung; Han, Woojae

    2014-04-01

    Deficits of the aging auditory system negatively affect older listeners in terms of speech communication, resulting in limitations to their social lives. To improve their perceptual skills, the goal of this study was to investigate the effects of time alteration, selective word stress, and varying sentence lengths on the speech perception of older listeners. Seventeen older people with normal hearing were tested for seven conditions of different time-altered sentences (i.e., ±60%, ±40%, ±20%, 0%), two conditions of selective word stress (i.e., no-stress and stress), and three different lengths of sentences (i.e., short, medium, and long) at the most comfortable level for individuals in quiet circumstances. As time compression increased, sentence perception scores decreased statistically. Compared to a natural (or no stress) condition, the selectively stressed words significantly improved the perceptual scores of these older listeners. Long sentences yielded the worst scores under all time-altered conditions. Interestingly, there was a noticeable positive effect for the selective word stress at the 20% time compression. This pattern of results suggests that a combination of time compression and selective word stress is more effective for understanding speech in older listeners than using the time-expanded condition only.

  5. Gender Identification Using High-Frequency Speech Energy: Effects of Increasing the Low-Frequency Limit.

    Science.gov (United States)

    Donai, Jeremy J; Halbritter, Rachel M

    The purpose of this study was to investigate the ability of normal-hearing listeners to use high-frequency energy for gender identification from naturally produced speech signals. Two experiments were conducted using a repeated-measures design. Experiment 1 investigated the effects of increasing high-pass filter cutoff (i.e., increasing the low-frequency spectral limit) on gender identification from naturally produced vowel segments. Experiment 2 studied the effects of increasing high-pass filter cutoff on gender identification from naturally produced sentences. Confidence ratings for the gender identification task were also obtained for both experiments. Listeners in experiment 1 were capable of extracting talker gender information at levels significantly above chance from vowel segments high-pass filtered up to 8.5 kHz. Listeners in experiment 2 also performed above chance on the gender identification task from sentences high-pass filtered up to 12 kHz. Cumulatively, the results of both experiments provide evidence that normal-hearing listeners can utilize information from the very high-frequency region (above 4 to 5 kHz) of the speech signal for talker gender identification. These findings are at variance with current assumptions regarding the perceptual information regarding talker gender within this frequency region. The current results also corroborate and extend previous studies of the use of high-frequency speech energy for perceptual tasks. These findings have potential implications for the study of information contained within the high-frequency region of the speech spectrum and the role this region may play in navigating the auditory scene, particularly when the low-frequency portion of the spectrum is masked by environmental noise sources or for listeners with substantial hearing loss in the low-frequency region and better hearing sensitivity in the high-frequency region (i.e., reverse slope hearing loss).

  6. INVENTORY FOR ASSESSMENT OF SOCIO-PERCEPTUAL ATTITUDE: PSYCHOMETRIC CHARACTERISTICS AND FEATURES OF USE

    Directory of Open Access Journals (Sweden)

    Tatiana D Dubovitskaya

    2017-12-01

    Full Text Available Socio-perceptual attitude is defined as a predisposition of the subjects of communication to perceive, assess, and act in relation to each other in a certain way. The functions of social-perceptual attitudes are social adaptation (utilitarian function, cognitive function, expression and evaluation, and psychological protection. Author’s technique of diagnostics of social-perceptual attitude of a person in relation to other people is presented in the article. The inventory uses the proverbs of the peoples of the world as test material. The results of psychometric testing indicate the reliability and validity of the technique.High scores on the technique results indicate the individual’s characteristics such as willingness to trust, to help, and to notice positive experiences; the ability to see the positive potential; faith in people’s ability to develop and achieve better results; emotional acceptance, benevolence, and empathy.The average scores indicate the subject’s desire for close and trusting relationships, cooperation, sincerity; willingness to understand another person; desire to take into account individual psychological features of other people; contradictions with others are either absent or are resolved constructively.Low scores indicate that, in relation to others, the subject is characterized by the following: suspicion, anticipation of negative attitude to themselves, willingness to see the negative manifestations in the behavior of others, ignoring of their successes and achievements; emotional rejection, criticism, irony, malice; accusations against the others used to justify their own negative actions (aggression against said others.The author’s technique can be used in solution of the problem of psycho-diagnostics and therapy in order to optimize communication and interpersonal relations: achieve mutual understanding and cooperation, develop a constructive mutually beneficial solution, overcome the desire to criticize

  7. Perceptual learning and human expertise.

    Science.gov (United States)

    Kellman, Philip J; Garrigan, Patrick

    2009-06-01

    We consider perceptual learning: experience-induced changes in the way perceivers extract information. Often neglected in scientific accounts of learning and in instruction, perceptual learning is a fundamental contributor to human expertise and is crucial in domains where humans show remarkable levels of attainment, such as language, chess, music, and mathematics. In Section 2, we give a brief history and discuss the relation of perceptual learning to other forms of learning. We consider in Section 3 several specific phenomena, illustrating the scope and characteristics of perceptual learning, including both discovery and fluency effects. We describe abstract perceptual learning, in which structural relationships are discovered and recognized in novel instances that do not share constituent elements or basic features. In Section 4, we consider primary concepts that have been used to explain and model perceptual learning, including receptive field change, selection, and relational recoding. In Section 5, we consider the scope of perceptual learning, contrasting recent research, focused on simple sensory discriminations, with earlier work that emphasized extraction of invariance from varied instances in more complex tasks. Contrary to some recent views, we argue that perceptual learning should not be confined to changes in early sensory analyzers. Phenomena at various levels, we suggest, can be unified by models that emphasize discovery and selection of relevant information. In a final section, we consider the potential role of perceptual learning in educational settings. Most instruction emphasizes facts and procedures that can be verbalized, whereas expertise depends heavily on implicit pattern recognition and selective extraction skills acquired through perceptual learning. We consider reasons why perceptual learning has not been systematically addressed in traditional instruction, and we describe recent successful efforts to create a technology of perceptual

  8. Perceptual learning and human expertise

    Science.gov (United States)

    Kellman, Philip J.; Garrigan, Patrick

    2009-06-01

    We consider perceptual learning: experience-induced changes in the way perceivers extract information. Often neglected in scientific accounts of learning and in instruction, perceptual learning is a fundamental contributor to human expertise and is crucial in domains where humans show remarkable levels of attainment, such as language, chess, music, and mathematics. In Section 2, we give a brief history and discuss the relation of perceptual learning to other forms of learning. We consider in Section 3 several specific phenomena, illustrating the scope and characteristics of perceptual learning, including both discovery and fluency effects. We describe abstract perceptual learning, in which structural relationships are discovered and recognized in novel instances that do not share constituent elements or basic features. In Section 4, we consider primary concepts that have been used to explain and model perceptual learning, including receptive field change, selection, and relational recoding. In Section 5, we consider the scope of perceptual learning, contrasting recent research, focused on simple sensory discriminations, with earlier work that emphasized extraction of invariance from varied instances in more complex tasks. Contrary to some recent views, we argue that perceptual learning should not be confined to changes in early sensory analyzers. Phenomena at various levels, we suggest, can be unified by models that emphasize discovery and selection of relevant information. In a final section, we consider the potential role of perceptual learning in educational settings. Most instruction emphasizes facts and procedures that can be verbalized, whereas expertise depends heavily on implicit pattern recognition and selective extraction skills acquired through perceptual learning. We consider reasons why perceptual learning has not been systematically addressed in traditional instruction, and we describe recent successful efforts to create a technology of perceptual

  9. Speech impairment in Down syndrome: a review.

    Science.gov (United States)

    Kent, Ray D; Vorperian, Houri K

    2013-02-01

    This review summarizes research on disorders of speech production in Down syndrome (DS) for the purposes of informing clinical services and guiding future research. Review of the literature was based on searches using MEDLINE, Google Scholar, PsycINFO, and HighWire Press, as well as consideration of reference lists in retrieved documents (including online sources). Search terms emphasized functions related to voice, articulation, phonology, prosody, fluency, and intelligibility. The following conclusions pertain to four major areas of review: voice, speech sounds, fluency and prosody, and intelligibility. The first major area is voice. Although a number of studies have reported on vocal abnormalities in DS, major questions remain about the nature and frequency of the phonatory disorder. Results of perceptual and acoustic studies have been mixed, making it difficult to draw firm conclusions or even to identify sensitive measures for future study. The second major area is speech sounds. Articulatory and phonological studies show that speech patterns in DS are a combination of delayed development and errors not seen in typical development. Delayed (i.e., developmental) and disordered (i.e., nondevelopmental) patterns are evident by the age of about 3 years, although DS-related abnormalities possibly appear earlier, even in infant babbling. The third major area is fluency and prosody. Stuttering and/or cluttering occur in DS at rates of 10%-45%, compared with about 1% in the general population. Research also points to significant disturbances in prosody. The fourth major area is intelligibility. Studies consistently show marked limitations in this area, but only recently has the research gone beyond simple rating scales.

  10. Chromatic Perceptual Learning but No Category Effects without Linguistic Input.

    Science.gov (United States)

    Grandison, Alexandra; Sowden, Paul T; Drivonikou, Vicky G; Notman, Leslie A; Alexander, Iona; Davies, Ian R L

    2016-01-01

    Perceptual learning involves an improvement in perceptual judgment with practice, which is often specific to stimulus or task factors. Perceptual learning has been shown on a range of visual tasks but very little research has explored chromatic perceptual learning. Here, we use two low level perceptual threshold tasks and a supra-threshold target detection task to assess chromatic perceptual learning and category effects. Experiment 1 investigates whether chromatic thresholds reduce as a result of training and at what level of analysis learning effects occur. Experiment 2 explores the effect of category training on chromatic thresholds, whether training of this nature is category specific and whether it can induce categorical responding. Experiment 3 investigates the effect of category training on a higher level, lateralized target detection task, previously found to be sensitive to category effects. The findings indicate that performance on a perceptual threshold task improves following training but improvements do not transfer across retinal location or hue. Therefore, chromatic perceptual learning is category specific and can occur at relatively early stages of visual analysis. Additionally, category training does not induce category effects on a low level perceptual threshold task, as indicated by comparable discrimination thresholds at the newly learned hue boundary and adjacent test points. However, category training does induce emerging category effects on a supra-threshold target detection task. Whilst chromatic perceptual learning is possible, learnt category effects appear to be a product of left hemisphere processing, and may require the input of higher level linguistic coding processes in order to manifest.

  11. When speech sounds like music.

    Science.gov (United States)

    Falk, Simone; Rathcke, Tamara; Dalla Bella, Simone

    2014-08-01

    Repetition can boost memory and perception. However, repeating the same stimulus several times in immediate succession also induces intriguing perceptual transformations and illusions. Here, we investigate the Speech to Song Transformation (S2ST), a massed repetition effect in the auditory modality, which crosses the boundaries between language and music. In the S2ST, a phrase repeated several times shifts to being heard as sung. To better understand this unique cross-domain transformation, we examined the perceptual determinants of the S2ST, in particular the role of acoustics. In 2 Experiments, the effects of 2 pitch properties and 3 rhythmic properties on the probability and speed of occurrence of the transformation were examined. Results showed that both pitch and rhythmic properties are key features fostering the transformation. However, some properties proved to be more conducive to the S2ST than others. Stable tonal targets that allowed for the perception of a musical melody led more often and quickly to the S2ST than scalar intervals. Recurring durational contrasts arising from segmental grouping favoring a metrical interpretation of the stimulus also facilitated the S2ST. This was, however, not the case for a regular beat structure within and across repetitions. In addition, individual perceptual abilities allowed to predict the likelihood of the S2ST. Overall, the study demonstrated that repetition enables listeners to reinterpret specific prosodic features of spoken utterances in terms of musical structures. The findings underline a tight link between language and music, but they also reveal important differences in communicative functions of prosodic structure in the 2 domains.

  12. Effect of perceptual load on conceptual processing: an extension of Vermeulen's theory.

    Science.gov (United States)

    Xie, Jiushu; Wang, Ruiming; Sun, Xun; Chang, Song

    2013-10-01

    The effect of color and shape load on conceptual processing was studied. Perceptual load effects have been found in visual and auditory conceptual processing, supporting the theory of embodied cognition. However, whether different types of visual concepts, such as color and shape, share the same perceptual load effects is unknown. In the current experiment, 32 participants were administered simultaneous perceptual and conceptual tasks to assess the relation between perceptual load and conceptual processing. Keeping color load in mind obstructed color conceptual processing. Hence, perceptual processing and conceptual load shared the same resources, suggesting embodied cognition. Color conceptual processing was not affected by shape pictures, indicating that different types of properties within vision were separate.

  13. A protocol for comprehensive assessment of bulbar dysfunction in amyotrophic lateral sclerosis (ALS).

    Science.gov (United States)

    Yunusova, Yana; Green, Jordan R; Wang, Jun; Pattee, Gary; Zinman, Lorne

    2011-02-21

    Improved methods for assessing bulbar impairment are necessary for expediting diagnosis of bulbar dysfunction in ALS, for predicting disease progression across speech subsystems, and for addressing the critical need for sensitive outcome measures for ongoing experimental treatment trials. To address this need, we are obtaining longitudinal profiles of bulbar impairment in 100 individuals based on a comprehensive instrumentation-based assessment that yield objective measures. Using instrumental approaches to quantify speech-related behaviors is very important in a field that has primarily relied on subjective, auditory-perceptual forms of speech assessment(1). Our assessment protocol measures performance across all of the speech subsystems, which include respiratory, phonatory (laryngeal), resonatory (velopharyngeal), and articulatory. The articulatory subsystem is divided into the facial components (jaw and lip), and the tongue. Prior research has suggested that each speech subsystem responds differently to neurological diseases such as ALS. The current protocol is designed to test the performance of each speech subsystem as independently from other subsystems as possible. The speech subsystems are evaluated in the context of more global changes to speech performance. These speech system level variables include speaking rate and intelligibility of speech. The protocol requires specialized instrumentation, and commercial and custom software. The respiratory, phonatory, and resonatory subsystems are evaluated using pressure-flow (aerodynamic) and acoustic methods. The articulatory subsystem is assessed using 3D motion tracking techniques. The objective measures that are used to quantify bulbar impairment have been well established in the speech literature and show sensitivity to changes in bulbar function with disease progression. The result of the assessment is a comprehensive, across-subsystem performance profile for each participant. The profile, when compared to

  14. Avaliação fonoaudiológica na atrofia de múltiplos sistemas: estudo com cinco pacientes Multiple system atrophy speech assessment: study of five cases

    Directory of Open Access Journals (Sweden)

    Denise Botelho Knopp

    2002-09-01

    Full Text Available A atrofia de múltiplos sistemas (AMS é caracterizada pela presença de sinais parkinsonianos, cerebelares, autonômicos e piramidais, em várias combinações. O aparecimento de disartria e disfagia no primeiro ano de manifestação de parkinsonismo, sugere o diagnóstico de AMS. O objetivo deste estudo foi o de caracterizar do ponto de vista fonoaudiológico os distúrbios da fala e da voz dos pacientes com AMS. Foram selecionados cinco pacientes, com idade média de 51,2 anos e com diagnóstico provável de AMS. Cada paciente foi submetido a avaliação neurológica e fonoaudiológica. Esta última foi composta dos seguintes itens: anamnese; avaliação miofuncional e avaliação perceptivo-auditiva da fala. Os sintomas de fala e voz apareceram 1,1 ano após o início dos sintomas motores e a disartrofonia apresentada por todos os pacientes foi a do tipo mista, mesclando os componentes hipocinético, atáxico e espástico, com predomínio do primeiro. Nossos achados são diferentes daqueles comumente vistos em pacientes com a doença de Parkinson, onde o componente hipocinético é o único achado. Os dados levantados indicam que a avaliação fonoaudiológica é importante no diagnóstico diferencial e no planejamento terapêutico da AMS.Multiple system atrophy (MSA is characterized by parkinsonian, cerebellar and pyramidal features along with autonomic dysfunction in different combinations. Onset of dysarthria during the first year of the manifestation of a parkinsonian syndrome suggests the diagnosis of MSA. The aim of this study was to characterize the voice and the speech of patients with MSA. We studied five MSA patients with a mean age of 51.2 years. Each patient was submitted to a neurological and a specific speech and voice assessment. The latter consisted of the following: clinical interview, myofunctional examination, and perceptual speech evaluation. Speech and voice complaints occurred at an average time of 1.1 year after the

  15. Influence of hearing loss on children’s identification of spondee words in a speech-shaped noise or a two-talker masker

    Science.gov (United States)

    Leibold, Lori J.; Hillock-Dunn, Andrea; Duncan, Nicole; Roush, Patricia A.; Buss, Emily

    2013-01-01

    This study compared spondee identification performance in presence of speech-shaped noise or two competing talkers across children with hearing loss and age-matched children with normal hearing. The results showed a greater masking effect for children with hearing loss compared to children with normal hearing for both masker conditions. However, the magnitude of this group difference was significantly larger for the two-talker compared to the speech-shaped noise masker. These results support the hypothesis that hearing loss influences children’s perceptual processing abilities. PMID:23492919

  16. Speech in spinocerebellar ataxia.

    Science.gov (United States)

    Schalling, Ellika; Hartelius, Lena

    2013-12-01

    Spinocerebellar ataxias (SCAs) are a heterogeneous group of autosomal dominant cerebellar ataxias clinically characterized by progressive ataxia, dysarthria and a range of other concomitant neurological symptoms. Only a few studies include detailed characterization of speech symptoms in SCA. Speech symptoms in SCA resemble ataxic dysarthria but symptoms related to phonation may be more prominent. One study to date has shown an association between differences in speech and voice symptoms related to genotype. More studies of speech and voice phenotypes are motivated, to possibly aid in clinical diagnosis. In addition, instrumental speech analysis has been demonstrated to be a reliable measure that may be used to monitor disease progression or therapy outcomes in possible future pharmacological treatments. Intervention by speech and language pathologists should go beyond assessment. Clinical guidelines for management of speech, communication and swallowing need to be developed for individuals with progressive cerebellar ataxia. Copyright © 2013 Elsevier Inc. All rights reserved.

  17. Speech monitoring and phonologically-mediated eye gaze in language perception and production: a comparison using printed word eye-tracking

    Science.gov (United States)

    Gauvin, Hanna S.; Hartsuiker, Robert J.; Huettig, Falk

    2013-01-01

    The Perceptual Loop Theory of speech monitoring assumes that speakers routinely inspect their inner speech. In contrast, Huettig and Hartsuiker (2010) observed that listening to one's own speech during language production drives eye-movements to phonologically related printed words with a similar time-course as listening to someone else's speech does in speech perception experiments. This suggests that speakers use their speech perception system to listen to their own overt speech, but not to their inner speech. However, a direct comparison between production and perception with the same stimuli and participants is lacking so far. The current printed word eye-tracking experiment therefore used a within-subjects design, combining production and perception. Displays showed four words, of which one, the target, either had to be named or was presented auditorily. Accompanying words were phonologically related, semantically related, or unrelated to the target. There were small increases in looks to phonological competitors with a similar time-course in both production and perception. Phonological effects in perception however lasted longer and had a much larger magnitude. We conjecture that this difference is related to a difference in predictability of one's own and someone else's speech, which in turn has consequences for lexical competition in other-perception and possibly suppression of activation in self-perception. PMID:24339809

  18. Analysis of Salient Feature Jitter in the Cochlea for Objective Prediction of Temporally Localized Distortion in Synthesized Speech

    Directory of Open Access Journals (Sweden)

    Wenliang Lu

    2009-01-01

    Full Text Available Temporally localized distortions account for the highest variance in subjective evaluation of coded speech signals (Sen (2001 and Hall (2001. The ability to discern and decompose perceptually relevant temporally localized coding noise from other types of distortions is both of theoretical importance as well as a valuable tool for deploying and designing speech synthesis systems. The work described within uses a physiologically motivated cochlear model to provide a tractable analysis of salient feature trajectories as processed by the cochlea. Subsequent statistical analysis shows simple relationships between the jitter of these trajectories and temporal attributes of the Diagnostic Acceptability Measure (DAM.

  19. The Virtual Man Project's CD-ROM "Voice Assessment: Speech-Language Pathology and Audiology & Medicine", Vol.1

    Directory of Open Access Journals (Sweden)

    Millena Maria Ramalho Matta Vieira

    2009-01-01

    Full Text Available The CD-ROM "Voice Assessment: Speech-Language Pathology and Audiology & Medicine" was developed as a teaching tool for people interested in the production of the spoken or sung human voice. Its content comprises several subjects concerning the anatomy and physiology of spoken and sung voice. A careful assessment becomes necessary in order to ensure the effectiveness of teaching and learning educational materials, whether related to education or health, within the proposal of education mediated by technology. OBJECTIVE: This study aimed to evaluate the efficacy of the Virtual Man Project's CD-ROM "Voice Assessment: Speech-Language Pathology and Audiology & Medicine", as a self-learning material, in two different populations: Speech-Language Pathology and Audiology students and Lyrical Singing students. The participants were instructed to study the CD-ROM during 1 month and answer two questionnaires: one before and another one after studying the CD-ROM. The quantitative results were compared statistically by the Student's t-test at a significance level of 5%. RESULTS: Seventeen out of the 28 students who completed the study, were Speech-Language Pathology and Audiology students, while 11 were Lyrical Singing students (dropout rate of 44%. Comparison of the answers to the questionnaires before and after studying the CD-ROM showed a statistically significant increase of the scores for the questionnaire applied after studying the CD-ROM for both Speech-Language Pathology and Audiology and Lyrical Singing students, with p<0.001 and p<0.004, respectively. There was also a statistically significant difference in all topics of this questionnaire for both groups of students. CONCLUSION: The results concerning the evaluation of the Speech-Language Pathology and Audiology and Lyrical Singing students' knowledge before and after learning from the CD-ROM allowed concluding that the participants made significant improvement in their knowledge of the proposed

  20. The early maximum likelihood estimation model of audiovisual integration in speech perception

    DEFF Research Database (Denmark)

    Andersen, Tobias

    2015-01-01

    integration to speech perception along with three model variations. In early MLE, integration is based on a continuous internal representation before categorization, which can make the model more parsimonious by imposing constraints that reflect experimental designs. The study also shows that cross......Speech perception is facilitated by seeing the articulatory mouth movements of the talker. This is due to perceptual audiovisual integration, which also causes the McGurk−MacDonald illusion, and for which a comprehensive computational account is still lacking. Decades of research have largely......-validation can evaluate models of audiovisual integration based on typical data sets taking both goodness-of-fit and model flexibility into account. All models were tested on a published data set previously used for testing the FLMP. Cross-validation favored the early MLE while more conventional error measures...

  1. Speech-to-Speech Relay Service

    Science.gov (United States)

    Consumer Guide Speech to Speech Relay Service Speech-to-Speech (STS) is one form of Telecommunications Relay Service (TRS). TRS is a service that allows persons with hearing and speech disabilities ...

  2. Childhood apraxia of speech: A survey of praxis and typical speech characteristics.

    Science.gov (United States)

    Malmenholt, Ann; Lohmander, Anette; McAllister, Anita

    2017-07-01

    The purpose of this study was to investigate current knowledge of the diagnosis childhood apraxia of speech (CAS) in Sweden and compare speech characteristics and symptoms to those of earlier survey findings in mainly English-speakers. In a web-based questionnaire 178 Swedish speech-language pathologists (SLPs) anonymously answered questions about their perception of typical speech characteristics for CAS. They graded own assessment skills and estimated clinical occurrence. The seven top speech characteristics reported as typical for children with CAS were: inconsistent speech production (85%), sequencing difficulties (71%), oro-motor deficits (63%), vowel errors (62%), voicing errors (61%), consonant cluster deletions (54%), and prosodic disturbance (53%). Motor-programming deficits described as lack of automatization of speech movements were perceived by 82%. All listed characteristics were consistent with the American Speech-Language-Hearing Association (ASHA) consensus-based features, Strand's 10-point checklist, and the diagnostic model proposed by Ozanne. The mode for clinical occurrence was 5%. Number of suspected cases of CAS in the clinical caseload was approximately one new patient/year and SLP. The results support and add to findings from studies of CAS in English-speaking children with similar speech characteristics regarded as typical. Possibly, these findings could contribute to cross-linguistic consensus on CAS characteristics.

  3. A Causal Inference Model Explains Perception of the McGurk Effect and Other Incongruent Audiovisual Speech.

    Directory of Open Access Journals (Sweden)

    John F Magnotti

    2017-02-01

    Full Text Available Audiovisual speech integration combines information from auditory speech (talker's voice and visual speech (talker's mouth movements to improve perceptual accuracy. However, if the auditory and visual speech emanate from different talkers, integration decreases accuracy. Therefore, a key step in audiovisual speech perception is deciding whether auditory and visual speech have the same source, a process known as causal inference. A well-known illusion, the McGurk Effect, consists of incongruent audiovisual syllables, such as auditory "ba" + visual "ga" (AbaVga, that are integrated to produce a fused percept ("da". This illusion raises two fundamental questions: first, given the incongruence between the auditory and visual syllables in the McGurk stimulus, why are they integrated; and second, why does the McGurk effect not occur for other, very similar syllables (e.g., AgaVba. We describe a simplified model of causal inference in multisensory speech perception (CIMS that predicts the perception of arbitrary combinations of auditory and visual speech. We applied this model to behavioral data collected from 60 subjects perceiving both McGurk and non-McGurk incongruent speech stimuli. The CIMS model successfully predicted both the audiovisual integration observed for McGurk stimuli and the lack of integration observed for non-McGurk stimuli. An identical model without causal inference failed to accurately predict perception for either form of incongruent speech. The CIMS model uses causal inference to provide a computational framework for studying how the brain performs one of its most important tasks, integrating auditory and visual speech cues to allow us to communicate with others.

  4. Automated analysis of connected speech reveals early biomarkers of Parkinson's disease in patients with rapid eye movement sleep behaviour disorder.

    Science.gov (United States)

    Hlavnička, Jan; Čmejla, Roman; Tykalová, Tereza; Šonka, Karel; Růžička, Evžen; Rusz, Jan

    2017-02-02

    For generations, the evaluation of speech abnormalities in neurodegenerative disorders such as Parkinson's disease (PD) has been limited to perceptual tests or user-controlled laboratory analysis based upon rather small samples of human vocalizations. Our study introduces a fully automated method that yields significant features related to respiratory deficits, dysphonia, imprecise articulation and dysrhythmia from acoustic microphone data of natural connected speech for predicting early and distinctive patterns of neurodegeneration. We compared speech recordings of 50 subjects with rapid eye movement sleep behaviour disorder (RBD), 30 newly diagnosed, untreated PD patients and 50 healthy controls, and showed that subliminal parkinsonian speech deficits can be reliably captured even in RBD patients, which are at high risk of developing PD or other synucleinopathies. Thus, automated vocal analysis should soon be able to contribute to screening and diagnostic procedures for prodromal parkinsonian neurodegeneration in natural environments.

  5. A Diagnostic Marker to Discriminate Childhood Apraxia of Speech from Speech Delay: III. Theoretical Coherence of the Pause Marker with Speech Processing Deficits in Childhood Apraxia of Speech

    Science.gov (United States)

    Shriberg, Lawrence D.; Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.

    2017-01-01

    Purpose: Previous articles in this supplement described rationale for and development of the pause marker (PM), a diagnostic marker of childhood apraxia of speech (CAS), and studies supporting its validity and reliability. The present article assesses the theoretical coherence of the PM with speech processing deficits in CAS. Method: PM and other…

  6. Understanding perceptual boundaries in laparoscopic surgery.

    Science.gov (United States)

    Lamata, Pablo; Gomez, Enrique J; Hernández, Félix Lamata; Oltra Pastor, Alfonso; Sanchez-Margallo, Francisco Miquel; Del Pozo Guerrero, Francisco

    2008-03-01

    Human perceptual capabilities related to the laparoscopic interaction paradigm are not well known. Its study is important for the design of virtual reality simulators, and for the specification of augmented reality applications that overcome current limitations and provide a supersensing to the surgeon. As part of this work, this article addresses the study of laparoscopic pulling forces. Two definitions are proposed to focalize the problem: the perceptual fidelity boundary, limit of human perceptual capabilities, and the Utile fidelity boundary, that encapsulates the perceived aspects actually used by surgeons to guide an operation. The study is then aimed to define the perceptual fidelity boundary of laparoscopic pulling forces. This is approached with an experimental design in which surgeons assess the resistance against pulling of four different tissues, which are characterized with both in vivo interaction forces and ex vivo tissue biomechanical properties. A logarithmic law of tissue consistency perception is found comparing subjective valorizations with objective parameters. A model of this perception is developed identifying what the main parameters are: the grade of fixation of the organ, the tissue stiffness, the amount of tissue bitten, and the organ mass being pulled. These results are a clear requirement analysis for the force feedback algorithm of a virtual reality laparoscopic simulator. Finally, some discussion is raised about the suitability of augmented reality applications around this surgical gesture.

  7. Talker-specific learning in amnesia: Insight into mechanisms of adaptive speech perception.

    Science.gov (United States)

    Trude, Alison M; Duff, Melissa C; Brown-Schmidt, Sarah

    2014-05-01

    A hallmark of human speech perception is the ability to comprehend speech quickly and effortlessly despite enormous variability across talkers. However, current theories of speech perception do not make specific claims about the memory mechanisms involved in this process. To examine whether declarative memory is necessary for talker-specific learning, we tested the ability of amnesic patients with severe declarative memory deficits to learn and distinguish the accents of two unfamiliar talkers by monitoring their eye-gaze as they followed spoken instructions. Analyses of the time-course of eye fixations showed that amnesic patients rapidly learned to distinguish these accents and tailored perceptual processes to the voice of each talker. These results demonstrate that declarative memory is not necessary for this ability and points to the involvement of non-declarative memory mechanisms. These results are consistent with findings that other social and accommodative behaviors are preserved in amnesia and contribute to our understanding of the interactions of multiple memory systems in the use and understanding of spoken language. Copyright © 2014 Elsevier Ltd. All rights reserved.

  8. Exploring "fringe" consciousness: the subjective experience of perceptual fluency and its objective bases.

    Science.gov (United States)

    Reber, Rolf; Wurtz, Pascal; Zimmermann, Thomas D

    2004-03-01

    Perceptual fluency is the subjective experience of ease with which an incoming stimulus is processed. Although perceptual fluency is assessed by speed of processing, it remains unclear how objective speed is related to subjective experiences of fluency. We present evidence that speed at different stages of the perceptual process contributes to perceptual fluency. In an experiment, figure-ground contrast influenced detection of briefly presented words, but not their identification at longer exposure durations. Conversely, font in which the word was written influenced identification, but not detection. Both contrast and font influenced subjective fluency. These findings suggest that speed of processing at different stages condensed into a unified subjective experience of perceptual fluency.

  9. Effect of subthalamic stimulation on voice and speech in Parkinson´s disease: for the better or worse ?

    Directory of Open Access Journals (Sweden)

    Sabine eSkodda

    2014-01-01

    Full Text Available Background: Deep brain stimulation of the subthalamic nucleus, although highly effective for the treatment of motor impairment in Parkinson´s disease, can induce speech deterioration in a subgroup of patients. The aim of the current study was to survey 1 if there are distinctive stimulation effects on the different parameters of voice and speech and 2 if there is a special pattern of preexisting speech abnormalities indicating a risk for further worsening under stimulation. Methods: N = 38 patients with Parkinson´s disease had to perform a speech test without medication with stimulation ON and OFF. Speech samples were analysed: 1 according to a four-dimensional perceptual speech score and 2 by acoustic analysis to obtain quantifiable measures of distinctive speech parameters.Results: Quality of voice was ameliorated with stimulation ON, and there were trends to increased loudness and better pitch variability. N = 8 patients featured a deterioration of speech with stimulation ON, caused by worsening of articulation or/and fluency. These patients had more severe overall speech impairment with characteristic features of articulatory slurring and articulatory acceleration already under StimOFF condition.Conclusion: The influence of subthalamic stimulation on Parkinsonian speech differs considerably between individual patients, however, there is a trend to amelioration of voice quality and prosody. Patients with stimulation-associated speech deterioration featured higher overall speech impairment and showed a distinctive pattern of articulatory abnormalities at baseline. Further investigations to confirm these preliminary findings are necessary to allow neurologists to pre-surgically estimate the individual risk of deterioration of speech under stimulation.

  10. Vocal effectiveness of speech-language pathology students: Before and after voice use during service delivery

    OpenAIRE

    Couch, Stephanie; Zieba, Dominique; van der Linde, Jeannie; van der Merwe, Anita

    2015-01-01

    Background: As a professional voice user, it is imperative that a speech-language pathologist’s(SLP) vocal effectiveness remain consistent throughout the day. Many factors may contribute to reduced vocal effectiveness, including prolonged voice use, vocally abusive behaviours,poor vocal hygiene and environmental factors. Objectives: To determine the effect of service delivery on the perceptual and acoustic features of voice. Method: A quasi-experimental., pre-test–post-test research de...

  11. Perceptual organization and visual attention.

    Science.gov (United States)

    Kimchi, Ruth

    2009-01-01

    Perceptual organization--the processes structuring visual information into coherent units--and visual attention--the processes by which some visual information in a scene is selected--are crucial for the perception of our visual environment and to visuomotor behavior. Recent research points to important relations between attentional and organizational processes. Several studies demonstrated that perceptual organization constrains attentional selectivity, and other studies suggest that attention can also constrain perceptual organization. In this chapter I focus on two aspects of the relationship between perceptual organization and attention. The first addresses the question of whether or not perceptual organization can take place without attention. I present findings demonstrating that some forms of grouping and figure-ground segmentation can occur without attention, whereas others require controlled attentional processing, depending on the processes involved and the conditions prevailing for each process. These findings challenge the traditional view, which assumes that perceptual organization is a unitary entity that operates preattentively. The second issue addresses the question of whether perceptual organization can affect the automatic deployment of attention. I present findings showing that the mere organization of some elements in the visual field by Gestalt factors into a coherent perceptual unit (an "object"), with no abrupt onset or any other unique transient, can capture attention automatically in a stimulus-driven manner. Taken together, the findings discussed in this chapter demonstrate the multifaceted, interactive relations between perceptual organization and visual attention.

  12. Towards Unification of Methods for Speech, Audio, Picture and Multimedia Quality Assessment

    DEFF Research Database (Denmark)

    Zielinski, S.; Rumsey, F.; Bech, Søren

    2015-01-01

    attempting to “bridge the gap” between the quality assessment methods used in various disciplines are indicated. Prospective challenges faced by researchers in the unification process are outlined. They include development of unified scales, defining unified anchors, integration of objective models......The paper addresses the need to develop unified methods for subjective and objective quality assessment across speech, audio, picture, and multimedia applications. Commonalities and differences between the currently used standards are overviewed. Examples of the already undertaken research...

  13. One Way or Another: Evidence for Perceptual Asymmetry in Pre-attentive Learning of Non-native Contrasts

    Directory of Open Access Journals (Sweden)

    Liquan Liu

    2018-03-01

    Full Text Available Research investigating listeners’ neural sensitivity to speech sounds has largely focused on segmental features. We examined Australian English listeners’ perception and learning of a supra-segmental feature, pitch direction in a non-native tonal contrast, using a passive oddball paradigm and electroencephalography. The stimuli were two contours generated from naturally produced high-level and high-falling tones in Mandarin Chinese, differing only in pitch direction (Liu and Kager, 2014. While both contours had similar pitch onsets, the pitch offset of the falling contour was lower than that of the level one. The contrast was presented in two orientations (standard and deviant reversed and tested in two blocks with the order of block presentation counterbalanced. Mismatch negativity (MMN responses showed that listeners discriminated the non-native tonal contrast only in the second block, reflecting indications of learning through exposure during the first block. In addition, listeners showed a later MMN peak for their second block of test relative to listeners who did the same block first, suggesting linguistic (as opposed to acoustic processing or a misapplication of perceptual strategies from the first to the second block. The results also showed a perceptual asymmetry for change in pitch direction: listeners who encountered a falling tone deviant in the first block had larger frontal MMN amplitudes than listeners who encountered a level tone deviant in the first block. The implications of our findings for second language speech and the developmental trajectory for tone perception are discussed.

  14. Perceptual and cognitive biases in individuals with body dysmorphic disorder symptoms.

    Science.gov (United States)

    Clerkin, Elise M; Teachman, Bethany A

    2008-01-01

    Given the extreme focus on perceived physical defects in body dysmorphic disorder (BDD), we expected that perceptual and cognitive biases related to physical appearance would be associated with BDD symptomology. To examine these hypotheses, participants ( N = 70) high and low in BDD symptoms completed tasks assessing visual perception and cognition. As expected, there were significant group differences in self-, but not other-, relevant cognitive biases. Perceptual bias results were mixed, with some evidence indicating that individuals high (versus low) in BDD symptoms literally see themselves in a less positive light. Further, individuals high in BDD symptoms failed to demonstrate a normative self-enhancement bias. Overall, this research points to the importance of assessing both cognitive and perceptual biases associated with BDD symptoms, and suggests that visual perception may be influenced by non-visual factors.

  15. Auditory-Visual Speech Perception in Three- and Four-Year-Olds and Its Relationship to Perceptual Attunement and Receptive Vocabulary

    Science.gov (United States)

    Erdener, Dogu; Burnham, Denis

    2018-01-01

    Despite the body of research on auditory-visual speech perception in infants and schoolchildren, development in the early childhood period remains relatively uncharted. In this study, English-speaking children between three and four years of age were investigated for: (i) the development of visual speech perception--lip-reading and visual…

  16. Variability and Intelligibility of Clarified Speech to Different Listener Groups

    Science.gov (United States)

    Silber, Ronnie F.

    Two studies examined the modifications that adult speakers make in speech to disadvantaged listeners. Previous research that has focused on speech to the deaf individuals and to young children has shown that adults clarify speech when addressing these two populations. Acoustic measurements suggest that the signal undergoes similar changes for both populations. Perceptual tests corroborate these results for the deaf population, but are nonsystematic in developmental studies. The differences in the findings for these populations and the nonsystematic results in the developmental literature may be due to methodological factors. The present experiments addressed these methodological questions. Studies of speech to hearing impaired listeners have used read, nonsense, sentences, for which speakers received explicit clarification instructions and feedback, while in the child literature, excerpts of real-time conversations were used. Therefore, linguistic samples were not precisely matched. In this study, experiments used various linguistic materials. Experiment 1 used a children's story; experiment 2, nonsense sentences. Four mothers read both types of material in four ways: (1) in "normal" adult speech, (2) in "babytalk," (3) under the clarification instructions used in the "hearing impaired studies" (instructed clear speech) and (4) in (spontaneous) clear speech without instruction. No extra practice or feedback was given. Sentences were presented to 40 normal hearing college students with and without simultaneous masking noise. Results were separately tabulated for content and function words, and analyzed using standard statistical tests. The major finding in the study was individual variation in speaker intelligibility. "Real world" speakers vary in their baseline intelligibility. The four speakers also showed unique patterns of intelligibility as a function of each independent variable. Results were as follows. Nonsense sentences were less intelligible than story

  17. Lexical effects on speech production and intelligibility in Parkinson's disease

    Science.gov (United States)

    Chiu, Yi-Fang

    recognition process. The relationship between acoustic results and perceptual accuracy is limited in this study suggesting that listeners incorporate acoustic and non-acoustic information to maximize speech intelligibility.

  18. A Dynamic Speech Comprehension Test for Assessing Real-World Listening Ability.

    Science.gov (United States)

    Best, Virginia; Keidser, Gitte; Freeston, Katrina; Buchholz, Jörg M

    2016-07-01

    Many listeners with hearing loss report particular difficulties with multitalker communication situations, but these difficulties are not well predicted using current clinical and laboratory assessment tools. The overall aim of this work is to create new speech tests that capture key aspects of multitalker communication situations and ultimately provide better predictions of real-world communication abilities and the effect of hearing aids. A test of ongoing speech comprehension introduced previously was extended to include naturalistic conversations between multiple talkers as targets, and a reverberant background environment containing competing conversations. In this article, we describe the development of this test and present a validation study. Thirty listeners with normal hearing participated in this study. Speech comprehension was measured for one-, two-, and three-talker passages at three different signal-to-noise ratios (SNRs), and working memory ability was measured using the reading span test. Analyses were conducted to examine passage equivalence, learning effects, and test-retest reliability, and to characterize the effects of number of talkers and SNR. Although we observed differences in difficulty across passages, it was possible to group the passages into four equivalent sets. Using this grouping, we achieved good test-retest reliability and observed no significant learning effects. Comprehension performance was sensitive to the SNR but did not decrease as the number of talkers increased. Individual performance showed associations with age and reading span score. This new dynamic speech comprehension test appears to be valid and suitable for experimental purposes. Further work will explore its utility as a tool for predicting real-world communication ability and hearing aid benefit. American Academy of Audiology.

  19. Non-native Listeners’ Recognition of High-Variability Speech Using PRESTO

    Science.gov (United States)

    Tamati, Terrin N.; Pisoni, David B.

    2015-01-01

    Background Natural variability in speech is a significant challenge to robust successful spoken word recognition. In everyday listening environments, listeners must quickly adapt and adjust to multiple sources of variability in both the signal and listening environments. High-variability speech may be particularly difficult to understand for non-native listeners, who have less experience with the second language (L2) phonological system and less detailed knowledge of sociolinguistic variation of the L2. Purpose The purpose of this study was to investigate the effects of high-variability sentences on non-native speech recognition and to explore the underlying sources of individual differences in speech recognition abilities of non-native listeners. Research Design Participants completed two sentence recognition tasks involving high-variability and low-variability sentences. They also completed a battery of behavioral tasks and self-report questionnaires designed to assess their indexical processing skills, vocabulary knowledge, and several core neurocognitive abilities. Study Sample Native speakers of Mandarin (n = 25) living in the United States recruited from the Indiana University community participated in the current study. A native comparison group consisted of scores obtained from native speakers of English (n = 21) in the Indiana University community taken from an earlier study. Data Collection and Analysis Speech recognition in high-variability listening conditions was assessed with a sentence recognition task using sentences from PRESTO (Perceptually Robust English Sentence Test Open-Set) mixed in 6-talker multitalker babble. Speech recognition in low-variability listening conditions was assessed using sentences from HINT (Hearing In Noise Test) mixed in 6-talker multitalker babble. Indexical processing skills were measured using a talker discrimination task, a gender discrimination task, and a forced-choice regional dialect categorization task. Vocabulary

  20. Selected acoustic characteristics of emerging esophageal speech: Case study

    Directory of Open Access Journals (Sweden)

    Glenn Binder

    1978-11-01

    Full Text Available The development of esophageal speech was examined in a laryngectomee subject to observe the emergence of selected acoustic characteristics, and their relation to listener intelligibility ratings. Over a two-and-a-half  month period, the data from  five recording sessions was used for spectrographic and perceptual (listener analysis. There was evidence to suggest a fairly reliable correlation between emerging acoustic characteristics and increasing perceptual ratings. Acoustic factors coincident with increased intelligibility ratings appeared related to two dimensions: firstly, the increasing pseudoglottic control over esophageal air release; secondly the presence of  a mechanism of  pharyngeal compression. Increased pseudoglottic control manifested  in a reduction of  tracheo-esophageal turbulence, and a more efficient  burping mode of  vibration with clearer formant structure. Spectrographic evidence of  a fundamental  frequency  did not emerge. These dimensions appeared to have potential diagnostic and therapeutic value, rendering an analysis of  the patient's developing vocal performance  more explicit for  both clinician and patient.

  1. New tests of the distal speech rate effect: Examining cross-linguistic generalization

    Directory of Open Access Journals (Sweden)

    Laura eDilley

    2013-12-01

    Full Text Available Recent findings [Dilley and Pitt, 2010. Psych. Science. 21, 1664-1670] have shown that manipulating context speech rate in English can cause entire syllables to disappear or appear perceptually. The current studies tested two rate-based explanations of this phenomenon while attempting to replicate and extend these findings to another language, Russian. In Experiment 1, native Russian speakers listened to Russian sentences which had been subjected to rate manipulations and performed a lexical report task. Experiment 2 investigated speech rate effects in cross-language speech perception; non-native speakers of Russian of both high and low proficiency were tested on the same Russian sentences as in Experiment 1. They decided between two lexical interpretations of a critical portion of the sentence, where one choice contained more phonological material than the other (e.g., /stərʌ'na/ side vs. /strʌ'na/ country. In both experiments, with native and non-native speakers of Russian, context speech rate and the relative duration of the critical sentence portion were found to influence the amount of phonological material perceived. The results support the generalized rate normalization hypothesis, according to which the content perceived in a spectrally ambiguous stretch of speech depends on the duration of that content relative to the surrounding speech, while showing that the findings of Dilley and Pitt (2010 extend to a variety of morphosyntactic contexts and a new language, Russian. Findings indicate that relative timing cues across an utterance can be critical to accurate lexical perception by both native and non-native speakers.

  2. Automatic Speech Recognition Systems for the Evaluation of Voice and Speech Disorders in Head and Neck Cancer

    OpenAIRE

    Andreas Maier; Tino Haderlein; Florian Stelzle; Elmar Nöth; Emeka Nkenke; Frank Rosanowski; Anne Schützenberger; Maria Schuster

    2010-01-01

    In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appropriate for objective and quick evaluation of intelligibility. In this study we investigate the applicability of the method to speech disorders caused by head and neck cancer. Intelligibility was quantified by speech recognition on recordings of a standard text read by 41 German laryngect...

  3. Not all sounds sound the same: Parkinson's disease affects differently emotion processing in music and in speech prosody.

    Science.gov (United States)

    Lima, César F; Garrett, Carolina; Castro, São Luís

    2013-01-01

    Does emotion processing in music and speech prosody recruit common neurocognitive mechanisms? To examine this question, we implemented a cross-domain comparative design in Parkinson's disease (PD). Twenty-four patients and 25 controls performed emotion recognition tasks for music and spoken sentences. In music, patients had impaired recognition of happiness and peacefulness, and intact recognition of sadness and fear; this pattern was independent of general cognitive and perceptual abilities. In speech, patients had a small global impairment, which was significantly mediated by executive dysfunction. Hence, PD affected differently musical and prosodic emotions. This dissociation indicates that the mechanisms underlying the two domains are partly independent.

  4. Psychophysical indices of perceptual functioning in dyslexia: A psychometric analysis

    OpenAIRE

    Heath, Steve M.; Bishop, Dorothy V. M.; Hogben, John H.; Roach, Neil W.

    2006-01-01

    An influential causal theory attributes dyslexia to visual and/or auditory perceptual deficits. This theory derives from group differences between individuals with dyslexia and controls on a range of psychophysical tasks, but there is substantial variation, both between individuals within a group and from task to task. We addressed two questions. First, do psychophysical measures have sufficient reliability to assess perceptual deficits in individuals? Second, do different psychophysical task...

  5. A Preliminary Report on Disordered Speech with Deep Brain Stimulation in Individuals with Parkinson's Disease

    Directory of Open Access Journals (Sweden)

    Christopher Dromey

    2011-01-01

    Full Text Available Deep brain stimulation (DBS of the subthalamic nucleus (STN has proven effective in treating the major motor symptoms of advanced Parkinson's disease (PD. The aim of this study was to learn which laryngeal and articulatory acoustic features changed in patients who were reported to have worse speech with stimulation. Six volunteers with PD who had bilateral STN electrodes were recorded with DBS turned on or off. Perceptual ratings reflected poorer speech performance with DBS on. Acoustic measures of articulation (corner vowel formants, diphthong slopes, and a spirantization index and phonation (perturbation, long-term average spectrum as well as verbal fluency scores showed mixed results with DBS. Some speakers improved while others became worse on individual measures. The magnitude of DBS effects was not predictable based on the patients' demographic characteristics. Future research involving adjustments to stimulator settings or electrode placement may be beneficial in limiting the negative effects of DBS on speech.

  6. Perceptual Processing Affects Conceptual Processing

    Science.gov (United States)

    van Dantzig, Saskia; Pecher, Diane; Zeelenberg, Rene; Barsalou, Lawrence W.

    2008-01-01

    According to the Perceptual Symbols Theory of cognition (Barsalou, 1999), modality-specific simulations underlie the representation of concepts. A strong prediction of this view is that perceptual processing affects conceptual processing. In this study, participants performed a perceptual detection task and a conceptual property-verification task…

  7. Relations between perceptual measures of temporal processing, auditory-evoked brainstem responses and speech intelligibility in noise

    DEFF Research Database (Denmark)

    Papakonstantinou, Alexandra; Strelcyk, Olaf; Dau, Torsten

    2011-01-01

    This study investigates behavioural and objective measures of temporal auditory processing and their relation to the ability to understand speech in noise. The experiments were carried out on a homogeneous group of seven hearing-impaired listeners with normal sensitivity at low frequencies (up to 1...... kHz) and steeply sloping hearing losses above 1 kHz. For comparison, data were also collected for five normalhearing listeners. Temporal processing was addressed at low frequencies by means of psychoacoustical frequency discrimination, binaural masked detection and amplitude modulation (AM......) detection. In addition, auditory brainstem responses (ABRs) to clicks and broadband rising chirps were recorded. Furthermore, speech reception thresholds (SRTs) were determined for Danish sentences in speechshaped noise. The main findings were: (1) SRTs were neither correlated with hearing sensitivity...

  8. Musically cued gait-training improves both perceptual and motor timing in Parkinson's disease

    Directory of Open Access Journals (Sweden)

    Charles-Etienne eBenoit

    2014-07-01

    Full Text Available It is well established that auditory cueing improves gait in patients with Idiopathic Parkinson’s Disease (IPD. Disease-related reductions in speed and step length can be improved by providing rhythmical auditory cues via a metronome or music. However, effects on cognitive aspects of motor control have yet to be thoroughly investigated. If synchronization of movement to an auditory cue relies on a supramodal timing system involved in perceptual, motor and sensorimotor integration, auditory cueing can be expected to affect both motor and perceptual timing. Here we tested this hypothesis by assessing perceptual and motor timing in 15 IPD patients before and after a four-week music training program with rhythmic auditory cueing. Long-term effects were assessed one month after the end of the training. Perceptual and motor timing was evaluated with the Battery for the Assessment of Auditory Sensorimotor and Timing Abilities (BAASTA and compared to that of age-, gender-, and education-matched healthy controls. Prior to training, IPD patients exhibited impaired perceptual and motor timing. Training improved patients’ performance in tasks requiring synchronization with isochronous sequences, and enhanced their ability to adapt to durational changes in a sequence in hand tapping tasks. Benefits of cueing extended to time perception (duration discrimination and detection of misaligned beats in musical excerpts. The current results demonstrate that auditory cueing leads to benefits beyond gait and support the idea that coupling gait to rhythmic auditory cues in IPD patients relies on a neuronal network engaged in both perceptual and motor timing.

  9. Developmental profile of speech-language and communicative functions in an individual with the preserved speech variant of Rett syndrome.

    Science.gov (United States)

    Marschik, Peter B; Vollmann, Ralf; Bartl-Pokorny, Katrin D; Green, Vanessa A; van der Meer, Larah; Wolin, Thomas; Einspieler, Christa

    2014-08-01

    We assessed various aspects of speech-language and communicative functions of an individual with the preserved speech variant of Rett syndrome (RTT) to describe her developmental profile over a period of 11 years. For this study, we incorporated the following data resources and methods to assess speech-language and communicative functions during pre-, peri- and post-regressional development: retrospective video analyses, medical history data, parental checklists and diaries, standardized tests on vocabulary and grammar, spontaneous speech samples and picture stories to elicit narrative competences. Despite achieving speech-language milestones, atypical behaviours were present at all times. We observed a unique developmental speech-language trajectory (including the RTT typical regression) affecting all linguistic and socio-communicative sub-domains in the receptive as well as the expressive modality. Future research should take into consideration a potentially considerable discordance between formal and functional language use by interpreting communicative acts on a more cautionary note.

  10. Behavioral and electrophysiological evidence for early and automatic detection of phonological equivalence in variable speech inputs.

    Science.gov (United States)

    Kharlamov, Viktor; Campbell, Kenneth; Kazanina, Nina

    2011-11-01

    Speech sounds are not always perceived in accordance with their acoustic-phonetic content. For example, an early and automatic process of perceptual repair, which ensures conformity of speech inputs to the listener's native language phonology, applies to individual input segments that do not exist in the native inventory or to sound sequences that are illicit according to the native phonotactic restrictions on sound co-occurrences. The present study with Russian and Canadian English speakers shows that listeners may perceive phonetically distinct and licit sound sequences as equivalent when the native language system provides robust evidence for mapping multiple phonetic forms onto a single phonological representation. In Russian, due to an optional but productive t-deletion process that affects /stn/ clusters, the surface forms [sn] and [stn] may be phonologically equivalent and map to a single phonological form /stn/. In contrast, [sn] and [stn] clusters are usually phonologically distinct in (Canadian) English. Behavioral data from identification and discrimination tasks indicated that [sn] and [stn] clusters were more confusable for Russian than for English speakers. The EEG experiment employed an oddball paradigm with nonwords [asna] and [astna] used as the standard and deviant stimuli. A reliable mismatch negativity response was elicited approximately 100 msec postchange in the English group but not in the Russian group. These findings point to a perceptual repair mechanism that is engaged automatically at a prelexical level to ensure immediate encoding of speech inputs in phonological terms, which in turn enables efficient access to the meaning of a spoken utterance.

  11. Speech and nonspeech: What are we talking about?

    Science.gov (United States)

    Maas, Edwin

    2017-08-01

    Understanding of the behavioural, cognitive and neural underpinnings of speech production is of interest theoretically, and is important for understanding disorders of speech production and how to assess and treat such disorders in the clinic. This paper addresses two claims about the neuromotor control of speech production: (1) speech is subserved by a distinct, specialised motor control system and (2) speech is holistic and cannot be decomposed into smaller primitives. Both claims have gained traction in recent literature, and are central to a task-dependent model of speech motor control. The purpose of this paper is to stimulate thinking about speech production, its disorders and the clinical implications of these claims. The paper poses several conceptual and empirical challenges for these claims - including the critical importance of defining speech. The emerging conclusion is that a task-dependent model is called into question as its two central claims are founded on ill-defined and inconsistently applied concepts. The paper concludes with discussion of methodological and clinical implications, including the potential utility of diadochokinetic (DDK) tasks in assessment of motor speech disorders and the contraindication of nonspeech oral motor exercises to improve speech function.

  12. Empathy, Ways of Knowing, and Interdependence as Mediators of Gender Differences in Attitudes toward Hate Speech and Freedom of Speech

    Science.gov (United States)

    Cowan, Gloria; Khatchadourian, Desiree

    2003-01-01

    Women are more intolerant of hate speech than men. This study examined relationality measures as mediators of gender differences in the perception of the harm of hate speech and the importance of freedom of speech. Participants were 107 male and 123 female college students. Questionnaires assessed the perceived harm of hate speech, the importance…

  13. Acetylcholine and Olfactory Perceptual Learning

    Science.gov (United States)

    Wilson, Donald A.; Fletcher, Max L.; Sullivan, Regina M.

    2004-01-01

    Olfactory perceptual learning is a relatively long-term, learned increase in perceptual acuity, and has been described in both humans and animals. Data from recent electrophysiological studies have indicated that olfactory perceptual learning may be correlated with changes in odorant receptive fields of neurons in the olfactory bulb and piriform…

  14. Use of Speech Analyses within a Mobile Application for the Assessment of Cognitive Impairment in Elderly People.

    Science.gov (United States)

    Konig, Alexandra; Satt, Aharon; Sorin, Alex; Hoory, Ran; Derreumaux, Alexandre; David, Renaud; Robert, Phillippe H

    2018-01-01

    Various types of dementia and Mild Cognitive Impairment (MCI) are manifested as irregularities in human speech and language, which have proven to be strong predictors for the disease presence and progress ion. Therefore, automatic speech analytics provided by a mobile application may be a useful tool in providing additional indicators for assessment and detection of early stage dementia and MCI. 165 participants (subjects with subjective cognitive impairment (SCI), MCI patients, Alzheimer's disease (AD) and mixed dementia (MD) patients) were recorded with a mobile application while performing several short vocal cognitive tasks during a regular consultation. These tasks included verbal fluency, picture description, counting down and a free speech task. The voice recordings were processed in two steps: in the first step, vocal markers were extracted using speech signal processing techniques; in the second, the vocal markers were tested to assess their 'power' to distinguish between SCI, MCI, AD and MD. The second step included training automatic classifiers for detecting MCI and AD, based on machine learning methods, and testing the detection accuracy. The fluency and free speech tasks obtain the highest accuracy rates of classifying AD vs. MD vs. MCI vs. SCI. Using the data, we demonstrated classification accuracy as follows: SCI vs. AD = 92% accuracy; SCI vs. MD = 92% accuracy; SCI vs. MCI = 86% accuracy and MCI vs. AD = 86%. Our results indicate the potential value of vocal analytics and the use of a mobile application for accurate automatic differentiation between SCI, MCI and AD. This tool can provide the clinician with meaningful information for assessment and monitoring of people with MCI and AD based on a non-invasive, simple and low-cost method. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  15. Perceptual dimensions differentiate emotions.

    Science.gov (United States)

    Cavanaugh, Lisa A; MacInnis, Deborah J; Weiss, Allen M

    2015-08-26

    Individuals often describe objects in their world in terms of perceptual dimensions that span a variety of modalities; the visual (e.g., brightness: dark-bright), the auditory (e.g., loudness: quiet-loud), the gustatory (e.g., taste: sour-sweet), the tactile (e.g., hardness: soft vs. hard) and the kinaesthetic (e.g., speed: slow-fast). We ask whether individuals use perceptual dimensions to differentiate emotions from one another. Participants in two studies (one where respondents reported on abstract emotion concepts and a second where they reported on specific emotion episodes) rated the extent to which features anchoring 29 perceptual dimensions (e.g., temperature, texture and taste) are associated with 8 emotions (anger, fear, sadness, guilt, contentment, gratitude, pride and excitement). Results revealed that in both studies perceptual dimensions differentiate positive from negative emotions and high arousal from low arousal emotions. They also differentiate among emotions that are similar in arousal and valence (e.g., high arousal negative emotions such as anger and fear). Specific features that anchor particular perceptual dimensions (e.g., hot vs. cold) are also differentially associated with emotions.

  16. Speech Signal and Facial Image Processing for Obstructive Sleep Apnea Assessment

    Directory of Open Access Journals (Sweden)

    Fernando Espinoza-Cuadros

    2015-01-01

    Full Text Available Obstructive sleep apnea (OSA is a common sleep disorder characterized by recurring breathing pauses during sleep caused by a blockage of the upper airway (UA. OSA is generally diagnosed through a costly procedure requiring an overnight stay of the patient at the hospital. This has led to proposing less costly procedures based on the analysis of patients’ facial images and voice recordings to help in OSA detection and severity assessment. In this paper we investigate the use of both image and speech processing to estimate the apnea-hypopnea index, AHI (which describes the severity of the condition, over a population of 285 male Spanish subjects suspected to suffer from OSA and referred to a Sleep Disorders Unit. Photographs and voice recordings were collected in a supervised but not highly controlled way trying to test a scenario close to an OSA assessment application running on a mobile device (i.e., smartphones or tablets. Spectral information in speech utterances is modeled by a state-of-the-art low-dimensional acoustic representation, called i-vector. A set of local craniofacial features related to OSA are extracted from images after detecting facial landmarks using Active Appearance Models (AAMs. Support vector regression (SVR is applied on facial features and i-vectors to estimate the AHI.

  17. Searching for sources of variance in speech recognition: Young adults with normal hearing

    Science.gov (United States)

    Watson, Charles S.; Kidd, Gary R.

    2005-04-01

    In the present investigation, sensory-perceptual abilities of one thousand young adults with normal hearing are being evaluated with a range of auditory, visual, and cognitive measures. Four auditory measures were derived from factor-analytic analyses of previous studies with 18-20 speech and non-speech variables [G. R. Kidd et al., J. Acoust. Soc. Am. 108, 2641 (2000)]. Two measures of visual acuity are obtained to determine whether variation in sensory skills tends to exist primarily within or across sensory modalities. A working memory test, grade point average, and Scholastic Aptitude Test scores (Verbal and Quantitative) are also included. Preliminary multivariate analyses support previous studies of individual differences in auditory abilities (e.g., A. M. Surprenant and C. S. Watson, J. Acoust. Soc. Am. 110, 2085-2095 (2001)] which found that spectral and temporal resolving power obtained with pure tones and more complex unfamiliar stimuli have little or no correlation with measures of speech recognition under difficult listening conditions. The current findings show that visual acuity, working memory, and intellectual measures are also very poor predictors of speech recognition ability, supporting the independence of this processing skill. Remarkable performance by some exceptional listeners will be described. [Work supported by the Office of Naval Research, Award No. N000140310644.

  18. Effect of the Number of Presentations on Listener Transcriptions and Reliability in the Assessment of Speech Intelligibility in Children

    Science.gov (United States)

    Lagerberg, Tove B.; Johnels, Jakob Åsberg; Hartelius, Lena; Persson, Christina

    2015-01-01

    Background: The assessment of intelligibility is an essential part of establishing the severity of a speech disorder. The intelligibility of a speaker is affected by a number of different variables relating, "inter alia," to the speech material, the listener and the listener task. Aims: To explore the impact of the number of…

  19. Assessing Speech Intelligibility in Children with Hearing Loss: Toward Revitalizing a Valuable Clinical Tool

    Science.gov (United States)

    Ertmer, David J.

    2011-01-01

    Background: Newborn hearing screening, early intervention programs, and advancements in cochlear implant and hearing aid technology have greatly increased opportunities for children with hearing loss to become intelligible talkers. Optimizing speech intelligibility requires that progress be monitored closely. Although direct assessment of…

  20. An analysis of machine translation and speech synthesis in speech-to-speech translation system

    OpenAIRE

    Hashimoto, K.; Yamagishi, J.; Byrne, W.; King, S.; Tokuda, K.

    2011-01-01

    This paper provides an analysis of the impacts of machine translation and speech synthesis on speech-to-speech translation systems. The speech-to-speech translation system consists of three components: speech recognition, machine translation and speech synthesis. Many techniques for integration of speech recognition and machine translation have been proposed. However, speech synthesis has not yet been considered. Therefore, in this paper, we focus on machine translation and speech synthesis, ...

  1. Direct speech quotations promote low relative-clause attachment in silent reading of English.

    Science.gov (United States)

    Yao, Bo; Scheepers, Christoph

    2018-07-01

    The implicit prosody hypothesis (Fodor, 1998, 2002) proposes that silent reading coincides with a default, implicit form of prosody to facilitate sentence processing. Recent research demonstrated that a more vivid form of implicit prosody is mentally simulated during silent reading of direct speech quotations (e.g., Mary said, "This dress is beautiful"), with neural and behavioural consequences (e.g., Yao, Belin, & Scheepers, 2011; Yao & Scheepers, 2011). Here, we explored the relation between 'default' and 'simulated' implicit prosody in the context of relative-clause (RC) attachment in English. Apart from confirming a general low RC-attachment preference in both production (Experiment 1) and comprehension (Experiments 2 and 3), we found that during written sentence completion (Experiment 1) or when reading silently (Experiment 2), the low RC-attachment preference was reliably enhanced when the critical sentences were embedded in direct speech quotations as compared to indirect speech or narrative sentences. However, when reading aloud (Experiment 3), direct speech did not enhance the general low RC-attachment preference. The results from Experiments 1 and 2 suggest a quantitative boost to implicit prosody (via auditory perceptual simulation) during silent production/comprehension of direct speech. By contrast, when reading aloud (Experiment 3), prosody becomes equally salient across conditions due to its explicit nature; indirect speech and narrative sentences thus become as susceptible to prosody-induced syntactic biases as direct speech. The present findings suggest a shared cognitive basis between default implicit prosody and simulated implicit prosody, providing a new platform for studying the effects of implicit prosody on sentence processing. Copyright © 2018 Elsevier B.V. All rights reserved.

  2. Neuroanatomical and cognitive mediators of age-related differences in perceptual priming and learning

    OpenAIRE

    Kennedy, Kristen M.; Rodrigue, Karen M.; Head, Denise; Gunning-Dixon, Faith; Raz, Naftali

    2009-01-01

    Our objectives were to assess age differences in perceptual repetition priming and perceptual skill learning, and to determine whether they are mediated by cognitive resources and regional cerebral volume differences. Fragmented picture identification paradigm allows the study of both priming and learning within the same task. We presented this task to 169 adults (ages 18–80), assessed working memory and fluid intelligence, and measured brain volumes of regions that were deemed relevant to th...

  3. The Relationship Between Acoustic Signal Typing and Perceptual Evaluation of Tracheoesophageal Voice Quality for Sustained Vowels.

    Science.gov (United States)

    Clapham, Renee P; van As-Brooks, Corina J; van Son, Rob J J H; Hilgers, Frans J M; van den Brekel, Michiel W M

    2015-07-01

    To investigate the relationship between acoustic signal typing and perceptual evaluation of sustained vowels produced by tracheoesophageal (TE) speakers and the use of signal typing in the clinical setting. Two evaluators independently categorized 1.75-second segments of narrow-band spectrograms according to acoustic signal typing and independently evaluated the recording of the same segments on a visual analog scale according to overall perceptual acoustic voice quality. The relationship between acoustic signal typing and overall voice quality (as a continuous scale and as a four-point ordinal scale) was investigated and the proportion of inter-rater agreement as well as the reliability between the two measures is reported. The agreement between signal type (I-IV) and ordinal voice quality (four-point scale) was low but significant, and there was a significant linear relationship between the variables. Signal type correctly predicted less than half of the voice quality data. There was a significant main effect of signal type on continuous voice quality scores with significant differences in median quality scores between signal types I-IV, I-III, and I-II. Signal typing can be used as an adjunct to perceptual and acoustic evaluation of the same stimuli for TE speech as part of a multidimensional evaluation protocol. Signal typing in its current form provides limited predictive information on voice quality, and there is significant overlap between signal types II and III and perceptual categories. Future work should consider whether the current four signal types could be refined. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  4. Is Hand Selection Modulated by Cognitive-perceptual Load?

    Science.gov (United States)

    Liang, Jiali; Wilkinson, Krista; Sainburg, Robert L

    2018-01-15

    Previous studies proposed that selecting which hand to use for a reaching task appears to be modulated by a factor described as "task difficulty". However, what features of a task might contribute to greater or lesser "difficulty" in the context of hand selection decisions has yet to be determined. There has been evidence that biomechanical and kinematic factors such as movement smoothness and work can predict patterns of selection across the workspace, suggesting a role of predictive cost analysis in hand-selection. We hypothesize that this type of prediction for hand-selection should recruit substantial cognitive resources and thus should be influenced by cognitive-perceptual loading. We test this hypothesis by assessing the role of cognitive-perceptual loading on hand selection decisions, using a visual search task that presents different levels of difficulty (cognitive-perceptual load), as established in previous studies on overall response time and efficiency of visual search. Although the data are necessarily preliminary due to small sample size, our data suggested an influence of cognitive-perceptual load on hand selection, such that the dominant hand was selected more frequently as cognitive load increased. Interestingly, cognitive-perceptual loading also increased cross-midline reaches with both hands. Because crossing midline is more costly in terms of kinematic and kinetic factors, our findings suggest that cognitive processes are normally engaged to avoid costly actions, and that the choice not-to-cross midline requires cognitive resources. Copyright © 2017 IBRO. Published by Elsevier Ltd. All rights reserved.

  5. Development and evaluation of the British English coordinate response measure speech-in-noise test as an occupational hearing assessment tool.

    Science.gov (United States)

    Semeraro, Hannah D; Rowan, Daniel; van Besouw, Rachel M; Allsopp, Adrian A

    2017-10-01

    The studies described in this article outline the design and development of a British English version of the coordinate response measure (CRM) speech-in-noise (SiN) test. Our interest in the CRM is as a SiN test with high face validity for occupational auditory fitness for duty (AFFD) assessment. Study 1 used the method of constant stimuli to measure and adjust the psychometric functions of each target word, producing a speech corpus with equal intelligibility. After ensuring all the target words had similar intelligibility, for Studies 2 and 3, the CRM was presented in an adaptive procedure in stationary speech-spectrum noise to measure speech reception thresholds and evaluate the test-retest reliability of the CRM SiN test. Studies 1 (n = 20) and 2 (n = 30) were completed by normal-hearing civilians. Study 3 (n = 22) was completed by hearing impaired military personnel. The results display good test-retest reliability (95% confidence interval (CI) hearing impairment. The British English CRM using stationary speech-spectrum noise is a "ready to use" SiN test, suitable for investigation as an AFFD assessment tool for military personnel.

  6. Religion, hate speech, and non-domination

    OpenAIRE

    Bonotti, Matteo

    2017-01-01

    In this paper I argue that one way of explaining what is wrong with hate speech is by critically assessing what kind of freedom free speech involves and, relatedly, what kind of freedom hate speech undermines. More specifically, I argue that the main arguments for freedom of speech (e.g. from truth, from autonomy, and from democracy) rely on a “positive” conception of freedom intended as autonomy and self-mastery (Berlin, 2006), and can only partially help us to understand what is wrong with ...

  7. Relationship between the stuttering severity index and speech rate

    Directory of Open Access Journals (Sweden)

    Claudia Regina Furquim de Andrade

    Full Text Available CONTEXT: The speech rate is one of the parameters considered when investigating speech fluency and is an important variable in the assessment of individuals with communication complaints. OBJECTIVE: To correlate the stuttering severity index with one of the indices used for assessing fluency/speech rate. DESIGN: Cross-sectional study. SETTING: Fluency and Fluency Disorders Investigation Laboratory, Faculdade de Medicina da Universidade de São Paulo. PARTICIPANTS: Seventy adults with stuttering diagnosis. MAIN MEASUREMENTS: A speech sample from each participant containing at least 200 fluent syllables was videotaped and analyzed according to a stuttering severity index test and speech rate parameters. RESULTS: The results obtained in this study indicate that the stuttering severity and the speech rate present significant variation, i.e., the more severe the stuttering is, the lower the speech rate in words and syllables per minute. DISCUSSION AND CONCLUSION: The results suggest that speech rate is an important indicator of fluency levels and should be incorporated in the assessment and treatment of stuttering. This study represents a first attempt to identify the possible subtypes of developmental stuttering. DEFINITION: Objective tests that quantify diseases are important in their diagnosis, treatment and prognosis.

  8. Effects of prior information on decoding degraded speech: an fMRI study.

    Science.gov (United States)

    Clos, Mareike; Langner, Robert; Meyer, Martin; Oechslin, Mathias S; Zilles, Karl; Eickhoff, Simon B

    2014-01-01

    Expectations and prior knowledge are thought to support the perceptual analysis of incoming sensory stimuli, as proposed by the predictive-coding framework. The current fMRI study investigated the effect of prior information on brain activity during the decoding of degraded speech stimuli. When prior information enabled the comprehension of the degraded sentences, the left middle temporal gyrus and the left angular gyrus were activated, highlighting a role of these areas in meaning extraction. In contrast, the activation of the left inferior frontal gyrus (area 44/45) appeared to reflect the search for meaningful information in degraded speech material that could not be decoded because of mismatches with the prior information. Our results show that degraded sentences evoke instantaneously different percepts and activation patterns depending on the type of prior information, in line with prediction-based accounts of perception. Copyright © 2012 Wiley Periodicals, Inc.

  9. Exploring the link between cognitive abilities and speech recognition in the elderly under different listening conditions

    DEFF Research Database (Denmark)

    Nuesse, Theresa; Steenken, Rike; Neher, Tobias

    2018-01-01

    , and it has been suggested that differences in cognitive abilities may also be important. The objective of this study was to investigate associations between performance in cognitive tasks and speech recognition under different listening conditions in older adults with either age appropriate hearing...... or hearing-impairment. To that end, speech recognition threshold (SRT) measurements were performed under several masking conditions that varied along the perceptual dimensions of dip listening, spatial separation, and informational masking. In addition, a neuropsychological test battery was administered......, which included measures of verbal working- and short-term memory, executive functioning, selective and divided attention, and lexical and semantic abilities. Age-matched groups of older adults with either age-appropriate hearing (ENH, N = 20) or aided hearing impairment (EHI, N = 21) participated...

  10. Speech profile of patients undergoing primary palatoplasty.

    Science.gov (United States)

    Menegueti, Katia Ignacio; Mangilli, Laura Davison; Alonso, Nivaldo; Andrade, Claudia Regina Furquim de

    2017-10-26

    To characterize the profile and speech characteristics of patients undergoing primary palatoplasty in a Brazilian university hospital, considering the time of intervention (early, before two years of age; late, after two years of age). Participants were 97 patients of both genders with cleft palate and/or cleft and lip palate, assigned to the Speech-language Pathology Department, who had been submitted to primary palatoplasty and presented no prior history of speech-language therapy. Patients were divided into two groups: early intervention group (EIG) - 43 patients undergoing primary palatoplasty before 2 years of age and late intervention group (LIG) - 54 patients undergoing primary palatoplasty after 2 years of age. All patients underwent speech-language pathology assessment. The following parameters were assessed: resonance classification, presence of nasal turbulence, presence of weak intraoral air pressure, presence of audible nasal air emission, speech understandability, and compensatory articulation disorder (CAD). At statistical significance level of 5% (p≤0.05), no significant difference was observed between the groups in the following parameters: resonance classification (p=0.067); level of hypernasality (p=0.113), presence of nasal turbulence (p=0.179); presence of weak intraoral air pressure (p=0.152); presence of nasal air emission (p=0.369), and speech understandability (p=0.113). The groups differed with respect to presence of compensatory articulation disorders (p=0.020), with the LIG presenting higher occurrence of altered phonemes. It was possible to assess the general profile and speech characteristics of the study participants. Patients submitted to early primary palatoplasty present better speech profile.

  11. Primary progressive apraxia of speech: clinical features and acoustic and neurologic correlates.

    Science.gov (United States)

    Duffy, Joseph R; Strand, Edythe A; Clark, Heather; Machulda, Mary; Whitwell, Jennifer L; Josephs, Keith A

    2015-05-01

    This study summarizes 2 illustrative cases of a neurodegenerative speech disorder, primary progressive apraxia of speech (AOS), as a vehicle for providing an overview of the disorder and an approach to describing and quantifying its perceptual features and some of its temporal acoustic attributes. Two individuals with primary progressive AOS underwent speech-language and neurologic evaluations on 2 occasions, ranging from 2.0 to 7.5 years postonset. Performance on several tests, tasks, and rating scales, as well as several acoustic measures, were compared over time within and between cases. Acoustic measures were compared with performance of control speakers. Both patients initially presented with AOS as the only or predominant sign of disease and without aphasia or dysarthria. The presenting features and temporal progression were captured in an AOS Rating Scale, an Articulation Error Score, and temporal acoustic measures of utterance duration, syllable rates per second, rates of speechlike alternating motion and sequential motion, and a pairwise variability index measure. AOS can be the predominant manifestation of neurodegenerative disease. Clinical ratings of its attributes and acoustic measures of some of its temporal characteristics can support its diagnosis and help quantify its salient characteristics and progression over time.

  12. Exploring the role of brain oscillations in speech perception in noise: Intelligibility of isochronously retimed speech

    Directory of Open Access Journals (Sweden)

    Vincent Aubanel

    2016-08-01

    Full Text Available A growing body of evidence shows that brain oscillations track speech. This mechanism is thought to maximise processing efficiency by allocating resources to important speech information, effectively parsing speech into units of appropriate granularity for further decoding. However, some aspects of this mechanism remain unclear. First, while periodicity is an intrinsic property of this physiological mechanism, speech is only quasi-periodic, so it is not clear whether periodicity would present an advantage in processing. Second, it is still a matter of debate which aspect of speech triggers or maintains cortical entrainment, from bottom-up cues such as fluctuations of the amplitude envelope of speech to higher level linguistic cues such as syntactic structure. We present data from a behavioural experiment assessing the effect of isochronous retiming of speech on speech perception in noise. Two types of anchor points were defined for retiming speech, namely syllable onsets and amplitude envelope peaks. For each anchor point type, retiming was implemented at two hierarchical levels, a slow time scale around 2.5 Hz and a fast time scale around 4 Hz. Results show that while any temporal distortion resulted in reduced speech intelligibility, isochronous speech anchored to P-centers (approximated by stressed syllable vowel onsets was significantly more intelligible than a matched anisochronous retiming, suggesting a facilitative role of periodicity defined on linguistically motivated units in processing speech in noise.

  13. Tridimensional assessment of adductor spasmodic dysphonia pre- and post-treatment with Botulinum toxin.

    Science.gov (United States)

    Dejonckere, P H; Neumann, K J; Moerman, M B J; Martens, J P; Giordano, A; Manfredi, C

    2012-04-01

    Spasmodic dysphonia voices form, in the same way as substitution voices, a particular category of dysphonia that seems not suited for a standardized basic multidimensional assessment protocol, like the one proposed by the European Laryngological Society. Thirty-three exhaustive analyses were performed on voices of 19 patients diagnosed with adductor spasmodic dysphonia (SD), before and after treatment with Botulinum toxin. The speech material consisted of 40 short sentences phonetically selected for constant voicing. Seven perceptual parameters (traditional and dedicated) were blindly rated by a panel of experienced clinicians. Nine acoustic measures (mainly based on voicing evidence and periodicity) were achieved by a special analysis program suited for strongly irregular signals and validated with synthesized deviant voices. Patients also filled in a VHI-questionnaire. Significant improvement is shown by all three approaches. The traditional GRB perceptual parameters appear to be adequate for these patients. Conversely, the special acoustic analysis program is successful in objectivating the improved regularity of vocal fold vibration: the basic jitter remains the most valuable parameter, when reliably quantified. The VHI is well suited for the voice-related quality of life. Nevertheless, when considering pre-therapy and post-therapy changes, the current study illustrates a complete lack of correlation between the perceptual, acoustic, and self-assessment dimensions. Assessment of SD-voices needs to be tridimensional.

  14. Perceptual-cognitive skill and the in situ performance of soccer players.

    Science.gov (United States)

    van Maarseveen, Mariëtte J J; Oudejans, Raôul R D; Mann, David L; Savelsbergh, Geert J P

    2018-02-01

    Many studies have shown that experts possess better perceptual-cognitive skills than novices (e.g., in anticipation, decision making, pattern recall), but it remains unclear whether a relationship exists between performance on those tests of perceptual-cognitive skill and actual on-field performance. In this study, we assessed the in situ performance of skilled soccer players and related the outcomes to measures of anticipation, decision making, and pattern recall. In addition, we examined gaze behaviour when performing the perceptual-cognitive tests to better understand whether the underlying processes were related when those perceptual-cognitive tasks were performed. The results revealed that on-field performance could not be predicted on the basis of performance on the perceptual-cognitive tests. Moreover, there were no strong correlations between the level of performance on the different tests. The analysis of gaze behaviour revealed differences in search rate, fixation duration, fixation order, gaze entropy, and percentage viewing time when performing the test of pattern recall, suggesting that it is driven by different processes to those used for anticipation and decision making. Altogether, the results suggest that the perceptual-cognitive tests may not be as strong determinants of actual performance as may have previously been assumed.

  15. Categorical speech processing in Broca's area: an fMRI study using multivariate pattern-based analysis.

    Science.gov (United States)

    Lee, Yune-Sang; Turkeltaub, Peter; Granger, Richard; Raizada, Rajeev D S

    2012-03-14

    Although much effort has been directed toward understanding the neural basis of speech processing, the neural processes involved in the categorical perception of speech have been relatively less studied, and many questions remain open. In this functional magnetic resonance imaging (fMRI) study, we probed the cortical regions mediating categorical speech perception using an advanced brain-mapping technique, whole-brain multivariate pattern-based analysis (MVPA). Normal healthy human subjects (native English speakers) were scanned while they listened to 10 consonant-vowel syllables along the /ba/-/da/ continuum. Outside of the scanner, individuals' own category boundaries were measured to divide the fMRI data into /ba/ and /da/ conditions per subject. The whole-brain MVPA revealed that Broca's area and the left pre-supplementary motor area evoked distinct neural activity patterns between the two perceptual categories (/ba/ vs /da/). Broca's area was also found when the same analysis was applied to another dataset (Raizada and Poldrack, 2007), which previously yielded the supramarginal gyrus using a univariate adaptation-fMRI paradigm. The consistent MVPA findings from two independent datasets strongly indicate that Broca's area participates in categorical speech perception, with a possible role of translating speech signals into articulatory codes. The difference in results between univariate and multivariate pattern-based analyses of the same data suggest that processes in different cortical areas along the dorsal speech perception stream are distributed on different spatial scales.

  16. Integrated approaches to perceptual learning.

    Science.gov (United States)

    Jacobs, Robert A

    2010-04-01

    New technologies and new ways of thinking have recently led to rapid expansions in the study of perceptual learning. We describe three themes shared by many of the nine articles included in this topic on Integrated Approaches to Perceptual Learning. First, perceptual learning cannot be studied on its own because it is closely linked to other aspects of cognition, such as attention, working memory, decision making, and conceptual knowledge. Second, perceptual learning is sensitive to both the stimulus properties of the environment in which an observer exists and to the properties of the tasks that the observer needs to perform. Moreover, the environmental and task properties can be characterized through their statistical regularities. Finally, the study of perceptual learning has important implications for society, including implications for science education and medical rehabilitation. Contributed articles relevant to each theme are summarized. Copyright © 2010 Cognitive Science Society, Inc.

  17. Voice parameters and videonasolaryngoscopy in children with vocal nodules: a longitudinal study, before and after voice therapy.

    Science.gov (United States)

    Valadez, Victor; Ysunza, Antonio; Ocharan-Hernandez, Esther; Garrido-Bustamante, Norma; Sanchez-Valerio, Araceli; Pamplona, Ma C

    2012-09-01

    Vocal Nodules (VN) are a functional voice disorder associated with voice misuse and abuse in children. There are few reports addressing vocal parameters in children with VN, especially after a period of vocal rehabilitation. The purpose of this study is to describe measurements of vocal parameters including Fundamental Frequency (FF), Shimmer (S), and Jitter (J), videonasolaryngoscopy examination and clinical perceptual assessment, before and after voice therapy in children with VN. Voice therapy was provided using visual support through Speech-Viewer software. Twenty patients with VN were studied. An acoustical analysis of voice was performed and compared with data from subjects from a control group matched by age and gender. Also, clinical perceptual assessment of voice and videonasolaryngoscopy were performed to all patients with VN. After a period of voice therapy, provided with visual support using Speech Viewer-III (SV-III-IBM) software, new acoustical analyses, perceptual assessments and videonasolaryngoscopies were performed. Before the onset of voice therapy, there was a significant difference (ptherapy period, a significant improvement (pvocal nodules were no longer discernible on the vocal folds in any of the cases. SV-III software seems to be a safe and reliable method for providing voice therapy in children with VN. Acoustic voice parameters, perceptual data and videonasolaryngoscopy were significantly improved after the speech therapy period was completed. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  18. Audiovisual speech facilitates voice learning.

    Science.gov (United States)

    Sheffert, Sonya M; Olson, Elizabeth

    2004-02-01

    In this research, we investigated the effects of voice and face information on the perceptual learning of talkers and on long-term memory for spoken words. In the first phase, listeners were trained over several days to identify voices from words presented auditorily or audiovisually. The training data showed that visual information about speakers enhanced voice learning, revealing cross-modal connections in talker processing akin to those observed in speech processing. In the second phase, the listeners completed an auditory or audiovisual word recognition memory test in which equal numbers of words were spoken by familiar and unfamiliar talkers. The data showed that words presented by familiar talkers were more likely to be retrieved from episodic memory, regardless of modality. Together, these findings provide new information about the representational code underlying familiar talker recognition and the role of stimulus familiarity in episodic word recognition.

  19. State of the art in perceptual design of hearing aids

    Science.gov (United States)

    Edwards, Brent W.; van Tasell, Dianne J.

    2002-05-01

    Hearing aid capabilities have increased dramatically over the past six years, in large part due to the development of small, low-power digital signal processing chips suitable for hearing aid applications. As hearing aid signal processing capabilities increase, there will be new opportunities to apply perceptually based knowledge to technological development. Most hearing loss compensation techniques in today's hearing aids are based on simple estimates of audibility and loudness. As our understanding of the psychoacoustical and physiological characteristics of sensorineural hearing loss improves, the result should be improved design of hearing aids and fitting methods. The state of the art in hearing aids will be reviewed, including form factors, user requirements, and technology that improves speech intelligibility, sound quality, and functionality. General areas of auditory perception that remain unaddressed by current hearing aid technology will be discussed.

  20. Neural Entrainment to Speech Modulates Speech Intelligibility

    NARCIS (Netherlands)

    Riecke, Lars; Formisano, Elia; Sorger, Bettina; Baskent, Deniz; Gaudrain, Etienne

    2018-01-01

    Speech is crucial for communication in everyday life. Speech-brain entrainment, the alignment of neural activity to the slow temporal fluctuations (envelope) of acoustic speech input, is a ubiquitous element of current theories of speech processing. Associations between speech-brain entrainment and

  1. Large Scale Functional Brain Networks Underlying Temporal Integration of Audio-Visual Speech Perception: An EEG Study.

    Science.gov (United States)

    Kumar, G Vinodh; Halder, Tamesh; Jaiswal, Amit K; Mukherjee, Abhishek; Roy, Dipanjan; Banerjee, Arpan

    2016-01-01

    Observable lip movements of the speaker influence perception of auditory speech. A classical example of this influence is reported by listeners who perceive an illusory (cross-modal) speech sound (McGurk-effect) when presented with incongruent audio-visual (AV) speech stimuli. Recent neuroimaging studies of AV speech perception accentuate the role of frontal, parietal, and the integrative brain sites in the vicinity of the superior temporal sulcus (STS) for multisensory speech perception. However, if and how does the network across the whole brain participates during multisensory perception processing remains an open question. We posit that a large-scale functional connectivity among the neural population situated in distributed brain sites may provide valuable insights involved in processing and fusing of AV speech. Varying the psychophysical parameters in tandem with electroencephalogram (EEG) recordings, we exploited the trial-by-trial perceptual variability of incongruent audio-visual (AV) speech stimuli to identify the characteristics of the large-scale cortical network that facilitates multisensory perception during synchronous and asynchronous AV speech. We evaluated the spectral landscape of EEG signals during multisensory speech perception at varying AV lags. Functional connectivity dynamics for all sensor pairs was computed using the time-frequency global coherence, the vector sum of pairwise coherence changes over time. During synchronous AV speech, we observed enhanced global gamma-band coherence and decreased alpha and beta-band coherence underlying cross-modal (illusory) perception compared to unisensory perception around a temporal window of 300-600 ms following onset of stimuli. During asynchronous speech stimuli, a global broadband coherence was observed during cross-modal perception at earlier times along with pre-stimulus decreases of lower frequency power, e.g., alpha rhythms for positive AV lags and theta rhythms for negative AV lags. Thus, our

  2. Simulation of Human Speech Production Applied to the Study and Synthesis of European Portuguese

    Directory of Open Access Journals (Sweden)

    Francisco A. C. Vaz

    2005-06-01

    Full Text Available A new articulatory synthesizer (SAPWindows, with a modular and flexible design, is described. A comprehensive acoustic model and a new interactive glottal source were implemented. Perceptual tests and simulations made possible by the synthesizer contributed to deepening our knowledge of one of the most important characteristics of European Portuguese, the nasal vowels. First attempts at incorporating models of frication into the articulatory synthesizer are presented, demonstrating the potential of performing fricative synthesis based on broad articulatory configurations. Synthesis of nonsense words and Portuguese words with vowels and nasal consonants is also shown. Despite not being capable of competing with mainstream concatenative speech synthesis, the anthropomorphic approach to speech synthesis, known as articulatory synthesis, proved to be a valuable tool for phonetics research and teaching. This was particularly true for the European Portuguese nasal vowels.

  3. Studying Real-World Perceptual Expertise

    Directory of Open Access Journals (Sweden)

    Jianhong eShen

    2014-08-01

    Full Text Available Significant insights into visual cognition have come from studying real-world perceptual expertise. Many have previously reviewed empirical findings and theoretical developments from this work. Here we instead provide a brief perspective on approaches, considerations, and challenges to studying real-world perceptual expertise. We discuss factors like choosing to use real-world versus artificial object domains of expertise, selecting a target domain of real-world perceptual expertise, recruiting experts, evaluating their level of expertise, and experimentally testing experts in the lab and online. Throughout our perspective, we highlight expert birding (also called birdwatching as an example, as it has been used as a target domain for over two decades in the perceptual expertise literature.

  4. Longitudinal Study of Speech Perception, Speech, and Language for Children with Hearing Loss in an Auditory-Verbal Therapy Program

    Science.gov (United States)

    Dornan, Dimity; Hickson, Louise; Murdoch, Bruce; Houston, Todd

    2009-01-01

    This study examined the speech perception, speech, and language developmental progress of 25 children with hearing loss (mean Pure-Tone Average [PTA] 79.37 dB HL) in an auditory verbal therapy program. Children were tested initially and then 21 months later on a battery of assessments. The speech and language results over time were compared with…

  5. Performance Study of Objective Speech Quality Measurement for Modern Wireless-VoIP Communications

    Directory of Open Access Journals (Sweden)

    Chan Wai-Yip

    2009-01-01

    Full Text Available Wireless-VoIP communications introduce perceptual degradations that are not present with traditional VoIP communications. This paper investigates the effects of such degradations on the performance of three state-of-the-art standard objective quality measurement algorithms—PESQ, P.563, and an "extended" E-model. The comparative study suggests that measurement performance is significantly affected by acoustic background noise type and level as well as speech codec and packet loss concealment strategy. On our data, PESQ attains superior overall performance and P.563 and E-model attain comparable performance figures.

  6. SCANDCLEFT RANDOMIZED TRIALS: SPEECH OUTCOMES IN 5-YEAR-OLDS WITH UCLP - velopharyngeal competency and hypernasality

    DEFF Research Database (Denmark)

    Lohmander, Anette; Persson, Christina; Willadsen, Elisabeth

    2017-01-01

    Background and aim: Adequate velopharyngeal function and speech are main goals in the treatment of cleft palate. The objective was to investigate if there were differences in velopharyngeal competency (VPC) and hypernasality at age 5 years in children with unilateral cleft lip and palate (UCLP...... (136 girls, 255 boys) were available and perceptually analysed. The main outcome measures were VPC and hypernasality from blinded assessments. Results: There were no statistically significant differences between the prevalences in the arms in any of the trials. VPC: Trial 1, A: 58%, B: 61%; Trial 2, A......: 57%, C: 54%; Trial 3, A: 35%, D: 51%. No hypernasality: Trial 1, A: 54%, B: 44%; Trial 2, A: 47%, C: 51%; Trial 3, A: 34%, D: 49%. Conclusions: No differences were found regarding VPC and hypernasality at age 5 years after different methods for primary palatal repair. The burden of care in terms...

  7. Eavesdropping with a Master: Leoš Janáček and the Music of Speech

    Directory of Open Access Journals (Sweden)

    Jonathan Pearl

    2006-07-01

    Full Text Available The composer Leos Janácek (1854-1928 has been noted for his interest in speech melodies. Little discussion has focused however on the field methods that he used in gathering them, nor on the products themselves. Janácek spent more than three decades, transcribing thousands of what he termed nápevky mluvy [tunelets of speech] in standard musical notation. The record that remains of these efforts is impressive both for its volume and its quality, as well as for its potential to reveal aspects of the perceptual overlap between music and language. Heretofore his pioneering efforts in the study of speech prosody and music perception have neither been recognized nor acknowledged. The present study provides a background for and an overview of the transcriptions, along with comparative musicological and linguistic analyses of the materials presented. With this analysis as a starting point, I indicate promising avenues for further collaborations between linguists and musicologists, seeking an integrated theory of music and language cognition.

  8. Perceptual-Auditory and Acoustical Analysis of the Voices of Transgender Women.

    Science.gov (United States)

    Schwarz, Karine; Fontanari, Anna Martha Vaitses; Costa, Angelo Brandelli; Soll, Bianca Machado Borba; da Silva, Dhiordan Cardoso; de Sá Villas-Bôas, Anna Paula; Cielo, Carla Aparecida; Bastilha, Gabriele Rodrigues; Ribeiro, Vanessa Veis; Dorfman, Maria Elza Kazumi Yamaguti; Lobato, Maria Inês Rodrigues

    2017-09-28

    Voice is an important gender marker in the transition process as a transgender individual accepts a new gender identity. The objectives of this study were to describe and relate aspects of a perceptual-auditory analysis and the fundamental frequency (F0) of male-to-female (MtF) transsexual individuals. A case-control study was carried out with individuals aged 19-52 years who attended the Gender Identity Program of the Hospital de Clínicas of Porto Alegre. Vocal recordings from the MtF transgender and cisgender individuals (vowel /a:/ and six phrases of Consensus Auditory Perceptual Evaluation Voice [CAPE-V]) were edited and randomly coded before storage in a Dropbox folder. The voices (vowel /a:/) were analyzed by consensus on the same day by two judge speech therapists who had more than 10 years of experience in the voice area using the GRBASI perceptual-auditory vocal evaluation scale. Acoustic analysis of the voices was performed using the advanced Multi-Dimensional Voice Program software. The resonance focus and the degrees of masculinity and femininity for each voice recording were determined by listening to the CAPE-V phrases, for the same judges. There were significant differences between the groups regarding a greater frequency of subjects with F0 between 80 and 150 Hz (P = 0.003), and a greater frequency of hypernasal resonant focus (P < 0.001) in the MtF cases and greater frequency of subjects with absence of roughness (P = 0.031) in the control group. The MtF group of individuals showed altered vertical resonant focus, more masculine voices, and lower fundamental frequencies. The control group showed a significant absence of roughness. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  9. Speech-enabled Computer-aided Translation

    DEFF Research Database (Denmark)

    Mesa-Lao, Bartolomé

    2014-01-01

    The present study has surveyed post-editor trainees’ views and attitudes before and after the introduction of speech technology as a front end to a computer-aided translation workbench. The aim of the survey was (i) to identify attitudes and perceptions among post-editor trainees before performing...... a post-editing task using automatic speech recognition (ASR); and (ii) to assess the degree to which post-editors’ attitudes and expectations to the use of speech technology changed after actually using it. The survey was based on two questionnaires: the first one administered before the participants...

  10. Investigating the Role of Working Memory in Speech-in-noise Identification for Listeners with Normal Hearing.

    Science.gov (United States)

    Füllgrabe, Christian; Rosen, Stuart

    2016-01-01

    With the advent of cognitive hearing science, increased attention has been given to individual differences in cognitive functioning and their explanatory power in accounting for inter-listener variability in understanding speech in noise (SiN). The psychological construct that has received most interest is working memory (WM), representing the ability to simultaneously store and process information. Common lore and theoretical models assume that WM-based processes subtend speech processing in adverse perceptual conditions, such as those associated with hearing loss or background noise. Empirical evidence confirms the association between WM capacity (WMC) and SiN identification in older hearing-impaired listeners. To assess whether WMC also plays a role when listeners without hearing loss process speech in acoustically adverse conditions, we surveyed published and unpublished studies in which the Reading-Span test (a widely used measure of WMC) was administered in conjunction with a measure of SiN identification. The survey revealed little or no evidence for an association between WMC and SiN performance. We also analysed new data from 132 normal-hearing participants sampled from across the adult lifespan (18-91 years), for a relationship between Reading-Span scores and identification of matrix sentences in noise. Performance on both tasks declined with age, and correlated weakly even after controlling for the effects of age and audibility (r = 0.39, p ≤ 0.001, one-tailed). However, separate analyses for different age groups revealed that the correlation was only significant for middle-aged and older groups but not for the young (< 40 years) participants.

  11. Perceptual learning modifies untrained pursuit eye movements.

    Science.gov (United States)

    Szpiro, Sarit F A; Spering, Miriam; Carrasco, Marisa

    2014-07-07

    Perceptual learning improves detection and discrimination of relevant visual information in mature humans, revealing sensory plasticity. Whether visual perceptual learning affects motor responses is unknown. Here we implemented a protocol that enabled us to address this question. We tested a perceptual response (motion direction estimation, in which observers overestimate motion direction away from a reference) and a motor response (voluntary smooth pursuit eye movements). Perceptual training led to greater overestimation and, remarkably, it modified untrained smooth pursuit. In contrast, pursuit training did not affect overestimation in either pursuit or perception, even though observers in both training groups were exposed to the same stimuli for the same time period. A second experiment revealed that estimation training also improved discrimination, indicating that overestimation may optimize perceptual sensitivity. Hence, active perceptual training is necessary to alter perceptual responses, and an acquired change in perception suffices to modify pursuit, a motor response. © 2014 ARVO.

  12. Speech comprehension difficulties in chronic tinnitus and its relation to hyperacusis

    Directory of Open Access Journals (Sweden)

    Veronika Vielsmeier

    2016-12-01

    Full Text Available AbstractObjectiveMany tinnitus patients complain about difficulties regarding speech comprehension. In spite of the high clinical relevance little is known about underlying mechanisms and predisposing factors. Here, we performed an exploratory investigation in a large sample of tinnitus patients to (1 estimate the prevalence of speech comprehension difficulties among tinnitus patients, to (2 compare subjective reports of speech comprehension difficulties with objective measurements in a standardized speech comprehension test and to (3 explore underlying mechanisms by analyzing the relationship between speech comprehension difficulties and peripheral hearing function (pure tone audiogram, as well as with co-morbid hyperacusis as a central auditory processing disorder. Subjects and MethodsSpeech comprehension was assessed in 361 tinnitus patients presenting between 07/2012 and 08/2014 at the Interdisciplinary Tinnitus Clinic at the University of Regensburg. The assessment included standard audiological assessment (pure tone audiometry, tinnitus pitch and loudness matching, the Goettingen sentence test (in quiet for speech audiometric evaluation, two questions about hyperacusis, and two questions about speech comprehension in quiet and noisy environments (How would you rate your ability to understand speech?; How would you rate your ability to follow a conversation when multiple people are speaking simultaneously?. Results Subjectively reported speech comprehension deficits are frequent among tinnitus patients, especially in noisy environments (cocktail party situation. 74.2% of all investigated patients showed disturbed speech comprehension (indicated by values above 21.5 dB SPL in the Goettingen sentence test. Subjective speech comprehension complaints (both in general and in noisy environment were correlated with hearing level and with audiologically-assessed speech comprehension ability. In contrast, co-morbid hyperacusis was only correlated

  13. Toward a Quantitative Basis for Assessment and Diagnosis of Apraxia of Speech

    Science.gov (United States)

    Haley, Katarina L.; Jacks, Adam; de Riesthal, Michael; Abou-Khalil, Rima; Roth, Heidi L.

    2012-01-01

    Purpose: We explored the reliability and validity of 2 quantitative approaches to document presence and severity of speech properties associated with apraxia of speech (AOS). Method: A motor speech evaluation was administered to 39 individuals with aphasia. Audio-recordings of the evaluation were presented to 3 experienced clinicians to determine…

  14. Does seeing an Asian face make speech sound more accented?

    Science.gov (United States)

    Zheng, Yi; Samuel, Arthur G

    2017-08-01

    Prior studies have reported that seeing an Asian face makes American English sound more accented. The current study investigates whether this effect is perceptual, or if it instead occurs at a later decision stage. We first replicated the finding that showing static Asian and Caucasian faces can shift people's reports about the accentedness of speech accompanying the pictures. When we changed the static pictures to dubbed videos, reducing the demand characteristics, the shift in reported accentedness largely disappeared. By including unambiguous items along with the original ambiguous items, we introduced a contrast bias and actually reversed the shift, with the Asian-face videos yielding lower judgments of accentedness than the Caucasian-face videos. By changing to a mixed rather than blocked design, so that the ethnicity of the videos varied from trial to trial, we eliminated the difference in accentedness rating. Finally, we tested participants' perception of accented speech using the selective adaptation paradigm. After establishing that an auditory-only accented adaptor shifted the perception of how accented test words are, we found that no such adaptation effect occurred when the adapting sounds relied on visual information (Asian vs. Caucasian videos) to influence the accentedness of an ambiguous auditory adaptor. Collectively, the results demonstrate that visual information can affect the interpretation, but not the perception, of accented speech.

  15. Internet Video Telephony Allows Speech Reading by Deaf Individuals and Improves Speech Perception by Cochlear Implant Users

    Science.gov (United States)

    Mantokoudis, Georgios; Dähler, Claudia; Dubach, Patrick; Kompis, Martin; Caversaccio, Marco D.; Senn, Pascal

    2013-01-01

    Objective To analyze speech reading through Internet video calls by profoundly hearing-impaired individuals and cochlear implant (CI) users. Methods Speech reading skills of 14 deaf adults and 21 CI users were assessed using the Hochmair Schulz Moser (HSM) sentence test. We presented video simulations using different video resolutions (1280×720, 640×480, 320×240, 160×120 px), frame rates (30, 20, 10, 7, 5 frames per second (fps)), speech velocities (three different speakers), webcameras (Logitech Pro9000, C600 and C500) and image/sound delays (0–500 ms). All video simulations were presented with and without sound and in two screen sizes. Additionally, scores for live Skype™ video connection and live face-to-face communication were assessed. Results Higher frame rate (>7 fps), higher camera resolution (>640×480 px) and shorter picture/sound delay (<100 ms) were associated with increased speech perception scores. Scores were strongly dependent on the speaker but were not influenced by physical properties of the camera optics or the full screen mode. There is a significant median gain of +8.5%pts (p = 0.009) in speech perception for all 21 CI-users if visual cues are additionally shown. CI users with poor open set speech perception scores (n = 11) showed the greatest benefit under combined audio-visual presentation (median speech perception +11.8%pts, p = 0.032). Conclusion Webcameras have the potential to improve telecommunication of hearing-impaired individuals. PMID:23359119

  16. Internet video telephony allows speech reading by deaf individuals and improves speech perception by cochlear implant users.

    Directory of Open Access Journals (Sweden)

    Georgios Mantokoudis

    Full Text Available OBJECTIVE: To analyze speech reading through Internet video calls by profoundly hearing-impaired individuals and cochlear implant (CI users. METHODS: Speech reading skills of 14 deaf adults and 21 CI users were assessed using the Hochmair Schulz Moser (HSM sentence test. We presented video simulations using different video resolutions (1280 × 720, 640 × 480, 320 × 240, 160 × 120 px, frame rates (30, 20, 10, 7, 5 frames per second (fps, speech velocities (three different speakers, webcameras (Logitech Pro9000, C600 and C500 and image/sound delays (0-500 ms. All video simulations were presented with and without sound and in two screen sizes. Additionally, scores for live Skype™ video connection and live face-to-face communication were assessed. RESULTS: Higher frame rate (>7 fps, higher camera resolution (>640 × 480 px and shorter picture/sound delay (<100 ms were associated with increased speech perception scores. Scores were strongly dependent on the speaker but were not influenced by physical properties of the camera optics or the full screen mode. There is a significant median gain of +8.5%pts (p = 0.009 in speech perception for all 21 CI-users if visual cues are additionally shown. CI users with poor open set speech perception scores (n = 11 showed the greatest benefit under combined audio-visual presentation (median speech perception +11.8%pts, p = 0.032. CONCLUSION: Webcameras have the potential to improve telecommunication of hearing-impaired individuals.

  17. Deficits in audiovisual speech perception in normal aging emerge at the level of whole-word recognition.

    Science.gov (United States)

    Stevenson, Ryan A; Nelms, Caitlin E; Baum, Sarah H; Zurkovsky, Lilia; Barense, Morgan D; Newhouse, Paul A; Wallace, Mark T

    2015-01-01

    Over the next 2 decades, a dramatic shift in the demographics of society will take place, with a rapid growth in the population of older adults. One of the most common complaints with healthy aging is a decreased ability to successfully perceive speech, particularly in noisy environments. In such noisy environments, the presence of visual speech cues (i.e., lip movements) provide striking benefits for speech perception and comprehension, but previous research suggests that older adults gain less from such audiovisual integration than their younger peers. To determine at what processing level these behavioral differences arise in healthy-aging populations, we administered a speech-in-noise task to younger and older adults. We compared the perceptual benefits of having speech information available in both the auditory and visual modalities and examined both phoneme and whole-word recognition across varying levels of signal-to-noise ratio. For whole-word recognition, older adults relative to younger adults showed greater multisensory gains at intermediate SNRs but reduced benefit at low SNRs. By contrast, at the phoneme level both younger and older adults showed approximately equivalent increases in multisensory gain as signal-to-noise ratio decreased. Collectively, the results provide important insights into both the similarities and differences in how older and younger adults integrate auditory and visual speech cues in noisy environments and help explain some of the conflicting findings in previous studies of multisensory speech perception in healthy aging. These novel findings suggest that audiovisual processing is intact at more elementary levels of speech perception in healthy-aging populations and that deficits begin to emerge only at the more complex word-recognition level of speech signals. Copyright © 2015 Elsevier Inc. All rights reserved.

  18. Attentional capture under high perceptual load.

    Science.gov (United States)

    Cosman, Joshua D; Vecera, Shaun P

    2010-12-01

    Attentional capture by abrupt onsets can be modulated by several factors, including the complexity, or perceptual load, of a scene. We have recently demonstrated that observers are less likely to be captured by abruptly appearing, task-irrelevant stimuli when they perform a search that is high, as opposed to low, in perceptual load (Cosman & Vecera, 2009), consistent with perceptual load theory. However, recent results indicate that onset frequency can influence stimulus-driven capture, with infrequent onsets capturing attention more often than did frequent onsets. Importantly, in our previous task, an abrupt onset was present on every trial, and consequently, attentional capture might have been affected by both onset frequency and perceptual load. In the present experiment, we examined whether onset frequency influences attentional capture under conditions of high perceptual load. When onsets were presented frequently, we replicated our earlier results; attentional capture by onsets was modulated under conditions of high perceptual load. Importantly, however, when onsets were presented infrequently, we observed robust capture effects. These results conflict with a strong form of load theory and, instead, suggest that exposure to the elements of a task (e.g., abrupt onsets) combines with high perceptual load to modulate attentional capture by task-irrelevant information.

  19. Hearing assessment in pre-school children with speech delay.

    Science.gov (United States)

    Psillas, George; Psifidis, Anestis; Antoniadou-Hitoglou, Magda; Kouloulas, Athanasios

    2006-09-01

    The purpose of this study was to detect any underlying hearing loss among the healthy pre-school children with speech delay. 76 children, aged from 1 to 5 years, underwent a thorough audiological examination consisting of tympanometry, free field testing, otoacoustic emission recordings and auditory brainstem responses (ABRs). If hearing was normal, then they were evaluated by a child neurologist-psychiatrist. According to our findings, the children were classified into 3 groups; those with normal hearing levels (group I, 52 children, 68.4%), sensorineural hearing loss (group II, 22 children, 28.9%) and conductive hearing loss (group III, 2 children, 2.6%). In group I, speech delay was attributed to pervasive developmental disorder (PDD), which represents high-functioning autistic children (37 cases). Other causes were specific language impairment (SLI)-expressive (3 cases), bilingualism (2 cases), and unknown etiology (10 cases). More than half (59%) of the children diagnosed with PDD evidenced significant language impairment limited to more than two words. Children with SLI-expressive and bilingualism used a maximum of two words. In group II, 13 children suffered from profound hearing loss in both ears, 3 from severe, 3 had profound hearing loss in one ear and severe in the other, 2 from moderate, and 1 had moderate in one ear and severe in the other. No child had mild sensorineural hearing loss. The children with profound hearing loss in at least one ear had total language impairment using no word at all (10 cases), or a maximum of two words (6 cases). When hearing loss was moderate to severe, then the speech vocabulary was confined to several words (more than two words-6 cases). Only two children suffering from conductive hearing loss both presented with complete lack of speech. A great number of healthy pre-school children with speech delay were found to have normal hearing. In this case, the otolaryngologist should be aware of the possible underlying clinical

  20. Magnified Neural Envelope Coding Predicts Deficits in Speech Perception in Noise.

    Science.gov (United States)

    Millman, Rebecca E; Mattys, Sven L; Gouws, André D; Prendergast, Garreth

    2017-08-09

    Verbal communication in noisy backgrounds is challenging. Understanding speech in background noise that fluctuates in intensity over time is particularly difficult for hearing-impaired listeners with a sensorineural hearing loss (SNHL). The reduction in fast-acting cochlear compression associated with SNHL exaggerates the perceived fluctuations in intensity in amplitude-modulated sounds. SNHL-induced changes in the coding of amplitude-modulated sounds may have a detrimental effect on the ability of SNHL listeners to understand speech in the presence of modulated background noise. To date, direct evidence for a link between magnified envelope coding and deficits in speech identification in modulated noise has been absent. Here, magnetoencephalography was used to quantify the effects of SNHL on phase locking to the temporal envelope of modulated noise (envelope coding) in human auditory cortex. Our results show that SNHL enhances the amplitude of envelope coding in posteromedial auditory cortex, whereas it enhances the fidelity of envelope coding in posteromedial and posterolateral auditory cortex. This dissociation was more evident in the right hemisphere, demonstrating functional lateralization in enhanced envelope coding in SNHL listeners. However, enhanced envelope coding was not perceptually beneficial. Our results also show that both hearing thresholds and, to a lesser extent, magnified cortical envelope coding in left posteromedial auditory cortex predict speech identification in modulated background noise. We propose a framework in which magnified envelope coding in posteromedial auditory cortex disrupts the segregation of speech from background noise, leading to deficits in speech perception in modulated background noise. SIGNIFICANCE STATEMENT People with hearing loss struggle to follow conversations in noisy environments. Background noise that fluctuates in intensity over time poses a particular challenge. Using magnetoencephalography, we demonstrate

  1. Non-intrusive speech quality assessment in simplified e-model

    OpenAIRE

    Vozňák, Miroslav

    2012-01-01

    The E-model brings a modern approach to the computation of estimated quality, allowing for easy implementation. One of its advantages is that it can be applied in real time. The method is based on a mathematical computation model evaluating transmission path impairments influencing speech signal, especially delays and packet losses. These parameters, common in an IP network, can affect speech quality dramatically. The paper deals with a proposal for a simplified E-model and its pr...

  2. Robust speech perception: Recognize the familiar, generalize to the similar, and adapt to the novel

    Science.gov (United States)

    Kleinschmidt, Dave F.; Jaeger, T. Florian

    2016-01-01

    Successful speech perception requires that listeners map the acoustic signal to linguistic categories. These mappings are not only probabilistic, but change depending on the situation. For example, one talker’s /p/ might be physically indistinguishable from another talker’s /b/ (cf. lack of invariance). We characterize the computational problem posed by such a subjectively non-stationary world and propose that the speech perception system overcomes this challenge by (1) recognizing previously encountered situations, (2) generalizing to other situations based on previous similar experience, and (3) adapting to novel situations. We formalize this proposal in the ideal adapter framework: (1) to (3) can be understood as inference under uncertainty about the appropriate generative model for the current talker, thereby facilitating robust speech perception despite the lack of invariance. We focus on two critical aspects of the ideal adapter. First, in situations that clearly deviate from previous experience, listeners need to adapt. We develop a distributional (belief-updating) learning model of incremental adaptation. The model provides a good fit against known and novel phonetic adaptation data, including perceptual recalibration and selective adaptation. Second, robust speech recognition requires listeners learn to represent the structured component of cross-situation variability in the speech signal. We discuss how these two aspects of the ideal adapter provide a unifying explanation for adaptation, talker-specificity, and generalization across talkers and groups of talkers (e.g., accents and dialects). The ideal adapter provides a guiding framework for future investigations into speech perception and adaptation, and more broadly language comprehension. PMID:25844873

  3. Speech discrimination difficulties in High-Functioning Autism Spectrum Disorder are likely independent of auditory hypersensitivity.

    Directory of Open Access Journals (Sweden)

    William Andrew Dunlop

    2016-08-01

    Full Text Available Autism Spectrum Disorder (ASD, characterised by impaired communication skills and repetitive behaviours, can also result in differences in sensory perception. Individuals with ASD often perform normally in simple auditory tasks but poorly compared to typically developed (TD individuals on complex auditory tasks like discriminating speech from complex background noise. A common trait of individuals with ASD is hypersensitivity to auditory stimulation. No studies to our knowledge consider whether hypersensitivity to sounds is related to differences in speech-in-noise discrimination. We provide novel evidence that individuals with high-functioning ASD show poor performance compared to TD individuals in a speech-in-noise discrimination task with an attentionally demanding background noise, but not in a purely energetic noise. Further, we demonstrate in our small sample that speech-hypersensitivity does not appear to predict performance in the speech-in-noise task. The findings support the argument that an attentional deficit, rather than a perceptual deficit, affects the ability of individuals with ASD to discriminate speech from background noise. Finally, we piloted a novel questionnaire that measures difficulty hearing in noisy environments, and sensitivity to non-verbal and verbal sounds. Psychometric analysis using 128 TD participants provided novel evidence for a difference in sensitivity to non-verbal and verbal sounds, and these findings were reinforced by participants with ASD who also completed the questionnaire. The study was limited by a small and high-functioning sample of participants with ASD. Future work could test larger sample sizes and include lower-functioning ASD participants.

  4. Lingual–Alveolar Contact Pressure During Speech in Amyotrophic Lateral Sclerosis: Preliminary Findings

    Science.gov (United States)

    Knollhoff, Stephanie; Barohn, Richard J.

    2017-01-01

    Purpose This preliminary study on lingual–alveolar contact pressures (LACP) in people with amyotrophic lateral sclerosis (ALS) had several aims: (a) to evaluate whether the protocol induced fatigue, (b) to compare LACP during speech (LACP-Sp) and during maximum isometric pressing (LACP-Max) in people with ALS (PALS) versus healthy controls, (c) to compare the percentage of LACP-Max utilized during speech (%Max) for PALS versus controls, and (d) to evaluate relationships between LACP-Sp and LACP-Max with word intelligibility. Method Thirteen PALS and 12 healthy volunteers produced /t, d, s, z, l, n/ sounds while LACP-Sp was recorded. LACP-Max was obtained before and after the speech protocol. Word intelligibility was obtained from auditory–perceptual judgments. Results LACP-Max values measured before and after completion of the speech protocol did not differ. LACP-Sp and LACP-Max were statistically lower in the ALS bulbar group compared with controls and PALS with only spinal symptoms. There was no statistical difference between groups for %Max. LACP-Sp and LACP-Max were correlated with word intelligibility. Conclusions It was feasible to obtain LACP-Sp measures without inducing fatigue. Reductions in LACP-Sp and LACP-Max for bulbar speakers might reflect tongue weakness. Although confirmation of results is needed, the data indicate that individuals with high word intelligibility maintained LACP-Sp at or above 2 kPa and LACP-Max at or above 50 kPa. PMID:28335033

  5. Perceptual learning modifies untrained pursuit eye movements

    OpenAIRE

    Szpiro, Sarit F. A.; Spering, Miriam; Carrasco, Marisa

    2014-01-01

    Perceptual learning improves detection and discrimination of relevant visual information in mature humans, revealing sensory plasticity. Whether visual perceptual learning affects motor responses is unknown. Here we implemented a protocol that enabled us to address this question. We tested a perceptual response (motion direction estimation, in which observers overestimate motion direction away from a reference) and a motor response (voluntary smooth pursuit eye movements). Perceptual training...

  6. Speech, Language, and Reading in 10-Year-Olds With Cleft: Associations With Teasing, Satisfaction With Speech, and Psychological Adjustment.

    Science.gov (United States)

    Feragen, Kristin Billaud; Særvold, Tone Kristin; Aukner, Ragnhild; Stock, Nicola Marie

    2017-03-01

      Despite the use of multidisciplinary services, little research has addressed issues involved in the care of those with cleft lip and/or palate across disciplines. The aim was to investigate associations between speech, language, reading, and reports of teasing, subjective satisfaction with speech, and psychological adjustment.   Cross-sectional data collected during routine, multidisciplinary assessments in a centralized treatment setting, including speech and language therapists and clinical psychologists.   Children with cleft with palatal involvement aged 10 years from three birth cohorts (N = 170) and their parents.   Speech: SVANTE-N. Language: Language 6-16 (sentence recall, serial recall, vocabulary, and phonological awareness). Reading: Word Chain Test and Reading Comprehension Test. Psychological measures: Strengths and Difficulties Questionnaire and extracts from the Satisfaction With Appearance Scale and Child Experience Questionnaire.   Reading skills were associated with self- and parent-reported psychological adjustment in the child. Subjective satisfaction with speech was associated with psychological adjustment, while not being consistently associated with speech therapists' assessments. Parent-reported teasing was found to be associated with lower levels of reading skills. Having a medical and/or psychological condition in addition to the cleft was found to affect speech, language, and reading significantly.   Cleft teams need to be aware of speech, language, and/or reading problems as potential indicators of psychological risk in children with cleft. This study highlights the importance of multiple reports (self, parent, and specialist) and a multidisciplinary approach to cleft care and research.

  7. Speech recognition in natural background noise.

    Directory of Open Access Journals (Sweden)

    Julien Meyer

    Full Text Available In the real world, human speech recognition nearly always involves listening in background noise. The impact of such noise on speech signals and on intelligibility performance increases with the separation of the listener from the speaker. The present behavioral experiment provides an overview of the effects of such acoustic disturbances on speech perception in conditions approaching ecologically valid contexts. We analysed the intelligibility loss in spoken word lists with increasing listener-to-speaker distance in a typical low-level natural background noise. The noise was combined with the simple spherical amplitude attenuation due to distance, basically changing the signal-to-noise ratio (SNR. Therefore, our study draws attention to some of the most basic environmental constraints that have pervaded spoken communication throughout human history. We evaluated the ability of native French participants to recognize French monosyllabic words (spoken at 65.3 dB(A, reference at 1 meter at distances between 11 to 33 meters, which corresponded to the SNRs most revealing of the progressive effect of the selected natural noise (-8.8 dB to -18.4 dB. Our results showed that in such conditions, identity of vowels is mostly preserved, with the striking peculiarity of the absence of confusion in vowels. The results also confirmed the functional role of consonants during lexical identification. The extensive analysis of recognition scores, confusion patterns and associated acoustic cues revealed that sonorant, sibilant and burst properties were the most important parameters influencing phoneme recognition. . Altogether these analyses allowed us to extract a resistance scale from consonant recognition scores. We also identified specific perceptual consonant confusion groups depending of the place in the words (onset vs. coda. Finally our data suggested that listeners may access some acoustic cues of the CV transition, opening interesting perspectives for

  8. Speech recognition in natural background noise.

    Science.gov (United States)

    Meyer, Julien; Dentel, Laure; Meunier, Fanny

    2013-01-01

    In the real world, human speech recognition nearly always involves listening in background noise. The impact of such noise on speech signals and on intelligibility performance increases with the separation of the listener from the speaker. The present behavioral experiment provides an overview of the effects of such acoustic disturbances on speech perception in conditions approaching ecologically valid contexts. We analysed the intelligibility loss in spoken word lists with increasing listener-to-speaker distance in a typical low-level natural background noise. The noise was combined with the simple spherical amplitude attenuation due to distance, basically changing the signal-to-noise ratio (SNR). Therefore, our study draws attention to some of the most basic environmental constraints that have pervaded spoken communication throughout human history. We evaluated the ability of native French participants to recognize French monosyllabic words (spoken at 65.3 dB(A), reference at 1 meter) at distances between 11 to 33 meters, which corresponded to the SNRs most revealing of the progressive effect of the selected natural noise (-8.8 dB to -18.4 dB). Our results showed that in such conditions, identity of vowels is mostly preserved, with the striking peculiarity of the absence of confusion in vowels. The results also confirmed the functional role of consonants during lexical identification. The extensive analysis of recognition scores, confusion patterns and associated acoustic cues revealed that sonorant, sibilant and burst properties were the most important parameters influencing phoneme recognition. . Altogether these analyses allowed us to extract a resistance scale from consonant recognition scores. We also identified specific perceptual consonant confusion groups depending of the place in the words (onset vs. coda). Finally our data suggested that listeners may access some acoustic cues of the CV transition, opening interesting perspectives for future studies.

  9. Transient and sustained cortical activity elicited by connected speech of varying intelligibility

    Directory of Open Access Journals (Sweden)

    Tiitinen Hannu

    2012-12-01

    Full Text Available Abstract Background The robustness of speech perception in the face of acoustic variation is founded on the ability of the auditory system to integrate the acoustic features of speech and to segregate them from background noise. This auditory scene analysis process is facilitated by top-down mechanisms, such as recognition memory for speech content. However, the cortical processes underlying these facilitatory mechanisms remain unclear. The present magnetoencephalography (MEG study examined how the activity of auditory cortical areas is modulated by acoustic degradation and intelligibility of connected speech. The experimental design allowed for the comparison of cortical activity patterns elicited by acoustically identical stimuli which were perceived as either intelligible or unintelligible. Results In the experiment, a set of sentences was presented to the subject in distorted, undistorted, and again in distorted form. The intervening exposure to undistorted versions of sentences rendered the initially unintelligible, distorted sentences intelligible, as evidenced by an increase from 30% to 80% in the proportion of sentences reported as intelligible. These perceptual changes were reflected in the activity of the auditory cortex, with the auditory N1m response (~100 ms being more prominent for the distorted stimuli than for the intact ones. In the time range of auditory P2m response (>200 ms, auditory cortex as well as regions anterior and posterior to this area generated a stronger response to sentences which were intelligible than unintelligible. During the sustained field (>300 ms, stronger activity was elicited by degraded stimuli in auditory cortex and by intelligible sentences in areas posterior to auditory cortex. Conclusions The current findings suggest that the auditory system comprises bottom-up and top-down processes which are reflected in transient and sustained brain activity. It appears that analysis of acoustic features occurs

  10. Gender and vocal production mode discrimination using the high frequencies for speech and singing

    Science.gov (United States)

    Monson, Brian B.; Lotto, Andrew J.; Story, Brad H.

    2014-01-01

    Humans routinely produce acoustical energy at frequencies above 6 kHz during vocalization, but this frequency range is often not represented in communication devices and speech perception research. Recent advancements toward high-definition (HD) voice and extended bandwidth hearing aids have increased the interest in the high frequencies. The potential perceptual information provided by high-frequency energy (HFE) is not well characterized. We found that humans can accomplish tasks of gender discrimination and vocal production mode discrimination (speech vs. singing) when presented with acoustic stimuli containing only HFE at both amplified and normal levels. Performance in these tasks was robust in the presence of low-frequency masking noise. No substantial learning effect was observed. Listeners also were able to identify the sung and spoken text (excerpts from “The Star-Spangled Banner”) with very few exposures. These results add to the increasing evidence that the high frequencies provide at least redundant information about the vocal signal, suggesting that its representation in communication devices (e.g., cell phones, hearing aids, and cochlear implants) and speech/voice synthesizers could improve these devices and benefit normal-hearing and hearing-impaired listeners. PMID:25400613

  11. Semantic Representations in 3D Perceptual Space

    Directory of Open Access Journals (Sweden)

    Suncica Zdravkovic

    2011-05-01

    Full Text Available Barsalou's (1999 perceptual theory of knowledge echoes the pre-20th century tradition of conceptualizing all knowledge as inherently perceptual. Hence conceptual space has an infinite number of dimensions and heavily relies on perceptual experience. Osgood's (1952 semantic differential technique was developed as a bridge between perception and semantics. We updated Osgood's methodology in order to investigate current issues in visual cognition by: (1 using a 2D rather than a 1D space to place the concepts, (2 having dimensions that were perceptual while the targets were conceptual, (3 coupling visual experience with another two perceptual domains (audition and touch, (4 analyzing the data using MDS (not factor analysis. In three experiments, subjects (N = 57 judged five concrete and five abstract words on seven bipolar scales in three perceptual modalities. The 2D space led to different patterns of response compared to the classic 1D space. MDS revealed that perceptual modalities are not equally informative for mapping word-meaning distances (Mantel min = −.23; Mantel max = .88. There was no reliable differences due to test administration modality (paper vs. computer, nor scale orientation. The present findings are consistent with multidimensionality of conceptual space, a perceptual basis for knowledge, and dynamic characteristics of concepts discussed in contemporary theories.

  12. Limited Cognitive Resources Explain a Trade-Off between Perceptual and Metacognitive Vigilance.

    Science.gov (United States)

    Maniscalco, Brian; McCurdy, Li Yan; Odegaard, Brian; Lau, Hakwan

    2017-02-01

    Why do experimenters give subjects short breaks in long behavioral experiments? Whereas previous studies suggest it is difficult to maintain attention and vigilance over long periods of time, it is unclear precisely what mechanisms benefit from rest after short experimental blocks. Here, we evaluate decline in both perceptual performance and metacognitive sensitivity (i.e., how well confidence ratings track perceptual decision accuracy) over time and investigate whether characteristics of prefrontal cortical areas correlate with these measures. Whereas a single-process signal detection model predicts that these two forms of fatigue should be strongly positively correlated, a dual-process model predicts that rates of decline may dissociate. Here, we show that these measures consistently exhibited negative or near-zero correlations, as if engaged in a trade-off relationship, suggesting that different mechanisms contribute to perceptual and metacognitive decisions. Despite this dissociation, the two mechanisms likely depend on common resources, which could explain their trade-off relationship. Based on structural MRI brain images of individual human subjects, we assessed gray matter volume in the frontal polar area, a region that has been linked to visual metacognition. Variability of frontal polar volume correlated with individual differences in behavior, indicating the region may play a role in supplying common resources for both perceptual and metacognitive vigilance. Additional experiments revealed that reduced metacognitive demand led to superior perceptual vigilance, providing further support for this hypothesis. Overall, results indicate that during breaks between short blocks, it is the higher-level perceptual decision mechanisms, rather than lower-level sensory machinery, that benefit most from rest. Perceptual task performance declines over time (the so-called vigilance decrement), but the relationship between vigilance in perception and metacognition has

  13. The semantics of prosody: acoustic and perceptual evidence of prosodic correlates to word meaning.

    Science.gov (United States)

    Nygaard, Lynne C; Herold, Debora S; Namy, Laura L

    2009-01-01

    This investigation examined whether speakers produce reliable prosodic correlates to meaning across semantic domains and whether listeners use these cues to derive word meaning from novel words. Speakers were asked to produce phrases in infant-directed speech in which novel words were used to convey one of two meanings from a set of antonym pairs (e.g., big/small). Acoustic analyses revealed that some acoustic features were correlated with overall valence of the meaning. However, each word meaning also displayed a unique acoustic signature, and semantically related meanings elicited similar acoustic profiles. In two perceptual tests, listeners either attempted to identify the novel words with a matching meaning dimension (picture pair) or with mismatched meaning dimensions. Listeners inferred the meaning of the novel words significantly more often when prosody matched the word meaning choices than when prosody mismatched. These findings suggest that speech contains reliable prosodic markers to word meaning and that listeners use these prosodic cues to differentiate meanings. That prosody is semantic suggests a reconceptualization of traditional distinctions between linguistic and nonlinguistic properties of spoken language. Copyright © 2009 Cognitive Science Society, Inc.

  14. Musician advantage for speech-on-speech perception

    NARCIS (Netherlands)

    Başkent, Deniz; Gaudrain, Etienne

    Evidence for transfer of musical training to better perception of speech in noise has been mixed. Unlike speech-in-noise, speech-on-speech perception utilizes many of the skills that musical training improves, such as better pitch perception and stream segregation, as well as use of higher-level

  15. Assessing spoken word recognition in children who are deaf or hard of hearing: A translational approach

    OpenAIRE

    Kirk, Karen Iler; Prusick, Lindsay; French, Brian; Gotch, Chad; Eisenberg, Laurie S.; Young, Nancy

    2012-01-01

    Under natural conditions, listeners use both auditory and visual speech cues to extract meaning from speech signals containing many sources of variability. However, traditional clinical tests of spoken word recognition routinely employ isolated words or sentences produced by a single talker in an auditory-only presentation format. The more central cognitive processes used during multimodal integration, perceptual normalization and lexical discrimination that may contribute to individual varia...

  16. A Sociolinguistic Assessment of the Darwazi Speech Variety in Afghanistan

    Directory of Open Access Journals (Sweden)

    Simone Beck

    2013-01-01

    Full Text Available This paper presents a sociolinguistic assessment of the Darwāzi speech varieties (including Tangshewi based on data collected during a survey conducted between August 31st and September 19th 2008 in the Darwāz area. The research was carried out under the auspices of the International Assistance Mission, a Non-Governmental Organization working in Afghanistan. The goal was to determine whether Dari, one of the two national languages, is adequate to be used in literature and primary school education, or whether the Darwāzi people would benefit from language development, including literature development and primary school education in the vernacular.

  17. Visual-perceptual impairment in children with cerebral palsy: a systematic review.

    Science.gov (United States)

    Ego, Anne; Lidzba, Karen; Brovedani, Paola; Belmonti, Vittorio; Gonzalez-Monge, Sibylle; Boudia, Baya; Ritz, Annie; Cans, Christine

    2015-04-01

    Visual perception is one of the cognitive functions often impaired in children with cerebral palsy (CP). The aim of this systematic literature review was to assess the frequency of visual-perceptual impairment (VPI) and its relationship with patient characteristics. Eligible studies were relevant papers assessing visual perception with five common standardized assessment instruments in children with CP published from January 1990 to August 2011. Of the 84 studies selected, 15 were retained. In children with CP, the proportion of VPI ranged from 40% to 50% and the mean visual perception quotient from 70 to 90. None of the studies reported a significant influence of CP subtype, IQ level, side of motor impairment, neuro-ophthalmological outcomes, or seizures. The severity of neuroradiological lesions seemed associated with VPI. The influence of prematurity was controversial, but a lower gestational age was more often associated with lower visual motor skills than with decreased visual-perceptual abilities. The impairment of visual perception in children with CP should be considered a core disorder within the CP syndrome. Further research, including a more systematic approach to neuropsychological testing, is needed to explore the specific impact of CP subgroups and of neuroradiological features on visual-perceptual development. © 2015 The Authors. Developmental Medicine & Child Neurology © 2015 Mac Keith Press.

  18. The role of the speech-language pathologist in home care.

    Science.gov (United States)

    Giles, Melanie; Barker, Mary; Hayes, Amanda

    2014-06-01

    Speech language pathologists play an important role in the care of patients with speech, language, or swallowing difficulties that can result from a variety of medical conditions. This article describes how speech language pathologists assess and treat these conditions and the red flags that suggest a referral to a speech language pathologist is indicated.

  19. Recognizing speech in a novel accent: the motor theory of speech perception reframed.

    Science.gov (United States)

    Moulin-Frier, Clément; Arbib, Michael A

    2013-08-01

    The motor theory of speech perception holds that we perceive the speech of another in terms of a motor representation of that speech. However, when we have learned to recognize a foreign accent, it seems plausible that recognition of a word rarely involves reconstruction of the speech gestures of the speaker rather than the listener. To better assess the motor theory and this observation, we proceed in three stages. Part 1 places the motor theory of speech perception in a larger framework based on our earlier models of the adaptive formation of mirror neurons for grasping, and for viewing extensions of that mirror system as part of a larger system for neuro-linguistic processing, augmented by the present consideration of recognizing speech in a novel accent. Part 2 then offers a novel computational model of how a listener comes to understand the speech of someone speaking the listener's native language with a foreign accent. The core tenet of the model is that the listener uses hypotheses about the word the speaker is currently uttering to update probabilities linking the sound produced by the speaker to phonemes in the native language repertoire of the listener. This, on average, improves the recognition of later words. This model is neutral regarding the nature of the representations it uses (motor vs. auditory). It serve as a reference point for the discussion in Part 3, which proposes a dual-stream neuro-linguistic architecture to revisits claims for and against the motor theory of speech perception and the relevance of mirror neurons, and extracts some implications for the reframing of the motor theory.

  20. Generation and Perceptual Implicit Memory: Different Generation Tasks Produce Different Effects on Perceptual Priming

    Science.gov (United States)

    Mulligan, Neil W.; Dew, Ilana T. Z.

    2009-01-01

    The generation manipulation has been critical in delineating differences between implicit and explicit memory. In contrast to past research, the present experiments indicate that generating from a rhyme cue produces as much perceptual priming as does reading. This is demonstrated for 3 visual priming tasks: perceptual identification, word-fragment…

  1. INTEGRATING MACHINE TRANSLATION AND SPEECH SYNTHESIS COMPONENT FOR ENGLISH TO DRAVIDIAN LANGUAGE SPEECH TO SPEECH TRANSLATION SYSTEM

    Directory of Open Access Journals (Sweden)

    J. SANGEETHA

    2015-02-01

    Full Text Available This paper provides an interface between the machine translation and speech synthesis system for converting English speech to Tamil text in English to Tamil speech to speech translation system. The speech translation system consists of three modules: automatic speech recognition, machine translation and text to speech synthesis. Many procedures for incorporation of speech recognition and machine translation have been projected. Still speech synthesis system has not yet been measured. In this paper, we focus on integration of machine translation and speech synthesis, and report a subjective evaluation to investigate the impact of speech synthesis, machine translation and the integration of machine translation and speech synthesis components. Here we implement a hybrid machine translation (combination of rule based and statistical machine translation and concatenative syllable based speech synthesis technique. In order to retain the naturalness and intelligibility of synthesized speech Auto Associative Neural Network (AANN prosody prediction is used in this work. The results of this system investigation demonstrate that the naturalness and intelligibility of the synthesized speech are strongly influenced by the fluency and correctness of the translated text.

  2. Self-Assessed Hearing Handicap in Older Adults with Poorer-than-Predicted Speech Recognition in Noise

    Science.gov (United States)

    Eckert, Mark A.; Matthews, Lois J.; Dubno, Judy R.

    2017-01-01

    Purpose: Even older adults with relatively mild hearing loss report hearing handicap, suggesting that hearing handicap is not completely explained by reduced speech audibility. Method: We examined the extent to which self-assessed ratings of hearing handicap using the Hearing Handicap Inventory for the Elderly (HHIE; Ventry & Weinstein, 1982)…

  3. Video gaming enhances psychomotor skills but not visuospatial and perceptual abilities in surgical trainees.

    Science.gov (United States)

    Kennedy, A M; Boyle, E M; Traynor, O; Walsh, T; Hill, A D K

    2011-01-01

    There is considerable interest in the identification and assessment of underlying aptitudes or innate abilities that could potentially predict excellence in the technical aspects of operating. However, before the assessment of innate abilities is introduced for high-stakes assessment (such as competitive selection into surgical training programs), it is essential to determine that these abilities are stable and unchanging and are not influenced by other factors, such as the use of video games. The aim of this study was to investigate whether experience playing video games will predict psychomotor performance on a laparoscopic simulator or scores on tests of visuospatial and perceptual abilities, and to examine the correlation, if any, between these innate abilities. Institutional ethical approval was obtained. Thirty-eight undergraduate medical students with no previous surgical experience were recruited. All participants completed a self-reported questionnaire that asked them to detail their video game experience. They then underwent assessment of their psychomotor, visuospatial, and perceptual abilities using previously validated tests. The results were analyzed using independent samples t tests to compare means and linear regression curves for subsequent analysis. Students who played video games for at least 7 hours per week demonstrated significantly better psychomotor skills than students who did not play video games regularly. However, there was no difference on measures of visuospatial and perceptual abilities. There was no correlation between psychomotor tests and visuospatial or perceptual tests. Regular video gaming correlates positively with psychomotor ability, but it does not seem to influence visuospatial or perceptual ability. This study suggests that video game experience might be beneficial to a future career in surgery. It also suggests that relevant surgical skills may be gained usefully outside the operating room in activities that are not

  4. Effects of Social Cognitive Impairment on Speech Disorder in Schizophrenia

    OpenAIRE

    Docherty, Nancy M.; McCleery, Amanda; Divilbiss, Marielle; Schumann, Emily B.; Moe, Aubrey; Shakeel, Mohammed K.

    2012-01-01

    Disordered speech in schizophrenia impairs social functioning because it impedes communication with others. Treatment approaches targeting this symptom have been limited by an incomplete understanding of its causes. This study examined the process underpinnings of speech disorder, assessed in terms of communication failure. Contributions of impairments in 2 social cognitive abilities, emotion perception and theory of mind (ToM), to speech disorder were assessed in 63 patients with schizophren...

  5. Reactive agents and perceptual ambiguity

    NARCIS (Netherlands)

    Dartel, M. van; Sprinkhuizen-Kuyper, I.G.; Postma, E.O.; Herik, H.J. van den

    2005-01-01

    Reactive agents are generally believed to be incapable of coping with perceptual ambiguity (i.e., identical sensory states that require different responses). However, a recent finding suggests that reactive agents can cope with perceptual ambiguity in a simple model (Nolfi, 2002). This paper

  6. Perceptual integration without conscious access.

    Science.gov (United States)

    Fahrenfort, Johannes J; van Leeuwen, Jonathan; Olivers, Christian N L; Hogendoorn, Hinze

    2017-04-04

    The visual system has the remarkable ability to integrate fragmentary visual input into a perceptually organized collection of surfaces and objects, a process we refer to as perceptual integration. Despite a long tradition of perception research, it is not known whether access to consciousness is required to complete perceptual integration. To investigate this question, we manipulated access to consciousness using the attentional blink. We show that, behaviorally, the attentional blink impairs conscious decisions about the presence of integrated surface structure from fragmented input. However, despite conscious access being impaired, the ability to decode the presence of integrated percepts remains intact, as shown through multivariate classification analyses of electroencephalogram (EEG) data. In contrast, when disrupting perception through masking, decisions about integrated percepts and decoding of integrated percepts are impaired in tandem, while leaving feedforward representations intact. Together, these data show that access consciousness and perceptual integration can be dissociated.

  7. Making Room for Second Language Phonotactics: Effects of L2 Learning and Environment on First Language Speech Perception.

    Science.gov (United States)

    Carlson, Matthew T

    2018-04-01

    Language-specific restrictions on sound sequences in words can lead to automatic perceptual repair of illicit sound sequences. As an example, no Spanish words begin with /s/-consonant sequences ([#sC]), and where necessary (e.g., foreign loanwords) [#sC] is repaired by inserting an initial [e], (e.g. foreign loanwords, cf., esnob, from English snob). As a result, Spanish speakers tend to perceive an illusory [e] before [#sC] sequences. Interestingly, this perceptual illusion is weaker in early Spanish-English bilinguals, whose other language, English, allows [#sC]. The present study explored whether this apparent influence of the English language on Spanish is restricted to early bilinguals, whose early language experience includes a mixture of both languages, or whether later learning of second language (L2) English can also induce a weakening of the first language (L1) perceptual illusion. Two groups of late Spanish-English bilinguals, immersed in Spanish or English, were tested on the same Spanish AX (same-different) discrimination task used in a study by Carlson et al., (2016) and their results compared with the Spanish monolinguals from Carlson et al.'s study. Like early bilinguals, late bilinguals exhibited a reduced impact of perceptual prothesis on discrimination accuracy. Additionally, late bilinguals, particularly in English immersion, were slowest when responding against the Spanish perceptual illusion. Robust L1 perceptual illusions thus appear to be malleable in the face of later L2 learning. It is argued that these results are consonant with the need for late bilinguals to navigate alternative, conflicting representations of the same acoustic material, even in unilingual L1 speech perception tasks.

  8. Methods and models for quantative assessment of speech intelligibility in cross-language communication

    NARCIS (Netherlands)

    Wijngaarden, S.J. van; Steeneken, H.J.M.; Houtgast, T.

    2001-01-01

    To deal with the effects of nonnative speech communication on speech intelligibility, one must know the magnitude of these effects. To measure this magnitude, suitable test methods must be available. Many of the methods used in cross-language speech communication research are not very suitable for

  9. Acoustic analysis assessment in speech pathology detection

    Directory of Open Access Journals (Sweden)

    Panek Daria

    2015-09-01

    Full Text Available Automatic detection of voice pathologies enables non-invasive, low cost and objective assessments of the presence of disorders, as well as accelerating and improving the process of diagnosis and clinical treatment given to patients. In this work, a vector made up of 28 acoustic parameters is evaluated using principal component analysis (PCA, kernel principal component analysis (kPCA and an auto-associative neural network (NLPCA in four kinds of pathology detection (hyperfunctional dysphonia, functional dysphonia, laryngitis, vocal cord paralysis using the a, i and u vowels, spoken at a high, low and normal pitch. The results indicate that the kPCA and NLPCA methods can be considered a step towards pathology detection of the vocal folds. The results show that such an approach provides acceptable results for this purpose, with the best efficiency levels of around 100%. The study brings the most commonly used approaches to speech signal processing together and leads to a comparison of the machine learning methods determining the health status of the patient

  10. Perceptual integration without conscious access

    NARCIS (Netherlands)

    Fahrenfort, Johannes J.; Van Leeuwen, Jonathan; Olivers, Christian N.L.; Hogendoorn, Hinze

    2017-01-01

    The visual system has the remarkable ability to integrate fragmentary visual input into a perceptually organized collection of surfaces and objects, a process we refer to as perceptual integration. Despite a long tradition of perception research, it is not known whether access to consciousness is

  11. Perceptual learning: top to bottom.

    Science.gov (United States)

    Amitay, Sygal; Zhang, Yu-Xuan; Jones, Pete R; Moore, David R

    2014-06-01

    Perceptual learning has traditionally been portrayed as a bottom-up phenomenon that improves encoding or decoding of the trained stimulus. Cognitive skills such as attention and memory are thought to drive, guide and modulate learning but are, with notable exceptions, not generally considered to undergo changes themselves as a result of training with simple perceptual tasks. Moreover, shifts in threshold are interpreted as shifts in perceptual sensitivity, with no consideration for non-sensory factors (such as response bias) that may contribute to these changes. Accumulating evidence from our own research and others shows that perceptual learning is a conglomeration of effects, with training-induced changes ranging from the lowest (noise reduction in the phase locking of auditory signals) to the highest (working memory capacity) level of processing, and includes contributions from non-sensory factors that affect decision making even on a "simple" auditory task such as frequency discrimination. We discuss our emerging view of learning as a process that increases the signal-to-noise ratio associated with perceptual tasks by tackling noise sources and inefficiencies that cause performance bottlenecks, and present some implications for training populations other than young, smart, attentive and highly-motivated college students. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.

  12. Speech endpoint detection with non-language speech sounds for generic speech processing applications

    Science.gov (United States)

    McClain, Matthew; Romanowski, Brian

    2009-05-01

    Non-language speech sounds (NLSS) are sounds produced by humans that do not carry linguistic information. Examples of these sounds are coughs, clicks, breaths, and filled pauses such as "uh" and "um" in English. NLSS are prominent in conversational speech, but can be a significant source of errors in speech processing applications. Traditionally, these sounds are ignored by speech endpoint detection algorithms, where speech regions are identified in the audio signal prior to processing. The ability to filter NLSS as a pre-processing step can significantly enhance the performance of many speech processing applications, such as speaker identification, language identification, and automatic speech recognition. In order to be used in all such applications, NLSS detection must be performed without the use of language models that provide knowledge of the phonology and lexical structure of speech. This is especially relevant to situations where the languages used in the audio are not known apriori. We present the results of preliminary experiments using data from American and British English speakers, in which segments of audio are classified as language speech sounds (LSS) or NLSS using a set of acoustic features designed for language-agnostic NLSS detection and a hidden-Markov model (HMM) to model speech generation. The results of these experiments indicate that the features and model used are capable of detection certain types of NLSS, such as breaths and clicks, while detection of other types of NLSS such as filled pauses will require future research.

  13. Speech Intelligibility in Noise Using Throat and Acoustic Microphones

    National Research Council Canada - National Science Library

    Acker-Mills, Barbara

    2004-01-01

    ... speech intelligibility. Speech intelligibility for signals generated by an acoustic microphone, a throat microphone, and the two microphones together was assessed using the Modified Rhyme Test (MRT...

  14. Examining Business Students' Career Preferences: A Perceptual Space Approach.

    Science.gov (United States)

    Soutar, Geoffrey N.; Clarke, Alexander W.

    1983-01-01

    Proposes a methodology for examining career preferences, which uses perceptual mapping techniques and external preference analysis to assess the attributes individuals believe are important. A study of 158 business students' career preferences suggested the methodology can be useful in analyzing reasons for career preferences. (WAS)

  15. Whole-exome sequencing supports genetic heterogeneity in childhood apraxia of speech.

    Science.gov (United States)

    Worthey, Elizabeth A; Raca, Gordana; Laffin, Jennifer J; Wilk, Brandon M; Harris, Jeremy M; Jakielski, Kathy J; Dimmock, David P; Strand, Edythe A; Shriberg, Lawrence D

    2013-10-02

    Childhood apraxia of speech (CAS) is a rare, severe, persistent pediatric motor speech disorder with associated deficits in sensorimotor, cognitive, language, learning and affective processes. Among other neurogenetic origins, CAS is the disorder segregating with a mutation in FOXP2 in a widely studied, multigenerational London family. We report the first whole-exome sequencing (WES) findings from a cohort of 10 unrelated participants, ages 3 to 19 years, with well-characterized CAS. As part of a larger study of children and youth with motor speech sound disorders, 32 participants were classified as positive for CAS on the basis of a behavioral classification marker using auditory-perceptual and acoustic methods that quantify the competence, precision and stability of a speaker's speech, prosody and voice. WES of 10 randomly selected participants was completed using the Illumina Genome Analyzer IIx Sequencing System. Image analysis, base calling, demultiplexing, read mapping, and variant calling were performed using Illumina software. Software developed in-house was used for variant annotation, prioritization and interpretation to identify those variants likely to be deleterious to neurodevelopmental substrates of speech-language development. Among potentially deleterious variants, clinically reportable findings of interest occurred on a total of five chromosomes (Chr3, Chr6, Chr7, Chr9 and Chr17), which included six genes either strongly associated with CAS (FOXP1 and CNTNAP2) or associated with disorders with phenotypes overlapping CAS (ATP13A4, CNTNAP1, KIAA0319 and SETX). A total of 8 (80%) of the 10 participants had clinically reportable variants in one or two of the six genes, with variants in ATP13A4, KIAA0319 and CNTNAP2 being the most prevalent. Similar to the results reported in emerging WES studies of other complex neurodevelopmental disorders, our findings from this first WES study of CAS are interpreted as support for heterogeneous genetic origins of

  16. The influence of environmental sound training on the perception of spectrally degraded speech and environmental sounds.

    Science.gov (United States)

    Shafiro, Valeriy; Sheft, Stanley; Gygi, Brian; Ho, Kim Thien N

    2012-06-01

    Perceptual training with spectrally degraded environmental sounds results in improved environmental sound identification, with benefits shown to extend to untrained speech perception as well. The present study extended those findings to examine longer-term training effects as well as effects of mere repeated exposure to sounds over time. Participants received two pretests (1 week apart) prior to a week-long environmental sound training regimen, which was followed by two posttest sessions, separated by another week without training. Spectrally degraded stimuli, processed with a four-channel vocoder, consisted of a 160-item environmental sound test, word and sentence tests, and a battery of basic auditory abilities and cognitive tests. Results indicated significant improvements in all speech and environmental sound scores between the initial pretest and the last posttest with performance increments following both exposure and training. For environmental sounds (the stimulus class that was trained), the magnitude of positive change that accompanied training was much greater than that due to exposure alone, with improvement for untrained sounds roughly comparable to the speech benefit from exposure. Additional tests of auditory and cognitive abilities showed that speech and environmental sound performance were differentially correlated with tests of spectral and temporal-fine-structure processing, whereas working memory and executive function were correlated with speech, but not environmental sound perception. These findings indicate generalizability of environmental sound training and provide a basis for implementing environmental sound training programs for cochlear implant (CI) patients.

  17. Impairments of speech fluency in Lewy body spectrum disorder.

    Science.gov (United States)

    Ash, Sharon; McMillan, Corey; Gross, Rachel G; Cook, Philip; Gunawardena, Delani; Morgan, Brianna; Boller, Ashley; Siderowf, Andrew; Grossman, Murray

    2012-03-01

    Few studies have examined connected speech in demented and non-demented patients with Parkinson's disease (PD). We assessed the speech production of 35 patients with Lewy body spectrum disorder (LBSD), including non-demented PD patients, patients with PD dementia (PDD), and patients with dementia with Lewy bodies (DLB), in a semi-structured narrative speech sample in order to characterize impairments of speech fluency and to determine the factors contributing to reduced speech fluency in these patients. Both demented and non-demented PD patients exhibited reduced speech fluency, characterized by reduced overall speech rate and long pauses between sentences. Reduced speech rate in LBSD correlated with measures of between-utterance pauses, executive functioning, and grammatical comprehension. Regression analyses related non-fluent speech, grammatical difficulty, and executive difficulty to atrophy in frontal brain regions. These findings indicate that multiple factors contribute to slowed speech in LBSD, and this is mediated in part by disease in frontal brain regions. Copyright © 2011 Elsevier Inc. All rights reserved.

  18. A Diagnostic Marker to Discriminate Childhood Apraxia of Speech from Speech Delay: IV. the Pause Marker Index

    Science.gov (United States)

    Shriberg, Lawrence D.; Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.

    2017-01-01

    Purpose: Three previous articles provided rationale, methods, and several forms of validity support for a diagnostic marker of childhood apraxia of speech (CAS), termed the pause marker (PM). Goals of the present article were to assess the validity and stability of the PM Index (PMI) to scale CAS severity. Method: PM scores and speech, prosody,…

  19. Perceptual learning and adult cortical plasticity.

    Science.gov (United States)

    Gilbert, Charles D; Li, Wu; Piech, Valentin

    2009-06-15

    The visual cortex retains the capacity for experience-dependent changes, or plasticity, of cortical function and cortical circuitry, throughout life. These changes constitute the mechanism of perceptual learning in normal visual experience and in recovery of function after CNS damage. Such plasticity can be seen at multiple stages in the visual pathway, including primary visual cortex. The manifestation of the functional changes associated with perceptual learning involve both long term modification of cortical circuits during the course of learning, and short term dynamics in the functional properties of cortical neurons. These dynamics are subject to top-down influences of attention, expectation and perceptual task. As a consequence, each cortical area is an adaptive processor, altering its function in accordance to immediate perceptual demands.

  20. Speech understanding in background noise with the two-microphone adaptive beamformer BEAM in the Nucleus Freedom Cochlear Implant System.

    Science.gov (United States)

    Spriet, Ann; Van Deun, Lieselot; Eftaxiadis, Kyriaky; Laneau, Johan; Moonen, Marc; van Dijk, Bas; van Wieringen, Astrid; Wouters, Jan

    2007-02-01

    This paper evaluates the benefit of the two-microphone adaptive beamformer BEAM in the Nucleus Freedom cochlear implant (CI) system for speech understanding in background noise by CI users. A double-blind evaluation of the two-microphone adaptive beamformer BEAM and a hardware directional microphone was carried out with five adult Nucleus CI users. The test procedure consisted of a pre- and post-test in the lab and a 2-wk trial period at home. In the pre- and post-test, the speech reception threshold (SRT) with sentences and the percentage correct phoneme scores for CVC words were measured in quiet and background noise at different signal-to-noise ratios. Performance was assessed for two different noise configurations (with a single noise source and with three noise sources) and two different noise materials (stationary speech-weighted noise and multitalker babble). During the 2-wk trial period at home, the CI users evaluated the noise reduction performance in different listening conditions by means of the SSQ questionnaire. In addition to the perceptual evaluation, the noise reduction performance of the beamformer was measured physically as a function of the direction of the noise source. Significant improvements of both the SRT in noise (average improvement of 5-16 dB) and the percentage correct phoneme scores (average improvement of 10-41%) were observed with BEAM compared to the standard hardware directional microphone. In addition, the SSQ questionnaire and subjective evaluation in controlled and real-life scenarios suggested a possible preference for the beamformer in noisy environments. The evaluation demonstrates that the adaptive noise reduction algorithm BEAM in the Nucleus Freedom CI-system may significantly increase the speech perception by cochlear implantees in noisy listening conditions. This is the first monolateral (adaptive) noise reduction strategy actually implemented in a mainstream commercial CI.

  1. Cognitive structure of occupational risks represented by a perceptual map.

    Science.gov (United States)

    Cardoso-Junior, M M; Scarpel, R A

    2012-01-01

    The main focus of risk management is technical and rational analysis about the operational risks and by those imposed by the occupational environment. In this work one seeks to contribute to the risk perception study and to better comprehend how a group of occupational safety students assesses a set of activities and environmental agents. In this way it was used theory sustained by psychometric paradigm and multivariate analysis tools, mainly multidimensional scaling, generalized Procrustes analysis and facets theory, in order to construct the perceptual map of occupational risks. The results obtained showed that the essential characteristics of risks, which were initially splited in 4 facets were detected and maintained in the perceptual map. It was not possible to reveal the cognitive structure of the group, because the variability of the students was too high. Differences among the risks analyzed could not be detected as well in the perceptual map of the group.

  2. Perceptual inference.

    Science.gov (United States)

    Aggelopoulos, Nikolaos C

    2015-08-01

    Perceptual inference refers to the ability to infer sensory stimuli from predictions that result from internal neural representations built through prior experience. Methods of Bayesian statistical inference and decision theory model cognition adequately by using error sensing either in guiding action or in "generative" models that predict the sensory information. In this framework, perception can be seen as a process qualitatively distinct from sensation, a process of information evaluation using previously acquired and stored representations (memories) that is guided by sensory feedback. The stored representations can be utilised as internal models of sensory stimuli enabling long term associations, for example in operant conditioning. Evidence for perceptual inference is contributed by such phenomena as the cortical co-localisation of object perception with object memory, the response invariance in the responses of some neurons to variations in the stimulus, as well as from situations in which perception can be dissociated from sensation. In the context of perceptual inference, sensory areas of the cerebral cortex that have been facilitated by a priming signal may be regarded as comparators in a closed feedback loop, similar to the better known motor reflexes in the sensorimotor system. The adult cerebral cortex can be regarded as similar to a servomechanism, in using sensory feedback to correct internal models, producing predictions of the outside world on the basis of past experience. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. Normative perceptual estimates for 91 healthy subjects age 60-75

    DEFF Research Database (Denmark)

    Wilms, Inge Linda; Nielsen, Simon

    2014-01-01

    Visual perception serves as the basis for much of the higher level cognitive processing as well as human activity in general. Here we present normative estimates for the following components of visual perception: the visual perceptual threshold, the visual short-term memory capacity and the visual...... perceptual encoding/decoding speed (processing speed) of visual short-term memory based on an assessment of 91 healthy subjects aged 60-75. The estimates are presented at total sample level as well as at gender level. The estimates were modelled from input from a whole-report assessment based on A Theory...... speed of Visual Short-term Memory (VTSM) but not the capacity of VSTM nor the visual threshold. The estimates will be useful for future studies into the effects of various types of intervention and training on cognition in general and visual attention in particular....

  4. Perceptual and cognitive effects of antipsychotics in first-episode schizophrenia: the potential impact of GABA concentration in the visual cortex.

    Science.gov (United States)

    Kelemen, Oguz; Kiss, Imre; Benedek, György; Kéri, Szabolcs

    2013-12-02

    Schizophrenia is characterized by anomalous perceptual experiences (e.g., sensory irritation, inundation, and flooding) and specific alterations in visual perception. We aimed to investigate the effects of short-term antipsychotic medication on these perceptual alterations. We assessed 28 drug-naïve first episode patients with schizophrenia and 20 matched healthy controls at baseline and follow-up 8 weeks later. Contrast sensitivity was measured with steady- and pulsed-pedestal tests. Participants also received a motion coherence task, the Structured Interview for Assessing Perceptual Anomalies (SIAPA), and the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS). Proton magnetic resonance spectroscopy was used to measure gamma-aminobutyric acid (GABA) levels in the occipital cortex (GABA/total creatine [Cr] ratio). Results revealed that, comparing baseline and follow-up values, patients with schizophrenia exhibited a marked sensitivity reduction on the steady-pedestal test at low spatial frequency. Anomalous perceptual experiences were also significantly ameliorated. Antipsychotic medications had no effect on motion perception. RBANS scores showed mild improvements. At baseline, but not at follow-up, patients with schizophrenia outperformed controls on the steady-pedestal test at low spatial frequency. The dysfunction of motion perception (higher coherence threshold in patients relative to controls) was similar at both assessments. There were reduced GABA levels in schizophrenia at both assessments, which were not related to perceptual functions. These results suggest that antipsychotics dominantly affect visual contrast sensitivity and anomalous perceptual experiences. The prominent dampening effect on low spatial frequency in the steady-pedestal test might indicate the normalization of putatively overactive magnocellular retino-geniculo-cortical pathways. © 2013.

  5. Reproducibility and validity of patient-rated assessment of speech, swallowing, and saliva control in Parkinson’s Disease

    NARCIS (Netherlands)

    Machiel Zwarts; Johanna Kalf; Bastiaan Bloem; George Borm; Marten Munneke; Bert de Swart

    2012-01-01

    To report on the development and psychometric evaluation of the Radboud Oral Motor Inventory for Parkinson's Disease (ROMP), a newly developed patient-rated assessment of speech, swallowing, and saliva control in patients with Parkinson's disease (PD). To evaluate reproducibility, 60 patients

  6. Adaptation and perceptual norms

    Science.gov (United States)

    Webster, Michael A.; Yasuda, Maiko; Haber, Sara; Leonard, Deanne; Ballardini, Nicole

    2007-02-01

    We used adaptation to examine the relationship between perceptual norms--the stimuli observers describe as psychologically neutral, and response norms--the stimulus levels that leave visual sensitivity in a neutral or balanced state. Adapting to stimuli on opposite sides of a neutral point (e.g. redder or greener than white) biases appearance in opposite ways. Thus the adapting stimulus can be titrated to find the unique adapting level that does not bias appearance. We compared these response norms to subjectively defined neutral points both within the same observer (at different retinal eccentricities) and between observers. These comparisons were made for visual judgments of color, image focus, and human faces, stimuli that are very different and may depend on very different levels of processing, yet which share the property that for each there is a well defined and perceptually salient norm. In each case the adaptation aftereffects were consistent with an underlying sensitivity basis for the perceptual norm. Specifically, response norms were similar to and thus covaried with the perceptual norm, and under common adaptation differences between subjectively defined norms were reduced. These results are consistent with models of norm-based codes and suggest that these codes underlie an important link between visual coding and visual experience.

  7. Assessing the applied benefits of perceptual training: Lessons from studies of training working-memory.

    Science.gov (United States)

    Jacoby, Nori; Ahissar, Merav

    2015-01-01

    In the 1980s to 1990s, studies of perceptual learning focused on the specificity of training to basic visual attributes such as retinal position and orientation. These studies were considered scientifically innovative since they suggested the existence of plasticity in the early stimulus-specific sensory cortex. Twenty years later, perceptual training has gradually shifted to potential applications, and research tends to be devoted to showing transfer. In this paper we analyze two key methodological issues related to the interpretation of transfer. The first has to do with the absence of a control group or the sole use of a test-retest group in traditional perceptual training studies. The second deals with claims of transfer based on the correlation between improvement on the trained and transfer tasks. We analyze examples from the general intelligence literature dealing with the impact on general intelligence of training on a working memory task. The re-analyses show that the reports of a significantly larger transfer of the trained group over the test-retest group fail to replicate when transfer is compared to an actively trained group. Furthermore, the correlations reported in this literature between gains on the trained and transfer tasks can be replicated even when no transfer is assumed.

  8. Parent-child interaction in motor speech therapy.

    Science.gov (United States)

    Namasivayam, Aravind Kumar; Jethava, Vibhuti; Pukonen, Margit; Huynh, Anna; Goshulak, Debra; Kroll, Robert; van Lieshout, Pascal

    2018-01-01

    This study measures the reliability and sensitivity of a modified Parent-Child Interaction Observation scale (PCIOs) used to monitor the quality of parent-child interaction. The scale is part of a home-training program employed with direct motor speech intervention for children with speech sound disorders. Eighty-four preschool age children with speech sound disorders were provided either high- (2×/week/10 weeks) or low-intensity (1×/week/10 weeks) motor speech intervention. Clinicians completed the PCIOs at the beginning, middle, and end of treatment. Inter-rater reliability (Kappa scores) was determined by an independent speech-language pathologist who assessed videotaped sessions at the midpoint of the treatment block. Intervention sensitivity of the scale was evaluated using a Friedman test for each item and then followed up with Wilcoxon pairwise comparisons where appropriate. We obtained fair-to-good inter-rater reliability (Kappa = 0.33-0.64) for the PCIOs using only video-based scoring. Child-related items were more strongly influenced by differences in treatment intensity than parent-related items, where a greater number of sessions positively influenced parent learning of treatment skills and child behaviors. The adapted PCIOs is reliable and sensitive to monitor the quality of parent-child interactions in a 10-week block of motor speech intervention with adjunct home therapy. Implications for rehabilitation Parent-centered therapy is considered a cost effective method of speech and language service delivery. However, parent-centered models may be difficult to implement for treatments such as developmental motor speech interventions that require a high degree of skill and training. For children with speech sound disorders and motor speech difficulties, a translated and adapted version of the parent-child observation scale was found to be sufficiently reliable and sensitive to assess changes in the quality of the parent-child interactions during

  9. Cognitive load effects on early visual perceptual processing.

    Science.gov (United States)

    Liu, Ping; Forte, Jason; Sewell, David; Carter, Olivia

    2018-05-01

    Contrast-based early visual processing has largely been considered to involve autonomous processes that do not need the support of cognitive resources. However, as spatial attention is known to modulate early visual perceptual processing, we explored whether cognitive load could similarly impact contrast-based perception. We used a dual-task paradigm to assess the impact of a concurrent working memory task on the performance of three different early visual tasks. The results from Experiment 1 suggest that cognitive load can modulate early visual processing. No effects of cognitive load were seen in Experiments 2 or 3. Together, the findings provide evidence that under some circumstances cognitive load effects can penetrate the early stages of visual processing and that higher cognitive function and early perceptual processing may not be as independent as was once thought.

  10. Perceptual context and individual differences in the language proficiency of preschool children.

    Science.gov (United States)

    Banai, Karen; Yifat, Rachel

    2016-02-01

    Although the contribution of perceptual processes to language skills during infancy is well recognized, the role of perception in linguistic processing beyond infancy is not well understood. In the experiments reported here, we asked whether manipulating the perceptual context in which stimuli are presented across trials influences how preschool children perform visual (shape-size identification; Experiment 1) and auditory (syllable identification; Experiment 2) tasks. Another goal was to determine whether the sensitivity to perceptual context can explain part of the variance in oral language skills in typically developing preschool children. Perceptual context was manipulated by changing the relative frequency with which target visual (Experiment 1) and auditory (Experiment 2) stimuli were presented in arrays of fixed size, and identification of the target stimuli was tested. Oral language skills were assessed using vocabulary, word definition, and phonological awareness tasks. Changes in perceptual context influenced the performance of the majority of children on both identification tasks. Sensitivity to perceptual context accounted for 7% to 15% of the variance in language scores. We suggest that context effects are an outcome of a statistical learning process. Therefore, the current findings demonstrate that statistical learning can facilitate both visual and auditory identification processes in preschool children. Furthermore, consistent with previous findings in infants and in older children and adults, individual differences in statistical learning were found to be associated with individual differences in language skills of preschool children. Copyright © 2015 Elsevier Inc. All rights reserved.

  11. Childhood apraxia of speech and multiple phonological disorders in Cairo-Egyptian Arabic speaking children: language, speech, and oro-motor differences.

    Science.gov (United States)

    Aziz, Azza Adel; Shohdi, Sahar; Osman, Dalia Mostafa; Habib, Emad Iskander

    2010-06-01

    Childhood apraxia of speech is a neurological childhood speech-sound disorder in which the precision and consistency of movements underlying speech are impaired in the absence of neuromuscular deficits. Children with childhood apraxia of speech and those with multiple phonological disorder share some common phonological errors that can be misleading in diagnosis. This study posed a question about a possible significant difference in language, speech and non-speech oral performances between children with childhood apraxia of speech, multiple phonological disorder and normal children that can be used for a differential diagnostic purpose. 30 pre-school children between the ages of 4 and 6 years served as participants. Each of these children represented one of 3 possible subject-groups: Group 1: multiple phonological disorder; Group 2: suspected cases of childhood apraxia of speech; Group 3: control group with no communication disorder. Assessment procedures included: parent interviews; testing of non-speech oral motor skills and testing of speech skills. Data showed that children with suspected childhood apraxia of speech showed significantly lower language score only in their expressive abilities. Non-speech tasks did not identify significant differences between childhood apraxia of speech and multiple phonological disorder groups except for those which required two sequential motor performances. In speech tasks, both consonant and vowel accuracy were significantly lower and inconsistent in childhood apraxia of speech group than in the multiple phonological disorder group. Syllable number, shape and sequence accuracy differed significantly in the childhood apraxia of speech group than the other two groups. In addition, children with childhood apraxia of speech showed greater difficulty in processing prosodic features indicating a clear need to address these variables for differential diagnosis and treatment of children with childhood apraxia of speech. Copyright (c

  12. Music and Speech Perception in Children Using Sung Speech.

    Science.gov (United States)

    Nie, Yingjiu; Galvin, John J; Morikawa, Michael; André, Victoria; Wheeler, Harley; Fu, Qian-Jie

    2018-01-01

    This study examined music and speech perception in normal-hearing children with some or no musical training. Thirty children (mean age = 11.3 years), 15 with and 15 without formal music training participated in the study. Music perception was measured using a melodic contour identification (MCI) task; stimuli were a piano sample or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note). Speech perception was measured in quiet and in steady noise using a matrix-styled sentence recognition task; stimuli were naturally intonated speech or sung speech with a fixed pitch (same note for each word) or a mixed pitch (different notes for each word). Significant musician advantages were observed for MCI and speech in noise but not for speech in quiet. MCI performance was significantly poorer with the mixed timbre stimuli. Speech performance in noise was significantly poorer with the fixed or mixed pitch stimuli than with spoken speech. Across all subjects, age at testing and MCI performance were significantly correlated with speech performance in noise. MCI and speech performance in quiet was significantly poorer for children than for adults from a related study using the same stimuli and tasks; speech performance in noise was significantly poorer for young than for older children. Long-term music training appeared to benefit melodic pitch perception and speech understanding in noise in these pediatric listeners.

  13. “Biotecnological War” - A Conceptual And Perceptual Assessment Tool For Teaching Biotechnology And Protein Chemistry For Undergraduate Students In Biological Sciences.

    OpenAIRE

    C. R. C. Cruz et al.

    2017-01-01

    "Biotecnological War" board game, a conceptual and perceptual assessment tool for biotechnology and protein chemistry teaching for undergraduate students in biological sciences and related areas. It is a proposal initially conceived as an alternative complementary tool for biochemistry teaching of proteins and peptides, challenging students, aiming to review concepts transmitted in classroom, stimulating diverse student’s abilities, such as their creativity, competitiveness and resource manag...

  14. Association between unsafe driving performance and cognitive-perceptual dysfunction in older drivers.

    Science.gov (United States)

    Park, Si-Woon; Choi, Eun Seok; Lim, Mun Hee; Kim, Eun Joo; Hwang, Sung Il; Choi, Kyung-In; Yoo, Hyun-Chul; Lee, Kuem Ju; Jung, Hi-Eun

    2011-03-01

    To find an association between cognitive-perceptual problems of older drivers and unsafe driving performance during simulated automobile driving in a virtual environment. Cross-sectional study. A driver evaluation clinic in a rehabilitation hospital. Fifty-five drivers aged 65 years or older and 48 drivers in their late twenties to early forties. All participants underwent evaluation of cognitive-perceptual function and driving performance, and the results were compared between older and younger drivers. The association between cognitive-perceptual function and driving performance was analyzed. Cognitive-perceptual function was evaluated with the Cognitive Perceptual Assessment for Driving (CPAD), a computer-based assessment tool consisting of depth perception, sustained attention, divided attention, the Stroop test, the digit span test, field dependency, and trail-making test A and B. Driving performance was evaluated with use of a virtual reality-based driving simulator. During simulated driving, car crashes were recorded, and an occupational therapist observed unsafe performances in controlling speed, braking, steering, vehicle positioning, making lane changes, and making turns. Thirty-five older drivers did not pass the CPAD test, whereas all of the younger drivers passed the test. When using the driving simulator, a significantly greater number of older drivers experienced car crashes and demonstrated unsafe performance in controlling speed, steering, and making lane changes. CPAD results were associated with car crashes, steering, vehicle positioning, and making lane changes. Older drivers who did not pass the CPAD test are 4 times more likely to experience a car crash, 3.5 times more likely to make errors in steering, 2.8 times more likely to make errors in vehicle positioning, and 6.5 times more likely to make errors in lane changes than are drivers who passed the CPAD test. Unsafe driving performance and car crashes during simulated driving were more

  15. Identifying Residual Speech Sound Disorders in Bilingual Children: A Japanese-English Case Study

    Science.gov (United States)

    Preston, Jonathan L.; Seki, Ayumi

    2011-01-01

    Purpose: To describe (a) the assessment of residual speech sound disorders (SSDs) in bilinguals by distinguishing speech patterns associated with second language acquisition from patterns associated with misarticulations and (b) how assessment of domains such as speech motor control and phonological awareness can provide a more complete…

  16. Slow perceptual processing at the core of developmental dyslexia: a parameter-based assessment of visual attention.

    Science.gov (United States)

    Stenneken, Prisca; Egetemeir, Johanna; Schulte-Körne, Gerd; Müller, Hermann J; Schneider, Werner X; Finke, Kathrin

    2011-10-01

    The cognitive causes as well as the neurological and genetic basis of developmental dyslexia, a complex disorder of written language acquisition, are intensely discussed with regard to multiple-deficit models. Accumulating evidence has revealed dyslexics' impairments in a variety of tasks requiring visual attention. The heterogeneity of these experimental results, however, points to the need for measures that are sufficiently sensitive to differentiate between impaired and preserved attentional components within a unified framework. This first parameter-based group study of attentional components in developmental dyslexia addresses potentially altered attentional components that have recently been associated with parietal dysfunctions in dyslexia. We aimed to isolate the general attentional resources that might underlie reduced span performance, i.e., either a deficient working memory storage capacity, or a slowing in visual perceptual processing speed, or both. Furthermore, by analysing attentional selectivity in dyslexia, we addressed a potential lateralized abnormality of visual attention, i.e., a previously suggested rightward spatial deviation compared to normal readers. We investigated a group of high-achieving young adults with persisting dyslexia and matched normal readers in an experimental whole report and a partial report of briefly presented letter arrays. Possible deviations in the parametric values of the dyslexic compared to the control group were taken as markers for the underlying deficit. The dyslexic group showed a striking reduction in perceptual processing speed (by 26% compared to controls) while their working memory storage capacity was in the normal range. In addition, a spatial deviation of attentional weighting compared to the control group was confirmed in dyslexic readers, which was larger in participants with a more severe dyslexic disorder. In general, the present study supports the relevance of perceptual processing speed in disorders

  17. Effects of Semantic Context and Feedback on Perceptual Learning of Speech Processed through an Acoustic Simulation of a Cochlear Implant

    Science.gov (United States)

    Loebach, Jeremy L.; Pisoni, David B.; Svirsky, Mario A.

    2010-01-01

    The effect of feedback and materials on perceptual learning was examined in listeners with normal hearing who were exposed to cochlear implant simulations. Generalization was most robust when feedback paired the spectrally degraded sentences with their written transcriptions, promoting mapping between the degraded signal and its acoustic-phonetic…

  18. Time-Frequency Feature Representation Using Multi-Resolution Texture Analysis and Acoustic Activity Detector for Real-Life Speech Emotion Recognition

    Directory of Open Access Journals (Sweden)

    Kun-Ching Wang

    2015-01-01

    Full Text Available The classification of emotional speech is mostly considered in speech-related research on human-computer interaction (HCI. In this paper, the purpose is to present a novel feature extraction based on multi-resolutions texture image information (MRTII. The MRTII feature set is derived from multi-resolution texture analysis for characterization and classification of different emotions in a speech signal. The motivation is that we have to consider emotions have different intensity values in different frequency bands. In terms of human visual perceptual, the texture property on multi-resolution of emotional speech spectrogram should be a good feature set for emotion classification in speech. Furthermore, the multi-resolution analysis on texture can give a clearer discrimination between each emotion than uniform-resolution analysis on texture. In order to provide high accuracy of emotional discrimination especially in real-life, an acoustic activity detection (AAD algorithm must be applied into the MRTII-based feature extraction. Considering the presence of many blended emotions in real life, in this paper make use of two corpora of naturally-occurring dialogs recorded in real-life call centers. Compared with the traditional Mel-scale Frequency Cepstral Coefficients (MFCC and the state-of-the-art features, the MRTII features also can improve the correct classification rates of proposed systems among different language databases. Experimental results show that the proposed MRTII-based feature information inspired by human visual perception of the spectrogram image can provide significant classification for real-life emotional recognition in speech.

  19. Comparison of Two Music Training Approaches on Music and Speech Perception in Cochlear Implant Users.

    Science.gov (United States)

    Fuller, Christina D; Galvin, John J; Maat, Bert; Başkent, Deniz; Free, Rolien H

    2018-01-01

    In normal-hearing (NH) adults, long-term music training may benefit music and speech perception, even when listening to spectro-temporally degraded signals as experienced by cochlear implant (CI) users. In this study, we compared two different music training approaches in CI users and their effects on speech and music perception, as it remains unclear which approach to music training might be best. The approaches differed in terms of music exercises and social interaction. For the pitch/timbre group, melodic contour identification (MCI) training was performed using computer software. For the music therapy group, training involved face-to-face group exercises (rhythm perception, musical speech perception, music perception, singing, vocal emotion identification, and music improvisation). For the control group, training involved group nonmusic activities (e.g., writing, cooking, and woodworking). Training consisted of weekly 2-hr sessions over a 6-week period. Speech intelligibility in quiet and noise, vocal emotion identification, MCI, and quality of life (QoL) were measured before and after training. The different training approaches appeared to offer different benefits for music and speech perception. Training effects were observed within-domain (better MCI performance for the pitch/timbre group), with little cross-domain transfer of music training (emotion identification significantly improved for the music therapy group). While training had no significant effect on QoL, the music therapy group reported better perceptual skills across training sessions. These results suggest that more extensive and intensive training approaches that combine pitch training with the social aspects of music therapy may further benefit CI users.

  20. Modeling Sluggishness in Binaural Unmasking of Speech for Maskers With Time-Varying Interaural Phase Differences.

    Science.gov (United States)

    Hauth, Christopher F; Brand, Thomas

    2018-01-01

    In studies investigating binaural processing in human listeners, relatively long and task-dependent time constants of a binaural window ranging from 10 ms to 250 ms have been observed. Such time constants are often thought to reflect "binaural sluggishness." In this study, the effect of binaural sluggishness on binaural unmasking of speech in stationary speech-shaped noise is investigated in 10 listeners with normal hearing. In order to design a masking signal with temporally varying binaural cues, the interaural phase difference of the noise was modulated sinusoidally with frequencies ranging from 0.25 Hz to 64 Hz. The lowest, that is the best, speech reception thresholds (SRTs) were observed for the lowest modulation frequency. SRTs increased with increasing modulation frequency up to 4 Hz. For higher modulation frequencies, SRTs remained constant in the range of 1 dB to 1.5 dB below the SRT determined in the diotic situation. The outcome of the experiment was simulated using a short-term binaural speech intelligibility model, which combines an equalization-cancellation (EC) model with the speech intelligibility index. This model segments the incoming signal into 23.2-ms time frames in order to predict release from masking in modulated noises. In order to predict the results from this study, the model required a further time constant applied to the EC mechanism representing binaural sluggishness. The best agreement with perceptual data was achieved using a temporal window of 200 ms in the EC mechanism.