Murphy, Cristina F B; Pagan-Neves, Luciana O; Wertzner, Haydée F; Schochat, Eliane
Although research has demonstrated that children with specific language impairment (SLI) and reading disorder (RD) exhibit sustained attention deficits, no study has investigated sustained attention in children with speech sound disorder (SSD). Given the overlap of symptoms, such as phonological memory deficits, between these different language disorders (i.e., SLI, SSD and RD) and the relationships between working memory, attention and language processing, it is worthwhile to investigate whether deficits in sustained attention also occur in children with SSD. A total of 55 children (18 diagnosed with SSD (8.11 ± 1.231) and 37 typically developing children (8.76 ± 1.461)) were invited to participate in this study. Auditory and visual sustained-attention tasks were applied. Children with SSD performed worse on these tasks; they committed a greater number of auditory false alarms and exhibited a significant decline in performance over the course of the auditory detection task. The extent to which performance is related to auditory perceptual difficulties and probable working memory deficits is discussed. Further studies are needed to better understand the specific nature of these deficits and their clinical implications.
Cristina F B Murphy
Full Text Available Although research has demonstrated that children with specific language impairment (SLI and reading disorder (RD exhibit sustained attention deficits, no study has investigated sustained attention in children with speech sound disorder (SSD. Given the overlap of symptoms, such as phonological memory deficits, between these different language disorders (i.e., SLI, SSD and RD and the relationships between working memory, attention and language processing, it is worthwhile to investigate whether deficits in sustained attention also occur in children with SSD. A total of 55 children (18 diagnosed with SSD (8.11 ± 1.231 and 37 typically developing children (8.76 ± 1.461 were invited to participate in this study. Auditory and visual sustained-attention tasks were applied. Children with SSD performed worse on these tasks; they committed a greater number of auditory false alarms and exhibited a significant decline in performance over the course of the auditory detection task. The extent to which performance is related to auditory perceptual difficulties and probable working memory deficits is discussed. Further studies are needed to better understand the specific nature of these deficits and their clinical implications.
Powell, Thomas W.
This article describes a procedure to aid in the clinical appraisal of child speech. The approach, based on the work by Dinnsen, Chin, Elbert, and Powell (1990; Some constraints on functionally disordered phonologies: Phonetic inventories and phonotactics. "Journal of Speech and Hearing Research", 33, 28-37), uses a railway idiom to track gains in…
Furlong, Lisa; Erickson, Shane; Morris, Meg E
With the current worldwide workforce shortage of Speech-Language Pathologists, new and innovative ways of delivering therapy to children with speech sound disorders are needed. Computer-based speech therapy may be an effective and viable means of addressing service access issues for children with speech sound disorders. To evaluate the efficacy of computer-based speech therapy programs for children with speech sound disorders. Studies reporting the efficacy of computer-based speech therapy programs were identified via a systematic, computerised database search. Key study characteristics, results, main findings and details of computer-based speech therapy programs were extracted. The methodological quality was evaluated using a structured critical appraisal tool. 14 studies were identified and a total of 11 computer-based speech therapy programs were evaluated. The results showed that computer-based speech therapy is associated with positive clinical changes for some children with speech sound disorders. There is a need for collaborative research between computer engineers and clinicians, particularly during the design and development of computer-based speech therapy programs. Evaluation using rigorous experimental designs is required to understand the benefits of computer-based speech therapy. The reader will be able to 1) discuss how computerbased speech therapy has the potential to improve service access for children with speech sound disorders, 2) explain the ways in which computer-based speech therapy programs may enhance traditional tabletop therapy and 3) compare the features of computer-based speech therapy programs designed for different client populations. Copyright © 2017 Elsevier Inc. All rights reserved.
Williams, A. Lynn, Ed.; McLeod, Sharynne, Ed.; McCauley, Rebecca J., Ed.
With detailed discussion and invaluable video footage of 23 treatment interventions for speech sound disorders (SSDs) in children, this textbook and DVD set should be part of every speech-language pathologist's professional preparation. Focusing on children with functional or motor-based speech disorders from early childhood through the early…
Preston, Jonathan L; Irwin, Julia R; Turcios, Jacqueline
Children with speech sound disorders may perceive speech differently than children with typical speech development. The nature of these speech differences is reviewed with an emphasis on assessing phoneme-specific perception for speech sounds that are produced in error. Category goodness judgment, or the ability to judge accurate and inaccurate tokens of speech sounds, plays an important role in phonological development. The software Speech Assessment and Interactive Learning System, which has been effectively used to assess preschoolers' ability to perform goodness judgments, is explored for school-aged children with residual speech errors (RSEs). However, data suggest that this particular task may not be sensitive to perceptual differences in school-aged children. The need for the development of clinical tools for assessment of speech perception in school-aged children with RSE is highlighted, and clinical suggestions are provided. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Jadir Mauro Galvao
Full Text Available The theme of sustainability has not yet achieved the feat of make up as an integral part the theoretical medley that brings out our most everyday actions, often visits some of our thoughts and permeates many of our speeches. The big event of 2012, the meeting gathered Rio +20 glances from all corners of the planet around that theme as burning, but we still see forward timidly. Although we have no very clear what the term sustainability closes it does not sound quite strange. Associate with things like ecology, planet, wastes emitted by smokestacks of factories, deforestation, recycling and global warming must be related, but our goal in this article is the least of clarifying the term conceptually and more try to observe as it appears in speeches of such conference. When the competent authorities talk about sustainability relate to what? We intend to investigate the lines and between the lines of these speeches, any assumptions associated with the term. Therefore we will analyze the speech of the People´s Summit, the opening speech of President Dilma and emblematic speech of the President of Uruguay, José Pepe Mujica.
Corbeil, Marieve; Trehub, Sandra E.; Peretz, Isabelle
Infants prefer speech to non-vocal sounds and to non-human vocalizations, and they prefer happy-sounding speech to neutral speech. They also exhibit an interest in singing, but there is little knowledge of their relative interest in speech and singing. The present study explored infants' attention to unfamiliar audio samples of speech and singing. In Experiment 1, infants 4–13 months of age were exposed to happy-sounding infant-directed speech vs. hummed lullabies by the same woman. They listened significantly longer to the speech, which had considerably greater acoustic variability and expressiveness, than to the lullabies. In Experiment 2, infants of comparable age who heard the lyrics of a Turkish children's song spoken vs. sung in a joyful/happy manner did not exhibit differential listening. Infants in Experiment 3 heard the happily sung lyrics of the Turkish children's song vs. a version that was spoken in an adult-directed or affectively neutral manner. They listened significantly longer to the sung version. Overall, happy voice quality rather than vocal mode (speech or singing) was the principal contributor to infant attention, regardless of age. PMID:23805119
Full Text Available Infants prefer speech to non-vocal sounds and to non-human vocalizations, and they prefer happy-sounding speech to neutral speech. They also exhibit an interest in singing, but there is little knowledge of their relative interest in speech and singing. The present study explored infants’ attention to unfamiliar audio samples of speech and singing. In Experiment 1, infants 4-13 months of age were exposed to happy-sounding infant-directed speech versus hummed lullabies by the same woman. They listened significantly longer to the speech, which had considerably greater acoustic variability and expressiveness, than to the lullabies. In Experiment 2, infants of comparable age who heard the lyrics of a Turkish children’s song spoken versus sung in a joyful/happy manner did not exhibit differential listening. Infants in Experiment 3 heard the happily sung lyrics of the Turkish children’s song versus a version that was spoken in an adult-directed or affectively neutral manner. They listened significantly longer to the sung version. Overall, happy voice quality rather than vocal mode (speech or singing was the principal contributor to infant attention, regardless of age.
Joseph, Sabine; Iverson, Paul; Manohar, Sanjay; Fox, Zoe; Scott, Sophie K; Husain, Masud
Memory for speech sounds is a key component of models of verbal working memory (WM). But how good is verbal WM? Most investigations assess this using binary report measures to derive a fixed number of items that can be stored. However, recent findings in visual WM have challenged such "quantized" views by employing measures of recall precision with an analogue response scale. WM for speech sounds might rely on both continuous and categorical storage mechanisms. Using a novel speech matching paradigm, we measured WM recall precision for phonemes. Vowel qualities were sampled from a formant space continuum. A probe vowel had to be adjusted to match the vowel quality of a target on a continuous, analogue response scale. Crucially, this provided an index of the variability of a memory representation around its true value and thus allowed us to estimate how memories were distorted from the original sounds. Memory load affected the quality of speech sound recall in two ways. First, there was a gradual decline in recall precision with increasing number of items, consistent with the view that WM representations of speech sounds become noisier with an increase in the number of items held in memory, just as for vision. Based on multidimensional scaling (MDS), the level of noise appeared to be reflected in distortions of the formant space. Second, as memory load increased, there was evidence of greater clustering of participants' responses around particular vowels. A mixture model captured both continuous and categorical responses, demonstrating a shift from continuous to categorical memory with increasing WM load. This suggests that direct acoustic storage can be used for single items, but when more items must be stored, categorical representations must be used.
McClain, Matthew; Romanowski, Brian
Non-language speech sounds (NLSS) are sounds produced by humans that do not carry linguistic information. Examples of these sounds are coughs, clicks, breaths, and filled pauses such as "uh" and "um" in English. NLSS are prominent in conversational speech, but can be a significant source of errors in speech processing applications. Traditionally, these sounds are ignored by speech endpoint detection algorithms, where speech regions are identified in the audio signal prior to processing. The ability to filter NLSS as a pre-processing step can significantly enhance the performance of many speech processing applications, such as speaker identification, language identification, and automatic speech recognition. In order to be used in all such applications, NLSS detection must be performed without the use of language models that provide knowledge of the phonology and lexical structure of speech. This is especially relevant to situations where the languages used in the audio are not known apriori. We present the results of preliminary experiments using data from American and British English speakers, in which segments of audio are classified as language speech sounds (LSS) or NLSS using a set of acoustic features designed for language-agnostic NLSS detection and a hidden-Markov model (HMM) to model speech generation. The results of these experiments indicate that the features and model used are capable of detection certain types of NLSS, such as breaths and clicks, while detection of other types of NLSS such as filled pauses will require future research.
Lee, Alice S.; Gibbon, Fiona E.
Background: Children with developmental speech sound disorders have difficulties in producing the speech sounds of their native language. These speech difficulties could be due to structural, sensory or neurophysiological causes (e.g. hearing impairment), butmore often the cause of the problem is unknown. One treatment approach used by speech-language therapists/pathologists is non-speech oral motor treatment (NSOMT). NSOMTs are non-speech activities that aim to stimulate or improve speech pr...
Full Text Available If it is well known that knowledge facilitates higher cognitive functions, such as visual and auditory word recognition, little is known about the influence of knowledge on detection, particularly in the auditory modality. Our study tested the influence of phonological and lexical knowledge on auditory detection. Words, pseudo words and complex non phonological sounds, energetically matched as closely as possible, were presented at a range of presentation levels from sub threshold to clearly audible. The participants performed a detection task (Experiments 1 and 2 that was followed by a two alternative forced choice recognition task in Experiment 2. The results of this second task in Experiment 2 suggest a correct recognition of words in the absence of detection with a subjective threshold approach. In the detection task of both experiments, phonological stimuli (words and pseudo words were better detected than non phonological stimuli (complex sounds, presented close to the auditory threshold. This finding suggests an advantage of speech for signal detection. An additional advantage of words over pseudo words was observed in Experiment 2, suggesting that lexical knowledge could also improve auditory detection when listeners had to recognize the stimulus in a subsequent task. Two simulations of detection performance performed on the sound signals confirmed that the advantage of speech over non speech processing could not be attributed to energetic differences in the stimuli.
Falk, Simone; Rathcke, Tamara; Dalla Bella, Simone
Repetition can boost memory and perception. However, repeating the same stimulus several times in immediate succession also induces intriguing perceptual transformations and illusions. Here, we investigate the Speech to Song Transformation (S2ST), a massed repetition effect in the auditory modality, which crosses the boundaries between language and music. In the S2ST, a phrase repeated several times shifts to being heard as sung. To better understand this unique cross-domain transformation, we examined the perceptual determinants of the S2ST, in particular the role of acoustics. In 2 Experiments, the effects of 2 pitch properties and 3 rhythmic properties on the probability and speed of occurrence of the transformation were examined. Results showed that both pitch and rhythmic properties are key features fostering the transformation. However, some properties proved to be more conducive to the S2ST than others. Stable tonal targets that allowed for the perception of a musical melody led more often and quickly to the S2ST than scalar intervals. Recurring durational contrasts arising from segmental grouping favoring a metrical interpretation of the stimulus also facilitated the S2ST. This was, however, not the case for a regular beat structure within and across repetitions. In addition, individual perceptual abilities allowed to predict the likelihood of the S2ST. Overall, the study demonstrated that repetition enables listeners to reinterpret specific prosodic features of spoken utterances in terms of musical structures. The findings underline a tight link between language and music, but they also reveal important differences in communicative functions of prosodic structure in the 2 domains.
McLeod, Sharynne; Harrison, Linda J.; McAllister, Lindy; McCormack, Jane
Purpose: To undertake a community (nonclinical) study to describe the speech of preschool children who had been identified by parents/teachers as having difficulties "talking and making speech sounds" and compare the speech characteristics of those who had and had not accessed the services of a speech-language pathologist (SLP). Method:…
Roman, Nicoleta; Wang, Deliang; Brown, Guy J.
At a cocktail party, one can selectively attend to a single voice and filter out all the other acoustical interferences. How to simulate this perceptual ability remains a great challenge. This paper describes a novel, supervised learning approach to speech segregation, in which a target speech signal is separated from interfering sounds using spatial localization cues: interaural time differences (ITD) and interaural intensity differences (IID). Motivated by the auditory masking effect, the notion of an ``ideal'' time-frequency binary mask is suggested, which selects the target if it is stronger than the interference in a local time-frequency (T-F) unit. It is observed that within a narrow frequency band, modifications to the relative strength of the target source with respect to the interference trigger systematic changes for estimated ITD and IID. For a given spatial configuration, this interaction produces characteristic clustering in the binaural feature space. Consequently, pattern classification is performed in order to estimate ideal binary masks. A systematic evaluation in terms of signal-to-noise ratio as well as automatic speech recognition performance shows that the resulting system produces masks very close to ideal binary ones. A quantitative comparison shows that the model yields significant improvement in performance over an existing approach. Furthermore, under certain conditions the model produces large speech intelligibility improvements with normal listeners.
Namasivayam, Aravind Kumar; Pukonen, Margit; Goshulak, Debra; Yu, Vickie Y; Kadis, Darren S; Kroll, Robert; Pang, Elizabeth W; De Nil, Luc F
The current study was undertaken to investigate the impact of speech motor issues on the speech intelligibility of children with moderate to severe speech sound disorders (SSD) within the context of the PROMPT intervention approach. The word-level Children's Speech Intelligibility Measure (CSIM), the sentence-level Beginner's Intelligibility Test (BIT) and tests of speech motor control and articulation proficiency were administered to 12 children (3:11 to 6:7 years) before and after PROMPT therapy. PROMPT treatment was provided for 45 min twice a week for 8 weeks. Twenty-four naïve adult listeners aged 22-46 years judged the intelligibility of the words and sentences. For CSIM, each time a recorded word was played to the listeners they were asked to look at a list of 12 words (multiple-choice format) and circle the word while for BIT sentences, the listeners were asked to write down everything they heard. Words correctly circled (CSIM) or transcribed (BIT) were averaged across three naïve judges to calculate percentage speech intelligibility. Speech intelligibility at both the word and sentence level was significantly correlated with speech motor control, but not articulatory proficiency. Further, the severity of speech motor planning and sequencing issues may potentially be a limiting factor in connected speech intelligibility and highlights the need to target these issues early and directly in treatment. The reader will be able to: (1) outline the advantages and disadvantages of using word- and sentence-level speech intelligibility tests; (2) describe the impact of speech motor control and articulatory proficiency on speech intelligibility; and (3) describe how speech motor control and speech intelligibility data may provide critical information to aid treatment planning. Copyright © 2013 Elsevier Inc. All rights reserved.
Priester, Gertrude H.; Goorhuis - Brouwer, Siena
Research aim: The primary aim of our study is to investigate if there is an ordering in the speech sound development of children aged 3-6, similar to the ordering in general language development. Method: The speech sound development of 1035 children was tested with a revised version of
Lewis, Barbara A.; Avrich, Allison A.; Freebairn, Lisa A.; Taylor, H. Gerry; Iyengar, Sudha K.; Stein, Catherine M.
Purpose: The present study examined associations of 5 endophenotypes (i.e., measurable skills that are closely associated with speech sound disorders and are useful in detecting genetic influences on speech sound production), oral motor skills, phonological memory, phonological awareness, vocabulary, and speeded naming, with 3 clinical criteria…
Preston, Jonathan L.; Ramsdell, Heather L.; Oller, D. Kimbrough; Edwards, Mary Louise; Tobin, Stephen J.
Purpose: To develop a system for numerically quantifying a speaker's phonetic accuracy through transcription-based measures. With a focus on normal and disordered speech in children, the authors describe a system for differentially weighting speech sound errors on the basis of various levels of phonetic accuracy using a Weighted Speech Sound…
Lewis, Barbara A.; Freebairn, Lisa A.; Hansen, Amy J.; Stein, Catherine M.; Shriberg, Lawrence D.; Iyengar, Sudha K.; Taylor, H. Gerry
The goal of this study was to classify children with speech sound disorders (SSD) empirically, using factor analytic techniques. Participants were 3-7-year olds enrolled in speech/language therapy (N=185). Factor analysis of an extensive battery of speech and language measures provided support for two distinct factors, representing the skill…
Myers-Schulz, Blake; Pujara, Maia; Wolf, Richard C; Koenigs, Michael
During much of the past century, it was widely believed that phonemes-the human speech sounds that constitute words-have no inherent semantic meaning, and that the relationship between a combination of phonemes (a word) and its referent is simply arbitrary. Although recent work has challenged this picture by revealing psychological associations between certain phonemes and particular semantic contents, the precise mechanisms underlying these associations have not been fully elucidated. Here we provide novel evidence that certain phonemes have an inherent, non-arbitrary emotional quality. Moreover, we show that the perceived emotional valence of certain phoneme combinations depends on a specific acoustic feature-namely, the dynamic shift within the phonemes' first two frequency components. These data suggest a phoneme-relevant acoustic property influencing the communication of emotion in humans, and provide further evidence against previously held assumptions regarding the structure of human language. This finding has potential applications for a variety of social, educational, clinical, and marketing contexts.
Shimokura, Ryota; Matsui, Toshie; Takaki, Yuya; Nishimura, Tadashi; Yamanaka, Toshiaki; Hosoi, Hiroshi
The purpose of this study was to explore the differences in speech intelligibility in short-reverberant sound fields using deteriorated monosyllables. Generated using digital signal processing, deteriorated monosyllables can lack the redundancy of words, and thus may emphasize differences in sound fields in terms of speech clarity. Ten participants without any hearing disorders identified 100 monosyllables convolved with eight impulse responses measured in different short-reverberant sound fields (speech transmission index >0.6 and reverberation time <1s), and we compared speech recognition scores between normal and deteriorated monosyllables. Deterioration was produced using low-pass filtering (cut off frequency=1600Hz). Speech recognition scores associated with the deteriorated monosyllables were lower than those for the normal monosyllables. In addition, scores were more varied among the different sound fields, although this result was not significant according to an analysis of variance. In contrast, the variation among sound fields was significant for the normal monosyllables. When comparing the intelligibility scores to the acoustic parameters calculated from eight impulse responses, the speech recognition scores were the highest when the reverberant/direct sound energy ratio (R/D) was balanced. Although our deterioration procedure obscured differences in intelligibility score among the different sound fields, we have established that the R/D is a useful parameter for evaluating speech intelligibility in short-reverberant sound fields. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Preston, Jonathan; Edwards, Mary Louise
Purpose: Some children with speech sound disorders (SSD) have difficulty with literacy-related skills, particularly phonological awareness (PA). This study investigates the PA skills of preschoolers with SSD by using a regression model to evaluate the degree to which PA can be concurrently predicted by types of speech sound errors. Method:…
Leech, Robert; Holt, Lori L.; Devlin, Joseph T.; Dick, Frederic
Regions of the human temporal lobe show greater activation for speech than for other sounds. These differences may reflect intrinsically specialized domain-specific adaptations for processing speech, or they may be driven by the significant expertise we have in listening to the speech signal. To test the expertise hypothesis, we used a video-game-based paradigm that tacitly trained listeners to categorize acoustically complex, artificial non-linguistic sounds. Before and after training, we us...
Pomaville, Frances M; Kladopoulos, Chris N
In this study, the authors examined the treatment efficacy of a behavioral speech therapy protocol for adult cochlear implant recipients. The authors used a multiple-baseline, across-behaviors and -participants design to examine the effectiveness of a therapy program based on behavioral principles and methods to improve the production of target speech sounds in 3 adults with cochlear implants. The authors included probe items in a baseline protocol to assess generalization of target speech sounds to untrained exemplars. Pretest and posttest scores from the Arizona Articulation Proficiency Scale, Third Revision (Arizona-3; Fudala, 2000) and measurement of speech errors during spontaneous speech were compared, providing additional measures of target behavior generalization. The results of this study provided preliminary evidence supporting the overall effectiveness and efficiency of a behavioral speech therapy program in increasing percent correct speech sound production in adult cochlear implant recipients. The generalization of newly trained speech skills to untrained words and to spontaneous speech was demonstrated. These preliminary findings support the application of behavioral speech therapy techniques for training speech sound production in adults with cochlear implants. Implications for future research and the development of aural rehabilitation programs for adult cochlear implant recipients are discussed.
Sørensen Karsten Vandborg
Full Text Available We propose time-frequency domain methods for noise estimation and speech enhancement. A speech presence detection method is used to find connected time-frequency regions of speech presence. These regions are used by a noise estimation method and both the speech presence decisions and the noise estimate are used in the speech enhancement method. Different attenuation rules are applied to regions with and without speech presence to achieve enhanced speech with natural sounding attenuated background noise. The proposed speech enhancement method has a computational complexity, which makes it feasible for application in hearing aids. An informal listening test shows that the proposed speech enhancement method has significantly higher mean opinion scores than minimum mean-square error log-spectral amplitude (MMSE-LSA and decision-directed MMSE-LSA.
Priester, Gertrude H; Goorhuis-Brouwer, Sieneke M
The primary aim of our study is to investigate if there is an ordering in the speech sound development of children aged 3-6, similar to the ordering in general language development. The speech sound development of 1035 children was tested with a revised version of Logo-Articulation Assessment. The data were analyzed with the Mokken Scale Program (MSP) in order to construct scales with satisfactory scalability (H-coefficient) and sufficient reliability (rho). The majority of children over 4.3 years of age turned out to have mastered most speech sounds. An ordering was only found in the youngest age group (3.8-4.3 years of age), for the sounds of /r/ in initial and final position and /s/ in initial position. This resulted in a set of scales. The scales developed for /r/ (in initial and final position) and /s/ were moderately scalable (H>0.43) and reliable (rho>0.83), and independent of gender. Moreover, we found variation in the judgment of speech sound development, which may perhaps have been due to where exactly the examiner was positioned during the assessment procedure: in front of the child, or sitting beside the child. We could not detect an ordering for all speech sounds. We only found an ordering for /r/ in initial and final position and /s/ in initial position. In the Mokken analysis we conducted, these scales turned out to be moderately strong and reliable. Our research also underlines that speech sound development is judged not only in an auditory sense, but judgment also depends on the visual interpretation of the listener. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Oliveira, Carla; Lousada, Marisa; Jesus, Luis M. T.
Children with speech sound disorders (SSD) represent a large number of speech and language therapists' caseloads. The intervention with children who have SSD can involve different therapy approaches, and these may be articulatory or phonologically based. Some international studies reveal a widespread application of articulatory based approaches in…
Klein, Harriet B.; Liu-Shea, May
Purpose: This study was designed to identify and describe between-word simplification patterns in the continuous speech of children with speech sound disorders. It was hypothesized that word combinations would reveal phonological changes that were unobserved with single words, possibly accounting for discrepancies between the intelligibility of…
William F Katz
Full Text Available Pronunciation training studies have yielded important information concerning the processing of audiovisual (AV information. Second language (L2 learners show increased reliance on bottom-up, multimodal input for speech perception (compared to monolingual individuals. However, little is known about the role of viewing one’s own speech articulation processes during speech training. The current study investigated whether real-time, visual feedback for tongue movement can improve a speaker’s learning of non-native speech sounds. An interactive 3D tongue visualization system based on electromagnetic articulography (EMA was used in a speech training experiment. Native speakers of American English produced a novel speech sound (/ɖ̠/; a voiced, coronal, palatal stop before, during, and after trials in which they viewed their own speech movements using the 3D model. Talkers’ productions were evaluated using kinematic (tongue-tip spatial positioning and acoustic (burst spectra measures. The results indicated a rapid gain in accuracy associated with visual feedback training. The findings are discussed with respect to neural models for multimodal speech processing.
Katz, William F; Mehta, Sonya
Pronunciation training studies have yielded important information concerning the processing of audiovisual (AV) information. Second language (L2) learners show increased reliance on bottom-up, multimodal input for speech perception (compared to monolingual individuals). However, little is known about the role of viewing one's own speech articulation processes during speech training. The current study investigated whether real-time, visual feedback for tongue movement can improve a speaker's learning of non-native speech sounds. An interactive 3D tongue visualization system based on electromagnetic articulography (EMA) was used in a speech training experiment. Native speakers of American English produced a novel speech sound (/ɖ/; a voiced, coronal, palatal stop) before, during, and after trials in which they viewed their own speech movements using the 3D model. Talkers' productions were evaluated using kinematic (tongue-tip spatial positioning) and acoustic (burst spectra) measures. The results indicated a rapid gain in accuracy associated with visual feedback training. The findings are discussed with respect to neural models for multimodal speech processing.
Plant, G L
Four subjects fitted with single-channel vibrotactile aids and provided with training in their use took part in a testing programme aimed at assessing their aided and unaided lipreading performance, their ability to detect segmental and suprasegmental features of speech, and the discrimination of common environmental sounds. The results showed that the vibrotactile aid provided very useful information as to speech and non-speech stimuli with the subjects performing best on those tasks where time/intensity cues provided sufficient information to enable identification. The implications of the study are discussed and a comparison made with those results reported for subjects using cochlear implants.
van Atteveldt, Nienke; Formisano, Elia; Goebel, Rainer; Blomert, Leo
Most people acquire literacy skills with remarkable ease, even though the human brain is not evolutionarily adapted to this relatively new cultural phenomenon. Associations between letters and speech sounds form the basis of reading in alphabetic scripts. We investigated the functional neuroanatomy
Lewis, Barbara A.; Freebairn, Lisa A.; Taylor, H. Gerry
Tests of phonology, semantics, and syntax were administered to 52 preschool children (ages 4-6) with speech sound disorders. Language impairment at school-age (ages 8-11) related to poor performance on preschool tests of syntax and nonsense word repetition, while reading impairment was predicted by poor performance in all preschool test domains.…
King, Amie M.; Hengst, Julie A.; DeThorne, Laura S.
Purpose: This study introduces an integrated multimodal intervention (IMI) and examines its effectiveness for the treatment of persistent and severe speech sound disorders (SSD) in young children. The IMI is an activity-based intervention that focuses simultaneously on increasing the "quantity" of a child's meaningful productions of target words…
Priester, G. H.; Post, W. J.; Goorhuis-Brouwer, S. M.
Objective: Comparison of normative data in English and Dutch speech sound development in young children. Research questions were: Which normative data are present concerning speech sound development in children between two and six years of age? In which way are the speech sounds examined? What are
Ito, Takayuki; Johns, Alexis R.; Ostry, David J.
Purpose: Somatosensory information associated with speech articulatory movements affects the perception of speech sounds and vice versa, suggesting an intimate linkage between speech production and perception systems. However, it is unclear which cortical processes are involved in the interaction between speech sounds and orofacial somatosensory…
Lee, Alice S-Y; Gibbon, Fiona E
Children with developmental speech sound disorders have difficulties in producing the speech sounds of their native language. These speech difficulties could be due to structural, sensory or neurophysiological causes (e.g. hearing impairment), but more often the cause of the problem is unknown. One treatment approach used by speech-language therapists/pathologists is non-speech oral motor treatment (NSOMT). NSOMTs are non-speech activities that aim to stimulate or improve speech production and treat specific speech errors. For example, using exercises such as smiling, pursing, blowing into horns, blowing bubbles, and lip massage to target lip mobility for the production of speech sounds involving the lips, such as /p/, /b/, and /m/. The efficacy of this treatment approach is controversial, and evidence regarding the efficacy of NSOMTs needs to be examined. To assess the efficacy of non-speech oral motor treatment (NSOMT) in treating children with developmental speech sound disorders who have speech errors. In April 2014 we searched the Cochrane Central Register of Controlled Trials (CENTRAL), Ovid MEDLINE (R) and Ovid MEDLINE In-Process & Other Non-Indexed Citations, EMBASE, Education Resources Information Center (ERIC), PsycINFO and 11 other databases. We also searched five trial and research registers, checked the reference lists of relevant titles identified by the search and contacted researchers to identify other possible published and unpublished studies. Randomised and quasi-randomised controlled trials that compared (1) NSOMT versus placebo or control; and (2) NSOMT as adjunctive treatment or speech intervention versus speech intervention alone, for children aged three to 16 years with developmental speech sound disorders, as judged by a speech and language therapist. Individuals with an intellectual disability (e.g. Down syndrome) or a physical disability were not excluded. The Trials Search Co-ordinator of the Cochrane Developmental, Psychosocial and
Huang, Norden E.
A new method for analyzing nonlinear and nonstationary data has been developed, and the natural applications are to speech and sound signals. The key part of the method is the Empirical Mode Decomposition method with which any complicated data set can be decomposed into a finite and often small number of Intrinsic Mode Functions (IMF). An IMF is defined as any function having the same numbers of zero-crossing and extrema, and also having symmetric envelopes defined by the local maxima and minima respectively. The IMF also admits well-behaved Hilbert transform. This decomposition method is adaptive, and, therefore, highly efficient. Since the decomposition is based on the local characteristic time scale of the data, it is applicable to nonlinear and nonstationary processes. With the Hilbert transform, the Intrinsic Mode Functions yield instantaneous frequencies as functions of time, which give sharp identifications of imbedded structures. This method invention can be used to process all acoustic signals. Specifically, it can process the speech signals for Speech synthesis, Speaker identification and verification, Speech recognition, and Sound signal enhancement and filtering. Additionally, as the acoustical signals from machinery are essentially the way the machines are talking to us. Therefore, the acoustical signals, from the machines, either from sound through air or vibration on the machines, can tell us the operating conditions of the machines. Thus, we can use the acoustic signal to diagnosis the problems of machines.
Firszt, Jill B; Ulmer, John L; Gaggl, Wolfgang
Various methods in auditory neuroscience have been used to gain knowledge about the structure and function of the human auditory cortical system. Regardless of method, hemispheric differences are evident in the normal processing of speech sounds. This review article, augmented by the authors' own work, provides evidence that asymmetries exist in both cortical and subcortical structures of the human auditory system. Asymmetries are affected by stimulus type, for example, hemispheric activation patterns have been shown to change from right to left cortex as stimuli change from speech to nonspeech. In addition, the presence of noise has differential effects on the contribution of the two hemispheres. Modifications of typical asymmetric cortical patterns occur when pathology is present, as in hearing loss or tinnitus. We show that in response to speech sounds, individuals with unilateral hearing loss lose the normal asymmetric pattern due to both a decrease in contralateral hemispheric activity and an increase in the ipsilateral hemisphere. These studies demonstrate the utility of modern neuroimaging techniques in functional investigations of the human auditory system. Neuroimaging techniques may provide additional insight as to how the cortical auditory pathways change with experience, including sound deprivation (e.g., hearing loss) and sound experience (e.g., training). Such investigations may explain why some populations appear to be more vulnerable to changes in hemispheric symmetry such as children with learning problems and the elderly. Copyright 2006 Wiley-Liss, Inc.
information retrieval, sound analysis synthesis and perception and speech processing of Indian languages. The Indian focus provided many interesting topics related to the Raga, from a music theory point of view to the instruments and the speciﬁc ornamentation of Indian classical singing. Another particular......, and the Department of Architecture, Design and Media Technology (ad:mt), University of Aalborg, Esbjerg, Denmark, and has taken place in France, Italy, Spain, and Denmark. Historically, CMMR oﬀers a cross-disciplinary overview of current music information retrieval and sound modeling activities and related topics...
Full Text Available Understanding speech is based on neural representations of individual speech sounds. In humans, such representations are capable of supporting an automatic and memory-based mechanism for auditory change detection, as reflected by the mismatch negativity of event-related potentials. There are also findings of neural representations of speech sounds in animals, but it is not known whether these representations can support the change detection mechanism analogous to that underlying the mismatch negativity in humans. To this end, we presented synthesized spoken syllables to urethane-anesthetized rats while local field potentials were epidurally recorded above their primary auditory cortex. In an oddball condition, a deviant stimulus /ga/ or /ba/ (probability 1:12 for each was rarely and randomly interspersed between frequently presented standard stimulus /da/ (probability 10:12. In an equiprobable condition, 12 syllables, including /da/, /ga/, and /ba/, were presented in a random order (probability 1:12 for each. We found evoked responses of higher amplitude to the deviant /ba/, albeit not to /ga/, relative to the standard /da/ in the oddball condition. Furthermore, the responses to /ba/ were higher in amplitude in the oddball condition than in the equiprobable condition. The findings suggest that anaesthetized rat’s brain can form representations of human speech sounds, and that these representations can support the memory-based change detection mechanism analogous to that underlying the mismatch negativity in humans. Our findings show a striking parallel in speech processing between humans and rodents and may thus pave the way for feasible animal models of memory-based change detection.
Firszt, Jill B.; Ulmer, John L.; Gaggl, Wolfgang
Various methods in auditory neuroscience have been used to gain knowledge about the structure and function of the human auditory cortical system. Regardless of method, hemispheric differences are evident in the normal processing of speech sounds. This review manuscript, augmented by the authors’ own work, provides evidence that asymmetries exist in both cortical and subcortical structures of the human auditory system. Asymmetries are affected by stimulus type, for example hemispheric activati...
Chen, Zhaocong; Wong, Francis C K; Jones, Jeffery A; Li, Weifeng; Liu, Peng; Chen, Xi; Liu, Hanjun
Speech perception and production are intimately linked. There is evidence that speech motor learning results in changes to auditory processing of speech. Whether speech motor control benefits from perceptual learning in speech, however, remains unclear. This event-related potential study investigated whether speech-sound learning can modulate the processing of feedback errors during vocal pitch regulation. Mandarin speakers were trained to perceive five Thai lexical tones while learning to associate pictures with spoken words over 5 days. Before and after training, participants produced sustained vowel sounds while they heard their vocal pitch feedback unexpectedly perturbed. As compared to the pre-training session, the magnitude of vocal compensation significantly decreased for the control group, but remained consistent for the trained group at the post-training session. However, the trained group had smaller and faster N1 responses to pitch perturbations and exhibited enhanced P2 responses that correlated significantly with their learning performance. These findings indicate that the cortical processing of vocal pitch regulation can be shaped by learning new speech-sound associations, suggesting that perceptual learning in speech can produce transfer effects to facilitating the neural mechanisms underlying the online monitoring of auditory feedback regarding vocal production.
Ellis, Daniel P. W.
Human listeners are very good at all kinds of sound detection and identification tasks, from understanding heavily accented speech to noticing a ringing phone underneath music playing at full blast. Efforts to duplicate these abilities on computer have been particularly intense in the area of speech recognition, and it is instructive to review which approaches have proved most powerful, and which major problems still remain. The features and models developed for speech have found applications in other audio recognition tasks, including musical signal analysis, and the problems of analyzing the general ``ambient'' audio that might be encountered by an auditorily endowed robot. This talk will briefly review statistical pattern recognition for audio signals, giving examples in several of these domains. Particular emphasis will be given to common aspects and lessons learned.
within the related ﬁelds of interest. Frontiers of Research in Speech and Music (FRSM) has been organized in diﬀerent parts of India every year since 1991. Previous conferences were held at ITC-SRA Kolkata, NPL New Delhi, BHU Varanasi, IIT Kanpur, Lucknow University, AIISH Mysore, IITM Gwalior, Utkal......, panel discussions, posters, and cultural events. We are pleased to announce that in light of the location in India there was a special focus on Indian speech and music. The melting pot of the FRSM and CMMR events gave rise to many interesting meetings with a focus on the ﬁeld from diﬀerent cultural...... information retrieval, sound analysis synthesis and perception and speech processing of Indian languages. The Indian focus provided many interesting topics related to the Raga, from a music theory point of view to the instruments and the speciﬁc ornamentation of Indian classical singing. Another particular...
Masso, Sarah; Baker, Elise; McLeod, Sharynne; Wang, Cen
Purpose: The aim of this study was to determine if polysyllable accuracy in preschoolers with speech sound disorders (SSD) was related to known predictors of later literacy development: phonological processing, receptive vocabulary, and print knowledge. Polysyllables--words of three or more syllables--are important to consider because unlike…
Full Text Available This article describes a school-based telehealth service delivery model and reports outcomes made by school-age students with speech sound disorders in a rural Ohio school district. Speech therapy using computer-based speech sound intervention materials was provided either by live interactive videoconferencing (telehealth, or conventional side-by-side intervention. Progress was measured using pre- and post-intervention scores on the Goldman Fristoe Test of Articulation-2 (Goldman & Fristoe, 2002. Students in both service delivery models made significant improvements in speech sound production, with students in the telehealth condition demonstrating greater mastery of their Individual Education Plan (IEP goals. Live interactive videoconferencing thus appears to be a viable method for delivering intervention for speech sound disorders to children in a rural, public school setting. Keywords: Telehealth, telerehabilitation, videoconferencing, speech sound disorder, speech therapy, speech-language pathology; E-Helper
Froyen, Dries; Van Atteveldt, Nienke; Bonte, Milene L; Blomert, Leo
Recently brain imaging evidence indicated that letter/speech-sound integration, necessary for establishing fluent reading, takes place in auditory association areas and that the integration is influenced by stimulus onset asynchrony (SOA) between the letter and the speech-sound. In the present
Macrae, Toby; Tyler, Ann A.
Purpose: The authors compared preschool children with co-occurring speech sound disorder (SSD) and language impairment (LI) to children with SSD only in their numbers and types of speech sound errors. Method: In this post hoc quasi-experimental study, independent samples t tests were used to compare the groups in the standard score from different…
Johnson, Erin Phinney; Pennington, Bruce F; Lowenstein, Joanna H; Nittrouer, Susan
Children with speech sound disorder (SSD) and reading disability (RD) have poor phonological awareness, a problem believed to arise largely from deficits in processing the sensory information in speech, specifically individual acoustic cues. However, such cues are details of acoustic structure. Recent theories suggest that listeners also need to be able to integrate those details to perceive linguistically relevant form. This study examined abilities of children with SSD, RD, and SSD+RD not only to process acoustic cues but also to recover linguistically relevant form from the speech signal. Ten- to 11-year-olds with SSD (n=17), RD (n=16), SSD+RD (n=17), and Controls (n=16) were tested to examine their sensitivity to (1) voice onset times (VOT); (2) spectral structure in fricative-vowel syllables; and (3) vocoded sentences. Children in all groups performed similarly with VOT stimuli, but children with disorders showed delays on other tasks, although the specifics of their performance varied. Children with poor phonemic awareness not only lack sensitivity to acoustic details, but are also less able to recover linguistically relevant forms. This is contrary to one of the main current theories of the relation between spoken and written language development. Readers will be able to (1) understand the role speech perception plays in phonological awareness, (2) distinguish between segmental and global structure analysis of speech perception, (3) describe differences and similarities in speech perception among children with speech sound disorder and/or reading disability, and (4) recognize the importance of broadening clinical interventions to focus on recognizing structure at all levels of speech analysis. Copyright © 2011 Elsevier Inc. All rights reserved.
Engineer, Crystal T; Rahebi, Kimiya C; Borland, Michael S; Buell, Elizabeth P; Centanni, Tracy M; Fink, Melyssa K; Im, Kwok W; Wilson, Linda G; Kilgard, Michael P
Individuals with Rett syndrome have greatly impaired speech and language abilities. Auditory brainstem responses to sounds are normal, but cortical responses are highly abnormal. In this study, we used the novel rat Mecp2 knockout model of Rett syndrome to document the neural and behavioral processing of speech sounds. We hypothesized that both speech discrimination ability and the neural response to speech sounds would be impaired in Mecp2 rats. We expected that extensive speech training would improve speech discrimination ability and the cortical response to speech sounds. Our results reveal that speech responses across all four auditory cortex fields of Mecp2 rats were hyperexcitable, responded slower, and were less able to follow rapidly presented sounds. While Mecp2 rats could accurately perform consonant and vowel discrimination tasks in quiet, they were significantly impaired at speech sound discrimination in background noise. Extensive speech training improved discrimination ability. Training shifted cortical responses in both Mecp2 and control rats to favor the onset of speech sounds. While training increased the response to low frequency sounds in control rats, the opposite occurred in Mecp2 rats. Although neural coding and plasticity are abnormal in the rat model of Rett syndrome, extensive therapy appears to be effective. These findings may help to explain some aspects of communication deficits in Rett syndrome and suggest that extensive rehabilitation therapy might prove beneficial. Copyright © 2015 Elsevier Inc. All rights reserved.
Rogers, Jack C; Möttönen, Riikka; Boyles, Rowan; Watkins, Kate E
Perceiving speech engages parts of the motor system involved in speech production. The role of the motor cortex in speech perception has been demonstrated using low-frequency repetitive transcranial magnetic stimulation (rTMS) to suppress motor excitability in the lip representation and disrupt discrimination of lip-articulated speech sounds (Möttönen and Watkins, 2009). Another form of rTMS, continuous theta-burst stimulation (cTBS), can produce longer-lasting disruptive effects following a brief train of stimulation. We investigated the effects of cTBS on motor excitability and discrimination of speech and non-speech sounds. cTBS was applied for 40 s over either the hand or the lip representation of motor cortex. Motor-evoked potentials recorded from the lip and hand muscles in response to single pulses of TMS revealed no measurable change in motor excitability due to cTBS. This failure to replicate previous findings may reflect the unreliability of measurements of motor excitability related to inter-individual variability. We also measured the effects of cTBS on a listener's ability to discriminate: (1) lip-articulated speech sounds from sounds not articulated by the lips ("ba" vs. "da"); (2) two speech sounds not articulated by the lips ("ga" vs. "da"); and (3) non-speech sounds produced by the hands ("claps" vs. "clicks"). Discrimination of lip-articulated speech sounds was impaired between 20 and 35 min after cTBS over the lip motor representation. Specifically, discrimination of across-category ba-da sounds presented with an 800-ms inter-stimulus interval was reduced to chance level performance. This effect was absent for speech sounds that do not require the lips for articulation and non-speech sounds. Stimulation over the hand motor representation did not affect discrimination of speech or non-speech sounds. These findings show that stimulation of the lip motor representation disrupts discrimination of speech sounds in an articulatory feature
Terband, Hayo; Maassen, Bernardus; Maas, Edwin; van Lieshout, Pascal; Maassen, Ben; Terband, Hayo
The classification and differentiation of pediatric speech sound disorders (SSD) is one of the main questions in the field of speech- and language pathology. Terms for classifying childhood and SSD and motor speech disorders (MSD) refer to speech production processes, and a variety of methods of
Pihko, Elina; Kujala, Teija; Mickos, Annika; Antell, Henrik; Alku, Paavo; Byring, Roger; Korkman, Marit
Our objective was to study how well the auditory evoked magnetic fields (EF) reflect the behavioral discrimination of speech sounds in preschool children, and if they reveal the same information as simultaneously recorded evoked potentials (EP). EFs and EPs were recorded in 11 preschool children (mean age 6 years 9 months) using an oddball paradigm with two sets of speech stimuli consisting both of one standard and two deviants. After the brain activity recording, children were tested on behavioural discrimination of the same stimuli presented in pairs. There was a mismatch negativity (MMN) calculated from difference curves and its magnetic counterpart MMNm measured from the original responses only to those deviants, which were behaviourally easiest to discriminate from the standards. In addition, EF revealed significant differences between the locations of the activation depending on the hemisphere and stimulus properties. EF, in addition to reflecting the sound-discrimination accuracy in a similar manner as EP, also reflected the spatial differences in activation of the temporal lobes. These results suggest that both EPs and EFs are feasible for investigating the neural basis of sound discrimination in young children. The recording of EFs with its high spatial resolution reveals information on the location of the activated neural sources.
Collins, M J; Hurtig, R R
The usefulness of tactile devices as aids to lipreading has been established. However, maximum usefulness in reducing the ambiguity of lipreading cues and/or use of tactile devices as a substitute for audition may be dependent on phonemic recognition via tactile signals alone. In the present study, a categorical perception paradigm was used to evaluate tactile perception of speech sounds in comparison to auditory perception. The results show that speech signals delivered by tactile stimulation can be categorically perceived on a voice-onset time (VOT) continuum. The boundary for the voiced-voiceless distinction falls at longer VOTs for tactile than for auditory perception. It is concluded that the procedure is useful for determining characteristics of tactile perception and for prosthesis evaluation.
Möttönen, Riikka; van de Ven, Gido M; Watkins, Kate E
The earliest stages of cortical processing of speech sounds take place in the auditory cortex. Transcranial magnetic stimulation (TMS) studies have provided evidence that the human articulatory motor cortex contributes also to speech processing. For example, stimulation of the motor lip representation influences specifically discrimination of lip-articulated speech sounds. However, the timing of the neural mechanisms underlying these articulator-specific motor contributions to speech processing is unknown. Furthermore, it is unclear whether they depend on attention. Here, we used magnetoencephalography and TMS to investigate the effect of attention on specificity and timing of interactions between the auditory and motor cortex during processing of speech sounds. We found that TMS-induced disruption of the motor lip representation modulated specifically the early auditory-cortex responses to lip-articulated speech sounds when they were attended. These articulator-specific modulations were left-lateralized and remarkably early, occurring 60-100 ms after sound onset. When speech sounds were ignored, the effect of this motor disruption on auditory-cortex responses was nonspecific and bilateral, and it started later, 170 ms after sound onset. The findings indicate that articulatory motor cortex can contribute to auditory processing of speech sounds even in the absence of behavioral tasks and when the sounds are not in the focus of attention. Importantly, the findings also show that attention can selectively facilitate the interaction of the auditory cortex with specific articulator representations during speech processing.
Nishiura, Takanobu; Nakamura, Satoshi
Humans communicate with each other through speech by focusing on the target speech among environmental sounds in real acoustic environments. We can easily identify the target sound from other environmental sounds. For hands-free speech recognition, the identification of the target speech from environmental sounds is imperative. This mechanism may also be important for a self-moving robot to sense the acoustic environments and communicate with humans. Therefore, this paper first proposes hidden Markov model (HMM)-based environmental sound source identification. Environmental sounds are modeled by three states of HMMs and evaluated using 92 kinds of environmental sounds. The identification accuracy was 95.4%. This paper also proposes a new HMM composition method that composes speech HMMs and an HMM of categorized environmental sounds for robust environmental sound-added speech recognition. As a result of the evaluation experiments, we confirmed that the proposed HMM composition outperforms the conventional HMM composition with speech HMMs and a noise (environmental sound) HMM trained using noise periods prior to the target speech in a captured signal. [Work supported by Ministry of Public Management, Home Affairs, Posts and Telecommunications of Japan.
Zoefel, Benedikt; VanRullen, Rufin
Phase entrainment of neural oscillations, the brain's adjustment to rhythmic stimulation, is a central component in recent theories of speech comprehension: the alignment between brain oscillations and speech sound improves speech intelligibility. However, phase entrainment to everyday speech sound could also be explained by oscillations passively following the low-level periodicities (e.g., in sound amplitude and spectral content) of auditory stimulation-and not by an adjustment to the speech rhythm per se. Recently, using novel speech/noise mixture stimuli, we have shown that behavioral performance can entrain to speech sound even when high-level features (including phonetic information) are not accompanied by fluctuations in sound amplitude and spectral content. In the present study, we report that neural phase entrainment might underlie our behavioral findings. We observed phase-locking between electroencephalogram (EEG) and speech sound in response not only to original (unprocessed) speech but also to our constructed "high-level" speech/noise mixture stimuli. Phase entrainment to original speech and speech/noise sound did not differ in the degree of entrainment, but rather in the actual phase difference between EEG signal and sound. Phase entrainment was not abolished when speech/noise stimuli were presented in reverse (which disrupts semantic processing), indicating that acoustic (rather than linguistic) high-level features play a major role in the observed neural entrainment. Our results provide further evidence for phase entrainment as a potential mechanism underlying speech processing and segmentation, and for the involvement of high-level processes in the adjustment to the rhythm of speech. Copyright © 2015 Elsevier Inc. All rights reserved.
Full Text Available Sound attenuation with conventional acoustic materials is subject to the mass law and requires massive and bulky structures at low frequencies. A possible alternative solution is provided by the use of metamaterials, which are artificial materials properly engineered to obtain properties and characteristics that it is not possible to find in natural materials. Theory and applications of metamaterials, already consolidated in electromagnetism, can be extended to acoustics; in particular, they can be applied to improve the properties of acoustical panels. The design of acoustic metasurfaces that could effectively control transmitted sound in unconventional ways appears a significant subject to be investigated, given its wide-ranging possible applications. In this contribution, we investigate the application of a metasurface-inspired technique to achieve the acoustical insulation of an environment. The designed surface has subwavelength thickness and structuring and could be realized with cheap, lightweight and sustainable materials. We present a few examples of such structures and analyze their acoustical behavior by means of full-wave simulations.
Overby, Megan S; Trainin, Guy; Smit, Ann Bosma; Bernthal, John E; Nelson, Ron
This archival study examined the relationship between the speech sound production skill of kindergarten children and literacy outcomes in Grades 1-3 in a data set where most children's vocabulary skills were within normal limits, speech therapy was not provided until 2nd grade, and phonological awareness instruction was discouraged at the time data were collected. Data were accessed from the Templin Archive (2004), and the speech sound production skill of 272 kindergartners were examined relative to literacy outcomes in 1st and 2nd grade (reading) and 3rd grade (spelling). Kindergartners in the 7th percentile for speech sound production skill scored more poorly in 1st- and 2nd-grade reading and 3rd-grade spelling than did kindergartners with average speech sound production skill; kindergartners in the 98th percentile achieved superior literacy skills compared to the mean. Phonological awareness mediated the effects of speech sound production skill on reading and spelling; vocabulary did not account for any unique variance. Speech sound disorders appear to be an overt manifestation of a complex interaction among variables influencing literacy skills, including nonlanguage cognition, vocabulary, letter knowledge, and phonological awareness. These interrelationships hold across the range of speech sound production skill, as children with superior speech sound production skill experience superior literacy outcomes.
Luders, Eileen; Kurth, Florian; Pigdon, Lauren; Conti-Ramsden, Gina; Reilly, Sheena; Morgan, Angela T
Speech sound disorder (SSD) is common, yet its neurobiology is poorly understood. Recent studies indicate atypical structural and functional anomalies either in one hemisphere or both hemispheres, which might be accompanied by alterations in inter-hemispheric connectivity. Indeed, abnormalities of the corpus callosum - the main fiber tract connecting the two hemispheres - have been linked to speech and language deficits in associated disorders, such as stuttering, dyslexia, aphasia, etc. However, there is a dearth of studies examining the corpus callosum in SSD. Here, we investigated whether a sample of 18 children with SSD differed in callosal morphology from 18 typically developing children carefully matched for age. Significantly reduced dimensions of the corpus callosum, particularly in the callosal anterior third, were observed in children with SSD. These findings indicating pronounced callosal aberrations in SSD make an important contribution to an understudied field of research and may suggest that SSD is accompanied by atypical lateralization of speech and language function. Copyright © 2017 IBRO. Published by Elsevier Ltd. All rights reserved.
Froyen, Dries J W; Bonte, Milene L; van Atteveldt, Nienke; Blomert, Leo
In transparent alphabetic languages, the expected standard for complete acquisition of letter-speech sound associations is within one year of reading instruction. The neural mechanisms underlying the acquisition of letter-speech sound associations have, however, hardly been investigated. The present
McDowell, Kimberly D.; Carroll, Jeri
The purpose of this study was twofold: (1) to examine the relations between speech sound accuracy, vocabulary, and phonological awareness, and (2) to examine the effect of word properties of neighborhood density and phonotactic probability on word learning within a storybook context, for children with and without speech sound inaccuracies. Fifty…
Gillam, Sandra Laing; Ford, Mikenzi Bentley
The current study was designed to examine the relationships between performance on a nonverbal phoneme deletion task administered in a dynamic assessment format with performance on measures of phoneme deletion, word-level reading, and speech sound production that required verbal responses for school-age children with speech sound disorders (SSDs).…
Tkach, Jean A.; Chen, Xu; Freebairn, Lisa A.; Schmithorst, Vincent J.; Holland, Scott K.; Lewis, Barbara A.
Speech sound disorders (SSD) are the largest group of communication disorders observed in children. One explanation for these disorders is that children with SSD fail to form stable phonological representations when acquiring the speech sound system of their language due to poor phonological memory (PM). The goal of this study was to examine PM in…
McLeod, Sharynne; Goldstein, Brian
Multilingual Aspects of Speech Sound Disorders in Children explores both multilingual and multicultural aspects of children with speech sound disorders. The 30 chapters have been written by 44 authors from 16 different countries about 112 languages and dialects. The book is designed to translate research into clinical practice. It is divided into…
Fabiano-Smith, Leah; Cuzner, Suzanne Lea
The purpose of this study was to utilize a theoretical model of bilingual speech sound production as a framework for analyzing the speech of bilingual children with speech sound disorders. In order to distinguish speech difference from speech disorder, we examined between-language interaction on initial consonant deletion, an error pattern found cross-linguistically in the speech of children with speech sound disorders. Thirteen monolingual English-speaking and bilingual Spanish-and English-speaking preschoolers with speech sound disorders were audio-recorded during a single word picture-naming task and their recordings were phonetically transcribed. Initial consonant deletion errors were examined both quantitatively and qualitatively. An analysis of cross-linguistic effects and an analysis of phonemic complexity were performed. Monolingual English-speaking children exhibited initial consonant deletion at a significantly lower rate than bilingual children in their Spanish productions; however, no other quantitative differences were found across groups or languages. Qualitative differences yielded between-language interaction in the error patterns of bilingual children. Phonemic complexity appeared to play a role in initial consonant deletion. Evidence from the speech of bilingual children with speech sound disorders supports analysing bilingual speech using a cross-linguistic framework. Both theoretical and clinical implications are discussed.
Tracy M Centanni
Full Text Available In utero RNAi of the dyslexia-associated gene Kiaa0319 in rats (KIA- degrades cortical responses to speech sounds and increases trial-by-trial variability in onset latency. We tested the hypothesis that KIA- rats would be impaired at speech sound discrimination. KIA- rats needed twice as much training in quiet conditions to perform at control levels and remained impaired at several speech tasks. Focused training using truncated speech sounds was able to normalize speech discrimination in quiet and background noise conditions. Training also normalized trial-by-trial neural variability and temporal phase locking. Cortical activity from speech trained KIA- rats was sufficient to accurately discriminate between similar consonant sounds. These results provide the first direct evidence that assumed reduced expression of the dyslexia-associated gene KIAA0319 can cause phoneme processing impairments similar to those seen in dyslexia and that intensive behavioral therapy can eliminate these impairments.
Tomblin, J. Bruce; Peng, Shu-Chen; Spencer, Linda J.; Lu, Nelson
Purpose: This study characterized the development of speech sound production in prelingually deaf children with a minimum of 8 years of cochlear implant (CI) experience. Method: Twenty-seven pediatric CI recipients' spontaneous speech samples from annual evaluation sessions were phonemically transcribed. Accuracy for these speech samples was…
Preston, Jonathan L.; Seki, Ayumi
Purpose: To describe (a) the assessment of residual speech sound disorders (SSDs) in bilinguals by distinguishing speech patterns associated with second language acquisition from patterns associated with misarticulations and (b) how assessment of domains such as speech motor control and phonological awareness can provide a more complete…
Eadie, Patricia; Morgan, Angela; Ukoumunne, Obioha C; Ttofari Eecen, Kyriaki; Wake, Melissa; Reilly, Sheena
The epidemiology of preschool speech sound disorder is poorly understood. Our aims were to determine: the prevalence of idiopathic speech sound disorder; the comorbidity of speech sound disorder with language and pre-literacy difficulties; and the factors contributing to speech outcome at 4 years. One thousand four hundred and ninety-four participants from an Australian longitudinal cohort completed speech, language, and pre-literacy assessments at 4 years. Prevalence of speech sound disorder (SSD) was defined by standard score performance of ≤79 on a speech assessment. Logistic regression examined predictors of SSD within four domains: child and family; parent-reported speech; cognitive-linguistic; and parent-reported motor skills. At 4 years the prevalence of speech disorder in an Australian cohort was 3.4%. Comorbidity with SSD was 40.8% for language disorder and 20.8% for poor pre-literacy skills. Sex, maternal vocabulary, socio-economic status, and family history of speech and language difficulties predicted SSD, as did 2-year speech, language, and motor skills. Together these variables provided good discrimination of SSD (area under the curve=0.78). This is the first epidemiological study to demonstrate prevalence of SSD at 4 years of age that was consistent with previous clinical studies. Early detection of SSD at 4 years should focus on family variables and speech, language, and motor skills measured at 2 years. © 2014 Mac Keith Press.
McKinnon, David H.; McLeod, Sharynne; Reilly, Sheena
Purpose: The aims of this study were threefold: to report teachers' estimates of the prevalence of speech disorders (specifically, stuttering, voice, and speech-sound disorders); to consider correspondence between the prevalence of speech disorders and gender, grade level, and socioeconomic status; and to describe the level of support provided to…
Sue Grogan-Johnson; Gabel, Rodney M.; Jacquelyn Taylor; Rowan, Lynne E.; Robin Alvares; Jason Schenker
This article describes a school-based telehealth service delivery model and reports outcomes made by school-age students with speech sound disorders in a rural Ohio school district. Speech therapy using computer-based speech sound intervention materials was provided either by live interactive videoconferencing (telehealth), or conventional side-by-side intervention. Progress was measured using pre- and post-intervention scores on the Goldman Fristoe Test of Articulation-2 (Goldman &...
Preston, Jonathan L.; Hull, Margaret; Edwards, Mary Louise
Purpose: To determine if speech error patterns in preschoolers with speech sound disorders (SSDs) predict articulation and phonological awareness (PA) outcomes almost 4 years later. Method: Twenty-five children with histories of preschool SSDs (and normal receptive language) were tested at an average age of 4;6 (years;months) and were followed up…
Engineer, Crystal T; Shetake, Jai A; Engineer, Navzer D; Vrana, Will A; Wolf, Jordan T; Kilgard, Michael P
Many individuals with language learning impairments exhibit temporal processing deficits and degraded neural responses to speech sounds. Auditory training can improve both the neural and behavioral deficits, though significant deficits remain. Recent evidence suggests that vagus nerve stimulation (VNS) paired with rehabilitative therapies enhances both cortical plasticity and recovery of normal function. We predicted that pairing VNS with rapid tone trains would enhance the primary auditory cortex (A1) response to unpaired novel speech sounds. VNS was paired with tone trains 300 times per day for 20 days in adult rats. Responses to isolated speech sounds, compressed speech sounds, word sequences, and compressed word sequences were recorded in A1 following the completion of VNS-tone train pairing. Pairing VNS with rapid tone trains resulted in stronger, faster, and more discriminable A1 responses to speech sounds presented at conversational rates. This study extends previous findings by documenting that VNS paired with rapid tone trains altered the neural response to novel unpaired speech sounds. Future studies are necessary to determine whether pairing VNS with appropriate auditory stimuli could potentially be used to improve both neural responses to speech sounds and speech perception in individuals with receptive language disorders. Copyright © 2017 Elsevier Inc. All rights reserved.
The Computer Music Modeling and Retrieval (CMMR) 2011 conference was the 8th event of this international series, and the ﬁrst that took place outside Europe. Since its beginnings in 2003, this conference has been co-organized by the Laboratoire de M´ecanique et d’Acoustique in Marseille, France......, and the Department of Architecture, Design and Media Technology (ad:mt), University of Aalborg, Esbjerg, Denmark, and has taken place in France, Italy, Spain, and Denmark. Historically, CMMR oﬀers a cross-disciplinary overview of current music information retrieval and sound modeling activities and related topics...... within the related ﬁelds of interest. Frontiers of Research in Speech and Music (FRSM) has been organized in diﬀerent parts of India every year since 1991. Previous conferences were held at ITC-SRA Kolkata, NPL New Delhi, BHU Varanasi, IIT Kanpur, Lucknow University, AIISH Mysore, IITM Gwalior, Utkal...
Gaither, Sarah E; Cohen-Goldberg, Ariel M; Gidney, Calvin L; Maddox, Keith B; Gidney, Calvin L; Gidney, Calvin L
Research has shown that priming one's racial identity can alter a biracial individuals' social behavior, but can such priming also influence their speech? Language is often used as a marker of one's social group membership and studies have shown that social context can affect the style of language that a person chooses to use, but this work has yet to be extended to the biracial population. Audio clips were extracted from a previous study involving biracial Black/White participants who had either their Black or White racial identity primed. Condition-blind coders rated Black-primed biracial participants as sounding significantly more Black and White-primed biracial participants as sounding significantly more White, both when listening to whole (Study 1a) and thin-sliced (Study 1b) clips. Further linguistic analyses (Studies 2a-c) were inconclusive regarding the features that differed between the two groups. Future directions regarding the need to investigate the intersections between social identity priming and language behavior with a biracial lens are discussed.
McLeod, Sharynne; Verdon, Sarah; Bowen, Caroline
A major challenge for the speech-language pathology profession in many cultures is to address the mismatch between the "linguistic homogeneity of the speech-language pathology profession and the linguistic diversity of its clientele" (Caesar & Kohler, 2007, p. 198). This paper outlines the development of the Multilingual Children with Speech Sound Disorders: Position Paper created to guide speech-language pathologists' (SLPs') facilitation of multilingual children's speech. An international expert panel was assembled comprising 57 researchers (SLPs, linguists, phoneticians, and speech scientists) with knowledge about multilingual children's speech, or children with speech sound disorders. Combined, they had worked in 33 countries and used 26 languages in professional practice. Fourteen panel members met for a one-day workshop to identify key points for inclusion in the position paper. Subsequently, 42 additional panel members participated online to contribute to drafts of the position paper. A thematic analysis was undertaken of the major areas of discussion using two data sources: (a) face-to-face workshop transcript (133 pages) and (b) online discussion artifacts (104 pages). Finally, a moderator with international expertise in working with children with speech sound disorders facilitated the incorporation of the panel's recommendations. The following themes were identified: definitions, scope, framework, evidence, challenges, practices, and consideration of a multilingual audience. The resulting position paper contains guidelines for providing services to multilingual children with speech sound disorders (http://www.csu.edu.au/research/multilingual-speech/position-paper). The paper is structured using the International Classification of Functioning, Disability and Health: Children and Youth Version (World Health Organization, 2007) and incorporates recommendations for (a) children and families, (b) SLPs' assessment and intervention, (c) SLPs' professional
Grogan-Johnson, Susan; Gabel, Rodney M; Taylor, Jacquelyn; Rowan, Lynne E; Alvares, Robin; Schenker, Jason
This article describes a school-based telehealth service delivery model and reports outcomes made by school-age students with speech sound disorders in a rural Ohio school district. Speech therapy using computer-based speech sound intervention materials was provided either by live interactive videoconferencing (telehealth), or conventional side-by-side intervention. Progress was measured using pre- and post-intervention scores on the Goldman Fristoe Test of Articulation-2 (Goldman & Fristoe, 2002). Students in both service delivery models made significant improvements in speech sound production, with students in the telehealth condition demonstrating greater mastery of their Individual Education Plan (IEP) goals. Live interactive videoconferencing thus appears to be a viable method for delivering intervention for speech sound disorders to children in a rural, public school setting.
Full Text Available The present study was intended to make electrophysiological investigations into the preattentive perception of native and non-native speech sounds. We recorded the mismatch negativity, elicited by single syllable change of both native and non-native speech-sound contrasts in tonal languages. EEGs were recorded and low-resolution brain electromagnetic tomography (LORETA was utilized to explore the neural electrical activity. Our results suggested that the left hemisphere was predominant in the perception of native speech sounds, whereas the non-native speech sound was perceived predominantly by the right hemisphere, which may be explained by the specialization in processing the prosodic and emotional components of speech formed in this hemisphere.
Shofner, William P
The behavioral responses of chinchillas to noise-vocoded versions of naturally spoken speech sounds were measured using stimulus generalization and operant conditioning. Behavioral performance for speech generalization by chinchillas is compared to recognition by a group of human listeners for the identical speech sounds. The ability of chinchillas to generalize the vocoded versions as tokens of the natural speech sounds is far less than recognition by human listeners. In many cases, responses of chinchillas to noise-vocoded speech sounds were more similar to responses to band limited noise than to the responses to natural speech sounds. Chinchillas were also tested with a middle C musical note as played on a piano. Comparison of the responses of chinchillas for the middle C condition to the responses obtained for the speech conditions suggest that chinchillas may be more influenced by fundamental frequency than by formant structure. The differences between vocoded speech perception in chinchillas and human listeners may reflect differences in their abilities to resolve the formants along the cochlea. It is argued that lengthening of the cochlea during human evolution may have provided one of the auditory mechanisms that influenced the evolution of speech-specific mechanisms.
Although RaSTI is a good indicator of the speech intelligibility capability of auditoria and similar spaces, during the past 2-3 years it has been shown that RaSTI is not a robust predictor of sound system intelligibility performance. Instead, it is now recommended, within both national and international codes and standards, that full STI measurement and analysis be employed. However, new research is reported, that indicates that STI is not as flawless, nor robust as many believe. The paper highlights a number of potential error mechanisms. It is shown that the measurement technique and signal excitation stimulus can have a significant effect on the overall result and accuracy, particularly where DSP-based equipment is employed. It is also shown that in its current state of development, STI is not capable of appropriately accounting for a number of fundamental speech and system attributes, including typical sound system frequency response variations and anomalies. This is particularly shown to be the case when a system is operating under reverberant conditions. Comparisons between actual system measurements and corresponding word score data are reported where errors of up to 50 implications for VA and PA system performance verification will be discussed.
Earle, F Sayako; Myers, Emily B
Adults learning a new language are faced with a significant challenge: non-native speech sounds that are perceptually similar to sounds in one's native language can be very difficult to acquire. Sleep and native language interference, 2 factors that may help to explain this difficulty in acquisition, are addressed in 3 studies. Results of Experiment 1 showed that participants trained on a non-native contrast at night improved in discrimination 24 hr after training, while those trained in the morning showed no such improvement. Experiments 2 and 3 addressed the possibility that incidental exposure to perceptually similar native language speech sounds during the day interfered with maintenance in the morning group. Taken together, results show that the ultimate success of non-native speech sound learning depends not only on the similarity of learned sounds to the native language repertoire, but also to interference from native language sounds before sleep. (c) 2015 APA, all rights reserved).
Flaherty, Mary; Dent, Micheal L; Sawusch, James R
The influence of experience with human speech sounds on speech perception in budgerigars, vocal mimics whose speech exposure can be tightly controlled in a laboratory setting, was measured. Budgerigars were divided into groups that differed in auditory exposure and then tested on a cue-trading identification paradigm with synthetic speech. Phonetic cue trading is a perceptual phenomenon observed when changes on one cue dimension are offset by changes in another cue dimension while still maintaining the same phonetic percept. The current study examined whether budgerigars would trade the cues of voice onset time (VOT) and the first formant onset frequency when identifying syllable initial stop consonants and if this would be influenced by exposure to speech sounds. There were a total of four different exposure groups: No speech exposure (completely isolated), Passive speech exposure (regular exposure to human speech), and two Speech-trained groups. After the exposure period, all budgerigars were tested for phonetic cue trading using operant conditioning procedures. Birds were trained to peck keys in response to different synthetic speech sounds that began with "d" or "t" and varied in VOT and frequency of the first formant at voicing onset. Once training performance criteria were met, budgerigars were presented with the entire intermediate series, including ambiguous sounds. Responses on these trials were used to determine which speech cues were used, if a trading relation between VOT and the onset frequency of the first formant was present, and whether speech exposure had an influence on perception. Cue trading was found in all birds and these results were largely similar to those of a group of humans. Results indicated that prior speech experience was not a requirement for cue trading by budgerigars. The results are consistent with theories that explain phonetic cue trading in terms of a rich auditory encoding of the speech signal.
Full Text Available The influence of experience with human speech sounds on speech perception in budgerigars, vocal mimics whose speech exposure can be tightly controlled in a laboratory setting, was measured. Budgerigars were divided into groups that differed in auditory exposure and then tested on a cue-trading identification paradigm with synthetic speech. Phonetic cue trading is a perceptual phenomenon observed when changes on one cue dimension are offset by changes in another cue dimension while still maintaining the same phonetic percept. The current study examined whether budgerigars would trade the cues of voice onset time (VOT and the first formant onset frequency when identifying syllable initial stop consonants and if this would be influenced by exposure to speech sounds. There were a total of four different exposure groups: No speech exposure (completely isolated, Passive speech exposure (regular exposure to human speech, and two Speech-trained groups. After the exposure period, all budgerigars were tested for phonetic cue trading using operant conditioning procedures. Birds were trained to peck keys in response to different synthetic speech sounds that began with "d" or "t" and varied in VOT and frequency of the first formant at voicing onset. Once training performance criteria were met, budgerigars were presented with the entire intermediate series, including ambiguous sounds. Responses on these trials were used to determine which speech cues were used, if a trading relation between VOT and the onset frequency of the first formant was present, and whether speech exposure had an influence on perception. Cue trading was found in all birds and these results were largely similar to those of a group of humans. Results indicated that prior speech experience was not a requirement for cue trading by budgerigars. The results are consistent with theories that explain phonetic cue trading in terms of a rich auditory encoding of the speech signal.
Tomblin, J Bruce; Peng, Shu-Chen; Spencer, Linda J; Lu, Nelson
This study characterized the development of speech sound production in prelingually deaf children with a minimum of 8 years of cochlear implant (CI) experience. Twenty-seven pediatric CI recipients' spontaneous speech samples from annual evaluation sessions were phonemically transcribed. Accuracy for these speech samples was evaluated in piecewise regression models. As a group, pediatric CI recipients showed steady improvement in speech sound production following implantation, but the improvement rate declined after 6 years of device experience. Piecewise regression models indicated that the slope estimating the participants' improvement rate was statistically greater than 0 during the first 6 years postimplantation, but not after 6 years. The group of pediatric CI recipients' accuracy of speech sound production after 4 years of device experience reasonably predicts their speech sound production after 5-10 years of device experience. The development of speech sound production in prelingually deaf children stabilizes after 6 years of device experience, and typically approaches a plateau by 8 years of device use. Early growth in speech before 4 years of device experience did not predict later rates of growth or levels of achievement. However, good predictions could be made after 4 years of device use.
Full Text Available Over the course of development, speech sounds that are contrastive in one’s native language tend to become perceived categorically: that is, listeners are unaware of variation within phonetic categories while showing excellent sensitivity to speech sounds that span linguistically meaningful phonetic category boundaries. The end stage of this developmental process is that the perceptual systems that handle acoustic-phonetic information show special tuning to native language contrasts, and as such, category-level information appears to be present at even fairly low levels of the neural processing stream. Research on adults acquiring non-native speech categories offers an avenue for investigating the interplay of category-level information and perceptual sensitivities to these sounds as speech categories emerge. In particular, one can observe the neural changes that unfold as listeners learn not only to perceive acoustic distinctions that mark non-native speech sound contrasts, but also to map these distinctions onto category-level representations. An emergent literature on the neural basis of novel and non-native speech sound learning offers new insight into this question. In this review, I will examine this literature in order to answer two key questions. First, where in the neural pathway does sensitivity to category-level phonetic information first emerge over the trajectory of speech sound learning? Second, how do frontal and temporal brain areas work in concert over the course of non-native speech sound learning? Finally, in the context of this literature I will describe a model of speech sound learning in which rapidly-adapting access to categorical information in the frontal lobes modulates the sensitivity of stable, slowly-adapting responses in the temporal lobes.
Myers, Emily B
Over the course of development, speech sounds that are contrastive in one's native language tend to become perceived categorically: that is, listeners are unaware of variation within phonetic categories while showing excellent sensitivity to speech sounds that span linguistically meaningful phonetic category boundaries. The end stage of this developmental process is that the perceptual systems that handle acoustic-phonetic information show special tuning to native language contrasts, and as such, category-level information appears to be present at even fairly low levels of the neural processing stream. Research on adults acquiring non-native speech categories offers an avenue for investigating the interplay of category-level information and perceptual sensitivities to these sounds as speech categories emerge. In particular, one can observe the neural changes that unfold as listeners learn not only to perceive acoustic distinctions that mark non-native speech sound contrasts, but also to map these distinctions onto category-level representations. An emergent literature on the neural basis of novel and non-native speech sound learning offers new insight into this question. In this review, I will examine this literature in order to answer two key questions. First, where in the neural pathway does sensitivity to category-level phonetic information first emerge over the trajectory of speech sound learning? Second, how do frontal and temporal brain areas work in concert over the course of non-native speech sound learning? Finally, in the context of this literature I will describe a model of speech sound learning in which rapidly-adapting access to categorical information in the frontal lobes modulates the sensitivity of stable, slowly-adapting responses in the temporal lobes.
Soskey, Laura N; Allen, Paul D; Bennetto, Loisa
One of the earliest observable impairments in autism spectrum disorder (ASD) is a failure to orient to speech and other social stimuli. Auditory spatial attention, a key component of orienting to sounds in the environment, has been shown to be impaired in adults with ASD. Additionally, specific deficits in orienting to social sounds could be related to increased acoustic complexity of speech. We aimed to characterize auditory spatial attention in children with ASD and neurotypical controls, and to determine the effect of auditory stimulus complexity on spatial attention. In a spatial attention task, target and distractor sounds were played randomly in rapid succession from speakers in a free-field array. Participants attended to a central or peripheral location, and were instructed to respond to target sounds at the attended location while ignoring nearby sounds. Stimulus-specific blocks evaluated spatial attention for simple non-speech tones, speech sounds (vowels), and complex non-speech sounds matched to vowels on key acoustic properties. Children with ASD had significantly more diffuse auditory spatial attention than neurotypical children when attending front, indicated by increased responding to sounds at adjacent non-target locations. No significant differences in spatial attention emerged based on stimulus complexity. Additionally, in the ASD group, more diffuse spatial attention was associated with more severe ASD symptoms but not with general inattention symptoms. Spatial attention deficits have important implications for understanding social orienting deficits and atypical attentional processes that contribute to core deficits of ASD. Autism Res 2017, 10: 1405-1416. © 2017 International Society for Autism Research, Wiley Periodicals, Inc. © 2017 International Society for Autism Research, Wiley Periodicals, Inc.
Leech, Robert; Holt, Lori L; Devlin, Joseph T; Dick, Frederic
Regions of the human temporal lobe show greater activation for speech than for other sounds. These differences may reflect intrinsically specialized domain-specific adaptations for processing speech, or they may be driven by the significant expertise we have in listening to the speech signal. To test the expertise hypothesis, we used a video-game-based paradigm that tacitly trained listeners to categorize acoustically complex, artificial nonlinguistic sounds. Before and after training, we used functional MRI to measure how expertise with these sounds modulated temporal lobe activation. Participants' ability to explicitly categorize the nonspeech sounds predicted the change in pretraining to posttraining activation in speech-sensitive regions of the left posterior superior temporal sulcus, suggesting that emergent auditory expertise may help drive this functional regionalization. Thus, seemingly domain-specific patterns of neural activation in higher cortical regions may be driven in part by experience-based restructuring of high-dimensional perceptual space.
Parsa, Vijay; Scollie, Susan; Glista, Danielle; Seelisch, Andreas
...) algorithm on perceived sound quality. In the first study, the cutoff frequency and compression ratio parameters of the NFC algorithm were varied, and their effect on the speech quality was measured subjectively with 12 normal hearing...
Allen, Melissa M
Clinicians do not have an evidence base they can use to recommend optimum intervention intensity for preschool children who present with speech sound disorder (SSD). This study examined the effect of dose frequency on phonological performance and the efficacy of the multiple oppositions approach. Fifty-four preschool children with SSD were randomly assigned to one of three intervention conditions. Two intervention conditions received the multiple oppositions approach either 3 times per week for 8 weeks (P3) or once weekly for 24 weeks (P1). A control (C) condition received a storybook intervention. Percentage of consonants correct (PCC) was evaluated at 8 weeks and after 24 sessions. PCC gain was examined after a 6-week maintenance period. The P3 condition had a significantly better phonological outcome than the P1 and C conditions at 8 weeks and than the P1 condition after 24 weeks. There were no significant differences between the P1 and C conditions. There was no significant difference between the P1 and P3 conditions in PCC gain during the maintenance period. Preschool children with SSD who received the multiple oppositions approach made significantly greater gains when they were provided with a more intensive dose frequency and when cumulative intervention intensity was held constant.
Peterson, Robin L; Pennington, Bruce F; Shriberg, Lawrence D; Boada, Richard
In this study, the authors evaluated literacy outcome in children with histories of speech sound disorder (SSD) who were characterized along 2 dimensions: broader language function and persistence of SSD. In previous studies, authors have demonstrated that each dimension relates to literacy but have not disentangled their effects. Methods Two groups of children (86 SSD and 37 controls) were recruited at ages 5-6 and were followed longitudinally. The authors report the literacy of children with SSD at ages 7-9, compared with controls and national norms, and relative to language skill and SSD persistence (both measured at age 5-6). The SSD group demonstrated elevated rates of reading disability. Language skill but not SSD persistence predicted later literacy. However, SSD persistence was associated with phonological awareness impairments. Phonological awareness alone predicted literacy outcome less well than a model that also included syntax and nonverbal IQ. Results support previous literature findings that SSD history predicts literacy difficulties and that the association is strongest for SSD + language impairment (LI). Magnitude of phonological impairment alone did not determine literacy outcome, as predicted by the core phonological deficit hypothesis. Instead, consistent with a multiple deficit approach, phonological deficits appeared to interact with other cognitive factors in literacy development.
Degtyarev, Vladimir M.; Gusev, Mikhail N.
We are giving several recommendations for the choice of parameters of the sound fragments in this report. The sound fragments are components of the sound base, used in Russian speech synthesis system by a text. It isn't the secret that quality of concatenation synthesis in many respects is defined at the stage of a speaker choice and preparation of base of speaker's voice samples. Formulated recommendations are received on the basis of the statistic analysis of big amount of various types of texts and concern both separate sound fragments and their groups. Parameters of sounds were taken with the help of the automatic linguistic processor including phonetic and prosodic transcriptors. The duration, intensity and main pitch frequency of sounds in various contexts and intonational contours were analyzed. The sound base produced according to the worked out recommendations, allows to make better intelligibility and naturalness of synthetic speech due to minimization of changes of speaker's voice samples.
Tatiane Faria Barrozo
Full Text Available ABSTRACT INTRODUCTION: Considering the importance of auditory information for the acquisition and organization of phonological rules, the assessment of (central auditory processing contributes to both the diagnosis and targeting of speech therapy in children with speech sound disorders. OBJECTIVE: To study phonological measures and (central auditory processing of children with speech sound disorder. METHODS: Clinical and experimental study, with 21 subjects with speech sound disorder aged between 7.0 and 9.11 years, divided into two groups according to their (central auditory processing disorder. The assessment comprised tests of phonology, speech inconsistency, and metalinguistic abilities. RESULTS: The group with (central auditory processing disorder demonstrated greater severity of speech sound disorder. The cutoff value obtained for the process density index was the one that best characterized the occurrence of phonological processes for children above 7 years of age. CONCLUSION: The comparison among the tests evaluated between the two groups showed differences in some phonological and metalinguistic abilities. Children with an index value above 0.54 demonstrated strong tendencies towards presenting a (central auditory processing disorder, and this measure was effective to indicate the need for evaluation in children with speech sound disorder.
Baker, Elise; McLeod, Sharynne
Purpose: This article provides both a tutorial and a clinical example of how speech-language pathologists (SLPs) can conduct evidence-based practice (EBP) when working with children with speech sound disorders (SSDs). It is a companion paper to the narrative review of 134 intervention studies for children who have an SSD (Baker & McLeod, 2011).…
Baker, Elise; McLeod, Sharynne
Purpose: This article provides a comprehensive narrative review of intervention studies for children with speech sound disorders (SSD). Its companion paper (Baker & McLeod, 2011) provides a tutorial and clinical example of how speech-language pathologists (SLPs) can engage in evidence-based practice (EBP) for this clinical population. Method:…
Watts Pappas, Nicole; McAllister, Lindy; McLeod, Sharynne
Parental beliefs and experiences regarding involvement in speech intervention for their child with mild to moderate speech sound disorder (SSD) were explored using multiple, sequential interviews conducted during a course of treatment. Twenty-one interviews were conducted with seven parents of six children with SSD: (1) after their child's initial…
Munson, Benjamin; Krause, Miriam O. P.
Background: Psycholinguistic models of language production provide a framework for determining the locus of language breakdown that leads to speech-sound disorder (SSD) in children. Aims: To examine whether children with SSD differ from their age-matched peers with typical speech and language development (TD) in the ability phonologically to…
Harrison, Linda J.; McLeod, Sharynne; McAllister, Lindy; McCormack, Jane
This study sought to assess the level of correspondence between parent and teacher report of concern about young children's speech and specialist assessment of speech sound disorders (SSD). A sample of 157 children aged 4-5 years was recruited in preschools and long day care centres in Victoria and New South Wales (NSW). SSD was assessed…
McLeod, Sharynne; Crowe, Kathryn; Masso, Sarah; Baker, Elise; McCormack, Jane; Wren, Yvonne; Roulstone, Susan; Howland, Charlotte
Speech sound disorders are a common communication difficulty in preschool children. Teachers indicate difficulty identifying and supporting these children. The aim of this research was to describe speech and language characteristics of children identified by their parents and/or teachers as having possible communication concerns. 275 Australian 4-…
Jonathan L Preston
Full Text Available Ultrasound imaging is an adjunct to traditional speech therapy that has shown to be beneficial in the remediation of speech sound errors. Ultrasound biofeedback can be utilized during therapy to provide clients additional knowledge about their tongue shapes when attempting to produce sounds that are in error. The additional feedback may assist children with childhood apraxia of speech in stabilizing motor patterns, thereby facilitating more consistent and accurate productions of sounds and syllables. However, due to its specialized nature, ultrasound visual feedback is a technology that is not widely available to clients. Short-term intensive treatment programs are one option that can be utilized to expand access to ultrasound biofeedback. Schema-based motor learning theory suggests that short-term intensive treatment programs (massed practice may assist children in acquiring more accurate motor patterns. In this case series, three participants ages 10-14 diagnosed with childhood apraxia of speech attended 16 hours of speech therapy over a two-week period to address residual speech sound errors. Two participants had distortions on rhotic sounds, while the third participant demonstrated lateralization of sibilant sounds. During therapy, cues were provided to assist participants in obtaining a tongue shape that facilitated a correct production of the erred sound. Additional practice without ultrasound was also included. Results suggested that all participants showed signs of acquisition of sounds in error. Generalization and retention results were mixed. One participant showed generalization and retention of sounds that were treated; one showed generalization but limited retention; and the third showed no evidence of generalization or retention. Individual characteristics that may facilitate generalization are discussed. Short-term intensive treatment programs using ultrasound biofeedback may result in the acquisition of more accurate motor
Froyen, Dries; van Atteveldt, Nienke; Blomert, Leo
In contrast with for example audiovisual speech, the relation between visual and auditory properties of letters and speech sounds is artificial and learned only by explicit instruction. The arbitrariness of the audiovisual link together with the widespread usage of letter-speech sound pairs in
To, Carol K S; Mcleod, Sharynne; Cheung, Pamela S P
The aim of this article was to describe phonetic variations and sound changes in Hong Kong Cantonese (HKC) to provide speech-language pathologists with information about acceptable variants of standard pronunciations for speech sound assessments. Study 1 examined the pattern of variations and changes based on past diachronic research and historical written records. Nine phonetic variations were found. Five in syllable-initial and syllabic contexts: (1) [n-] → [l-], (2) [ŋ-] → Ø-, (3) Ø- → [ŋ-], (4) [k(w)ɔ-] → [kɔ-], (5) syllabic [ŋ̍] → [m̩]; and four in syllable-final contexts: (6) [-ŋ] → [-n], (7) [-n] → [-ŋ], (8) [-k] → [-t], (9) [-t] → [-k]. Historical records demonstrated the pattern of variation and changes in HKC across time. In study 2, a large-scale synchronic study of speakers of differing ages was undertaken to determine acceptable phonetic variations of HKC for speech sound assessments. In the synchronic study, single-words were elicited from 138 children (10;8-12;4) and 112 adults (18-45 years) who spoke Cantonese and lived in Hong Kong. Synchronic evidence demonstrated five acceptable variants in syllable-initial and syllabic contexts: (1) [n-] → [l-], (2) [ŋ-] → Ø-, (3) Ø- → [ŋ-], (4) [k(w)ɔ-] → [kɔ-] and (5) syllabic [ŋ̍] → [m̩] and four incomplete sound changes in syllable-final contexts: (6) [-ŋ] → [-n], (7) [-n] → [-ŋ], (8) [-k] → [-t] and (9) [-t] → [-k]. The incomplete sound changes may still be accepted as variants in speech sound assessments unless related speech problems are indicated.
Cummings, Alycia E.; Barlow, Jessica A.
The goal of this research programme was to evaluate the role of word lexicality in effecting phonological change in children's sound systems. Four children with functional speech sound disorders (SSDs) were enrolled in an across-subjects multiple baseline single-subject design; two were treated using high-frequency real words (RWs) and two were…
Preston, Jonathan L.; Leece, Megan C.; Maas, Edwin
Background: There is a need to develop effective interventions and to compare the efficacy of different interventions for children with residual speech-sound errors (RSSEs). Rhotics (the r-family of sounds) are frequently in error American English-speaking children with RSSEs and are commonly targeted in treatment. One treatment approach involves…
Martinek, J; Tatar, M; Javorka, M
Objective monitoring of cough sound for extended period is an important step toward a better understanding of this symptom. Because ambulatory cough monitoring systems are not commercially available, we prepared own monitoring system, which is able to distinguish between voluntary cough sound and speech in healthy volunteers. 20-min sound records were obtained using portable digital voice recorder. Characteristics of the sound events have been calculated in time and frequency domains and by a nonlinear analysis. Based on selected parameters, classification tree was constructed for the classification of cough and non-cough sound events. We validated the usefulness of our algorithm developed against manual counts of cough obtained by a trained observer. The median sensitivity value was 100% (the interquartile range was 98-100) and the median specificity was 95% (the interquartile range was 90-97). In conclusion, we developed an algorithm to distinguish between voluntary cough sound and speech with a high degree of accuracy.
Engineer, Crystal T; Centanni, Tracy M; Im, Kwok W; Rahebi, Kimiya C; Buell, Elizabeth P; Kilgard, Michael P
Fragile X syndrome is the most common inherited form of intellectual disability and the leading genetic cause of autism. Impaired phonological processing in fragile X syndrome interferes with the development of language skills. Although auditory cortex responses are known to be abnormal in fragile X syndrome, it is not clear how these differences impact speech sound processing. This study provides the first evidence that the cortical representation of speech sounds is impaired in Fmr1 knockout rats, despite normal speech discrimination behavior. Evoked potentials and spiking activity in response to speech sounds, noise burst trains, and tones were significantly degraded in primary auditory cortex, anterior auditory field and the ventral auditory field. Neurometric analysis of speech evoked activity using a pattern classifier confirmed that activity in these fields contains significantly less information about speech sound identity in Fmr1 knockout rats compared to control rats. Responses were normal in the posterior auditory field, which is associated with sound localization. The greatest impairment was observed in the ventral auditory field, which is related to emotional regulation. Dysfunction in the ventral auditory field may contribute to poor emotional regulation in fragile X syndrome and may help explain the observation that later auditory evoked responses are more disturbed in fragile X syndrome compared to earlier responses. Rodent models of fragile X syndrome are likely to prove useful for understanding the biological basis of fragile X syndrome and for testing candidate therapies. Copyright © 2014 Elsevier B.V. All rights reserved.
Okuno, Hiroshi G.; Nakatani, Tomohiro; Kawabata, Takeshi [NTT Basic Research Laboratories, Kanagawa (Japan)
This paper reports the preliminary results of experiments on listening to several sounds at once. Two issues are addressed: segregating speech streams from a mixture of sounds, and interfacing speech stream segregation with automatic speech recognition (ASR). Speech stream segregation (SSS) is modeled as a process of extracting harmonic fragments, grouping these extracted harmonic fragments, and substituting some sounds for non-harmonic parts of groups. This system is implemented by extending the harmonic-based stream segregation system reported at AAAI-94 and IJCAI-95. The main problem in interfacing SSS with HMM-based ASR is how to improve the recognition performance which is degraded by spectral distortion of segregated sounds caused mainly by the binaural input, grouping, and residue substitution. Our solution is to re-train the parameters of the HMM with training data binauralized for four directions, to group harmonic fragments according to their directions, and to substitute the residue of harmonic fragments for non-harmonic parts of each group. Experiments with 500 mixtures of two women`s utterances of a word showed that the cumulative accuracy of word recognition up to the 10th candidate of each woman`s utterance is, on average, 75%.
Joshi, M.; Iyer, M.; Gupta, N.; Barreto, A.
In multiple speaker environments such as teleconferences we observe a loss of intelligibility, particularly if the sound is monaural in nature. In this study, we exploit the "Cocktail Party Effect", where a person can isolate one sound above all others using sound localization and gender cues. To improve clarity of speech, each speaker is assigned a direction using Head Related Transfer Functions (HRTFs) which creates an auditory map of multiple conversations. A mixture of male and female voices is used to improve comprehension.
Full Text Available Phonetic symbolism is the phenomenon of speech sounds evoking images based on sensory experiences; it is often discussed with cross-modal correspondence. By using Garner's task, Hirata, Kita, and Ukita (2009 showed the cross-modal congruence between brightness and voiced/voiceless consonants in Japanese speech sound, which is known as phonetic symbolism. In the present study, we examined the effect of the meaning of mimetics (lexical words whose sound reflects its meaning, like “ding-dong” in Japanese language on the cross-modal correspondence. We conducted an experiment with Chinese speech sounds with or without aspiration using Chinese people. Chinese vocabulary also contains mimetics but the existence of aspiration doesn't relate to the meaning of Chinese mimetics. As a result, Chinese speech sounds with aspiration, which resemble voiceless consonants, were matched with white color, whereas those without aspiration were matched with black. This result is identical to its pattern in Japanese people and consequently suggests that cross-modal correspondence occurs without the effect of the meaning of mimetics. The problem that whether these cross-modal correspondences are purely based on physical properties of speech sound or affected from phonetic properties remains for further study.
Crystal T Engineer
Full Text Available Children with autism often have language impairments and degraded cortical responses to speech. Extensive behavioral interventions can improve language outcomes and cortical responses. Prenatal exposure to the antiepileptic drug valproic acid (VPA increases the risk for autism and language impairment. Prenatal exposure to VPA also causes weaker and delayed auditory cortex responses in rats. In this study, we document speech sound discrimination ability in VPA exposed rats and document the effect of extensive speech training on auditory cortex responses. VPA exposed rats were significantly impaired at consonant, but not vowel, discrimination. Extensive speech training resulted in both stronger and faster anterior auditory field responses compared to untrained VPA exposed rats, and restored responses to control levels. This neural response improvement generalized to non-trained sounds. The rodent VPA model of autism may be used to improve the understanding of speech processing in autism and contribute to improving language outcomes.
Katz, William F.; Sonya eMehta
Pronunciation training studies have yielded important information concerning the processing of audiovisual (AV) information. Second language (L2) learners show increased reliance on bottom-up, multimodal input for speech perception (compared to monolingual individuals). However, little is known about the role of viewing one's own speech articulation processes during speech training. The current study investigated whether real-time, visual feedback for tongue movement can improve a speaker's l...
the nature of the mechanisms and representations that mediate speech perception during infancy, childhood , and adulthood. The talks are currently being...for the way speech perception develops throughout childhood and, at the same time account for the way speech is perceived by adults whose behavior was...of syllable- monitoring tasks with bilingual English/French speakers, we were able to determine that some segmentation strategies are universal (or
Full Text Available Congenital amusics, or tone-deaf individuals, show difficulty in perceiving and producing small pitch differences. While amusia has marked effects on music perception, its impact on speech perception is less clear. Here we test the hypothesis that individual differences in pitch perception affect judgment of emotion in speech, by applying band-pass filters to spoken statements of emotional speech. A norming study was first conducted on Mechanical Turk to ensure that the intended emotions from the Macquarie Battery for Evaluation of Prosody (MBEP were reliably identifiable by US English speakers. The most reliably identified emotional speech samples were used in in Experiment 1, in which subjects performed a psychophysical pitch discrimination task, and an emotion identification task under band-pass and unfiltered speech conditions. Results showed a significant correlation between pitch discrimination threshold and emotion identification accuracy for band-pass filtered speech, with amusics (defined here as those with a pitch discrimination threshold > 16 Hz performing worse than controls. This relationship with pitch discrimination was not seen in unfiltered speech conditions. Given the dissociation between band-pass filtered and unfiltered speech conditions, we inferred that amusics may be compensating for poorer pitch perception by using speech cues that are filtered out in this manipulation.
Parsa, Vijay; Scollie, Susan; Glista, Danielle; Seelisch, Andreas
Frequency lowering technologies offer an alternative amplification solution for severe to profound high frequency hearing losses. While frequency lowering technologies may improve audibility of high frequency sounds, the very nature of this processing can affect the perceived sound quality. This article reports the results from two studies that investigated the impact of a nonlinear frequency compression (NFC) algorithm on perceived sound quality. In the first study, the cutoff frequency and compression ratio parameters of the NFC algorithm were varied, and their effect on the speech quality was measured subjectively with 12 normal hearing adults, 12 normal hearing children, 13 hearing impaired adults, and 9 hearing impaired children. In the second study, 12 normal hearing and 8 hearing impaired adult listeners rated the quality of speech in quiet, speech in noise, and music after processing with a different set of NFC parameters. Results showed that the cutoff frequency parameter had more impact on sound quality ratings than the compression ratio, and that the hearing impaired adults were more tolerant to increased frequency compression than normal hearing adults. No statistically significant differences were found in the sound quality ratings of speech-in-noise and music stimuli processed through various NFC settings by hearing impaired listeners. These findings suggest that there may be an acceptable range of NFC settings for hearing impaired individuals where sound quality is not adversely affected. These results may assist an Audiologist in clinical NFC hearing aid fittings for achieving a balance between high frequency audibility and sound quality.
Preston, Jonathan L; McAllister Byun, Tara; Boyce, Suzanne E; Hamilton, Sarah; Tiede, Mark; Phillips, Emily; Rivera-Campos, Ahmed; Whalen, Douglas H
Diagnostic ultrasound imaging has been a common tool in medical practice for several decades. It provides a safe and effective method for imaging structures internal to the body. There has been a recent increase in the use of ultrasound technology to visualize the shape and movements of the tongue during speech, both in typical speakers and in clinical populations. Ultrasound imaging of speech has greatly expanded our understanding of how sounds articulated with the tongue (lingual sounds) are produced. Such information can be particularly valuable for speech-language pathologists. Among other advantages, ultrasound images can be used during speech therapy to provide (1) illustrative models of typical (i.e. "correct") tongue configurations for speech sounds, and (2) a source of insight into the articulatory nature of deviant productions. The images can also be used as an additional source of feedback for clinical populations learning to distinguish their better productions from their incorrect productions, en route to establishing more effective articulatory habits. Ultrasound feedback is increasingly used by scientists and clinicians as both the expertise of the users increases and as the expense of the equipment declines. In this tutorial, procedures are presented for collecting ultrasound images of the tongue in a clinical context. We illustrate these procedures in an extended example featuring one common error sound, American English /r/. Images of correct and distorted /r/ are used to demonstrate (1) how to interpret ultrasound images, (2) how to assess tongue shape during production of speech sounds, (3), how to categorize tongue shape errors, and (4), how to provide visual feedback to elicit a more appropriate and functional tongue shape. We present a sample protocol for using real-time ultrasound images of the tongue for visual feedback to remediate speech sound errors. Additionally, example data are shown to illustrate outcomes with the procedure.
Wertzner, Haydée Fiszbein; Francisco, Danira Tavares; Pagan-Neves, Luciana de Oliveira
To describe the tongue shape for /s/ and /∫/ sounds in three different groups of children with and without speech sound disorder. The six participants were divided into three groups: Group 1--two typically developing children, Group 2--two children with speech sound disorder presenting any other phonological processes but not the ones involving the production of the /∫/ and Group 3--two children with speech sound disorder presenting any phonological processes associated to the presence of the phonological process of palatal fronting (these two children produced /∫/ as /s/) aged between 5 and 8 years old, all speakers of Brazilian Portuguese. The data were the words /'∫avi/ (key) and /'sapu/ (frog). Tongue contour was individually traced for the five productions of each target word. The analysis of the tongue contour pointed to evidences that both /s/ and /∫/ were produced using distinct tongue contours for G1 and G2. The production of these two groups was more stable than G3. The tongue contour for /s/ and /∫/ from the children in G3 was similar, indicating that their production was undifferentiated. The use of the ultrasound applied to the speech analysis was effective to confirm the perceptual analysis of the sound made by the speech-language pathologist.
Verdon, Sarah; McLeod, Sharynne; Wong, Sandie
The speech and language therapy profession is required to provide services to increasingly multilingual caseloads. Much international research has focused on the challenges of speech and language therapists' (SLTs) practice with multilingual children. To draw on the experience and knowledge of experts in the field to: (1) identify aspirations for practice, (2) propose recommendations for working effectively with multilingual children with speech sound disorders, and (3) reconceptualize understandings of and approaches to practice. Fourteen members of the International Expert Panel on Multilingual Children's Speech met in Cork, Ireland, to discuss SLTs' practice with multilingual children with speech sound disorders. Panel members had worked in 18 countries and spoke nine languages. Transcripts of the 6-h discussion were analysed using Cultural-Historical Activity Theory (CHAT) as a heuristic framework to make visible the reality and complexities of SLTs' practice with multilingual children. Aspirations and recommendations for reconceptualizing approaches to practice with multilingual children with speech sound disorders included: (1) increased training for working with multilingual children, their families, and interpreters, (2) increased training for transcribing speech in many languages, (3) increased time and resources for SLTs working with multilingual children and (4) use of the International Classification of Functioning, Disability and Health (ICF-CY). The reality and complexities of practice identified in this paper highlight that it is not possible to formulate and implement one 'gold standard' method of assessment and intervention for all multilingual children with speech sound disorders. It is possible, however, to underpin practice with a framework that ensures comprehensive assessment, accurate diagnosis and effective intervention. This paper proposes that by working towards the aspirations of the Expert Panel, SLTs can be empowered to facilitate
Clark, Chagit E; Conture, Edward G; Walden, Tedra A; Lambert, Warren E
The purpose of this study was to assess the association between speech sound articulation and childhood stuttering in a relatively large sample of preschool-age children who do and do not stutter, using the Goldman-Fristoe Test of Articulation-2 (GFTA-2; Goldman & Fristoe, 2000). Participants included 277 preschool-age children who do (CWS; n=128, 101 males) and do not stutter (CWNS; n=149, 76 males). Generalized estimating equations (GEE) were performed to assess between-group (CWS versus CWNS) differences on the GFTA-2. Additionally, within-group correlations were performed to explore the relation between CWS' speech sound articulation abilities and their stuttering frequency and severity, as well as their sound prolongation index (SPI; Schwartz & Conture, 1988). No significant differences were found between the articulation scores of preschool-age CWS and CWNS. However, there was a small gender effect for the 5-year-old age group, with girls generally exhibiting better articulation scores than boys. Additional findings indicated no relation between CWS' speech sound articulation abilities and their stuttering frequency, severity, or SPI. Findings suggest no apparent association between speech sound articulation-as measured by one standardized assessment (GFTA-2)-and childhood stuttering for this sample of preschool-age children (N=277). After reading this article, the reader will be able to: (1) discuss salient issues in the articulation literature relative to children who stutter; (2) compare/contrast the present study's methodologies and main findings to those of previous studies that investigated the association between childhood stuttering and speech sound articulation; (3) identify future research needs relative to the association between childhood stuttering and speech sound development; (4) replicate the present study's methodology to expand this body of knowledge. Copyright © 2013 Elsevier Inc. All rights reserved.
Centanni, Tracy Michelle; Booker, Anne B; Chen, Fuyi; Sloan, Andrew M; Carraway, Ryan S; Rennaker, Robert L; LoTurco, Joseph J; Kilgard, Michael P
Dyslexia is the most common developmental language disorder and is marked by deficits in reading and phonological awareness. One theory of dyslexia suggests that the phonological awareness deficit is due to abnormal auditory processing of speech sounds. Variants in DCDC2 and several other neural migration genes are associated with dyslexia and may contribute to auditory processing deficits. In the current study, we tested the hypothesis that RNAi suppression of Dcdc2 in rats causes abnormal cortical responses to sound and impaired speech sound discrimination. In the current study, rats were subjected in utero to RNA interference targeting of the gene Dcdc2 or a scrambled sequence. Primary auditory cortex (A1) responses were acquired from 11 rats (5 with Dcdc2 RNAi; DC-) before any behavioral training. A separate group of 8 rats (3 DC-) were trained on a variety of speech sound discrimination tasks, and auditory cortex responses were acquired following training. Dcdc2 RNAi nearly eliminated the ability of rats to identify specific speech sounds from a continuous train of speech sounds but did not impair performance during discrimination of isolated speech sounds. The neural responses to speech sounds in A1 were not degraded as a function of presentation rate before training. These results suggest that A1 is not directly involved in the impaired speech discrimination caused by Dcdc2 RNAi. This result contrasts earlier results using Kiaa0319 RNAi and suggests that different dyslexia genes may cause different deficits in the speech processing circuitry, which may explain differential responses to therapy. Although dyslexia is diagnosed through reading difficulty, there is a great deal of variation in the phenotypes of these individuals. The underlying neural and genetic mechanisms causing these differences are still widely debated. In the current study, we demonstrate that suppression of a candidate-dyslexia gene causes deficits on tasks of rapid stimulus processing
Alvarsson, Jesper J
An acoustic environment contains sounds from various sound sources, some generally perceived as wanted, others as unwanted. This thesis examines the effects of wanted and unwanted sounds in acoustic environments, with regard to masking, stress recovery, and speech intelligibility. In urban settings, masking of unwanted sounds by sounds from water structures has been suggested as a way to improve the acoustic environment. However, Study I showed that the unwanted (road traffic) sound was bette...
Macrae, Toby; Tyler, Ann A; Lewis, Kerry E
The authors of this study examined relationships between measures of word and speech error variability and between these and other speech and language measures in preschool children with speech sound disorder (SSD). In this correlational study, 18 preschool children with SSD, age-appropriate receptive vocabulary, and normal oral motor functioning and hearing were assessed across 2 sessions. Experimental measures included word and speech error variability, receptive vocabulary, nonword repetition (NWR), and expressive language. Pearson product–moment correlation coefficients were calculated among the experimental measures. The correlation between word and speech error variability was slight and nonsignificant. The correlation between word variability and receptive vocabulary was moderate and negative, although nonsignificant. High word variability was associated with small receptive vocabularies. The correlations between speech error variability and NWR and between speech error variability and the mean length of children's utterances were moderate and negative, although both were nonsignificant. High speech error variability was associated with poor NWR and language scores. High word variability may reflect unstable lexical representations, whereas high speech error variability may reflect indistinct phonological representations. Preschool children with SSD who show abnormally high levels of different types of speech variability may require slightly different approaches to intervention.
within the related ﬁelds of interest. Frontiers of Research in Speech and Music (FRSM) has been organized in diﬀerent parts of India every year since 1991. Previous conferences were held at ITC-SRA Kolkata, NPL New Delhi, BHU Varanasi, IIT Kanpur, Lucknow University, AIISH Mysore, IITM Gwalior, Utkal...... University, Bhubaneswar, Annamalai University, and IIDL Thiruvananthapuram to promote research activities covering many interdisciplinary research areas such as physics, mathematics, speech, musicology, electronics and computer science and their practical application. Through this symposium indigenous speech...... and Application, Utkal University, together with LMA and INCM (CNRS, France) and ad:mt, Aalborg University Esbjerg (Denmark). The conference featured prominent keynote speakers working in the area of music information retrieval and automatic speech recognition, and the program of CMMR 2011 included paper sessions...
Jahncke, Helena; Björkeholm, Patrik; Marsh, John E; Odelius, Johan; Sörqvist, Patrik
Background speech is one of the most disturbing noise sources at shared workplaces in terms of both annoyance and performance-related disruption. Therefore, it is important to identify techniques that can efficiently protect performance against distraction. It is also important that the techniques are perceived as satisfactory and are subjectively evaluated as effective in their capacity to reduce distraction. The aim of the current study was to compare three methods of attenuating distraction from background speech: masking a background voice with nature sound through headphones, masking a background voice with other voices through headphones and merely wearing headphones (without masking) as a way to attenuate the background sound. Quiet was deployed as a baseline condition. Thirty students participated in an experiment employing a repeated measures design. Performance (serial short-term memory) was impaired by background speech (1 voice), but this impairment was attenuated when the speech was masked - and in particular when it was masked by nature sound. Furthermore, perceived workload was lowest in the quiet condition and significantly higher in all other sound conditions. Notably, the headphones tested as a sound-attenuating device (i.e. without masking) did not protect against the effects of background speech on performance and subjective work load. Nature sound was the only masking condition that worked as a protector of performance, at least in the context of the serial recall task. However, despite the attenuation of distraction by nature sound, perceived workload was still high - suggesting that it is difficult to find a masker that is both effective and perceived as satisfactory.
Full Text Available Constantly bombarded with input, the brain has the need to filter out relevant information while ignoring the irrelevant rest. A powerful tool may be represented by neural oscillations which entrain their high-excitability phase to important input while their low-excitability phase attenuates irrelevant information. Indeed, the alignment between brain oscillations and speech improves intelligibility and helps dissociating speakers during a cocktail party. Although well-investigated, the contribution of low- and high-level processes to phase entrainment to speech sound has only recently begun to be understood. Here, we review those findings, and concentrate on three main results: (1 Phase entrainment to speech sound is modulated by attention or predictions, likely supported by top-down signals and indicating higher-level processes involved in the brain’s adjustment to speech. (2 As phase entrainment to speech can be observed without systematic fluctuations in sound amplitude or spectral content, it does not only reflect a passive steady-state ringing of the cochlea, but entails a higher-level process. (3 The role of intelligibility for phase entrainment is debated. Recent results suggest that intelligibility modulates the behavioral consequences of entrainment, rather than directly affecting the strength of entrainment in auditory regions. We conclude that phase entrainment to speech reflects a sophisticated mechanism: Several high-level processes interact to optimally align neural oscillations with predicted events of high relevance, even when they are hidden in a continuous stream of background noise.
Maryn, Youri; Roy, Nelson
Auditory-perceptual evaluation of dysphonia may be influenced by the type of speech/voice task used to render judgements during the clinical evaluation, i.e., sustained vowels versus continuous speech. This study explored (a) differences in listener dysphonia severity ratings on the basis of speech/voice tasks, (b) the influence of speech/voice task on dysphonia severity ratings of stimuli that combined sustained vowels and continuous speech, and (c) the differences in inter-rater reliability of dysphonia severity ratings between both speech tasks. Five experienced listeners rated overall dysphonia severity in sustained vowels, continuous speech and concatenated speech samples elicited by 39 subjects with various voice disorders and degrees of hoarseness. Data confirmed that sustained vowels are rated significantly more dysphonic than continuous speech. Furthermore, dysphonia severity in concatenated speech samples is least determined by the sustained vowel. Finally, no significant difference was found in inter-rater reliability between dysphonia severity ratings of sustained vowels versus continuous speech. Based upon the results, both types of speech/voice tasks (i.e., sustained vowel and continuous speech) should be elicited and judged by clinicians in the auditory-perceptual rating of dysphonia severity.
Afshar, Mohamad Reza; Ghorbani, Ali; Rashedi, Vahid; Jalilevand, Nahid; Kamali, Mohamad
The aim of this study was to compare working memory span in Persian-speaking preschool children with speech sound disorder (SSD) and their typically speaking peers. Additionally, the study aimed to examine Non-Word Repetition (NWR), Forward Digit Span (FDS) and Backward Digit Span (BDS) in four groups of children with varying severity levels of SSD. The participants in this study comprised 35 children with SSD and 35 typically developing (TD) children -matched for age and sex-as a control group. The participants were between the age range of 48 and 72 months. Two components of working memory including phonological loop and central executive were compared between two groups. We used two tasks (NWR and FDS) to assess phonological loop component, and one task (BDS) to assess central executive component. Percentage of correct consonants (PCC) was used to calculate the severity of SSD. Significant differences were observed between the two groups in all tasks that assess working memory (p memory between the various severity groups indicated significant differences between different severities of both NWR and FDS tasks among the SSD children (p 0.05). The result showed that PCC scores in TD children were associated with NWR (p children were associated with NWR and FDS (p 0.05). The working memory skills were weaker in SSD children, in comparison to TD children. In addition, children with varying levels of severity of SSD differed in terms of NWR and FSD, but not BDS. Copyright © 2017 Elsevier B.V. All rights reserved.
within the related ﬁelds of interest. Frontiers of Research in Speech and Music (FRSM) has been organized in diﬀerent parts of India every year since 1991. Previous conferences were held at ITC-SRA Kolkata, NPL New Delhi, BHU Varanasi, IIT Kanpur, Lucknow University, AIISH Mysore, IITM Gwalior, Utkal...... University, Bhubaneswar, Annamalai University, and IIDL Thiruvananthapuram to promote research activities covering many interdisciplinary research areas such as physics, mathematics, speech, musicology, electronics and computer science and their practical application. Through this symposium indigenous speech...... to researchers, academicians and industrialists to enhance their knowledge and to interact with each other to share their knowledge and experience in the latest developments in the ﬁelds. Participation in FRSM has always encouraged researchers to contribute toward achieving the objectives of the symposium...
Full Text Available Without doubt, there is emotional information in almost any kind of sound received by humans every day: be it the affective state ofa person transmitted by means of speech; the emotion intended by a composer while writing a musical piece, or conveyed by a musician while performing it; or the affective state connected to an acoustic event occurring in the environment, in the soundtrack of a movie or in a radio play. In the field of affective computing, there is currently some loosely connected research concerning either of these phenomena, but a holistic computational model of affect in sound is still lacking. In turn, for tomorrow’s pervasive technical systems, including affective companions and robots, it is expected to be highly beneficial to understand the affective dimensions of ’the sound that something makes’, in order to evaluate the systems auditory environment and its own audio output. This article aims at a first step towards a holistic computational model: Starting from standard acoustic feature extraction schemes in the domains of speech, music, and sound analysis, we interpret the worth of individual features across these three domains, considering four audio databases with observer annotations in the arousal and valence dimensions. In the results, we find that by selection of appropriate descriptors, cross-domain arousal and valence regression is feasible achieving significant correlations with the observer annotations of up to .78 for arousal (training on sound and testing on enacted speech and .60 for valence (training on enacted speech and testing on music. The high degree of cross-domain consistency in encoding the two main dimensions of affect may be attributable to the co-evolution of speech and music from multimodal affect bursts, including the integration of nature sounds for expressive effects.
Weninger, Felix; Eyben, Florian; Schuller, Björn W; Mortillaro, Marcello; Scherer, Klaus R
WITHOUT DOUBT, THERE IS EMOTIONAL INFORMATION IN ALMOST ANY KIND OF SOUND RECEIVED BY HUMANS EVERY DAY: be it the affective state of a person transmitted by means of speech; the emotion intended by a composer while writing a musical piece, or conveyed by a musician while performing it; or the affective state connected to an acoustic event occurring in the environment, in the soundtrack of a movie, or in a radio play. In the field of affective computing, there is currently some loosely connected research concerning either of these phenomena, but a holistic computational model of affect in sound is still lacking. In turn, for tomorrow's pervasive technical systems, including affective companions and robots, it is expected to be highly beneficial to understand the affective dimensions of "the sound that something makes," in order to evaluate the system's auditory environment and its own audio output. This article aims at a first step toward a holistic computational model: starting from standard acoustic feature extraction schemes in the domains of speech, music, and sound analysis, we interpret the worth of individual features across these three domains, considering four audio databases with observer annotations in the arousal and valence dimensions. In the results, we find that by selection of appropriate descriptors, cross-domain arousal, and valence regression is feasible achieving significant correlations with the observer annotations of up to 0.78 for arousal (training on sound and testing on enacted speech) and 0.60 for valence (training on enacted speech and testing on music). The high degree of cross-domain consistency in encoding the two main dimensions of affect may be attributable to the co-evolution of speech and music from multimodal affect bursts, including the integration of nature sounds for expressive effects.
Full Text Available The performance of objective speech and audio quality measures for the prediction of the perceived quality of frequency-compressed speech in hearing aids is investigated in this paper. A number of existing quality measures have been applied to speech signals processed by a hearing aid, which compresses speech spectra along frequency in order to make information contained in higher frequencies audible for listeners with severe high-frequency hearing loss. Quality measures were compared with subjective ratings obtained from normal hearing and hearing impaired children and adults in an earlier study. High correlations were achieved with quality measures computed by quality models that are based on the auditory model of Dau et al., namely, the measure PSM, computed by the quality model PEMO-Q; the measure qc, computed by the quality model proposed by Hansen and Kollmeier; and the linear subcomponent of the HASQI. For the prediction of quality ratings by hearing impaired listeners, extensions of some models incorporating hearing loss were implemented and shown to achieve improved prediction accuracy. Results indicate that these objective quality measures can potentially serve as tools for assisting in initial setting of frequency compression parameters.
Huber, Rainer; Parsa, Vijay; Scollie, Susan
The performance of objective speech and audio quality measures for the prediction of the perceived quality of frequency-compressed speech in hearing aids is investigated in this paper. A number of existing quality measures have been applied to speech signals processed by a hearing aid, which compresses speech spectra along frequency in order to make information contained in higher frequencies audible for listeners with severe high-frequency hearing loss. Quality measures were compared with subjective ratings obtained from normal hearing and hearing impaired children and adults in an earlier study. High correlations were achieved with quality measures computed by quality models that are based on the auditory model of Dau et al., namely, the measure PSM, computed by the quality model PEMO-Q; the measure qc, computed by the quality model proposed by Hansen and Kollmeier; and the linear subcomponent of the HASQI. For the prediction of quality ratings by hearing impaired listeners, extensions of some models incorporating hearing loss were implemented and shown to achieve improved prediction accuracy. Results indicate that these objective quality measures can potentially serve as tools for assisting in initial setting of frequency compression parameters.
Full Text Available This study investigates signals from sustained phonation and text-dependent speech modalities for Parkinson's disease screening. Phonation corresponds to the vowel /a/ voicing task and speech to the pronunciation of a short sentence in Lithuanian language. Signals were recorded through two channels simultaneously, namely, acoustic cardioid (AC and smart phone (SP microphones. Additional modalities were obtained by splitting speech recording into voiced and unvoiced parts. Information in each modality is summarized by 18 well-known audio feature sets. Random forest (RF is used as a machine learning algorithm, both for individual feature sets and for decision-level fusion. Detection performance is measured by the out-of-bag equal error rate (EER and the cost of log-likelihood-ratio. Essentia audio feature set was the best using the AC speech modality and YAAFE audio feature set was the best using the SP unvoiced modality, achieving EER of 20.30% and 25.57%, respectively. Fusion of all feature sets and modalities resulted in EER of 19.27% for the AC and 23.00% for the SP channel. Non-linear projection of a RF-based proximity matrix into the 2D space enriched medical decision support by visualization.
Yang, Wonyoung; Hodgson, Murray
Reinforcing speech levels and controlling noise and reverberation are the ultimate acoustical goals of lecture-room design to achieve high speech intelligibility. The effects of sound absorption on these factors have opposite consequences for speech intelligibility. Here, novel ceiling baffles and reflectors were evaluated as a sound-control measure, using computer and 1/8-scale models of a lecture room with hard surfaces and excessive reverberation. Parallel ceiling baffles running front to back were investigated. They were expected to absorb reverberation incident on the ceiling from many angles, while leaving speech signals, reflecting from the ceiling to the back of the room, unaffected. Various baffle spacings and absorptions, central and side speaker positions, and receiver positions throughout the room, were considered. Reflective baffles controlled reverberation, with a minimum decrease of sound levels. Absorptive baffles reduced reverberation, but reduced speech levels significantly. Ceiling reflectors, in the form of obstacles of semicircular cross section, suspended below the ceiling, were also tested. These were either 7 m long and in parallel, front-to-back lines, or 0.8 m long and randomly distributed, with flat side up or down, and reflective or absorptive top surfaces. The long reflectors with flat side down and no absorption were somewhat effective; the other configurations were not.
Zhao, Yunjing; Ma, Hongwei; Wang, Yueping; Gao, Hong; Xi, Chunyan; Hua, Tainyi; Zhao, Yaru; Qiu, Guangrong
FOXP2 was described as the first gene relevant to human speech and language disorders. The main objective of this study was to compare the distribution of FOXP2 gene polymorphisms between patients with speech sound disorder and healthy controls. Five FOXP2 polymorphisms, rs923875, rs2396722, rs1852469, rs17137124 and rs1456031, were analyzed in 150 patients with speech sound disorder according to DSM-IV, as well as in 140 healthy controls. Coding exons for key domains of FOXP2 were also sequenced in all the patients. Significant differences in the genotype (P = 0.001) and allele (P = 0.0025) frequencies of rs1852469 (located 5' upstream of the ATG initiator codon) were found between patients and controls. The excess of the T allele in the patients group remained significant after Bonferroni correction (P = 0.0126). Further investigations revealed a risk haplotype: rs2396722T/+rs1852469T. Our screening of key domains did not detect any point mutations in this sample. But we detected heterozygous triplet deletion of the glutamine-encoding region of exon 5 that alter FOXP2 protein sequence in five probands. These changes are predicted to yield a polyglutamine tract reduction from 40 to 39 consecutive glutamines. Our data support a possible role of FOXP2 in the vulnerability to speech sound disorder, which adds further evidence to implicate this gene in speech and language functions. © 2010 The Authors. Psychiatry and Clinical Neurosciences © 2010 Japanese Society of Psychiatry and Neurology.
Nozza, Robert J.
Binaural masked thresholds for a speech sound (/ba/) were estimated under two interaural phase conditions in three age groups (infants, preschoolers, adults). Differences as a function of both age and condition and effects of reducing intensity for adults were significant in indicating possible developmental binaural hearing changes, especially…
Ruscello, Dennis M.
Purpose: This article examines nonspeech oral motor treatments (NSOMTs) in the population of clients with developmental speech sound disorders. NSOMTs are a collection of nonspeech methods and procedures that claim to influence tongue, lip, and jaw resting postures; increase strength; improve muscle tone; facilitate range of motion; and develop…
Daniel, Graham R.; McLeod, Sharynne
Teachers play a major role in supporting children's educational, social, and emotional development although may be unprepared for supporting children with speech sound disorders. Interviews with 34 participants including six focus children, their parents, siblings, friends, teachers and other significant adults in their lives highlighted…
Powell, Thomas W.
Purpose: The use of nonspeech oral motor treatments (NSOMTs) in the management of pediatric speech sound production disorders is controversial. This article serves as a prologue to a clinical forum that examines this topic in depth. Method: Theoretical, historical, and ethical issues are reviewed to create a series of clinical questions that…
Benedek-Wood, Elizabeth; McNaughton, David; Light, Janice
This study used a multiple probe across participants' research design to evaluate the effects of instruction on the acquisition of letter-sound correspondences (LSCs) by three young children with autism spectrum disorder and limited speech. All three children (ages 3-5 years) reached criterion for identifying the LSCs targeted during instruction,…
Hickok, G.; Okada, K.; Barr, W.; Pa, J.; Rogalsky, C.; Donnelly, K.; Barde, L.; Grant, A.
Data from lesion studies suggest that the ability to perceive speech sounds, as measured by auditory comprehension tasks, is supported by temporal lobe systems in both the left and right hemisphere. For example, patients with left temporal lobe damage and auditory comprehension deficits (i.e., Wernicke's aphasics), nonetheless comprehend isolated…
Svec, JG; Titze, IR; Popolo, PS
How accurately can sound pressure levels (SPLs) of speech be estimated from skin vibration of the neck? Measurements using a small accelerometer were carried out in 27 subjects (10 males and 17 females) who read Rainbow and Marvin Williams passages in soft, comfortable, and loud voice, while skin
In alphabetic languages, learning to associate speech-sounds with unfamiliar characters is a critical step in becoming a proficient reader. This dissertation aimed at expanding our knowledge of this learning process and its relation to dyslexia, with an emphasis on bridging the gap between
Skebo, Crysten M.; Lewis, Barbara A.; Freebairn, Lisa A.; Tag, Jessica; Ciesla, Allison Avrich; Stein, Catherine M.
Purpose: The relationship between phonological awareness, overall language, vocabulary, and nonlinguistic cognitive skills to decoding and reading comprehension was examined for students at 3 stages of literacy development (i.e., early elementary school, middle school, and high school). Students with histories of speech sound disorders (SSD) with…
Castro, Márcia Mathias de; Wertzner, Haydée Fiszbein
To analyze the effectiveness of stimulability as a complementary task to the diagnosis of speech sound disorders (SSD), and to describe the performance of children with absent sounds from the phonetic inventory according to stimulable absent sounds, severity, gender, age, and occurrence of different phonological processes. Participants were 130 male and female children with ages between 5 years and 10 years and 10 months, divided into two groups: Research Group (RG), comprising 55 children with SSD; and Control Group (CG), composed of 75 children with no speech and language disorders. Based on participants' performance on the Phonology test, the severity of the disorder was calculated through the Percentage of Consonants Correct - Revised (PCC-R), and the phonetic inventory was verified. The stimulability test was applied to each absent sound from the phonetic inventory, based on the imitation of single words. The RG was subdivided into RG1 (27 children who presented absent sounds) and RG2 (28 children with complete inventory). None of the CG children presented absent sounds in the phonetic inventory, while 49% of the RG1 subjects presented absent sounds. There was absence of most language sounds. PCC-R means were lower for RG1, indicating higher severity. In the RG1, 22 children were stimulable, while five were not stimulable to any absent sound. There was association between the most occurring phonological processes and the need for stimulability assessment, indicating that the difficulty to produce absent sounds reflects difficulty with phonological representation. Stimulability is influenced by age, but not by gender. The stimulability test is effective to identify stimulable children among those who present absent sounds from their phonetic inventory. Children with SSD and absent sounds have lower PCC-R, and therefore present more severe disorder. Most of the children with absent sounds are stimulable, but may not be stimulable for complex syllable structures
Vick, Jennell C; Campbell, Thomas F; Shriberg, Lawrence D; Green, Jordan R; Truemper, Klaus; Rusiewicz, Heather Leavy; Moore, Christopher A
The purpose of the study was to determine whether distinct subgroups of preschool children with speech sound disorders (SSD) could be identified using a subgroup discovery algorithm (SUBgroup discovery via Alternate Random Processes, or SUBARP). Of specific interest was finding evidence of a subgroup of SSD exhibiting performance consistent with atypical speech motor control. Ninety-seven preschool children with SSD completed speech and nonspeech tasks. Fifty-three kinematic, acoustic, and behavioral measures from these tasks were input to SUBARP. Two distinct subgroups were identified from the larger sample. The 1st subgroup (76%; population prevalence estimate = 67.8%-84.8%) did not have characteristics that would suggest atypical speech motor control. The 2nd subgroup (10.3%; population prevalence estimate = 4.3%-16.5%) exhibited significantly higher variability in measures of articulatory kinematics and poor ability to imitate iambic lexical stress, suggesting atypical speech motor control. Both subgroups were consistent with classes of SSD in the Speech Disorders Classification System (SDCS; Shriberg et al., 2010a). Characteristics of children in the larger subgroup were consistent with the proportionally large SDCS class termed speech delay; characteristics of children in the smaller subgroup were consistent with the SDCS subtype termed motor speech disorder-not otherwise specified. The authors identified candidate measures to identify children in each of these groups.
Ygual-Fernandez, A; Cervera-Merida, J F
In the treatment of speech disorders by means of speech therapy two antagonistic methodological approaches are applied: non-verbal ones, based on oral motor exercises (OME), and verbal ones, which are based on speech processing tasks with syllables, phonemes and words. In Spain, OME programmes are called 'programas de praxias', and are widely used and valued by speech therapists. To review the studies conducted on the effectiveness of OME-based treatments applied to children with speech disorders and the theoretical arguments that could justify, or not, their usefulness. Over the last few decades evidence has been gathered about the lack of efficacy of this approach to treat developmental speech disorders and pronunciation problems in populations without any neurological alteration of motor functioning. The American Speech-Language-Hearing Association has advised against its use taking into account the principles of evidence-based practice. The knowledge gathered to date on motor control shows that the pattern of mobility and its corresponding organisation in the brain are different in speech and other non-verbal functions linked to nutrition and breathing. Neither the studies on their effectiveness nor the arguments based on motor control studies recommend the use of OME-based programmes for the treatment of pronunciation problems in children with developmental language disorders.
Aravamudhan, Radhika; Lotto, Andrew J; Hawks, John W
Williams [(1986). "Role of dynamic information in the perception of coarticulated vowels," Ph.D. thesis, University of Connecticut, Standford, CT] demonstrated that nonspeech contexts had no influence on pitch judgments of nonspeech targets, whereas context effects were obtained when instructed to perceive the sounds as speech. On the other hand, Holt et al. [(2000). "Neighboring spectral content influences vowel identification," J. Acoust. Soc. Am. 108, 710-722] showed that nonspeech contexts were sufficient to elicit context effects in speech targets. The current study was to test a hypothesis that could explain the varying effectiveness of nonspeech contexts: Context effects are obtained only when there are well-established perceptual categories for the target stimuli. Experiment 1 examined context effects in speech and nonspeech signals using four series of stimuli: steady-state vowels that perceptually spanned from /inverted ohm/-/I/ in isolation and in the context of /w/ (with no steady-state portion) and two nonspeech sine-wave series that mimicked the acoustics of the speech series. In agreement with previous work context effects were obtained for speech contexts and targets but not for nonspeech analogs. Experiment 2 tested predictions of the hypothesis by testing for nonspeech context effects after the listeners had been trained to categorize the sounds. Following training, context-dependent categorization was obtained for nonspeech stimuli in the training group. These results are presented within a general perceptual-cognitive framework for speech perception research.
, such as human interaction, perception and cognition and much more. CMMR built its strength on its open and multidisciplinary approach to these ﬁelds and the interaction of researchers with expertise in the CMMR areas. As such, CMMR evolves with the researchers and their openness to new trends and directions...... University, Bhubaneswar, Annamalai University, and IIDL Thiruvananthapuram to promote research activities covering many interdisciplinary research areas such as physics, mathematics, speech, musicology, electronics and computer science and their practical application. Through this symposium indigenous speech...... to researchers, academicians and industrialists to enhance their knowledge and to interact with each other to share their knowledge and experience in the latest developments in the ﬁelds. Participation in FRSM has always encouraged researchers to contribute toward achieving the objectives of the symposium...
Elmer, Stefan; Klein, Carina; Kühnis, Jürg; Liem, Franziskus; Meyer, Martin; Jäncke, Lutz
In this study, we used high-density EEG to evaluate whether speech and music expertise has an influence on the categorization of expertise-related and unrelated sounds. With this purpose in mind, we compared the categorization of speech, music, and neutral sounds between professional musicians, simultaneous interpreters (SIs), and controls in response to morphed speech-noise, music-noise, and speech-music continua. Our hypothesis was that music and language expertise will strengthen the memory representations of prototypical sounds, which act as a perceptual magnet for morphed variants. This means that the prototype would "attract" variants. This so-called magnet effect should be manifested by an increased assignment of morphed items to the trained category, by a reduced maximal slope of the psychometric function, as well as by differential event-related brain responses reflecting memory comparison processes (i.e., N400 and P600 responses). As a main result, we provide first evidence for a domain-specific behavioral bias of musicians and SIs toward the trained categories, namely music and speech. In addition, SIs showed a bias toward musical items, indicating that interpreting training has a generic influence on the cognitive representation of spectrotemporal signals with similar acoustic properties to speech sounds. Notably, EEG measurements revealed clear distinct N400 and P600 responses to both prototypical and ambiguous items between the three groups at anterior, central, and posterior scalp sites. These differential N400 and P600 responses represent synchronous activity occurring across widely distributed brain networks, and indicate a dynamical recruitment of memory processes that vary as a function of training and expertise.
McCandliss Bruce D
Full Text Available Abstract Background Neural systems show habituation responses at multiple levels, including relatively abstract language categories. Dishabituation – responses to non-habituated stimuli – can provide a window into the structure of these categories, without requiring an overt task. Methods We used an event-related fMRI design with short interval habituation trials, in which trains of stimuli were presented passively during 1.5 second intervals of relative silence between clustered scans. Trains of four identical stimuli (standard trials and trains of three identical stimuli followed by a stimulus from a different phonetic category (deviant trials were presented. This paradigm allowed us to measure and compare the time course of overall responses to speech, and responses to phonetic change. Results Comparisons between responses to speech and silence revealed strong responses throughout the extent of superior temporal gyrus (STG bilaterally. Comparisons between deviant and standard trials revealed dishabituation responses in a restricted region of left posterior STG, near the border with supramarginal gyrus (SMG. Novelty responses to deviant trials were also observed in right frontal regions and hippocampus. Conclusion A passive, dishabituation paradigm provides results similar to studies requiring overt responses. This paradigm can readily be extended for the study of pre-attentive processing of speech in populations such as children and second-language learners whose overt behavior is often difficult to interpret because of ancillary task demands.
Bellis, T J; Nicol, T; Kraus, N
Hemispheric asymmetries in the processing of elemental speech sounds appear to be critical for normal speech perception. This study investigated the effects of age on hemispheric asymmetry observed in the neurophysiological responses to speech stimuli in three groups of normal hearing, right-handed subjects: children (ages, 8-11 years), young adults (ages, 20-25 years), and older adults (ages > 55 years). Peak-to-peak response amplitudes of the auditory cortical P1-N1 complex obtained over right and left temporal lobes were examined to determine the degree of left/right asymmetry in the neurophysiological responses elicited by synthetic speech syllables in each of the three subject groups. In addition, mismatch negativity (MMN) responses, which are elicited by acoustic change, were obtained. Whereas children and young adults demonstrated larger P1-N1-evoked response amplitudes over the left temporal lobe than over the right, responses from elderly subjects were symmetrical. In contrast, MMN responses, which reflect an echoic memory process, were symmetrical in all subject groups. The differences observed in the neurophysiological responses were accompanied by a finding of significantly poorer ability to discriminate speech syllables involving rapid spectrotemporal changes in the older adult group. This study demonstrates a biological, age-related change in the neural representation of basic speech sounds and suggests one possible underlying mechanism for the speech perception difficulties exhibited by aging adults. Furthermore, results of this study support previous findings suggesting a dissociation between neural mechanisms underlying those processes that reflect the basic representation of sound structure and those that represent auditory echoic memory and stimulus change.
Li, Will X. Y.
Category formation of human perception is a vital part of cognitive ability. The disciplines of neuroscience and linguistics, however, seldom mention it in the marrying of the two. The present study reviews the neurological view of language acquisition as normalization of incoming speech signal, and attempts to suggest how speech sound category formation may connect personality with second language speech perception. Through a questionnaire, (being thick or thin) ego boundary, a correlate found to be related to category formation, was proven a positive indicator of personality types. Following the qualitative study, thick boundary and thin boundary English learners native in Cantonese were given a speech-signal perception test using an ABX discrimination task protocol. Results showed that thick-boundary learners performed significantly lower in accuracy rate than thin-boundary learners. It was implied that differences in personality do have an impact on language learning. PMID:24757425
Lan, Yizhou; Li, Will X Y
Category formation of human perception is a vital part of cognitive ability. The disciplines of neuroscience and linguistics, however, seldom mention it in the marrying of the two. The present study reviews the neurological view of language acquisition as normalization of incoming speech signal, and attempts to suggest how speech sound category formation may connect personality with second language speech perception. Through a questionnaire, (being thick or thin) ego boundary, a correlate found to be related to category formation, was proven a positive indicator of personality types. Following the qualitative study, thick boundary and thin boundary English learners native in Cantonese were given a speech-signal perception test using an ABX discrimination task protocol. Results showed that thick-boundary learners performed significantly lower in accuracy rate than thin-boundary learners. It was implied that differences in personality do have an impact on language learning.
Full Text Available Category formation of human perception is a vital part of cognitive ability. The disciplines of neuroscience and linguistics, however, seldom mention it in the marrying of the two. The present study reviews the neurological view of language acquisition as normalization of incoming speech signal, and attempts to suggest how speech sound category formation may connect personality with second language speech perception. Through a questionnaire, (being thick or thin ego boundary, a correlate found to be related to category formation, was proven a positive indicator of personality types. Following the qualitative study, thick boundary and thin boundary English learners native in Cantonese were given a speech-signal perception test using an ABX discrimination task protocol. Results showed that thick-boundary learners performed significantly lower in accuracy rate than thin-boundary learners. It was implied that differences in personality do have an impact on language learning.
... Plan Hot Topics Flu Facts Arrhythmias Abuse Speech Problems KidsHealth > For Teens > Speech Problems Print A A ... form speech sounds into words. What Causes Speech Problems? Normal speech might seem effortless, but it's actually ...
Hayiou-Thomas, Marianna E.; Carroll, Julia M.; Leavett, Ruth; Hulme, Charles; Snowling, Margaret J.
Background: This study considers the role of early speech difficulties in literacy development, in the context of additional risk factors. Method: Children were identified with speech sound disorder (SSD) at the age of 3½ years, on the basis of performance on the Diagnostic Evaluation of Articulation and Phonology. Their literacy skills were…
Rinne, T; Alho, K; Alku, P; Holi, M; Sinkkonen, J; Virtanen, J; Bertrand, O; Näätänen, R
Hemispheric specialization of human speech processing has been found in brain imaging studies using fMRI and PET. Due to the restricted time resolution, these methods cannot, however, determine the stage of auditory processing at which this specialization first emerges. We used a dense electrode array covering the whole scalp to record the mismatch negativity (MMN), an event-related brain potential (ERP) automatically elicited by occasional changes in sounds, which ranged from non-phonetic (tones) to phonetic (vowels). MMN can be used to probe auditory central processing on a millisecond scale with no attention-dependent task requirements. Our results indicate that speech processing occurs predominantly in the left hemisphere at the early, pre-attentive level of auditory analysis.
Full Text Available Background and Aim: Some children with speech sound disorder (SSD have difficulty with phonological awareness skills; therefore, the purpose of this study was to survey the correlation between phonological processes and phonological awareness.Methods: Twenty-one children with speech sound disorder, aged between 5 and 6, participated in this cross-sectional study. They were recruited from speech therapy clinics at the Tehran University of Medical Sciences. They were selected using the convenience sampling method . Language, speech sound, and phonological awareness skills were investigated by the test of language development-third edition (TOLD-3, the Persian diagnostic evaluation articulation and phonology test, and the phonological awareness test. Both Pearson’s and Spearman’s correlations were used to analyze the data.Results: There was a significant correlation between the atypical phonological processes and alliteration awareness (p=0.005, rhyme awareness (p=0.009, blending phonemes (p=0.006, identification of words with the same initial phoneme (p=0.007, and identification of words with the same final phoneme (p=0.007. Analyzing the correlation on the basis of the phoneme and syllable structure separately showed there was a significant correlation between the atypical phoneme structure and alliteration awareness (p=0.001, rhyme awareness (p=0.008, blending phonemes (p=0.029, identification of words with the same initial phoneme (p=0.007, and identification of words with the same final phoneme (p=0.003.Conclusion: Results revealed a relationship between phonological processes and phonological awareness in children with speech sound disorder. Poor phonological awareness was associated with atypical phonological processes especially at the phoneme level.
Ruaa Osama Hariri
Full Text Available Children with Attention-Deficiency/Hyperactive Disorder (ADHD often have co-existing learning disabilities and developmental weaknesses or delays in some areas including speech (Rief, 2005. Seeing that phonological disorders include articulation errors and other forms of speech disorders, studies pertaining to children with ADHD symptoms who demonstrate signs of phonological disorders in their native Arabic language are lacking. The purpose of this study is to provide a description of Arabic language deficits and to present a theoretical model of potential associations between phonological language deficits and ADHD. Dodd and McCormack’s (1995 four subgroups classification of speech disorder and the phonological disorders pertaining to the Arabic language provided by a Saudi Institute for Speech and Hearing are examined within the theoretical framework. Since intervention may improve articulation and focuses a child’s attention on the sound structure of words, findings in this study are based on the assumption that children with ADHD may acquire phonology for their Arabic language in the same way, and following the same developmental stages as intelligible children. Both quantitative and qualitative analyses have proven that the ADHD group analyzed in this study had indeed failed to acquire most of their Arabic consonants as they should have. Keywords: speech sound disorder, attention-deficiency/hyperactive, developmental disorder, phonological disorder, language disorder/delay, language impairment
Lewis, Barbara A; Freebairn, Lisa; Tag, Jessica; Ciesla, Allison A; Iyengar, Sudha K; Stein, Catherine M; Taylor, H Gerry
In this study, the authors determined adolescent speech, language, and literacy outcomes of individuals with histories of early childhood speech sound disorders (SSD) with and without comorbid language impairment (LI) and examined factors associated with these outcomes. This study used a prospective longitudinal design. Participants with SSD (n = 170), enrolled at early childhood (4-6 years) were followed at adolescence (11-18 years) and were compared to individuals with no histories of speech or language impairment (no SSD; n = 146) on measures of speech, language, and literacy. Comparisons were made between adolescents with early childhood histories of no SSD, SSD only, and SSD plus LI as well as between adolescents with no SSD, resolved SSD, and persistent SSD. Individuals with early childhood SSD with comorbid LI had poorer outcomes than those with histories of SSD only or no SSD. Poorer language and literacy outcomes in adolescence were associated with multiple factors, including persistent speech sound problems, lower nonverbal intelligence, and lower socioeconomic status. Adolescents with persistent SSD had higher rates of comorbid LI and reading disability than the no SSD and resolved SSD groups. Risk factors for language and literacy problems in adolescence include an early history of LI, persistent SSD, lower nonverbal cognitive ability, and social disadvantage.
David, Marion; Lavandier, Mathieu; Grimault, Nicolas; Oxenham, Andrew J
Differences in fundamental frequency (F0) between voiced sounds are known to be a strong cue for stream segregation. However, speech consists of both voiced and unvoiced sounds, and less is known about whether and how the unvoiced portions are segregated. This study measured listeners' ability to integrate or segregate sequences of consonant-vowel tokens, comprising a voiceless fricative and a vowel, as a function of the F0 difference between interleaved sequences of tokens. A performance-based measure was used, in which listeners detected the presence of a repeated token either within one sequence or between the two sequences (measures of voluntary and obligatory streaming, respectively). The results showed a systematic increase of voluntary stream segregation as the F0 difference between the two interleaved sequences increased from 0 to 13 semitones, suggesting that F0 differences allowed listeners to segregate speech sounds, including the unvoiced portions. In contrast to the consistent effects of voluntary streaming, the trend towards obligatory stream segregation at large F0 differences failed to reach significance. Listeners were no longer able to perform the voluntary-streaming task reliably when the unvoiced portions were removed from the stimuli, suggesting that the unvoiced portions were used and correctly segregated in the original task. The results demonstrate that streaming based on F0 differences occurs for natural speech sounds, and that the unvoiced portions are correctly assigned to the corresponding voiced portions. Copyright © 2016 Elsevier B.V. All rights reserved.
Dillier, Norbert; Lai, Wai Kong
The Nucleus(®) 5 System Sound Processor (CP810, Cochlear™, Macquarie University, NSW, Australia) contains two omnidirectional microphones. They can be configured as a fixed directional microphone combination (called Zoom) or as an adaptive beamformer (called Beam), which adjusts the directivity continuously to maximally reduce the interfering noise. Initial evaluation studies with the CP810 had compared performance and usability of the new processor in comparison with the Freedom™ Sound Processor (Cochlear™) for speech in quiet and noise for a subset of the processing options. This study compares the two processing options suggested to be used in noisy environments, Zoom and Beam, for various sound field conditions using a standardized speech in noise matrix test (Oldenburg sentences test). Nine German-speaking subjects who previously had been using the Freedom speech processor and subsequently were upgraded to the CP810 device participated in this series of additional evaluation tests. The speech reception threshold (SRT for 50% speech intelligibility in noise) was determined using sentences presented via loudspeaker at 65 dB SPL in front of the listener and noise presented either via the same loudspeaker (S0N0) or at 90 degrees at either the ear with the sound processor (S0NCI+) or the opposite unaided ear (S0NCI-). The fourth noise condition consisted of three uncorrelated noise sources placed at 90, 180 and 270 degrees. The noise level was adjusted through an adaptive procedure to yield a signal to noise ratio where 50% of the words in the sentences were correctly understood. In spatially separated speech and noise conditions both Zoom and Beam could improve the SRT significantly. For single noise sources, either ipsilateral or contralateral to the cochlear implant sound processor, average improvements with Beam of 12.9 and 7.9 dB in SRT were found. The average SRT of -8 dB for Beam in the diffuse noise condition (uncorrelated noise from both sides and
Mcleod, Sharynne; Baker, Elise
A survey of 231 Australian speech-language pathologists (SLPs) was undertaken to describe practices regarding assessment, analysis, target selection, intervention, and service delivery for children with speech sound disorders (SSD). The participants typically worked in private practice, education, or community health settings and 67.6% had a waiting list for services. For each child, most of the SLPs spent 10-40 min in pre-assessment activities, 30-60 min undertaking face-to-face assessments, and 30-60 min completing paperwork after assessments. During an assessment SLPs typically conducted a parent interview, single-word speech sampling, collected a connected speech sample, and used informal tests. They also determined children's stimulability and estimated intelligibility. With multilingual children, informal assessment procedures and English-only tests were commonly used and SLPs relied on family members or interpreters to assist. Common analysis techniques included determination of phonological processes, substitutions-omissions-distortions-additions (SODA), and phonetic inventory. Participants placed high priority on selecting target sounds that were stimulable, early developing, and in error across all word positions and 60.3% felt very confident or confident selecting an appropriate intervention approach. Eight intervention approaches were frequently used: auditory discrimination, minimal pairs, cued articulation, phonological awareness, traditional articulation therapy, auditory bombardment, Nuffield Centre Dyspraxia Programme, and core vocabulary. Children typically received individual therapy with an SLP in a clinic setting. Parents often observed and participated in sessions and SLPs typically included siblings and grandparents in intervention sessions. Parent training and home programs were more frequently used than the group therapy. Two-thirds kept up-to-date by reading journal articles monthly or every 6 months. There were many similarities with
Ikeda, Yumiko; Yahata, Noriaki; Takahashi, Hidehiko; Koeda, Michihiko; Asai, Kunihiko; Okubo, Yoshiro; Suzuki, Hidenori
Comprehending conversation in a crowd requires appropriate orienting and sustainment of auditory attention to and discrimination of the target speaker. While a multitude of cognitive functions such as voice perception and language processing work in concert to subserve this ability, it is still unclear which cognitive components critically determine successful discrimination of speech sounds under constantly changing auditory conditions. To investigate this, we present a functional magnetic resonance imaging (fMRI) study of changes in cerebral activities associated with varying challenge levels of speech discrimination. Subjects participated in a diotic listening paradigm that presented them with two news stories read simultaneously but independently by a target speaker and a distracting speaker of incongruent or congruent sex. We found that the voice of distracter of congruent rather than incongruent sex made the listening more challenging, resulting in enhanced activities mainly in the left temporal and frontal gyri. Further, the activities at the left inferior, left anterior superior and right superior loci in the temporal gyrus were shown to be significantly correlated with accuracy of the discrimination performance. The present results suggest that the subregions of bilateral temporal gyri play a key role in the successful discrimination of speech under constantly changing auditory conditions as encountered in daily life. 2010 Elsevier Ireland Ltd and the Japan Neuroscience Society. All rights reserved.
The Computer Music Modeling and Retrieval (CMMR) 2011 conference was the 8th event of this international series, and the ﬁrst that took place outside Europe. Since its beginnings in 2003, this conference has been co-organized by the Laboratoire de M´ecanique et d’Acoustique in Marseille, France......, panel discussions, posters, and cultural events. We are pleased to announce that in light of the location in India there was a special focus on Indian speech and music. The melting pot of the FRSM and CMMR events gave rise to many interesting meetings with a focus on the ﬁeld from diﬀerent cultural...... a splendid opportunity to keep up to date on these issues. We would like to thank the Program Committee members for their valuable paper reports and thank all the participants who made CMMR 2011 an exciting and original event. In particular, we would like to acknowledge the organizers and participants...
Koyama, S; Gunji, A; Yabe, H; Yamada, R A; Oiwa, S; Kubo, R; Kakigi, R
The backward masking effect on non-native consonants by a following vowel was examined using neuromagnetic responses to synthesized speech sounds. Native speakers of Japanese were presented with sequences of frequent (85%) and infrequent (15%) speech sounds (/ra/ and /la/ respectively, no /l/ /r/ contrast in Japanese language). The duration of the stimuli was 110 ms in a short session and 150 ms in a long session. In the short session, the stimuli were terminated in the course of the transition from the consonant to the vowel to diminish the masking effect from the vowel part. A distinct magnetic counterpart of mismatch negativity (MMNm) was observed for the short session, whereas a smaller MMNm was observed for the long session.
Clausen, Marit Carolin; Fox-Boyer, Anette
Children with speech sound disorders (SSD) are a heterogeneous group in terms of severity, underlying causes, speech characteristics and response to intervention. The correct identification and remediation of SSD is of particular importance since children with persisting SSD are placed at risk...... socially, academically and vocationally (McCormack et al., 2009). Thus, speech analysis should accurately detect whether a child has SSD or not, and should furthermore provide information about the type of disorder present. The classification into distinct subgroups of SSD should support clinicians...... disorders, Danish, classification References Clausen, M. C. (2014). LogoFoVa - Logopædisk Udredning af Fonologiske Vanskeligheder. Copenhagen: Dansk Psykologisk Forlag. Clausen, M.C. & Fox-Boyer, A.V. (in preparation). The Phonological Development in Danish-speaking children: a normative study. Dodd, B...
van Atteveldt, Nienke M; Formisano, Elia; Blomert, Leo; Goebel, Rainer
Temporal proximity is a critical determinant for cross-modal integration by multisensory neurons. Information content may serve as an additional binding factor for more complex or less natural multisensory information. Letters and speech sounds, which form the basis of literacy acquisition, are not naturally related but associated through explicit learning. We investigated the relative importance of temporal proximity and information content on the integration of letters and speech sounds by manipulating both factors within the same functional magnetic resonance imaging (fMRI) design. The results reveal significant interactions between temporal proximity and content congruency in anterior and posterior auditory association cortex, indicating that temporal synchrony is critical for the integration of letters and speech sounds. The temporal profiles for multisensory integration in the auditory association cortex resemble those demonstrated for single multisensory neurons in different brain structures and animal species. This similarity suggests that basic neural integration rules apply to the binding of multisensory information that is not naturally related but overlearned during literacy acquisition. Furthermore, the present study shows the suitability of fMRI to study temporal aspects of multisensory neural processing.
Nadia Vilela; Tatiane Faria Barrozo; Luciana de Oliveira Pagan-Neves; Seisse Gabriela Gandolfi Sanches; Haydée Fiszbein Wertzner; Renata Mota Mamede Carvallo
OBJECTIVE: To identify a cutoff value based on the Percentage of Consonants Correct-Revised index that could indicate the likelihood of a child with a speech-sound disorder also having a (central) auditory processing disorder . METHODS: Language, audiological and (central) auditory processing evaluations were administered. The participants were 27 subjects with speech-sound disorders aged 7 to 10 years and 11 months who were divided into two different groups according to their (central) audi...
Verhaert, Nicolas; Lazard, Diane S; Gnansia, Dan; Bébéar, Jean-Pierre; Romanet, Philippe; Meyer, Bernard; Péan, Vincent; Mollard, Dominique; Truy, Eric
In this prospective study the outcome of the Digisonic® SP Binaural cochlear implant (CI), a device enabling electric stimulation of both cochleae by a single receiver, was evaluated in 14 postlingually deafened adults after 12 months of use. Speech perception was tested using French disyllabic words in quiet and in speech-shaped noise at +10 dB signal-to-noise ratio. Horizontal sound localization in quiet was tested using pink noise coming from 5 loudspeakers, from -90 to +90° along the azimuth. Speech scores in quiet were 76% (±19.5 SD) in the bilateral condition, 62% (±24 SD) for the better ear alone and 43.5% (±27 SD) for the poorer ear alone. Speech scores in noise were 60% (±27.5 SD), 46% (±28 SD) and 28% (±25 SD), respectively, in the same conditions. Statistical analysis showed a significant advantage of the bilateral use in quiet and in noise (p Sound localization accuracy improved significantly when using the device in the bilateral condition with an average root mean square of 35°. Compared with published outcomes of usual bilateral cochlear implantation, this device could be a valuable alternative to two CIs. Prospective controlled trials, comparing the Digisonic SP Binaural CI with a standard bilateral cochlear implantation are mandatory to evaluate their respective advantages and cost-effectiveness. Copyright © 2012 S. Karger AG, Basel.
Koyama, S; Gunji, A; Yabe, H; Oiwa, S; Akahane-Yamada, R; Kakigi, R; Näätänen, R
Evoked magnetic responses to speech sounds [R. Näätänen, A. Lehtokoski, M. Lennes, M. Cheour, M. Huotilainen, A. Iivonen, M. Vainio, P. Alku, R.J. Ilmoniemi, A. Luuk, J. Allik, J. Sinkkonen and K. Alho, Language-specific phoneme representations revealed by electric and magnetic brain responses. Nature, 385 (1997) 432-434.] were recorded from 13 Japanese subjects (right-handed). Infrequently presented vowels ([o]) among repetitive vowels ([e]) elicited the magnetic counterpart of mismatch negativity, MMNm (Bilateral, nine subjects; Left hemisphere alone, three subjects; Right hemisphere alone, one subject). The estimated source of the MMNm was stronger in the left than in the right auditory cortex. The sources were located posteriorly in the left than in the right auditory cortex. These findings are consistent with the results obtained in Finnish [R. Näätänen, A. Lehtokoski, M. Lennes, M. Cheour, M. Huotilainen, A. Iivonen, M.Vainio, P.Alku, R.J. Ilmoniemi, A. Luuk, J. Allik, J. Sinkkonen and K. Alho, Language-specific phoneme representations revealed by electric and magnetic brain responses. Nature, 385 (1997) 432-434.][T. Rinne, K. Alho, P. Alku, M. Holi, J. Sinkkonen, J. Virtanen, O. Bertrand and R. Näätänen, Analysis of speech sounds is left-hemisphere predominant at 100-150 ms after sound onset. Neuroreport, 10 (1999) 1113-1117.] and English [K. Alho, J.F. Connolly, M. Cheour, A. Lehtokoski, M. Huotilainen, J. Virtanen, R. Aulanko and R.J. Ilmoniemi, Hemispheric lateralization in preattentive processing of speech sounds. Neurosci. Lett., 258 (1998) 9-12.] subjects. Instead of the P1m observed in Finnish [M. Tervaniemi, A. Kujala, K. Alho, J. Virtanen, R.J. Ilmoniemi and R. Näätänen, Functional specialization of the human auditory cortex in processing phonetic and musical sounds: A magnetoencephalographic (MEG) study. Neuroimage, 9 (1999) 330-336.] and English [K. Alho, J. F. Connolly, M. Cheour, A. Lehtokoski, M. Huotilainen, J. Virtanen, R. Aulanko
Skahan, Sarah M; Watson, Maggie; Lof, Gregory L
This study examined assessment procedures used by speech-language pathologists (SLPs) when assessing children suspected of having speech sound disorders (SSD). This national survey also determined the information participants obtained from clients' speech samples, evaluation of non-native English speakers, and time spent on assessment. One thousand surveys were mailed to a randomly selected group of SLPs, self-identified as having worked with children with SSD. A total of 333 (33%) surveys were returned. The assessment tasks most frequently used included administering a commercial test, estimating intelligibility, assessing stimulability, and conducting a hearing screening. The amount of time dedicated to assessment activities (e.g., administering formal tests, contacting parents) varied across participants and was significantly related to years of experience but not caseload size. Most participants reported using informal assessment procedures, or English-only standardized tests, when evaluating non-native English speakers. Most participants provided assessments that met federal guidelines to qualify children for special education services; however, additional assessment may be needed to create comprehensive treatment plans for their clients. These results provide a unique perspective on the assessment of children suspected of having SSD and should be helpful to SLPs as they examine their own assessment practices.
Full Text Available The present study attempts to investigate Indonesian EFL teachersâ€™ and native English speakersâ€™ perceptions of mispronunciations of English sounds by Indonesian EFL learners. For this purpose, a paper-form questionnaire consisting of 32 target mispronunciations was distributed to Indonesian secondary school teachers of English and also to native English speakers. An analysis of the respondentsâ€™ perceptions has discovered that 14 out of the 32 target mispronunciations are pedagogically significant in pronunciation instruction. A further analysis of the reasons for these major mispronunciations has reconfirmed the prevalence of interference of learnersâ€™ native language in their English pronunciation as a major cause of mispronunciations. It has also revealed Indonesian EFL teachersâ€™ tendency to overestimate the seriousness of their learnersâ€™ pronunciations. Based on these findings, the study makes suggestions for better English pronunciation teaching in Indonesia or other EFL countries.
Aravena, Sebastián; Tijms, Jurgen; Snellings, Patrick; van der Molen, Maurits W
In this study, we examined the learning of letter-speech sound correspondences within an artificial script and performed an experimental analysis of letter-speech sound learning among dyslexic and normal readers vis-à-vis phonological awareness, rapid automatized naming, reading, and spelling. Participants were provided with 20 min of training aimed at learning eight new basic letter-speech sound correspondences, followed by a short assessment of mastery of the correspondences and word-reading ability in this unfamiliar script. Our results demonstrated that brief training is moderately successful in differentiating dyslexic readers from normal readers in their ability to learn letter-speech sound correspondences. The normal readers outperformed the dyslexic readers for accuracy and speed on a letter-speech sound matching task, as well as on a word-reading task containing familiar words written in the artificial orthography. Importantly, the new artificial script-related measures were related to phonological awareness and rapid automatized naming and made a unique contribution in predicting individual differences in reading and spelling ability. Our results are consistent with the view that a fundamental letter-speech sound learning deficit is a key factor in dyslexia.
This study tested the hypothesis that children with speech sound disorder have generalized slowed motor speeds. It evaluated associations among oral and hand motor speeds and measures of speech (articulation and phonology) and language (receptive vocabulary, sentence comprehension, sentence imitation), in 11 children with moderate to severe SSD…
Waring, R.; Knight, R.
Background: Children with speech sound disorders (SSD) form a heterogeneous group who differ in terms of the severity of their condition, underlying cause, speech errors, involvement of other aspects of the linguistic system and treatment response. To date there is no universal and agreed-upon classification system. Instead, a number of…
Haro, Martín; Serrà, Joan; Herrera, Perfecto; Corral, Alvaro
Timbre is a key perceptual feature that allows discrimination between different sounds. Timbral sensations are highly dependent on the temporal evolution of the power spectrum of an audio signal. In order to quantitatively characterize such sensations, the shape of the power spectrum has to be encoded in a way that preserves certain physical and perceptual properties. Therefore, it is common practice to encode short-time power spectra using psychoacoustical frequency scales. In this paper, we study and characterize the statistical properties of such encodings, here called timbral code-words. In particular, we report on rank-frequency distributions of timbral code-words extracted from 740 hours of audio coming from disparate sources such as speech, music, and environmental sounds. Analogously to text corpora, we find a heavy-tailed Zipfian distribution with exponent close to one. Importantly, this distribution is found independently of different encoding decisions and regardless of the audio source. Further analysis on the intrinsic characteristics of most and least frequent code-words reveals that the most frequent code-words tend to have a more homogeneous structure. We also find that speech and music databases have specific, distinctive code-words while, in the case of the environmental sounds, this database-specific code-words are not present. Finally, we find that a Yule-Simon process with memory provides a reasonable quantitative approximation for our data, suggesting the existence of a common simple generative mechanism for all considered sound sources.
Full Text Available Timbre is a key perceptual feature that allows discrimination between different sounds. Timbral sensations are highly dependent on the temporal evolution of the power spectrum of an audio signal. In order to quantitatively characterize such sensations, the shape of the power spectrum has to be encoded in a way that preserves certain physical and perceptual properties. Therefore, it is common practice to encode short-time power spectra using psychoacoustical frequency scales. In this paper, we study and characterize the statistical properties of such encodings, here called timbral code-words. In particular, we report on rank-frequency distributions of timbral code-words extracted from 740 hours of audio coming from disparate sources such as speech, music, and environmental sounds. Analogously to text corpora, we find a heavy-tailed Zipfian distribution with exponent close to one. Importantly, this distribution is found independently of different encoding decisions and regardless of the audio source. Further analysis on the intrinsic characteristics of most and least frequent code-words reveals that the most frequent code-words tend to have a more homogeneous structure. We also find that speech and music databases have specific, distinctive code-words while, in the case of the environmental sounds, this database-specific code-words are not present. Finally, we find that a Yule-Simon process with memory provides a reasonable quantitative approximation for our data, suggesting the existence of a common simple generative mechanism for all considered sound sources.
The purpose of this study was to determine the influence of the compression time constants in a multi-channel compression hearing aid on both subjectively assessed speech intelligibility and sound quality in realistic binaural acoustical situations for normal-hearing and hearing-impaired listeners. A nonlinear hearing aid with 15 independent compression channels of approximated critical bandwidth was simulated on a personal computer. Various everyday life situations containing different sounds such as speech and speech in noise were recorded binaurally through original hearing aid microphones placed in BTE hearing aid cases. Two experiments were run with normal hearing and hearing-impaired subjects. For each subject, hearing thresholds were established using in situ audiometry. The static I/O-curve parameters in all channels of the hearing aid were then adjusted so that normal speech received an insertion gain corresponding to the NAL-R formula (Byrne & Dillon, 1986). The compression ratio was kept constant at 2.1:1. In the first experiment with six normal-hearing and six hearing-impaired subjects, the hearing aid was programmed to four different settings by changing only the compression time constants while all the parameters describing the static nonlinear Input/Output-curve were kept constant. The compression threshold was set to a very low value. In the second experiment with seven normal-hearing and eight hearing-impaired subjects, the hearing aid was programmed to four settings by changing the release time constants and the compression threshold while all other remaining parameters were kept constant. Using a complete A/B pair comparison procedure, subjects were presented binaurally with the amplified sounds and asked to subjectively assess their preference for each hearing aid setting with regards to speech intelligibility and sound quality. In Experiment 1, all subjects showed a significant preference for the longest release time (4 sec) over the two
Astikainen, Piia; Mällo, Tanel; Ruusuvirta, Timo; Näätänen, Risto
Human infants are able to detect changes in grammatical rules in a speech sound stream. Here, we tested whether rats have a comparable ability by using an electrophysiological measure that has been shown to reflect higher order auditory cognition even before it becomes manifested in behavioral level. Urethane-anesthetized rats were presented with a stream of sequences consisting of three pseudowords carried out at a fast pace. Frequently presented “standard” sequences had 16 variants which all had the same structure. They were occasionally replaced by acoustically novel “deviant” sequences of two different types: structurally consistent and inconsistent sequences. Two stimulus conditions were presented for separate animal groups. In one stimulus condition, the standard and the pattern-obeying deviant sequences had an AAB structure, while the pattern-violating deviant sequences had an ABB structure. In the other stimulus condition, these assignments were reversed. During the stimulus presentation, local-field potentials were recorded from the dura, above the auditory cortex. Two temporally separate differential brain responses to the deviant sequences reflected the detection of the deviant speech sound sequences. The first response was elicited by both types of deviant sequences and reflected most probably their acoustical novelty. The second response was elicited specifically by the structurally inconsistent deviant sequences (pattern-violating deviant sequences), suggesting that rats were able to detect changes in the pattern of three-syllabic speech sound sequence (i.e., location of the reduplication of an element in the sequence). Since all the deviant sound sequences were constructed of novel items, our findings indicate that, similarly to the human brain, the rat brain has the ability to automatically generalize extracted structural information to new items. PMID:25452712
Veugen, Lidwien C E; Chalupper, Josef; Mens, Lucas H M; Snik, Ad F M; van Opstal, A John
This study aimed to improve access to high-frequency interaural level differences (ILD), by applying extreme frequency compression (FC) in the hearing aid (HA) of 13 bimodal listeners, using a cochlear implant (CI) and conventional HA in opposite ears. An experimental signal-adaptive frequency-lowering algorithm was tested, compressing frequencies above 160 Hz into the individual audible range of residual hearing, but only for consonants (adaptive FC), thus protecting vowel formants, with the aim to preserve speech perception. In a cross-over design with at least 5 weeks of acclimatization between sessions, bimodal performance with and without adaptive FC was compared for horizontal sound localization, speech understanding in quiet and in noise, and vowel, consonant and voice-pitch perception. On average, adaptive FC did not significantly affect any of the test results. Yet, two subjects who were fitted with a relatively weak frequency compression ratio, showed improved horizontal sound localization. After the study, four subjects preferred adaptive FC, four preferred standard frequency mapping, and four had no preference. Noteworthy, the subjects preferring adaptive FC were those with best performance on all tasks, both with and without adaptive FC. On a group level, extreme adaptive FC did not change sound localization and speech understanding in bimodal listeners. Possible reasons are too strong compression ratios, insufficient residual hearing or that the adaptive switching, although preserving vowel perception, may have been ineffective to produce consistent ILD cues. Individual results suggested that two subjects were able to integrate the frequency-compressed HA input with that of the CI, and benefitted from enhanced binaural cues for horizontal sound localization.
Van der Merwe, Anita; le Roux, Mia
The objective of this article is to create awareness amongst speech-language pathologists and audiologists in South Africa regarding the difference between the sound systems of Germanic languages and the sound systems of South African Bantu languages. A brief overview of the sound systems of two Bantu languages, namely isiZulu and Setswana, is provided. These two languages are representative of the Nguni language group and the Sotho group respectively.Consideration is given to the notion of language-specific symptoms of speech, language and hearing disorders in addition to universal symptoms. The possible impact of speech production, language and hearing disorders on the ability to produce and perceive speech in these languages, and the challenges that this holds for research and clinical practice, are pointed out.
Ott, Cyrill G M; Jäncke, Lutz
Musicians and musically untrained individuals have been shown to differ in a variety of functional brain processes such as auditory analysis and sensorimotor interaction. At the same time, internally operating forward models are assumed to enable the organism to discriminate the sensory outcomes of self-initiated actions from other sensory events by deriving predictions from efference copies of motor commands about forthcoming sensory consequences. As a consequence, sensory responses to stimuli that are triggered by a self-initiated motor act are suppressed relative to the same but externally initiated stimuli, a phenomenon referred to as motor-induced suppression (MIS) of sensory cortical feedback. Moreover, MIS in the auditory domain has been shown to be modulated by the predictability of certain properties such as frequency or stimulus onset. The present study compares auditory processing of predictable and unpredictable self-initiated 0-delay speech sounds and piano tones between musicians and musical laymen by means of an event-related potential (ERP) and topographic pattern analysis (TPA) [microstate analysis or evoked potential (EP) mapping] approach. As in previous research on the topic of MIS, the amplitudes of the auditory event-related potential (AEP) N1 component were significantly attenuated for predictable and unpredictable speech sounds in both experimental groups to a comparable extent. On the other hand, AEP N1 amplitudes were enhanced for unpredictable self-initiated piano tones in both experimental groups similarly and MIS did not develop for predictable self-initiated piano tones at all. The more refined EP mapping revealed that the microstate exhibiting a typical auditory N1-like topography was significantly shorter in musicians when speech sounds and piano tones were self-initiated and predictable. In contrast, non-musicians only exhibited shorter auditory N1-like microstate durations in response to self-initiated and predictable piano tones
Iliadou, Vasiliki Vivian; Chermak, Gail D; Bamiou, Doris-Eva
According to the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition, diagnosis of speech sound disorder (SSD) requires a determination that it is not the result of other congenital or acquired conditions, including hearing loss or neurological conditions that may present with similar symptomatology. To examine peripheral and central auditory function for the purpose of determining whether a peripheral or central auditory disorder was an underlying factor or contributed to the child's SSD. Central auditory processing disorder clinic pediatric case reports. Three clinical cases are reviewed of children with diagnosed SSD who were referred for audiological evaluation by their speech-language pathologists as a result of slower than expected progress in therapy. Audiological testing revealed auditory deficits involving peripheral auditory function or the central auditory nervous system. These cases demonstrate the importance of increasing awareness among professionals of the need to fully evaluate the auditory system to identify auditory deficits that could contribute to a patient's speech sound (phonological) disorder. Audiological assessment in cases of suspected SSD should not be limited to pure-tone audiometry given its limitations in revealing the full range of peripheral and central auditory deficits, deficits which can compromise treatment of SSD. American Academy of Audiology.
Intartaglia, Bastien; White-Schwoch, Travis; Kraus, Nina; Schön, Daniele
Growing evidence shows that music and language experience affect the neural processing of speech sounds throughout the auditory system. Recent work mainly focused on the benefits induced by musical practice on the processing of native language or tonal foreign language, which rely on pitch processing. The aim of the present study was to take this research a step further by investigating the effect of music training on processing English sounds by foreign listeners. We recorded subcortical electrophysiological responses to an English syllable in three groups of participants: native speakers, non-native nonmusicians, and non-native musicians. Native speakers had enhanced neural processing of the formant frequencies of speech, compared to non-native nonmusicians, suggesting that automatic encoding of these relevant speech cues are sensitive to language experience. Most strikingly, in non-native musicians, neural responses to the formant frequencies did not differ from those of native speakers, suggesting that musical training may compensate for the lack of language experience by strengthening the neural encoding of important acoustic information. Language and music experience seem to induce a selective sensory gain along acoustic dimensions that are functionally-relevant-here, formant frequencies that are crucial for phoneme discrimination.
Rødvik, Arne K
The aim of this pilot study was to identify the most common speech sound confusions of 5 Norwegian cochlear implanted post-lingually deafened adults. We played recorded nonwords, aCa, iCi and bVb, to our informants, asked them to repeat what they heard, recorded their repetitions and transcribed these phonetically. We arranged the collected data in confusion matrices to find the most common and most uncommon speech sound confusions. We found that the voiced and unvoiced consonants are seldom confused. We also found that there was a higher rate of consonant confusion for the iCi words than for the aCa words. The most frequent confusion was [eta] perceived as [n], [m] perceived as [n] and [upsilon] perceived as [n]. For the consonants, manner of articulation was rarely confused, but place of articulation was often confused. An exception from this was the confusion of [l] and [n], which differs only in manner of articulation. The latter is in accordance with reports we get from clinicians. We postulate that this is caused by the speech processing of the cochlear implant. We found less confusion of the vowels, which can be explained by the fact that vowels have much higher energy and longer duration than most of the consonants. The most frequent confusion was [a:] perceived as [see text] and [u:] perceived as [see text]. [e:], [i:] and [see text] were never confused with other vowels.
Hayiou-Thomas, M.E.; Carroll, Julia M.; Leavett, Ruth; Hulme, Charles; Snowling, Margaret Jean
BACKGROUND: This study considers the role of early speech difficulties in literacy development, in the context of additional risk factors. METHOD: Children were identified with speech sound disorder (SSD) at the age of 3½ years, on the basis of performance on the Diagnostic Evaluation of Articulation and Phonology. Their literacy skills were assessed at the start of formal reading instruction (age 5½), using measures of phoneme awareness, word-level reading and spelling; and 3 years later (ag...
McCormack, Jane; Baker, Elise; Masso, Sarah; Crowe, Kathryn; McLeod, Sharynne; Wren, Yvonne; Roulstone, Sue
Implementation fidelity refers to the degree to which an intervention or programme adheres to its original design. This paper examines implementation fidelity in the Sound Start Study, a clustered randomised controlled trial of computer-assisted support for children with speech sound disorders (SSD). Sixty-three children with SSD in 19 early childhood centres received computer-assisted support (Phoneme Factory Sound Sorter [PFSS] - Australian version). Educators facilitated the delivery of PFSS targeting phonological error patterns identified by a speech-language pathologist. Implementation data were gathered via (1) the computer software, which recorded when and how much intervention was completed over 9 weeks; (2) educators' records of practice sessions; and (3) scoring of fidelity (intervention procedure, competence and quality of delivery) from videos of intervention sessions. Less than one-third of children received the prescribed number of days of intervention, while approximately one-half participated in the prescribed number of intervention plays. Computer data differed from educators' data for total number of days and plays in which children participated; the degree of match was lower as data became more specific. Fidelity to intervention procedures, competency and quality of delivery was high. Implementation fidelity may impact intervention outcomes and so needs to be measured in intervention research; however, the way in which it is measured may impact on data.
Earle, F Sayako; Landi, Nicole; Myers, Emily B
Sleep is important for memory consolidation and contributes to the formation of new perceptual categories. This study examined sleep as a source of variability in typical learners' ability to form new speech sound categories. We trained monolingual English speakers to identify a set of non-native speech sounds at 8PM, and assessed their ability to identify and discriminate between these sounds immediately after training, and at 8AM on the following day. We tracked sleep duration overnight, and found that light sleep duration predicted gains in identification performance, while total sleep duration predicted gains in discrimination ability. Participants obtained an average of less than 6h of sleep, pointing to the degree of sleep deprivation as a potential factor. Behavioral measures were associated with ERP indexes of neural sensitivity to the learned contrast. These results demonstrate that the relative success in forming new perceptual categories depends on the duration of post-training sleep. Copyright Â© 2016 Elsevier Ireland Ltd. All rights reserved.
Vogel, Adam P; Morgan, Angela T
The importance and utility of objective evidence-based measurement of the voice is well documented. Therefore, greater consideration needs to be given to the factors that influence the quality of voice and speech recordings. This manuscript aims to bring together the many features that affect acoustically acquired voice and speech. Specifically, the paper considers the practical requirements of individual speech acquisition configurations through examining issues relating to hardware, software and microphone selection, the impact of environmental noise, analogue to digital conversion and file format as well as the acoustic measures resulting from varying levels of signal integrity. The type of recording environment required by a user is often dictated by a variety of clinical and experimental needs, including: the acoustic measures being investigated; portability of equipment; an individual's budget; and the expertise of the user. As the quality of recorded signals is influenced by many factors, awareness of these issues is essential. This paper aims to highlight the importance of these methodological considerations to those previously uninitiated with voice and speech acoustics. With current technology, the highest quality recording would be made using a stand-alone hard disc recorder, an independent mixer to attenuate the incoming signal, and insulated wiring combined with a high quality microphone in an anechoic chamber or sound treated room.
Francisco, Danira Tavares; Wertzner, Haydée Fiszbein
This study describes the criteria that are used in ultrasound to measure the differences between the tongue contours that produce [s] and [ʃ] sounds in the speech of adults, typically developing children (TDC), and children with speech sound disorder (SSD) with the phonological process of palatal fronting. Overlapping images of the tongue contours that resulted from 35 subjects producing the [s] and [ʃ] sounds were analysed to select 11 spokes on the radial grid that were spread over the tongue contour. The difference was calculated between the mean contour of the [s] and [ʃ] sounds for each spoke. A cluster analysis produced groups with some consistency in the pattern of articulation across subjects and differentiated adults and TDC to some extent and children with SSD with a high level of success. Children with SSD were less likely to show differentiation of the tongue contours between the articulation of [s] and [ʃ].
Wren, Yvonne; Harding, Sam; Goldbart, Juliet; Roulstone, Sue
Multiple interventions have been developed to address speech sound disorder (SSD) in children. Many of these have been evaluated but the evidence for these has not been considered within a model which categorizes types of intervention. The opportunity to carry out a systematic review of interventions for SSD arose as part of a larger scale study of interventions for primary speech and language impairment in preschool children. To review systematically the evidence for interventions for SSD in preschool children and to categorize them within a classification of interventions for SSD. Relevant search terms were used to identify intervention studies published up to 2012, with the following inclusion criteria: participants were aged between 2 years and 5 years, 11 months; they exhibited speech, language and communication needs; and a primary outcome measure of speech was used. Studies that met inclusion criteria were quality appraised using the single case experimental design (SCED) or PEDro-P, depending on their methodology. Those judged to be high quality were classified according to the primary focus of intervention. The final review included 26 studies. Case series was the most common research design. Categorization to the classification system for interventions showed that cognitive-linguistic and production approaches to intervention were the most frequently reported. The highest graded evidence was for three studies within the auditory-perceptual and integrated categories. The evidence for intervention for preschool children with SSD is focused on seven out of 11 subcategories of interventions. Although all the studies included in the review were good quality as defined by quality appraisal checklists, they mostly represented lower-graded evidence. Higher-graded studies are needed to understand clearly the strength of evidence for different interventions. © 2018 Royal College of Speech and Language Therapists.
Ibrahim, Iman; Parsa, Vijay; Macpherson, Ewan; Cheesman, Margaret
Wireless synchronization of the digital signal processing (DSP) features between two hearing aids in a bilateral hearing aid fitting is a fairly new technology. This technology is expected to preserve the differences in time and intensity between the two ears by co-ordinating the bilateral DSP features such as multichannel compression, noise reduction, and adaptive directionality. The purpose of this study was to evaluate the benefits of wireless communication as implemented in two commercially available hearing aids. More specifically, this study measured speech intelligibility and sound localization abilities of normal hearing and hearing impaired listeners using bilateral hearing aids with wireless synchronization of multichannel Wide Dynamic Range Compression (WDRC). Twenty subjects participated; 8 had normal hearing and 12 had bilaterally symmetrical sensorineural hearing loss. Each individual completed the Hearing in Noise Test (HINT) and a sound localization test with two types of stimuli. No specific benefit from wireless WDRC synchronization was observed for the HINT; however, hearing impaired listeners had better localization with the wireless synchronization. Binaural wireless technology in hearing aids may improve localization abilities although the possible effect appears to be small at the initial fitting. With adaptation, the hearing aids with synchronized signal processing may lead to an improvement in localization and speech intelligibility. Further research is required to demonstrate the effect of adaptation to the hearing aids with synchronized signal processing on different aspects of auditory performance.
Full Text Available Wireless synchronization of the digital signal processing (DSP features between two hearing aids in a bilateral hearing aid fitting is a fairly new technology. This technology is expected to preserve the differences in time and intensity between the two ears by co-ordinating the bilateral DSP features such as multichannel compression, noise reduction, and adaptive directionality. The purpose of this study was to evaluate the benefits of wireless communication as implemented in two commercially available hearing aids. More specifically, this study measured speech intelligibility and sound localization abilities of normal hearing and hearing impaired listeners using bilateral hearing aids with wireless synchronization of multichannel Wide Dynamic Range Compression (WDRC. Twenty subjects participated; 8 had normal hearing and 12 had bilaterally symmetrical sensorineural hearing loss. Each individual completed the Hearing in Noise Test (HINT and a sound localization test with two types of stimuli. No specific benefit from wireless WDRC synchronization was observed for the HINT; however, hearing impaired listeners had better localization with the wireless synchronization. Binaural wireless technology in hearing aids may improve localization abilities although the possible effect appears to be small at the initial fitting. With adaptation, the hearing aids with synchronized signal processing may lead to an improvement in localization and speech intelligibility. Further research is required to demonstrate the effect of adaptation to the hearing aids with synchronized signal processing on different aspects of auditory performance.
Full Text Available Wireless synchronization of the digital signal processing (DSP features between two hearing aids in a bilateral hearing aid fitting is a fairly new technology. This technology is expected to preserve the differences in time and intensity between the two ears by co-ordinating the bilateral DSP features such as multichannel compression, noise reduction, and adaptive directionality. The purpose of this study was to evaluate the benefits of wireless communication as implemented in two commercially available hearing aids. More specifically, this study measured speech intelligibility and sound localization abilities of normal hearing and hearing impaired listeners using bilateral hearing aids with wireless synchronization of multichannel Wide Dynamic Range Compression (WDRC. Twenty subjects participated; 8 had normal hearing and 12 had bilaterally symmetrical sensorineural hearing loss. Each individual completed the Hearing in Noise Test (HINT and a sound localization test with two types of stimuli. No specific benefit from wireless WDRC synchronization was observed for the HINT; however, hearing impaired listeners had better localization with the wireless synchronization. Binaural wireless technology in hearing aids may improve localization abilities although the possible effect appears to be small at the initial fitting. With adaptation, the hearing aids with synchronized signal processing may lead to an improvement in localization and speech intelligibility. Further research is required to demonstrate the effect of adaptation to the hearing aids with synchronized signal processing on different aspects of auditory performance.
Leifholz, M; Margolf-Hackl, S; Kreikemeier, S; Kiessling, J
The acceptance of hearing aids by users with high frequency hearing loss still represents a problem. Processing algorithms that shift high frequency signal components into an audible frequency range are proposed as a solution. We looked into the issue of whether frequency compression becomes more beneficial with increasing high frequency hearing loss or/and for users with cochlear dead regions (DR). A total of 20 hearing aid candidates were assessed audiometrically and classified into two test groups in terms of their hearing loss and the presence of DR. The subjects then evaluated four hearing aid settings that differed solely in the degree of frequency compression. Speech recognition threshold measurements and subjective sound quality ratings were carried out for all four settings. Data showed that 15 of the 20 test subjects understood fricatives with a high frequency spectrum component better, since they were able to distinguish between the two logatomes "Afa" and "Asa". No correlation was found between the beneficial effect of frequency compression and the degree of high frequency hearing loss or the presence of DR. Subjective sound quality ratings indicated no clear preference, but excessive frequency compression was generally deemed counterproductive. Frequency compression may be appropriate for hearing aid users with high frequency hearing loss and can improve speech recognition. The degree of frequency compression required to achieve maximal benefit varies from case to case and has to be optimized on an individual basis.
Unicomb, Rachael; Hewat, Sally; Spencer, Elizabeth; Harrison, Elisabeth
Speech sound disorders reportedly co-occur in young children who stutter at a substantial rate. Despite this, there is a paucity of scientific research available to support a treatment approach when these disorders co-exist. Similarly, little is known about how clinicians are currently working with this caseload given that best practice for the treatment of both disorders in isolation has evolved in recent years. This study used a qualitative approach to explore current clinical management and rationales when working with children who have co-occurring stuttering and speech sound disorder. Thirteen participant SLPs engaged in semi-structured telephone interviews. Interview data were analysed based on principles derived from grounded theory. Several themes were identified including multi-faceted assessment, workplace challenges, weighing-up the evidence, and direct intervention. The core theme, clinical reasoning, highlighted the participants' main concern, that not enough is known about this caseload on which to base decisions about intervention. There was consensus that little is available in the research literature to guide decisions relating to service delivery. These findings highlight the need for further research to provide evidence-based guidelines for clinical practice with this caseload.
Becker, Frank; Reinvang, Ivar
To investigate changes in brain activation related to tone and speech sound processing during aphasia rehabilitation. Longitudinal study investigating patients with stroke, subarachnoid hemorrhage and traumatic brain injury 3 and 7 months post-injury. Eight patients with aphasia, reflecting a wide range of auditory comprehension impairment. Token test and Norwegian Basic Aphasia Assessment were used to measure auditory comprehension function. Brain event-related potentials were recorded in passive paradigms with harmonically rich tones and syllables in order to obtain the mismatch negativity component that reflects automatic stimulus discrimination. In an active syllable discrimination paradigm, stimulus feature integration (N1), attended stimulus discrimination and classification (N2), and target detection (P3) were studied. Auditory comprehension scores improved approximately 10% during the observation period. Ipsilesional frontal P3- and N2-amplitude increased significantly. A significant shift in topographical distribution from the contralesional to the ipsilesional hemisphere was observed for the N2 component. The study of individual waveforms indicates inter-individual differences in reorganization after brain injury. Hemispherical distribution of brain activation correlating with speech sound processing in aphasia can change during the first months after brain injury. Event-related potentials are a potentially useful method for detecting individual activation patterns relevant to recovery in aphasia rehabilitation.
Full Text Available The purpose of this study was to explore developmental changes, in terms of spectral fluctuations and temporal periodicity with Japanese- and English-learning infants. Three age groups (15, 20, and 24 months were selected, because infants diversify phonetic inventories with age. Natural speech of the infants was recorded. We utilized a critical-band-filter bank, which simulated the frequency resolution in adults’ auditory periphery. First, the correlations between the critical-band outputs represented by factor analysis were observed in order to see how the critical bands should be connected to each other, if a listener is to differentiate sounds in infants’ speech. In the following analysis, we analyzed the temporal fluctuations of factor scores by calculating autocorrelations. The present analysis identified three factors observed in adult speech at 24 months of age in both linguistic environments. These three factors were shifted to a higher frequency range corresponding to the smaller vocal tract size of the infants. The results suggest that the vocal tract structures of the infants had developed to become adult-like configuration by 24 months of age in both language environments. The amount of utterances with periodic nature of shorter time increased with age in both environments. This trend was clearer in the Japanese environment.
McGrath, Lauren M; Pennington, Bruce F; Willcutt, Erik G; Boada, Richard; Shriberg, Lawrence D; Smith, Shelley D
Few studies have investigated the role of gene x environment interactions (G x E) in speech, language, and literacy disorders. Currently, there are two theoretical models, the diathesis-stress model and the bioecological model, that make opposite predictions about the expected direction of G x E, because environmental risk factors may either strengthen or weaken the effect of genes on phenotypes. The purpose of the current study was to test for G x E at two speech sound disorder and reading disability linkage peaks using a sib-pair linkage design and continuous measures of socioeconomic status, home language/literacy environment, and number of ear infections. The interactions were tested using composite speech, language, and preliteracy phenotypes and previously identified linkage peaks on 6p22 and 15q21. Results showed five G x E at both the 6p22 and 15q21 locations across several phenotypes and environmental measures. Four of the five interactions were consistent with the bioecological model of G x E. Each of these four interactions involved environmental measures of the home language/literacy environment. The only interaction that was consistent with the diathesis-stress model was one involving the number of ear infections as the environmental risk variable. The direction of these interactions and possible interpretations are explored in the discussion.
Cyrill Guy Martin Ott
Full Text Available Musicians and musically untrained individuals have been shown to differ in a variety of functional brain processes such as auditory analysis and sensorimotor interaction. At the same time, internally operating forward models are assumed to enable the organism to discriminate the sensory outcomes of self-initiated actions from other sensory events by deriving predictions from efference copies of motor commands about forthcoming sensory consequences. As a consequence, sensory responses to stimuli that are triggered by a self-initiated motor act are suppressed relative to the same but externally-initiated stimuli, a phenomenon referred to as motor-induced suppression (MIS of sensory cortical feedback. Moreover, MIS in the auditory domain has been shown to be modulated by the predictability of certain properties such as frequency or stimulus onset. The present study compares auditory processing of predictable and unpredictable self-initiated zero-delay speech sounds and piano tones between musicians and musical laymen by means of an event-related potential (ERP and topographic pattern analysis (microstate analysis or EP mapping approach. Taken together, our findings suggest that besides the known effect of MIS, internally operating forward models also facilitate early acoustic analysis of complex tones by means of faster processing time as indicated by shorter auditory N1-like microstate durations in the first ~ 200 ms after stimulus onset. In addition, musicians seem to profit from this facilitation also during the analysis of speech sounds as indicated by comparable auditory N1-like microstate duration patterns between speech and piano conditions. In contrast, non-musicians did not show such an effect.
Dietrich, Christiane; Swingley, Daniel; Werker, Janet F
One of the first steps infants take in learning their native language is to discover its set of speech-sound categories. This early development is shown when infants begin to lose the ability to differentiate some of the speech sounds their language does not use, while retaining or improving discrimination of language-relevant sounds. However, this aspect of early phonological tuning is not sufficient for language learning. Children must also discover which of the phonetic cues that are used in their language serve to signal lexical distinctions. Phonetic variation that is readily discriminable to all children may indicate two different words in one language but only one word in another. Here, we provide evidence that the language background of 1.5-year-olds affects their interpretation of phonetic variation in word learning, and we show that young children interpret salient phonetic variation in language-specific ways. Three experiments with a total of 104 children compared Dutch- and English-learning 18-month-olds' responses to novel words varying in vowel duration or vowel quality. Dutch learners interpreted vowel duration as lexically contrastive, but English learners did not, in keeping with properties of Dutch and English. Both groups performed equivalently when differentiating words varying in vowel quality. Thus, at one and a half years, children's phonological knowledge already guides their interpretation of salient phonetic variation. We argue that early phonological learning is not just a matter of maintaining the ability to distinguish language-relevant phonetic cues. Learning also requires phonological interpretation at appropriate levels of linguistic analysis.
Fraga González, G.; Žarić, G.; Tijms, J.; Bonte, M.; Blomert, L.; van der Molen, M.W.
A recent account of dyslexia assumes that a failure to develop automated letter-speech sound integration might be responsible for the observed lack of reading fluency. This study uses a pre-test-training-post-test design to evaluate the effects of a training program based on letter-speech sound
Carney, Laurel H.; Li, Tianhao; McDonough, Joyce M.
Abstract Current models for neural coding of vowels are typically based on linear descriptions of the auditory periphery, and fail at high sound levels and in background noise. These models rely on either auditory nerve discharge rates or phase locking to temporal fine structure. However, both discharge rates and phase locking saturate at moderate to high sound levels, and phase locking is degraded in the CNS at middle to high frequencies. The fact that speech intelligibility is robust over a...
Erath, Byron; Plesniak, Michael
Human speech is made possible by the air flow interaction with the vocal folds. During phonation, asymmetries in the glottal flow field may arise from flow phenomena (e.g. the Coanda effect) as well as from pathological vocal fold motion (e.g. unilateral paralysis). In this study, the effects of flow asymmetries on glottal sound sources were investigated. Dynamically-programmable 7.5 times life-size vocal fold models with 2 degrees-of-freedom (linear and rotational) were constructed to provide a first-order approximation of vocal fold motion. Important parameters (Reynolds, Strouhal, and Euler numbers) were scaled to physiological values. Normal and abnormal vocal fold motions were synthesized, and the velocity field and instantaneous transglottal pressure drop were measured. Variability in the glottal jet trajectory necessitated sorting of the data according to the resulting flow configuration. The dipole sound source is related to the transglottal pressure drop via acoustic analogies. Variations in the transglottal pressure drop (and subsequently the dipole sound source) arising from flow asymmetries are discussed.
Asp, Filip; Jakobsson, Anne-Marie; Berninger, Erik
Unilateral hearing loss (UHL) occurs in 25% of cases of congenital sensorineural hearing loss. Due to the unilaterally reduced audibility associated with UHL, everyday demanding listening situations may be disrupted despite normal hearing in one ear. The aim of this study was to quantify acute changes in recognition of speech in spatially separate competing speech and sound localization accuracy, and relate those changes to two levels of temporary induced UHL (UHL30 and UHL43; suffixes denote the average hearing threshold across 0.5, 1, 2, and 4 kHz) for 8 normal-hearing adults. A within-subject repeated-measures design was used (normal binaural conditions, UHL30 and UHL43). The main outcome measures were the threshold for 40% correct speech recognition and the overall variance in sound localization accuracy quantified by an Error Index (0 = perfect performance, 1.0 = random performance). Distinct and statistically significant deterioration in speech recognition (2.0 dB increase in threshold, p 0.05), while sound localization was additionally impaired (Error Index increase of 0.33, p < 0.01) with an associated large increase in individual variability. Qualitative analyses on a subject-by-subject basis showed that high-frequency audibility was important for speech recognition, while low-frequency audibility was important for horizontal sound localization accuracy. While the data might not be entirely applicable to individuals with long-standing UHL, the results suggest a need for intervention for mild-to-moderate UHL. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Warlaumont, Anne S; Finnegan, Megan K
At around 7 months of age, human infants begin to reliably produce well-formed syllables containing both consonants and vowels, a behavior called canonical babbling. Over subsequent months, the frequency of canonical babbling continues to increase. How the infant's nervous system supports the acquisition of this ability is unknown. Here we present a computational model that combines a spiking neural network, reinforcement-modulated spike-timing-dependent plasticity, and a human-like vocal tract to simulate the acquisition of canonical babbling. Like human infants, the model's frequency of canonical babbling gradually increases. The model is rewarded when it produces a sound that is more auditorily salient than sounds it has previously produced. This is consistent with data from human infants indicating that contingent adult responses shape infant behavior and with data from deaf and tracheostomized infants indicating that hearing, including hearing one's own vocalizations, is critical for canonical babbling development. Reward receipt increases the level of dopamine in the neural network. The neural network contains a reservoir with recurrent connections and two motor neuron groups, one agonist and one antagonist, which control the masseter and orbicularis oris muscles, promoting or inhibiting mouth closure. The model learns to increase the number of salient, syllabic sounds it produces by adjusting the base level of muscle activation and increasing their range of activity. Our results support the possibility that through dopamine-modulated spike-timing-dependent plasticity, the motor cortex learns to harness its natural oscillations in activity in order to produce syllabic sounds. It thus suggests that learning to produce rhythmic mouth movements for speech production may be supported by general cortical learning mechanisms. The model makes several testable predictions and has implications for our understanding not only of how syllabic vocalizations develop in
Warlaumont, Anne S.; Finnegan, Megan K.
At around 7 months of age, human infants begin to reliably produce well-formed syllables containing both consonants and vowels, a behavior called canonical babbling. Over subsequent months, the frequency of canonical babbling continues to increase. How the infant’s nervous system supports the acquisition of this ability is unknown. Here we present a computational model that combines a spiking neural network, reinforcement-modulated spike-timing-dependent plasticity, and a human-like vocal tract to simulate the acquisition of canonical babbling. Like human infants, the model’s frequency of canonical babbling gradually increases. The model is rewarded when it produces a sound that is more auditorily salient than sounds it has previously produced. This is consistent with data from human infants indicating that contingent adult responses shape infant behavior and with data from deaf and tracheostomized infants indicating that hearing, including hearing one’s own vocalizations, is critical for canonical babbling development. Reward receipt increases the level of dopamine in the neural network. The neural network contains a reservoir with recurrent connections and two motor neuron groups, one agonist and one antagonist, which control the masseter and orbicularis oris muscles, promoting or inhibiting mouth closure. The model learns to increase the number of salient, syllabic sounds it produces by adjusting the base level of muscle activation and increasing their range of activity. Our results support the possibility that through dopamine-modulated spike-timing-dependent plasticity, the motor cortex learns to harness its natural oscillations in activity in order to produce syllabic sounds. It thus suggests that learning to produce rhythmic mouth movements for speech production may be supported by general cortical learning mechanisms. The model makes several testable predictions and has implications for our understanding not only of how syllabic vocalizations develop
Overby, Megan S; Masterson, Julie J; Preston, Jonathan L
This archival investigation examined the relationship between preliteracy speech sound production skill (SSPS) and spelling in Grade 3 using a dataset in which children's receptive vocabulary was generally within normal limits, speech therapy was not provided until Grade 2, and phonological awareness instruction was discouraged at the time data were collected. Participants (N = 250), selected from the Templin Archive (Templin, 2004), varied on prekindergarten SSPS. Participants' real word spellings in Grade 3 were evaluated using a metric of linguistic knowledge, the Computerized Spelling Sensitivity System (Masterson & Apel, 2013). Relationships between kindergarten speech error types and later spellings also were explored. Prekindergarten children in the lowest SPSS (7th percentile) scored poorest among articulatory subgroups on both individual spelling elements (phonetic elements, junctures, and affixes) and acceptable spelling (using relatively more omissions and illegal spelling patterns). Within the 7th percentile subgroup, there were no statistical spelling differences between those with mostly atypical speech sound errors and those with mostly typical speech sound errors. Findings were consistent with predictions from dual route models of spelling that SSPS is one of many variables associated with spelling skill and that children with impaired SSPS are at risk for spelling difficulty.
Vilela, Nadia; Barrozo, Tatiane Faria; Pagan-Neves, Luciana de Oliveira; Sanches, Seisse Gabriela Gandolfi; Wertzner, Haydée Fiszbein; Carvallo, Renata Mota Mamede
To identify a cutoff value based on the Percentage of Consonants Correct-Revised index that could indicate the likelihood of a child with a speech-sound disorder also having a (central) auditory processing disorder . Language, audiological and (central) auditory processing evaluations were administered. The participants were 27 subjects with speech-sound disorders aged 7 to 10 years and 11 months who were divided into two different groups according to their (central) auditory processing evaluation results. When a (central) auditory processing disorder was present in association with a speech disorder, the children tended to have lower scores on phonological assessments. A greater severity of speech disorder was related to a greater probability of the child having a (central) auditory processing disorder. The use of a cutoff value for the Percentage of Consonants Correct-Revised index successfully distinguished between children with and without a (central) auditory processing disorder. The severity of speech-sound disorder in children was influenced by the presence of (central) auditory processing disorder. The attempt to identify a cutoff value based on a severity index was successful.
Full Text Available OBJECTIVE: To identify a cutoff value based on the Percentage of Consonants Correct-Revised index that could indicate the likelihood of a child with a speech-sound disorder also having a (central auditory processing disorder. METHODS: Language, audiological and (central auditory processing evaluations were administered. The participants were 27 subjects with speech-sound disorders aged 7 to 10 years and 11 months who were divided into two different groups according to their (central auditory processing evaluation results. RESULTS: When a (central auditory processing disorder was present in association with a speech disorder, the children tended to have lower scores on phonological assessments. A greater severity of speech disorder was related to a greater probability of the child having a (central auditory processing disorder. The use of a cutoff value for the Percentage of Consonants Correct-Revised index successfully distinguished between children with and without a (central auditory processing disorder. CONCLUSIONS: The severity of speech-sound disorder in children was influenced by the presence of (central auditory processing disorder. The attempt to identify a cutoff value based on a severity index was successful.
Masterson, Julie J.; Preston, Jonathan L.
Purpose This archival investigation examined the relationship between preliteracy speech sound production skill (SSPS) and spelling in Grade 3 using a dataset in which children's receptive vocabulary was generally within normal limits, speech therapy was not provided until Grade 2, and phonological awareness instruction was discouraged at the time data were collected. Method Participants (N = 250), selected from the Templin Archive (Templin, 2004), varied on prekindergarten SSPS. Participants' real word spellings in Grade 3 were evaluated using a metric of linguistic knowledge, the Computerized Spelling Sensitivity System (Masterson & Apel, 2013). Relationships between kindergarten speech error types and later spellings also were explored. Results Prekindergarten children in the lowest SPSS (7th percentile) scored poorest among articulatory subgroups on both individual spelling elements (phonetic elements, junctures, and affixes) and acceptable spelling (using relatively more omissions and illegal spelling patterns). Within the 7th percentile subgroup, there were no statistical spelling differences between those with mostly atypical speech sound errors and those with mostly typical speech sound errors. Conclusions Findings were consistent with predictions from dual route models of spelling that SSPS is one of many variables associated with spelling skill and that children with impaired SSPS are at risk for spelling difficulty. PMID:26380965
Peter, Beate; Raskind, Wendy H.
Purpose: To evaluate phenotypic expressions of speech sound disorder (SSD) in multigenerational families with evidence of familial forms of SSD. Method: Members of five multigenerational families (N = 36) produced rapid sequences of monosyllables and disyllables and tapped computer keys with repetitive and alternating movements. Results: Measures…
Lousada, M.; Jesus, Luis M. T.; Hall, A.; Joffe, V.
Background: The effectiveness of two treatment approaches (phonological therapy and articulation therapy) for treatment of 14 children, aged 4;0-6;7 years, with phonologically based speech-sound disorder (SSD) has been previously analysed with severity outcome measures (percentage of consonants correct score, percentage occurrence of phonological…
Wren, Yvonne; Miller, Laura L.; Peters, Tim J.; Emond, Alan; Roulstone, Sue
Purpose: The purpose of this study was to determine prevalence and predictors of persistent speech sound disorder (SSD) in children aged 8 years after disregarding children presenting solely with common clinical distortions (i.e., residual errors). Method: Data from the Avon Longitudinal Study of Parents and Children (Boyd et al., 2012) were used.…
Nixon, Jessie Sophia
Spoken communication involves transmission of a message which takes physical form in acoustic waves. Within any given language, acoustic cues pattern in language-specific ways along language-specific acoustic dimensions to create speech sound contrasts. These cues are utilized by listeners to
Goswami, Usha; Fosker, Tim; Huss, Martina; Mead, Natasha; Szucs, Denes
Across languages, children with developmental dyslexia have a specific difficulty with the neural representation of the sound structure (phonological structure) of speech. One likely cause of their difficulties with phonology is a perceptual difficulty in auditory temporal processing (Tallal, 1980). Tallal (1980) proposed that basic auditory…
Kovelman, Ioulia; Yip, Jonathan C; Beck, Erica L
Over the course of language acquisition, the brain becomes specialized in the perception of native language speech sounds or phonemes. As a result, adult speakers are highly efficient at processing their native language, but may struggle to perceive some non-native phonemes. This specialization is thought to arise from changes that occur in a person's brain as a result of maturation and language experience. In this study, adult native speakers of English were asked to discriminate between phonemes of varying degrees of difference from English (similar to English: Tagalog /na/-/ŋa/; different from English: Ndebele /k||i/-/k!i/), as their brain activity was measured using functional near infrared spectroscopy imaging. The left inferior frontal region showed activation only during the native condition; this finding is discussed in the context of developmental and adult neuroimaging work and suggests that the left inferior frontal region is critical for perceiving native phoneme contrasts during development and in adulthood.
Farquharson, Kelly; Hogan, Tiffany P; Bernthal, John E
The aim of this study was to explore the role of working memory processes as a possible cognitive underpinning of persistent speech sound disorders (SSD). Forty school-aged children were enrolled; 20 children with persistent SSD (P-SSD) and 20 typically developing children. Children participated in three working memory tasks - one to target each of the components in Baddeley's working memory model: phonological loop, visual spatial sketchpad and central executive. Children with P-SSD performed poorly only on the phonological loop tasks compared to their typically developing age-matched peers. However, mediation analyses revealed that the relation between working memory and a P-SSD was reliant upon nonverbal intelligence. These results suggest that co-morbid low-average nonverbal intelligence are linked to poor working memory in children with P-SSD. Theoretical and clinical implications are discussed.
Lee, Alice; Gibbon, Fiona E; Kearney, Elaine; Murphy, Doris
There is evidence that complete tongue-palate contact across the palate during production of vowels can be observed in some children with speech disorders associated with cleft palate in the English-speaking and Japanese-speaking populations. Although it has been shown that this is not a feature of typical vowel articulation in English-speaking adults, tongue-palate contact during vowel production in typical children and English-speaking children with speech sound disorders (SSD) have not been reported in detail. Therefore, this study sought to determine whether complete tongue-palate contact occurs during production of five selected vowels in 10 children with SSD and eight typically-developing children. The results showed that none of the typical children had complete contact across the palate during any of the vowels. However, of the 119 vowels produced by the children with SSD, 24% showed complete contact across the palate during at least a portion of the vowel segment. The results from the typically-developing children suggest that complete tongue-palate contact is an atypical articulatory feature. However, the evidence suggests that this pattern occurs relatively frequently in children with SSD. Further research is needed to determine the prevalence, cause, and perceptual consequence of complete tongue-palate contact.
Brumbaugh, Klaire Mann; Smit, Ann Bosma
In a national survey, speech-language pathologists (SLPs) were asked about service delivery and interventions they use with children ages 3-6 who have speech sound disorder (SSD). The survey was e-mailed to 2,084 SLPs who worked in pre-elementary settings across the United States. Of these, 24% completed part or all of the survey, with 18% completing the entire survey. SLPs reported that they provided children ages 3-6 who had SSD with 30 or 60 min of treatment time weekly, regardless of group or individual setting. More SLPs indicated that they used traditional intervention than other types of intervention. However, many SLPs also reported using aspects of phonological interventions and providing phonological awareness training. Fewer SLPs indicated that they used nonspeech oral motor exercises than in a previous survey ( Lof & Watson, 2008). Recently graduated SLPs were no more familiar with recent advances in phonological intervention than were their more experienced colleagues. Discussion This study confirms previous findings ( Mullen & Schooling, 2010) about the amount of service provided to children ages 3-6 who have SSD. Issues related to the use of traditional and phonological intervention with children who have phonological disorder are discussed, along with concerns related to evidence-based practice and research needs.
Stein, Catherine M; Millard, Christopher; Kluge, Amy; Miscimarra, Lara E; Cartier, Kevin C; Freebairn, Lisa A; Hansen, Amy J; Shriberg, Lawrence D; Taylor, H Gerry; Lewis, Barbara A; Iyengar, Sudha K
Despite a growing body of evidence indicating that speech sound disorder (SSD) has an underlying genetic etiology, researchers have not yet identified specific genes predisposing to this condition. The speech and language deficits associated with SSD are shared with several other disorders, including dyslexia, autism, Prader-Willi Syndrome (PWS), and Angelman's Syndrome (AS), raising the possibility of gene sharing. Furthermore, we previously demonstrated that dyslexia and SSD share genetic susceptibility loci. The present study assesses the hypothesis that SSD also shares susceptibility loci with autism and PWS. To test this hypothesis, we examined linkage between SSD phenotypes and microsatellite markers on the chromosome 15q14-21 region, which has been associated with autism, PWS/AS, and dyslexia. Using SSD as the phenotype, we replicated linkage to the 15q14 region (P=0.004). Further modeling revealed that this locus influenced oral-motor function, articulation and phonological memory, and that linkage at D15S118 was potentially influenced by a parent-of-origin effect (LOD score increase from 0.97 to 2.17, P=0.0633). These results suggest shared genetic determinants in this chromosomal region for SSD, autism, and PWS/AS.
Munson, Benjamin; Krause, Miriam O.P.
Background Psycholinguistic models of language production provide a framework for determining the locus of language breakdown that leads to Speech Sound Disorder (SSD) in children. Aims This experiment examined whether children with SSD differ from their age-matched peers with typical speech and language development (TD) in the ability to phonologically encode lexical items that have been accessed from memory. Methods & Procedures Thirty-six children (18 with TD, 18 with SSD) viewed pictures while listening to interfering words (IW) or a nonlinguistic auditory stimulus presented over headphones either 150 ms prior to, concurrent with, or 150 ms after picture presentation. The phonological similarity of the IW and the pictures’ names varied. Picture-naming latency, accuracy, and duration were tallied. Outcomes & Results All children named pictures more quickly in the presence of an IW identical to the picture’s name than in the other conditions. At the +150 ms stimulus onset asynchrony, pictures were named more quickly when the IW shared phonemes with the picture’s name than when they were phonologically unrelated to the picture’s name. The size of this effect was similar for children with SSD and children with TD. Variation in the magnitude of inhibition and facilitation on cross-modal priming tasks across children was more strongly affected by the size of the expressive and receptive lexicons than by speech-production accuracy. Conclusions & Implications Results suggest that SSD is not associated with reduced phonological encoding ability, at least as it is reflected by cross-modal naming tasks. PMID:27432488
Sugden, Eleanor; Baker, Elise; Munro, Natalie; Williams, A Lynn
Internationally, speech and language therapists (SLTs) are involving parents and providing home tasks in intervention for phonology-based speech sound disorder (SSD). To ensure that SLTs' involvement of parents is guided by empirical research, a review of peer-reviewed published evidence is needed. To provide SLTs and researchers with a comprehensive appraisal and analysis of peer-reviewed published intervention research reporting parent involvement and the provision of home tasks in intervention studies for children with phonology-based SSD. A systematic search and review was conducted. Academic databases were searched for peer-reviewed research papers published between 1979 and 2013 reporting on phonological intervention for SSD. Of the 176 papers that met the criteria, 61 were identified that reported on the involvement of parents and/or home tasks within the intervention. These papers were analysed using a quality appraisal tool. Details regarding the involvement of parents and home tasks were extracted and analysed to provide a summary of these practices within the evidence base. Parents have been involved in intervention research for phonology-based SSD. However, most of the peer-reviewed published papers reporting this research have provided limited details regarding what this involved. This paucity of information presents challenges for SLTs wishing to integrate external evidence into their clinical services and clinical decision-making. It also raises issues regarding treatment fidelity for researchers wishing to replicate published intervention research. The range of tasks in which parents were involved, and the limited details reported in the literature, present challenges for SLTs wanting to involve parents in intervention. Further high-quality research reporting more detail regarding the involvement of parents and home tasks in intervention for SSD is needed. © 2016 Royal College of Speech and Language Therapists.
Bram Van Dun
Full Text Available Cortical auditory evoked potentials (CAEPs are an emerging tool for hearing aid fitting evaluation in young children who cannot provide reliable behavioral feedback. It is therefore useful to determine the relationship between the sensation level of speech sounds and the detection sensitivity of CAEPs, which is the ratio between the number of detections and the sum of detections and non-detections. Twenty-five sensorineurally hearing impaired infants with an age range of 8 to 30 months were tested once, 18 aided and 7 unaided. First, behavioral thresholds of speech stimuli /m/, /g/, and /t/ were determined using visual reinforcement orientation audiometry. Afterwards, the same speech stimuli were presented at 55, 65, and 75 dB sound pressure level, and CAEPs were recorded. An automatic statistical detection paradigm was used for CAEP detection. For sensation levels above 0, 10, and 20 dB respectively, detection sensitivities were equal to 72±10, 75±10, and 78±12%. In 79% of the cases, automatic detection P-values became smaller when the sensation level was increased by 10 dB. The results of this study suggest that the presence or absence of CAEPs can provide some indication of the audibility of a speech sound for infants with sensorineural hearing loss. The detection of a CAEP might provide confidence, to a degree commensurate with the detection probability, that the infant is detecting that sound at the level presented. When testing infants where the audibility of speech sounds has not been established behaviorally, the lack of a cortical response indicates the possibility, but by no means a certainty, that the sensation level is 10 dB or less.
Nora Maria Raschle
Full Text Available Studies in sleeping newborns and infants propose that the superior temporal sulcus is involved in speech processing soon after birth. Speech processing also implicitly requires the analysis of the human voice, which conveys both linguistic and extra-linguistic information. However, due to technical and practical challenges when neuroimaging young children, evidence of neural correlates of speech and/or voice processing in toddlers and young children remains scarce. In the current study, we used functional magnetic resonance imaging (fMRI in 20 typically developing preschool children (average age = 5.8 y; range 5.2-6.8 y to investigate brain activation during judgments about vocal identity versus the initial speech sound of spoken object words. FMRI results reveal common brain regions responsible for voice-specific and speech-sound specific processing of spoken object words including bilateral primary and secondary language areas of the brain. Contrasting voice-specific with speech-sound specific processing predominantly activates the anterior part of the right-hemispheric superior temporal sulcus. Furthermore, the right STS is functionally correlated with left-hemispheric temporal and right-hemispheric prefrontal regions. This finding underlines the importance of the right superior temporal sulcus as a temporal voice area and indicates that this brain region is specialized, and functions similarly to adults by the age of five. We thus extend previous knowledge of voice-specific regions and their functional connections to the young brain which may further our understanding of the neuronal mechanism of speech-specific processing in children with developmental disorders, such as autism or specific language impairments.
Peter, Beate; Larkin, Tara; Stoel-Gammon, Carol
Perceptual similarities of musical tones separated by octave intervals are known as octave equivalence (OE). Peter et al. [(2008). Proceedings of the Fourth Conference on Speech Prosody, edited S. Maduerira, C. Reis, and P. Barbosa, Luso-Brazilian Association of Speech Sciences, Campinas, pp. 731-734] found evidence of octave-shifted pitch matching (OSPM) in children during verbal imitation tasks, implying OE in speech tokens. This study evaluated the role of lexical stress and speech sound disorder (SSD) in OSPM. Eleven children with SSD and 11 controls imitated low-pitched nonwords. Stimulus/response f(0) ratios were computed. OSPM was expressed preferentially in stressed vowels. SSD was associated with reduced expression of OSPM in unstressed vowels only. Results are consistent with the psycholinguistic prominence of lexical stress and prosodic deficits in SSD.
Full Text Available This study examined whether rapid temporal auditory processing, verbal working memory capacity, non-verbal intelligence, executive functioning, musical ability and prior foreign language experience predicted how well native English speakers (N=120 discriminated Norwegian tonal and vowel contrasts as well as a non-speech analogue of the tonal contrast and a native vowel contrast presented over noise. Results confirmed a male advantage for temporal and tonal processing, and also revealed that temporal processing was associated with both non-verbal intelligence and speech processing. In contrast, effects of musical ability on non-native speech-sound processing and of inhibitory control on vowel discrimination were not mediated by temporal processing. These results suggest that individual differences in non-native speech-sound processing are to some extent determined by temporal auditory processing ability, in which males perform better, but are also determined by a host of other abilities that are deployed flexibly depending on the characteristics of the target sounds.
Kempe, Vera; Bublitz, Dennis; Brooks, Patricia J
Is the observed link between musical ability and non-native speech-sound processing due to enhanced sensitivity to acoustic features underlying both musical and linguistic processing? To address this question, native English speakers (N = 118) discriminated Norwegian tonal contrasts and Norwegian vowels. Short tones differing in temporal, pitch, and spectral characteristics were used to measure sensitivity to the various acoustic features implicated in musical and speech processing. Musical ability was measured using Gordon's Advanced Measures of Musical Audiation. Results showed that sensitivity to specific acoustic features played a role in non-native speech-sound processing: Controlling for non-verbal intelligence, prior foreign language-learning experience, and sex, sensitivity to pitch and spectral information partially mediated the link between musical ability and discrimination of non-native vowels and lexical tones. The findings suggest that while sensitivity to certain acoustic features partially mediates the relationship between musical ability and non-native speech-sound processing, complex tests of musical ability also tap into other shared mechanisms. © 2014 The British Psychological Society.
Dominik R Bach
Full Text Available Rising sound intensity often signals an approaching sound source and can serve as a powerful warning cue, eliciting phasic attention, perception biases and emotional responses. How the evaluation of approaching sounds unfolds over time remains elusive. Here, we capitalised on the temporal resolution of magnetoencephalograpy (MEG to investigate in humans a dynamic encoding of perceiving approaching and receding sounds. We compared magnetic responses to intensity envelopes of complex sounds to those of white noise sounds, in which intensity change is not perceived as approaching. Sustained magnetic fields over temporal sensors tracked intensity change in complex sounds in an approximately linear fashion, an effect not seen for intensity change in white noise sounds, or for overall intensity. Hence, these fields are likely to track approach/recession, but not the apparent (instantaneous distance of the sound source, or its intensity as such. As a likely source of this activity, the bilateral inferior temporal gyrus and right temporo-parietal junction emerged. Our results indicate that discrete temporal cortical areas parametrically encode behavioural significance in moving sound sources where the signal unfolded in a manner reminiscent of evidence accumulation. This may help an understanding of how acoustic percepts are evaluated as behaviourally relevant, where our results highlight a crucial role of cortical areas.
Bram Van Dun
Full Text Available
Background: Cortical auditory evoked potentials (CAEPs are an emerging tool for hearing aid fitting evaluation in young children who cannot provide reliable behavioral feedback. It is therefore useful to determine the relationship between the sensation level of speech sounds and the detection sensitivity of CAEPs.
Design and methods: Twenty-five sensorineurally hearing impaired infants with an age range of 8 to 30 months were tested once, 18 aided and 7 unaided. First, behavioral thresholds of speech stimuli /m/, /g/, and /t/ were determined using visual reinforcement orientation audiometry (VROA. Afterwards, the same speech stimuli were presented at 55, 65, and 75 dB SPL, and CAEP recordings were made. An automatic statistical detection paradigm was used for CAEP detection.
Results: For sensation levels above 0, 10, and 20 dB respectively, detection sensitivities were equal to 72 ± 10, 75 ± 10, and 78 ± 12%. In 79% of the cases, automatic detection p-values became smaller when the sensation level was increased by 10 dB.
Conclusions: The results of this study suggest that the presence or absence of CAEPs can provide some indication of the audibility of a speech sound for infants with sensorineural hearing loss. The detection of a CAEP provides confidence, to a degree commensurate with the detection probability, that the infant is detecting that sound at the level presented. When testing infants where the audibility of speech sounds has not been established behaviorally, the lack of a cortical response indicates the possibility, but by no means a certainty, that the sensation level is 10 dB or less.
Barrozo, Tatiane Faria; Pagan-Neves, Luciana de Oliveira; Pinheiro da Silva, Joyce; Wertzner, Haydée Fiszbein
The purpose of the study was to determine the sensitivity and specificity, and to establish cutoff points for the severity index Percentage of Consonants Correct - Revised (PCC-R) in Brazilian Portuguese-speaking children with and without speech sound disorders. 72 children between 5:00 and 7:11 years old - 36 children without speech and language complaints and 36 children with speech sound disorders. The PCC-R was applied to the figure naming and word imitation tasks that are part of the ABFW Child Language Test. Results were statistically analyzed. The ROC curve was performed and sensitivity and specificity values of the index were verified. The group of children without speech sound disorders presented greater PCC-R values in both tasks, regardless of the gender of the participants. The cutoff value observed for the picture naming task was 93.4%, with a sensitivity value of 0.89 and specificity of 0.94 (age independent). For the word imitation task, results were age-dependent: for age group ≤6:5 years old, the cutoff value was 91.0% (sensitivity of 0.77 and specificity of 0.94) and for age group >6:5 years-old, the cutoff value was 93.9% (sensitivity of 0.93 and specificity of 0.94). Given the high sensitivity and specificity of PCC-R, we can conclude that the index was effective in discriminating and identifying children with and without speech sound disorders.
Strait, Dana L; O'Connell, Samantha; Parbery-Clark, Alexandra; Kraus, Nina
The perception and neural representation of acoustically similar speech sounds underlie language development. Music training hones the perception of minute acoustic differences that distinguish sounds; this training may generalize to speech processing given that adult musicians have enhanced neural differentiation of similar speech syllables compared with nonmusicians. Here, we asked whether this neural advantage in musicians is present early in life by assessing musically trained and untrained children as young as age 3. We assessed auditory brainstem responses to the speech syllables /ba/ and /ga/ as well as auditory and visual cognitive abilities in musicians and nonmusicians across 3 developmental time-points: preschoolers, school-aged children, and adults. Cross-phase analyses objectively measured the degree to which subcortical responses differed to these speech syllables in musicians and nonmusicians for each age group. Results reveal that musicians exhibit enhanced neural differentiation of stop consonants early in life and with as little as a few years of training. Furthermore, the extent of subcortical stop consonant distinction correlates with auditory-specific cognitive abilities (i.e., auditory working memory and attention). Results are interpreted according to a corticofugal framework for auditory learning in which subcortical processing enhancements are engendered by strengthened cognitive control over auditory function in musicians. © The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: email@example.com.
Priester, G.H.; Post, W.J.; Goorhuis-Brouwer, S.M.
Objective: Analysis of examination procedure and diagnosis of articulation problems by speech therapists. Study design: Survey study. Materials and methods: Eighty-five Dutch speech therapists (23% response), working in private practises or involved in language screening procedures in Youth Health
Hrastelj, Laura; Knight, Rachael-Anne
Background: A pattern of ingressive substitutions for word-final sibilants can be identified in a small number of cases in child speech disorder, with growing evidence suggesting it is a phonological difficulty, despite the unusual surface form. Phonological difficulty implies a problem with the cognitive process of organizing speech into sound…
Emily B. J. Coffey
Full Text Available Speech-in-noise (SIN perception is a complex cognitive skill that affects social, vocational, and educational activities. Poor SIN ability particularly affects young and elderly populations, yet varies considerably even among healthy young adults with normal hearing. Although SIN skills are known to be influenced by top-down processes that can selectively enhance lower-level sound representations, the complementary role of feed-forward mechanisms and their relationship to musical training is poorly understood. Using a paradigm that minimizes the main top-down factors that have been implicated in SIN performance such as working memory, we aimed to better understand how robust encoding of periodicity in the auditory system (as measured by the frequency-following response contributes to SIN perception. Using magnetoencephalograpy, we found that the strength of encoding at the fundamental frequency in the brainstem, thalamus, and cortex is correlated with SIN accuracy. The amplitude of the slower cortical P2 wave was previously also shown to be related to SIN accuracy and FFR strength; we use MEG source localization to show that the P2 wave originates in a temporal region anterior to that of the cortical FFR. We also confirm that the observed enhancements were related to the extent and timing of musicianship. These results are consistent with the hypothesis that basic feed-forward sound encoding affects SIN perception by providing better information to later processing stages, and that modifying this process may be one mechanism through which musical training might enhance the auditory networks that subserve both musical and language functions.
Lopez-Poveda, Enrique A; Eustaquio-Martín, Almudena; Stohl, Joshua S; Wolford, Robert D; Schatzer, Reinhold; Gorospe, José M; Ruiz, Santiago Santa Cruz; Benito, Fernando; Wilson, Blake S
We have recently proposed a binaural cochlear implant (CI) sound processing strategy inspired by the contralateral medial olivocochlear reflex (the MOC strategy) and shown that it improves intelligibility in steady-state noise (Lopez-Poveda et al., 2016, Ear Hear 37:e138-e148). The aim here was to evaluate possible speech-reception benefits of the MOC strategy for speech maskers, a more natural type of interferer. Speech reception thresholds (SRTs) were measured in six bilateral and two single-sided deaf CI users with the MOC strategy and with a standard (STD) strategy. SRTs were measured in unilateral and bilateral listening conditions, and for target and masker stimuli located at azimuthal angles of (0°, 0°), (-15°, +15°), and (-90°, +90°). Mean SRTs were 2-5 dB better with the MOC than with the STD strategy for spatially separated target and masker sources. For bilateral CI users, the MOC strategy (1) facilitated the intelligibility of speech in competition with spatially separated speech maskers in both unilateral and bilateral listening conditions; and (2) led to an overall improvement in spatial release from masking in the two listening conditions. Insofar as speech is a more natural type of interferer than steady-state noise, the present results suggest that the MOC strategy holds potential for promising outcomes for CI users. Copyright © 2017. Published by Elsevier B.V.
Gildersleeve-Neumann, Christina; Goldstein, Brian A
The effect of bilingual service delivery on treatment of speech sound disorders (SSDs) in bilingual children is largely unknown. Bilingual children with SSDs are typically provided intervention in only one language, although research suggests dual-language instruction for language disorders is best practice for bilinguals. This study examined cross-linguistic generalization of bilingual intervention in treatment of two 5-year-old sequential bilingual boys with SSDs (one with Childhood Apraxia of Speech), hypothesizing that selecting and treating targets in both languages would result in significant overall change in their English and Spanish speech systems. A multiple baseline across behaviours design was used to measure treatment effectiveness for two targets per child. Children received treatment 2-3 times per week for 8 weeks and in Spanish for at least 2 of every 3 days. Ongoing treatment performance was measured in probes in both languages; overall speech skills were compared pre- and post-treatment. Both children's speech improved in both languages with similar magnitude; there was improvement in some non-treated errors. Treating both languages had an overall positive effect on these bilingual children's speech. Future bilingual intervention research should explore alternating treatments designs, efficiency of monolingual vs bilingual treatment, different language and bilingual backgrounds, and between-group comparisons.
Ferrucci, Juliana Lopes; Mangilli, Laura Davison; Sassi, Fernanda Chiarion; Limongi, Suelly Cecilia Olivan; Andrade, Claudia Regina Furquim de
This study aimed to investigate international scientific papers published on the subject of cervical auscultation and its use in speech therapy. The study involved a qualitative review of the literature spanning the last 10 years. Articles were selected from the PubMed database using the following keywords: cervical auscultation, swallowing and swallowing disorders. Research was included that was conducted on adult humans (over 18 years of age) and was written in English. Each citation retrieved from the database was analyzed independently by each of the study researchers to ascertain its relevance for inclusion in the study. The methodology involved formulating the research question, locating and selecting studies and critically evaluating the articles according to the precepts of the Cochrane Handbook. As a result, 35 studies were identified; 13 articles were analyzed because they allowed access to the full text and were related directly to the subject. We found that the studies were performed with groups of healthy subjects and subjects with different types of base pathology. Some studies compared the patterns found in the different groups. Some of the research sought to study the pattern of swallowing sounds with different factors - evaluator experience, the specificity and sensitivity of the method and how to improve the technique of cervical auscultation through the use of instruments other than the stethoscope. The conclusion of this critical analysis is that cervical auscultation is an important tool to be used in conjunction with other assessment methods in the routine clinical evaluation of swallowing.
Skebo, Crysten M.; Lewis, Barbara A.; Freebairn, Lisa A.; Tag, Jessica; Ciesla, Allison Avrich; Stein, Catherine M.
Purpose The relationship between phonological awareness, overall language, vocabulary, and nonlinguistic cognitive skills to decoding and reading comprehension was examined for students at 3 stages of literacy development (i.e., early elementary school, middle school, and high school). Students with histories of speech sound disorders (SSD) with and without language impairment (LI) were compared to students without histories of SSD or LI (typical language; TL). Method In a cross-sectional design, students ages 7;0 (years; months) to 17;9 completed tests that measured reading, language, and nonlinguistic cognitive skills. Results For the TL group, phonological awareness predicted decoding at early elementary school, and overall language predicted reading comprehension at early elementary school and both decoding and reading comprehension at middle school and high school. For the SSD-only group, vocabulary predicted both decoding and reading comprehension at early elementary school, and overall language predicted both decoding and reading comprehension at middle school and decoding at high school. For the SSD and LI group, overall language predicted decoding at all 3 literacy stages and reading comprehension at early elementary school and middle school, and vocabulary predicted reading comprehension at high school. Conclusion Although similar skills contribute to reading across the age span, the relative importance of these skills changes with children’s literacy stages. PMID:23833280
Lewis, Barbara A; Short, Elizabeth J; Iyengar, Sudha K; Taylor, H Gerry; Freebairn, Lisa; Tag, Jessica; Avrich, Allison A; Stein, Catherine M
The purpose of this study was to examine the association of speech-sound disorders (SSD) with symptoms of attention-deficit/hyperactivity disorder (ADHD) by the severity of the SSD and the mode of transmission of SSD within the pedigrees of children with SSD. The participants were 412 children who were enrolled in a longitudinal family study of SSD. Children were grouped on the basis of the severity of their SSD as determined by their scores on the Goldman-Fristoe Test of Articulation and history of an SSD. Five severity groups were compared: no SSD, resolved SSD, mild SSD, mild-moderate SSD, and moderate-severe SSD. Participants were also coded for comorbid language impairment (LI), based on scores on a standardized language test. Pedigrees of children were considered to represent bilineal inheritance of disorders if there was a history for SSD on both the maternal and paternal sides of the family. Parents completed the ADHD rating scale and a developmental questionnaire for each of their children. Children with moderate-severe SSD had higher ratings on the inattention and hyperactive/impulsivity scales than children with no SSD. Children whose family pedigrees demonstrated bilineal inheritance had higher ratings of inattention than children without bilineal inheritance. To determine the best predictors of ADHD ratings, multiple linear regression analyses were conducted. LI was more predictive of ADHD symptoms than SSD severity, bilineal inheritance of SSD, age, or gender. Findings support that LI rather than SSD is associated with ADHD.
Full Text Available Endowing machines with sensing capabilities similar to those of humans is a prevalent quest in engineering and computer science. In the pursuit of making computers sense their surroundings, a huge effort has been conducted to allow machines and computers to acquire, process, analyze and understand their environment in a human-like way. Focusing on the sense of hearing, the ability of computers to sense their acoustic environment as humans do goes by the name of machine hearing. To achieve this ambitious aim, the representation of the audio signal is of paramount importance. In this paper, we present an up-to-date review of the most relevant audio feature extraction techniques developed to analyze the most usual audio signals: speech, music and environmental sounds. Besides revisiting classic approaches for completeness, we include the latest advances in the field based on new domains of analysis together with novel bio-inspired proposals. These approaches are described following a taxonomy that organizes them according to their physical or perceptual basis, being subsequently divided depending on the domain of computation (time, frequency, wavelet, image-based, cepstral, or other domains. The description of the approaches is accompanied with recent examples of their application to machine hearing related problems.
Wellman, Rachel L.; Lewis, Barbara A.; Freebairn, Lisa A.; Avrich, Allison A.; Hansen, Amy J.; Stein, Catherine M.
Purpose The main purpose of this study was to examine how children with isolated speech sound disorders (SSDs; n = 20), children with combined SSDs and language impairment (LI; n = 20), and typically developing children (n = 20), ages 3;3 (years;months) to 6;6, differ in narrative ability. The second purpose was to determine if early narrative ability predicts school-age (8–12 years) literacy skills. Method This study employed a longitudinal cohort design. The children completed a narrative retelling task before their formal literacy instruction began. The narratives were analyzed and compared for group differences. Performance on these early narratives was then used to predict the children’s reading decoding, reading comprehension, and written language ability at school age. Results Significant group differences were found in children’s (a) ability to answer questions about the story, (b) use of story grammars, and (c) number of correct and irrelevant utterances. Regression analysis demonstrated that measures of story structure and accuracy were the best predictors of the decoding of real words, reading comprehension, and written language. Measures of syntax and lexical diversity were the best predictors of the decoding of nonsense words. Conclusion Combined SSDs and LI, and not isolated SSDs, impact a child’s narrative abilities. Narrative retelling is a useful task for predicting which children may be at risk for later literacy problems. PMID:21969531
Lewis, Barbara A; Avrich, Allison A; Freebairn, Lisa A; Hansen, Amy J; Sucheston, Lara E; Kuo, Iris; Taylor, H Gerry; Iyengar, Sudha K; Stein, Catherine M
To demonstrate that early childhood speech sound disorders (SSD) and later school-age reading, written expression, and spelling skills are influenced by shared endophenotypes that may be in part genetic. Children with SSD and their siblings were assessed at early childhood (ages 4-6 years) and followed at school age (7-12 years). The relationship of shared endophenotypes with early childhood SSD and school-age outcomes and the shared genetic influences on these outcomes were examined. Structural equation modeling demonstrated that oral motor skills, phonological awareness, phonological memory, vocabulary, and speeded naming have varying influences on reading decoding, spelling, spoken language, and written expression at school age. Genetic linkage studies demonstrated linkage for reading, spelling, and written expression measures to regions on chromosomes 1, 3, 6, and 15 that were previously linked to oral motor skills, articulation, phonological memory, and vocabulary at early childhood testing. Endophenotypes predict school-age literacy outcomes over and above that predicted by clinical diagnoses of SSD or language impairment. Findings suggest that these shared endophenotypes and common genetic influences affect early childhood SSD and later school-age reading, spelling, spoken language, and written expression skills.
Hayiou-Thomas, Marianna E; Carroll, Julia M; Leavett, Ruth; Hulme, Charles; Snowling, Margaret J
This study considers the role of early speech difficulties in literacy development, in the context of additional risk factors. Children were identified with speech sound disorder (SSD) at the age of 3½ years, on the basis of performance on the Diagnostic Evaluation of Articulation and Phonology. Their literacy skills were assessed at the start of formal reading instruction (age 5½), using measures of phoneme awareness, word-level reading and spelling; and 3 years later (age 8), using measures of word-level reading, spelling and reading comprehension. The presence of early SSD conferred a small but significant risk of poor phonemic skills and spelling at the age of 5½ and of poor word reading at the age of 8. Furthermore, within the group with SSD, the persistence of speech difficulties to the point of school entry was associated with poorer emergent literacy skills, and children with 'disordered' speech errors had poorer word reading skills than children whose speech errors indicated 'delay'. In contrast, the initial severity of SSD was not a significant predictor of reading development. Beyond the domain of speech, the presence of a co-occurring language impairment was strongly predictive of literacy skills and having a family risk of dyslexia predicted additional variance in literacy at both time-points. Early SSD alone has only modest effects on literacy development but when additional risk factors are present, these can have serious negative consequences, consistent with the view that multiple risks accumulate to predict reading disorders. © 2016 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for Child and Adolescent Mental Health.
Estis, Julie M; Parisi, Julie A; Moore, Robert E; Brungart, Douglas S
Speech intelligibility performance with an in-the-ear microphone embedded in a custom-molded deep-insertion earplug was compared with results obtained using a free-field microphone. Intelligibility differences between microphones were further analyzed to assess whether reduced intelligibility was specific to certain sound classes. 36 participants completed the Modified Rhyme Test using recordings made with each microphone. While speech intelligibility for both microphones was highly accurate, intelligibility with the free-field microphone was significantly better than with the in-the-ear microphone. There were significant effects of place and manner of sound production. Significant differences in recognition among specific phonemes were also revealed. Implications included modifying the in-the-ear microphone to transmit more high frequency energy. Use of the in-the-ear microphone was limited by significant loss of high-frequency energy of the speech signal which resulted in reduced intelligibility for some sounds; however, the in-the-ear microphone is a promising technology for effective communication in military environments.
Archila-Suerte, Pilar; Zevin, Jason; Hernandez, Arturo E
This study investigates the role of age of acquisition (AoA), socioeducational status (SES), and second language (L2) proficiency on the neural processing of L2 speech sounds. In a task of pre-attentive listening and passive viewing, Spanish-English bilinguals and a control group of English monolinguals listened to English syllables while watching a film of natural scenery. Eight regions of interest were selected from brain areas involved in speech perception and executive processes. The regions of interest were examined in 2 separate two-way ANOVA (AoA×SES; AoA×L2 proficiency). The results showed that AoA was the main variable affecting the neural response in L2 speech processing. Direct comparisons between AoA groups of equivalent SES and proficiency level enhanced the intensity and magnitude of the results. These results suggest that AoA, more than SES and proficiency level, determines which brain regions are recruited for the processing of second language speech sounds. Copyright © 2014 Elsevier Inc. All rights reserved.
Apel, Kenn; Lawrence, Jessika
Purpose: In this study, the authors compared the morphological awareness abilities of children with speech sound disorder (SSD) and children with typical speech skills and examined how morphological awareness ability predicted word-level reading and spelling performance above other known contributors to literacy development. Method: Eighty-eight…
Andrew D Brown
Full Text Available Hearing protection devices (HPDs such as earplugs offer to mitigate noise exposure and reduce the incidence of hearing loss among persons frequently exposed to intense sound. However, distortions of spatial acoustic information and reduced audibility of low-intensity sounds caused by many existing HPDs can make their use untenable in high-risk (e.g., military or law enforcement environments where auditory situational awareness is imperative. Here we assessed (1 sound source localization accuracy using a head-turning paradigm, (2 speech-in-noise recognition using a modified version of the QuickSIN test, and (3 tone detection thresholds using a two-alternative forced-choice task. Subjects were 10 young normal-hearing males. Four different HPDs were tested (two active, two passive, including two new and previously untested devices. Relative to unoccluded (control performance, all tested HPDs significantly degraded performance across tasks, although one active HPD slightly improved high-frequency tone detection thresholds and did not degrade speech recognition. Behavioral data were examined with respect to head-related transfer functions measured using a binaural manikin with and without tested HPDs in place. Data reinforce previous reports that HPDs significantly compromise a variety of auditory perceptual facilities, particularly sound localization due to distortions of high-frequency spectral cues that are important for the avoidance of front-back confusions.
Brown, Andrew D.; Beemer, Brianne T.; Greene, Nathaniel T.; Argo, Theodore; Meegan, G. Douglas; Tollin, Daniel J.
Hearing protection devices (HPDs) such as earplugs offer to mitigate noise exposure and reduce the incidence of hearing loss among persons frequently exposed to intense sound. However, distortions of spatial acoustic information and reduced audibility of low-intensity sounds caused by many existing HPDs can make their use untenable in high-risk (e.g., military or law enforcement) environments where auditory situational awareness is imperative. Here we assessed (1) sound source localization accuracy using a head-turning paradigm, (2) speech-in-noise recognition using a modified version of the QuickSIN test, and (3) tone detection thresholds using a two-alternative forced-choice task. Subjects were 10 young normal-hearing males. Four different HPDs were tested (two active, two passive), including two new and previously untested devices. Relative to unoccluded (control) performance, all tested HPDs significantly degraded performance across tasks, although one active HPD slightly improved high-frequency tone detection thresholds and did not degrade speech recognition. Behavioral data were examined with respect to head-related transfer functions measured using a binaural manikin with and without tested HPDs in place. Data reinforce previous reports that HPDs significantly compromise a variety of auditory perceptual facilities, particularly sound localization due to distortions of high-frequency spectral cues that are important for the avoidance of front-back confusions. PMID:26313145
Full Text Available , “Recognizing speech of goats, wolves, sheep and non-natives,” Speech Communication, vol. 35, no. 1, pp. 71–79, 2001.  L. M. Tomokiyo, “Handling non-native speech in lvcsr: A preliminary study,” in Proceedings of the EUROCALL/CALICO/ISCA workshop...
Monaghan, Padraic; Christiansen, Morten H.
There are numerous models of how speech segmentation may proceed in infants acquiring their first language. We present a framework for considering the relative merits and limitations of these various approaches. We then present a model of speech segmentation that aims to reveal important sources of information for speech segmentation, and to…
Lima, César F; Garrett, Carolina; Castro, São Luís
Does emotion processing in music and speech prosody recruit common neurocognitive mechanisms? To examine this question, we implemented a cross-domain comparative design in Parkinson's disease (PD). Twenty-four patients and 25 controls performed emotion recognition tasks for music and spoken sentences. In music, patients had impaired recognition of happiness and peacefulness, and intact recognition of sadness and fear; this pattern was independent of general cognitive and perceptual abilities. In speech, patients had a small global impairment, which was significantly mediated by executive dysfunction. Hence, PD affected differently musical and prosodic emotions. This dissociation indicates that the mechanisms underlying the two domains are partly independent.
McLeod, Sharynne; Baker, Elise; McCormack, Jane; Wren, Yvonne; Roulstone, Sue; Crowe, Kathryn; Masso, Sarah; White, Paul; Howland, Charlotte
Purpose: The aim was to evaluate the effectiveness of computer-assisted input-based intervention for children with speech sound disorders (SSD). Method: The Sound Start Study was a cluster-randomized controlled trial. Seventy-nine early childhood centers were invited to participate, 45 were recruited, and 1,205 parents and educators of 4- and…
D'Ausilio, Alessandro; Bufalari, Ilaria; Salmas, Paola; Fadiga, Luciano
Listening to speech recruits a network of fronto-temporo-parietal cortical areas. Classical models consider anterior, motor, sites involved in speech production whereas posterior sites involved in comprehension. This functional segregation is more and more challenged by action-perception theories suggesting that brain circuits for speech articulation and speech perception are functionally interdependent. Recent studies report that speech listening elicits motor activities analogous to production. However, the motor system could be crucially recruited only under certain conditions that make speech discrimination hard. Here, by using event-related double-pulse transcranial magnetic stimulation (TMS) on lips and tongue motor areas, we show data suggesting that the motor system may play a role in noisy, but crucially not in noise-free environments, for the discrimination of speech signals. Copyright © 2011 Elsevier Srl. All rights reserved.
Full Text Available The acquisition of letter-speech sound associations is one of the basic requirements for fluent reading acquisition and its failure may contribute to reading difficulties in developmental dyslexia. Here we investigated event-related potential (ERP measures of letter-speech sound integration in 9-year-old typical and dyslexic readers and specifically test their relation to individual differences in reading fluency. We employed an audiovisual oddball paradigm in typical readers (n = 20, dysfluent (n = 18 and severely dysfluent (n = 18 dyslexic children. In one auditory and two audiovisual conditions the Dutch spoken vowels/a/and/o/were presented as standard and deviant stimuli. In audiovisual blocks, the letter 'a' was presented either simultaneously (AV0, or 200 ms before (AV200 vowel sound onset. Across the three children groups, vowel deviancy in auditory blocks elicited comparable mismatch negativity (MMN and late negativity (LN responses. In typical readers, both audiovisual conditions (AV0 and AV200 led to enhanced MMN and LN amplitudes. In both dyslexic groups, the audiovisual LN effects were mildly reduced. Most interestingly, individual differences in reading fluency were correlated with MMN latency in the AV0 condition. A further analysis revealed that this effect was driven by a short-lived MMN effect encompassing only the N1 window in severely dysfluent dyslexics versus a longer MMN effect encompassing both the N1 and P2 windows in the other two groups. Our results confirm and extend previous findings in dyslexic children by demonstrating a deficient pattern of letter-speech sound integration depending on the level of reading dysfluency. These findings underscore the importance of considering individual differences across the entire spectrum of reading skills in addition to group differences between typical and dyslexic readers.
Eramudugolla, Ranmalee; Henderson, Rachel; Mattingley, Jason B.
Integration of simultaneous auditory and visual information about an event can enhance our ability to detect that event. This is particularly evident in the perception of speech, where the articulatory gestures of the speaker's lips and face can significantly improve the listener's detection and identification of the message, especially when that…
Pisoni, David B.
This paper reviews some of the major evidence and arguments currently available to support the view that human speech perception may require the use of specialized neural mechanisms for perceptual analysis. Experiments using synthetically produced speech signals with adults are briefly summarized and extensions of these results to infants and other organisms are reviewed with an emphasis towards detailing those aspects of speech perception that may require some need for specialized species-specific processors. Finally, some comments on the role of early experience in perceptual development are provided as an attempt to identify promising areas of new research in speech perception. PMID:399200
McGrath, Lauren M; Hutaff-Lee, Christa; Scott, Ashley; Boada, Richard; Shriberg, Lawrence D; Pennington, Bruce F
This study focuses on the comorbidity between attention-deficit/hyperactivity disorder (ADHD) symptoms and speech sound disorder (SSD). SSD is a developmental disorder characterized by speech production errors that impact intelligibility. Previous research addressing this comorbidity has typically used heterogeneous groups of speech-language disordered children. This study employed more precise speech-language diagnostic criteria and examined ADHD symptomatology in 108 SSD children between the ages of 4 and 7 years old with specific language impairment (SLI) (n = 23, 14 males, 9 females) and without SLI (n = 85, 49 males, 36 females). We also examined whether a subcategory of SSD, persistent (n = 39, 25 males, 14 females) versus normalized SSD (n = 67, 38 males, 29 females), was associated with ADHD and/or interacted with SLI to predict ADHD symptomatology. Results indicated that participants in the SSD + SLI group had higher rates of inattentive ADHD symptoms than those in the SSD-only and control groups. In addition, an unexpected interaction emerged such that children with SLI and normalized-SSD had significantly higher ADHD inattentive ratings than the other subgroups. A proposed explanation for this interaction is discussed.
Li, Xinyan; Zhao, Dan; Li, Junwei; Xu, Yousheng
Self-sustained thermoacoustics oscillations most often arise due to the coupling between unsteady heat release and acoustic waves. The large-amplitude oscillations are wanted in thermoacoustic engine systems. However, they are undesirable in many other systems such as aero-engine afterburners, rocket motors, ramjets, and gas turbines, since the oscillations may become so intense that they cause structural damage and costly mission failure. In this work, we experimentally investigate the "anti-sound" approach in damping Rijke-type thermoacoustic oscillations by actuating a monopole-like sound source. For this, four different least-mean-square (LMS) algorithms are used to determine the "anti-sound" signal to drive the actuator. Comparison is then made. It is found that the LMS-based "anti-sound" approach is able to minimize the thermoacoustic oscillations, even when the operating conditions are slightly changed. Sound pressure level is reduced by 45 dB. Finally, a numerical model is developed to gain insights on the interaction between the monopole sound source and the system. Unsteady heat release from the flame is assumed to be caused by its surface variations resulting from the oncoming acoustic fluctuations. By linearizing the flame model and recasting it into the classical time-lag N -τ formulation, the thermoacoustic system transfer function is calculated. Compared with the experimental measurement by injecting a broad-band white noise, good agreement is obtained.
Hariri, Ruaa Osama
Children with Attention-Deficiency/Hyperactive Disorder (ADHD) often have co-existing learning disabilities and developmental weaknesses or delays in some areas including speech (Rief, 2005). Seeing that phonological disorders include articulation errors and other forms of speech disorders, studies pertaining to children with ADHD symptoms who…
Verdon, Sarah; McLeod, Sharynne; Wong, Sandie
Background: The speech and language therapy profession is required to provide services to increasingly multilingual caseloads. Much international research has focused on the challenges of speech and language therapists' (SLTs) practice with multilingual children. Aims: To draw on the experience and knowledge of experts in the field to: (1)…
Lousada, M; Jesus, Luis M T; Hall, A; Joffe, V
The effectiveness of two treatment approaches (phonological therapy and articulation therapy) for treatment of 14 children, aged 4;0-6;7 years, with phonologically based speech-sound disorder (SSD) has been previously analysed with severity outcome measures (percentage of consonants correct score, percentage occurrence of phonological processes and phonetic inventory). Considering that the ultimate goal of intervention for children with phonologically based SSD is to improve intelligibility, it is curious that intervention studies focusing on children's phonology do not routinely use intelligibility as an outcome measure. It is therefore important that the impact of interventions on speech intelligibility is explored. This paper investigates the effectiveness of the two treatment approaches (phonological therapy and articulation therapy) using intelligibility measures, both in single words and in continuous speech, as the primary outcome. Fourteen children with phonologically based SSD participated in the intervention. The children were randomly assigned to phonological therapy or articulation therapy (seven children in each group). Two assessment methods were used for measuring intelligibility: a word identification task (for single words) and a rating scale (for continuous speech). Twenty-one unfamiliar adults listened and judged the children's intelligibility. Reliability analyses showed overall high agreement between listeners across both methods. Significant improvements were noted in intelligibility in both single words (paired t(6)=4.409, p=0.005) and continuous speech (asymptotic Z=2.371, p=0.018) for the group receiving phonology therapy pre- to post-treatment, but no differences in intelligibility were found for those receiving the articulation therapy pre- to post-treatment, either for single words (paired t(6)=1.763, p=0.128) or continuous speech (asymptotic Z=1.442, p=0.149). Intelligibility measures were sensitive enough to show changes in the
Robertson, William C
Muddled about what makes music? Stuck on the study of harmonics? Dumbfounded by how sound gets around? Now you no longer have to struggle to teach concepts you really don t grasp yourself. Sound takes an intentionally light touch to help out all those adults science teachers, parents wanting to help with homework, home-schoolers seeking necessary scientific background to teach middle school physics with confidence. The book introduces sound waves and uses that model to explain sound-related occurrences. Starting with the basics of what causes sound and how it travels, you'll learn how musical instruments work, how sound waves add and subtract, how the human ear works, and even why you can sound like a Munchkin when you inhale helium. Sound is the fourth book in the award-winning Stop Faking It! Series, published by NSTA Press. Like the other popular volumes, it is written by irreverent educator Bill Robertson, who offers this Sound recommendation: One of the coolest activities is whacking a spinning metal rod...
Dubnov, Shlomo; Rodet, Xavier
This work investigates aperiodicities that occur in the sustained portion of a sound of musical instrument played by a human player, due to synchronous versus asynchronous deviations of the partial phases. By using an additive sinusoidal analysis, phases of individual partials are precisely extracted and their correlation statistics and coupling effects are analyzed. It is shown that various musical instruments exhibit different phase coupling characteristics. The effect of phase coupling is compared to analysis by means of higher order statistics and it is shown that both methods are closely mathematically related. Following a detailed analysis of phase coupling for various musical instruments it is suggested that phase coupling is an important characteristic of a sustained portion of sound of individual musical instruments, and possibly even of instrumental families. Interesting differences in phase deviations where found for the flute, trumpet and cello. For the cello, the effect of vibrato is examined by comparing the analysis of a closed string sound played with a natural vibrato to analysis of an open string sound that contains no vibrato. Following, a possible model for phase deviations in the cello is presented and a simulation of phase fluctuations for this model is performed.
Yeung, H Henny; Werker, Janet F
One of the central themes in the study of language acquisition is the gap between the linguistic knowledge that learners demonstrate, and the apparent inadequacy of linguistic input to support induction of this knowledge. One of the first linguistic abilities in the course of development to exemplify this problem is in speech perception: specifically, learning the sound system of one's native language. Native-language sound systems are defined by meaningful contrasts among words in a language, yet infants learn these sound patterns before any significant numbers of words are acquired. Previous approaches to this learning problem have suggested that infants can learn phonetic categories from statistical analysis of auditory input, without regard to word referents. Experimental evidence presented here suggests instead that young infants can use visual cues present in word-labeling situations to categorize phonetic information. In Experiment 1, 9-month-old English-learning infants failed to discriminate two non-native phonetic categories, establishing baseline performance in a perceptual discrimination task. In Experiment 2, these infants succeeded at discrimination after watching contrasting visual cues (i.e., videos of two novel objects) paired consistently with the two non-native phonetic categories. In Experiment 3, these infants failed at discrimination after watching the same visual cues, but paired inconsistently with the two phonetic categories. At an age before which memory of word labels is demonstrated in the laboratory, 9-month-old infants use contrastive pairings between objects and sounds to influence their phonetic sensitivity. Phonetic learning may have a more functional basis than previous statistical learning mechanisms assume: infants may use cross-modal associations inherent in social contexts to learn native-language phonetic categories.
Full Text Available The auditory system typically processes information from concurrently active sound sources (e.g., two voices speaking at once, in the presence of multiple delayed, attenuated and distorted sound-wave reflections (reverberation. Brainstem circuits help segregate these complex acoustic mixtures into auditory objects. Psychophysical studies demonstrate a strong interaction between reverberation and fundamental-frequency (F0 modulation, leading to impaired segregation of competing vowels when segregation is on the basis of F0 differences. Neurophysiological studies of complex-sound segregation have concentrated on sounds with steady F0s, in anechoic environments. However, F0 modulation and reverberation are quasi-ubiquitous.We examine the ability of 129 single units in the ventral cochlear nucleus of the anesthetized guinea pig to segregate the concurrent synthetic vowel sounds /a/ and /i/, based on temporal discharge patterns under closed-field conditions. We address the effects of added real-room reverberation, F0 modulation, and the interaction of these two factors, on brainstem neural segregation of voiced speech sounds. A firing-rate representation of single-vowels’ spectral envelopes is robust to the combination of F0 modulation and reverberation: local firing-rate maxima and minima across the tonotopic array code vowel-formant structure. However, single-vowel F0-related periodicity information in shuffled inter-spike interval distributions is significantly degraded in the combined presence of reverberation and F0 modulation. Hence, segregation of double-vowels’ spectral energy into two streams (corresponding to the two vowels, on the basis of temporal discharge patterns, is impaired by reverberation; specifically when F0 is modulated. All unit types (primary-like, chopper, onset are similarly affected. These results offer neurophysiological insights to perceptual organization of complex acoustic scenes under realistically challenging
Sayles, Mark; Stasiak, Arkadiusz; Winter, Ian M
The auditory system typically processes information from concurrently active sound sources (e.g., two voices speaking at once), in the presence of multiple delayed, attenuated and distorted sound-wave reflections (reverberation). Brainstem circuits help segregate these complex acoustic mixtures into "auditory objects." Psychophysical studies demonstrate a strong interaction between reverberation and fundamental-frequency (F0) modulation, leading to impaired segregation of competing vowels when segregation is on the basis of F0 differences. Neurophysiological studies of complex-sound segregation have concentrated on sounds with steady F0s, in anechoic environments. However, F0 modulation and reverberation are quasi-ubiquitous. We examine the ability of 129 single units in the ventral cochlear nucleus (VCN) of the anesthetized guinea pig to segregate the concurrent synthetic vowel sounds /a/ and /i/, based on temporal discharge patterns under closed-field conditions. We address the effects of added real-room reverberation, F0 modulation, and the interaction of these two factors, on brainstem neural segregation of voiced speech sounds. A firing-rate representation of single-vowels' spectral envelopes is robust to the combination of F0 modulation and reverberation: local firing-rate maxima and minima across the tonotopic array code vowel-formant structure. However, single-vowel F0-related periodicity information in shuffled inter-spike interval distributions is significantly degraded in the combined presence of reverberation and F0 modulation. Hence, segregation of double-vowels' spectral energy into two streams (corresponding to the two vowels), on the basis of temporal discharge patterns, is impaired by reverberation; specifically when F0 is modulated. All unit types (primary-like, chopper, onset) are similarly affected. These results offer neurophysiological insights to perceptual organization of complex acoustic scenes under realistically challenging listening
Sayles, Mark; Stasiak, Arkadiusz; Winter, Ian M.
The auditory system typically processes information from concurrently active sound sources (e.g., two voices speaking at once), in the presence of multiple delayed, attenuated and distorted sound-wave reflections (reverberation). Brainstem circuits help segregate these complex acoustic mixtures into “auditory objects.” Psychophysical studies demonstrate a strong interaction between reverberation and fundamental-frequency (F0) modulation, leading to impaired segregation of competing vowels when segregation is on the basis of F0 differences. Neurophysiological studies of complex-sound segregation have concentrated on sounds with steady F0s, in anechoic environments. However, F0 modulation and reverberation are quasi-ubiquitous. We examine the ability of 129 single units in the ventral cochlear nucleus (VCN) of the anesthetized guinea pig to segregate the concurrent synthetic vowel sounds /a/ and /i/, based on temporal discharge patterns under closed-field conditions. We address the effects of added real-room reverberation, F0 modulation, and the interaction of these two factors, on brainstem neural segregation of voiced speech sounds. A firing-rate representation of single-vowels' spectral envelopes is robust to the combination of F0 modulation and reverberation: local firing-rate maxima and minima across the tonotopic array code vowel-formant structure. However, single-vowel F0-related periodicity information in shuffled inter-spike interval distributions is significantly degraded in the combined presence of reverberation and F0 modulation. Hence, segregation of double-vowels' spectral energy into two streams (corresponding to the two vowels), on the basis of temporal discharge patterns, is impaired by reverberation; specifically when F0 is modulated. All unit types (primary-like, chopper, onset) are similarly affected. These results offer neurophysiological insights to perceptual organization of complex acoustic scenes under realistically challenging listening
Moser, Dana; Fridriksson, Julius; Bonilha, Leonardo; Healy, Eric W; Baylis, Gordon; Baker, Julie M; Rorden, Chris
Two primary areas of damage have been implicated in apraxia of speech (AOS) based on the time post-stroke: (1) the left inferior frontal gyrus (IFG) in acute patients, and (2) the left anterior insula (aIns) in chronic patients. While AOS is widely characterized as a disorder in motor speech planning, little is known about the specific contributions of each of these regions in speech. The purpose of this study was to investigate cortical activation during speech production with a specific focus on the aIns and the IFG in normal adults. While undergoing sparse fMRI, 30 normal adults completed a 30-minute speech-repetition task consisting of three-syllable nonwords that contained either (a) English (native) syllables or (b) non-English (novel) syllables. When the novel syllable productions were compared to the native syllable productions, greater neural activation was observed in the aIns and IFG, particularly during the first 10 min of the task when novelty was the greatest. Although activation in the aIns remained high throughout the task for novel productions, greater activation was clearly demonstrated when the initial 10 min was compared to the final 10 min of the task. These results suggest increased activity within an extensive neural network, including the aIns and IFG, when the motor speech system is taxed, such as during the production of novel speech. We speculate that the amount of left aIns recruitment during speech production may be related to the internal construction of the motor speech unit such that the degree of novelty/automaticity would result in more or less demands respectively. The role of the IFG as a storehouse and integrative processor for previously acquired routines is also discussed.
Sound has the power to soothe, excite, warn, protect, and inform. Indeed, the transmission and reception of audio signals pervade our daily lives. Readers will examine the mechanics and properties of sound and provides an overview of the "interdisciplinary science called acoustics." Also covered are functions and diseases of the human ear.
Ruaa Osama Hariri
Children with Attention-Deficiency/Hyperactive Disorder (ADHD) often have co-existing learning disabilities and developmental weaknesses or delays in some areas including speech (Rief, 2005). Seeing that phonological disorders include articulation errors and other forms of speech disorders, studies pertaining to children with ADHD symptoms who demonstrate signs of phonological disorders in their native Arabic language are lacking. The purpose of this study is to provide a description of Arabi...
Abdulmohsen A. Dashti
Full Text Available In light of sociolinguist phonological change, the following study investigates the [j] sound in the speech of Kuwaitis as the predominant form and characterizes the sedentary population which is made up of both the indigenous and non-indigenous group; while [ʤ] is the realisation of the Bedouins who are also a part of the indigenous population. Although [ʤ] is the classical variant, it has, for some time, been regarded by Kuwaitis as the stigmatized form and the [j] as the one that carries prestige. This study examines the change of status of [j] and [ʤ] in the speech of Kuwaitis. The main hypothesis is that [j] no longer carries prestige. To test this hypothesis, 40 Kuwaitis of different gender, ages, educational background, and social networks were spontaneously chosen to be interviewed. Their speech was phonetically transcribed and accordingly was quantitatively and qualitatively analyzed. Results indicate that the [j] variant is undergoing change of status and that the social parameters and the significant political and social changes, that Kuwait has undergone recently, have triggered this linguistic shift.
Eicher, J D; Stein, C M; Deng, F; Ciesla, A A; Powers, N R; Boada, R; Smith, S D; Pennington, B F; Iyengar, S K; Lewis, B A; Gruen, J R
A major milestone of child development is the acquisition and use of speech and language. Communication disorders, including speech sound disorder (SSD), can impair a child's academic, social and behavioral development. Speech sound disorder is a complex, polygenic trait with a substantial genetic component. However, specific genes that contribute to SSD remain largely unknown. To identify associated genes, we assessed the association of the DYX2 dyslexia risk locus and markers in neurochemical signaling genes (e.g., nicotinic and dopaminergic) with SSD and related endophenotypes. We first performed separate primary associations in two independent samples - Cleveland SSD (210 affected and 257 unaffected individuals in 127 families) and Denver SSD (113 affected individuals and 106 unaffected individuals in 85 families) - and then combined results by meta-analysis. DYX2 markers, specifically those in the 3' untranslated region of DCDC2 (P = 1.43 × 10(-4) ), showed the strongest associations with phonological awareness. We also observed suggestive associations of dopaminergic-related genes ANKK1 (P = 1.02 × 10(-2) ) and DRD2 (P = 9.22 × 10(-3) ) and nicotinic-related genes CHRNA3 (P = 2.51 × 10(-3) ) and BDNF (P = 8.14 × 10(-3) ) with case-control status and articulation. Our results further implicate variation in putative regulatory regions in the DYX2 locus, particularly in DCDC2, influencing language and cognitive traits. The results also support previous studies implicating variation in dopaminergic and nicotinic neural signaling influencing human communication and cognitive development. Our findings expand the literature showing genetic factors (e.g., DYX2) contributing to multiple related, yet distinct neurocognitive domains (e.g., dyslexia, language impairment, and SSD). How these factors interactively yield different neurocognitive and language-related outcomes remains to be elucidated. © 2015 The Authors. Genes, Brain and Behavior published by
Cristina eMurphy; Luciana ePagan-Neves; Haydee eWertzner; Eliane eSchochat
This study aimed to compare the effects of a non-linguistic auditory intervention approach with a phonological intervention approach on the phonological skills of children with speech sound disorder. A total of 17 children, aged 7-12 years, with speech sound disorder were randomly allocated to either the non-linguistic auditory temporal intervention group (n = 10, average age 7.7 ± 1.2) or phonological intervention group (n = 7, average age 8.6 ± 1.2). The intervention outcomes included audit...
Furlong, Lisa M; Morris, Meg E; Erickson, Shane; Serry, Tanya A
Although mobile apps are readily available for speech sound disorders (SSD), their validity has not been systematically evaluated. This evidence-based appraisal will critically review and synthesize current evidence on available therapy apps for use by children with SSD. The main aims are to (1) identify the types of apps currently available for Android and iOS mobile phones and tablets, and (2) to critique their design features and content using a structured quality appraisal tool. This protocol paper presents and justifies the methods used for a systematic review of mobile apps that provide intervention for use by children with SSD. The primary outcomes of interest are (1) engagement, (2) functionality, (3) aesthetics, (4) information quality, (5) subjective quality, and (6) perceived impact. Quality will be assessed by 2 certified practicing speech-language pathologists using a structured quality appraisal tool. Two app stores will be searched from the 2 largest operating platforms, Android and iOS. Systematic methods of knowledge synthesis shall include searching the app stores using a defined procedure, data extraction, and quality analysis. This search strategy shall enable us to determine how many SSD apps are available for Android and for iOS compatible mobile phones and tablets. It shall also identify the regions of the world responsible for the apps' development, the content and the quality of offerings. Recommendations will be made for speech-language pathologists seeking to use mobile apps in their clinical practice. This protocol provides a structured process for locating apps and appraising the quality, as the basis for evaluating their use in speech pathology for children in English-speaking nations.
Full Text Available One of the most important development banks which finances private initiatives in the Central and Eastern Europe countries is the European Bank for Reconstruction and Development (EBRD. EBRD as international financial institution plays a very important role in the development of many sectors such as agribusiness, energy efficiency, financial institutions, manufacturing, municipal and environmental infrastructure, natural resources, power and energy, property and tourism, telecommunications, information technology and media, transport. Its objectives aim to promote transition to market economies by investing mainly in the private sector, to mobilize significant foreign direct investment, to support privatization, restructuring and better municipal services to improve people’s lives and to encourage environmentally sound and sustainable development. The present scientific article focuses on the last objective respectively the bank commitment to promote environmentally sound and sustainable development and shortly presents EBRD environmental policy because EBRD, unlike other development banks, has strong and imperative regulations regarding this issue. This is why all the EBRD potential beneficiaries must prove that their projects are environmentally sound.
Blau, Vera C; van Atteveldt, Nienke; Ekkebus, Michel; Goebel, Rainer; Blomert, Leo
Developmental dyslexia is a specific reading and spelling deficit affecting 4% to 10% of the population. Advances in understanding its origin support a core deficit in phonological processing characterized by difficulties in segmenting spoken words into their minimally discernable speech segments
Smulders, Yvette E.; Rinia, Albert B.; Pourier, Vanessa E C; Van Zon, Alice; Van Zanten, Gijsbert A.; Stegeman, Inge; Scherf, Fanny W A C; Smit, Adriana L.; Topsakal, Vedat; Tange, Rinze A.; Grolman, Wilko
The Advanced Bionics ® (AB)-York crescent of sound is a new test setup that comprises speech intelligibility in noise and localization tests that represent everyday listening situations. One of its tests is the Sentence Test with Adaptive Randomized Roving levels (STARR) with sentences and noise
Blau, Vera C; Reithler, Joel; van Atteveldt, Nienke; Seitz, Jochen; Gerretsen, Patty; Goebel, Rainer; Blomert, Leo
Learning to associate auditory information of speech sounds with visual information of letters is a first and critical step for becoming a skilled reader in alphabetic languages. Nevertheless, it remains largely unknown which brain areas subserve the learning and automation of such associations.
Blau, Vera; Reithler, Joel; van Atteveldt, Nienke; Seitz, Jochen; Gerretsen, Patty; Goebel, Rainer; Blomert, Leo
Learning to associate auditory information of speech sounds with visual information of letters is a first and critical step for becoming a skilled reader in alphabetic languages. Nevertheless, it remains largely unknown which brain areas subserve the learning and automation of such associations. Here, we employ functional magnetic resonance…
McLeod, Sharynne; Daniel, Graham; Barr, Jacqueline
Children interact with people in context: including home, school, and in the community. Understanding children's relationships within context is important for supporting children's development. Using child-friendly methodologies, the purpose of this research was to understand the lives of children with speech sound disorder (SSD) in context.…
Fraga González, G.; Žarić, G.; Tijms, J.; Bonte, M.; van der Molen, M.W.
We use a neurocognitive perspective to discuss the contribution of learning letter-speech sound (L-SS) associations and visual specialization in the initial phases of reading in dyslexic children. We review findings from associative learning studies on related cognitive skills important for
Pivik, R. T.; Andres, Aline; Badger, Thomas M.
Early post-natal nutrition influences later development, but there are no studies comparing brain function in healthy infants as a function of dietary intake even though the major infant diets differ significantly in nutrient composition. We studied brain responses (event-related potentials; ERPs) to speech sounds for infants who were fed either…
Hoffmann, Pablo F.; Møller, Anders Kalsgaard; Christensen, Flemming
A hear-through headset is formed by mounting miniature microphones on small insert earphones. This type of ear-wear technology enables the user to hear the sound sources and acoustics of the surroundings as close to real life as possible, with the additional feature that computer-generated audio...... signals can be superimposed via earphone reproduction. An important aspect of the hear-through headset is its transparency, i.e. how close to real life can the electronically amplied sounds be perceived. Here we report experiments conducted to evaluate the auditory transparency of a hear-through headset...
Sices, Laura; Taylor, H Gerry; Freebairn, Lisa; Hansen, Amy; Lewis, Barbara
Disorders of articulation or speech-sound disorders (SSD) are common in early childhood. Children with these disorders may be at risk for reading difficulties because they may have poor auditory, phonologic, and verbal memory skills. We sought to characterize the reading and writing readiness of preschool children with SSD and identify factors associated with preliteracy skills. Subjects were 125 children aged 3 to 6 years with moderate to severe SSD; 53% had comorbid language impairment (LI). Reading readiness was measured with the Test of Early Reading Ability-2 (TERA) and writing skills with the Test of Early Written Language-2 (TEWL), which assessed print concept knowledge. Linear regression was used to examine the association between SSD severity and TERA and TEWL scores and analysis of variance to examine the effect of comorbid LI. Performance on a battery of speech and language tests was reduced by way of factor analysis to composites for articulation, narrative, grammar, and word knowledge skills. Early reading and writing scores were significantly lower for children with comorbid LI but were not related to SSD severity once language status was taken into account. Composites for grammar and word knowledge were related to performance on the TERA and TEWL, even after adjusting for Performance IQ. Below average language skills in preschool place a child at risk for deficits in preliteracy skills, which may have implications for the later development of reading disability. Preschool children with SSD and LI may benefit from instruction in preliteracy skills in addition to language therapy.
Marie, Céline; Kujala, Teija; Besson, Mireille
The aim of this experiment was two-fold. Our first goal was to determine whether linguistic expertise influences the pre-attentive [as reflected by the Mismatch Negativity - (MMN)] and the attentive processing (as reflected by behavioural discrimination accuracy) of non-speech, harmonic sounds. The second was to directly compare the effects of linguistic and musical expertise. To this end, we compared non-musician native speakers of a quantity language, Finnish, in which duration is a phonemically contrastive cue, with French musicians and French non-musicians. Results revealed that pre-attentive and attentive processing of duration deviants was enhanced in Finn non-musicians and French musicians compared to French non-musicians. By contrast, MMN in French musicians was larger than in both Finns and French non-musicians for frequency deviants, whereas no between-group differences were found for intensity deviants. By showing similar effects of linguistic and musical expertise, these results argue in favor of common processing of duration in music and speech. Copyright Â© 2010 Elsevier Srl. All rights reserved.
Peter, Beate; Raskind, Wendy H.
Purpose To evaluate phenotypic expressions of speech sound disorder (SSD) in multigenerational families with evidence of familial forms of SSD. Method Members of five multigenerational families (N = 36) produced rapid sequences of monosyllables and disyllables and tapped computer keys with repetitive and alternating movements. Results Measures of repetitive and alternating motor speed were correlated within and between the two motor systems. Repetitive and alternating motor speeds increased in children and decreased in adults as a function of age. In two families with children who had severe speech deficits consistent with disrupted praxis, slowed alternating, but not repetitive, oral movements characterized most of the affected children and adults with a history of SSD, and slowed alternating hand movements were seen in some of the biologically related participants as well. Conclusion Results are consistent with a familial motor-based SSD subtype with incomplete penetrance, motivating new clinical questions about motor-based intervention not only in the oral but also the limb system. PMID:21909176
O. Yu. Kydashev
Full Text Available This paper presents the detailed description of agglomerative clustering system implementation for speech segments based on Bayesian information criterion. Numerical experiment results with different acoustic features, as well as the full and diagonal covariance matrices application are given. The error rate DER equal to 6.4% for audio records of radio «Svoboda» was achieved by means of designed system.
At the turn of the twentieth century, the sound of presidential address changed from an orotund style to an instructional style. The orotund style had featured the careful pronunciation of consonants, elongated vowels, trilled r's and repeated declamations. The instructional style, on the other hand, mimicked the conversational lectures of the…
Spendrup, Sara; Hunter, Erik; Isgren, Ellinor
Nature sounds are increasingly used by some food retailers to enhance in-store ambiance and potentially even influence sustainable food choices. An in-store, 2 × 3 between-subject full factorial experiment conducted on 627 customers over 12 days tested whether nature sound directly and indirectly influenced willingness to buy (WTB) sustainable foods. The results show that nature sounds positively and directly influence WTB organic foods in groups of customers (men) that have relatively low initial intentions to buy. Indirectly, we did not find support for the effect of nature sound on influencing mood or connectedness to nature (CtN). However, we show that information on the product's sustainability characteristics moderates the relationship between CtN and WTB in certain groups. Namely, when CtN is high, sustainability information positively moderated WTB both organic and climate friendly foods in men. Conversely, when CtN was low, men expressed lower WTB organic and climate friendly foods than identical, albeit conventionally labelled products. Consequently, our study concludes that nature sounds might be an effective, yet subtle in-store tool to use on groups of consumers who might otherwise respond negatively to more overt forms of sustainable food information. Copyright © 2016 Elsevier Ltd. All rights reserved.
Haapala, Sini; Niemitalo-Haapola, Elina; Raappana, Antti; Kujala, Tiia; Suominen, Kalervo; Kujala, Teija; Jansson-Verkasalo, Eira
pattern of discriminating small speech sound contrasts in children with RAOM. The results suggest that childhood RAOM does not affect the central auditory pathway integrity or sound encoding. However, RAOM may lead to aberrant preattentive discrimination of sound features even when the peripheral auditory input is normal. These results are clinically significant because even transient problems with auditory processing may delay language development.
Full Text Available Abstract Background The speech signal contains both information about phonological features such as place of articulation and non-phonological features such as speaker identity. These are different aspects of the 'what'-processing stream (speaker vs. speech content, and here we show that they can be further segregated as they may occur in parallel but within different neural substrates. Subjects listened to two different vowels, each spoken by two different speakers. During one block, they were asked to identify a given vowel irrespectively of the speaker (phonological categorization, while during the other block the speaker had to be identified irrespectively of the vowel (speaker categorization. Auditory evoked fields were recorded using 148-channel magnetoencephalography (MEG, and magnetic source imaging was obtained for 17 subjects. Results During phonological categorization, a vowel-dependent difference of N100m source location perpendicular to the main tonotopic gradient replicated previous findings. In speaker categorization, the relative mapping of vowels remained unchanged but sources were shifted towards more posterior and more superior locations. Conclusions These results imply that the N100m reflects the extraction of abstract invariants from the speech signal. This part of the processing is accomplished in auditory areas anterior to AI, which are part of the auditory 'what' system. This network seems to include spatially separable modules for identifying the phonological information and for associating it with a particular speaker that are activated in synchrony but within different regions, suggesting that the 'what' processing can be more adequately modeled by a stream of parallel stages. The relative activation of the parallel processing stages can be modulated by attentional or task demands.
MacGregor, Lucy J; Corley, Martin; Donaldson, David I
Silent pauses are a common form of disfluency in speech yet little attention has been paid to them in the psycholinguistic literature. The present paper investigates the consequences of such silences for listeners, using an Event-Related Potential (ERP) paradigm. Participants heard utterances ending in predictable or unpredictable words, some of which included a disfluent silence before the target. In common with previous findings using er disfluencies, the N400 difference between predictable and unpredictable words was attenuated for the utterances that included silent pauses, suggesting a reduction in the relative processing benefit for predictable words. An earlier relative negativity, topographically distinct from the N400 effect and identifiable as a Phonological Mismatch Negativity (PMN), was found for fluent utterances only. This suggests that only in the fluent condition did participants perceive the phonology of unpredictable words to mismatch with their expectations. By contrast, for disfluent utterances only, unpredictable words gave rise to a late left frontal positivity, an effect previously observed following ers and disfluent repetitions. We suggest that this effect reflects the engagement of working memory processes that occurs when fluent speech is resumed. Using a surprise recognition memory test, we also show that listeners were more likely to recognise words which had been encountered after silent pauses, demonstrating that silence affects not only the process of language comprehension but also its eventual outcome. We argue that, from a listener's perspective, one critical feature of disfluency is the temporal delay which it adds to the speech signal. Copyright © 2010 Elsevier Ltd. All rights reserved.
Yang, Wonyoung; Hodgson, Murray
Speech-intelligibility tests auralized in a virtual classroom were used to investigate the optimal reverberation times for verbal communication for normal-hearing and hearing-impaired adults. The idealized classroom had simple geometry, uniform surface absorption, and an approximately diffuse sound field. It contained a speech source, a listener at a receiver position, and a noise source located at one of two positions. The relative output levels of the speech and noise sources were varied, along with the surface absorption and the corresponding reverberation time. The binaural impulse responses of the speech and noise sources in each classroom configuration were convolved with Modified Rhyme Test (MRT) and babble-noise signals. The resulting signals were presented to normal-hearing and hearing-impaired adult subjects to identify the configurations that gave the highest speech intelligibilities for the two groups. For both subject groups, when the speech source was closer to the listener than the noise source, the optimal reverberation time was zero. When the noise source was closer to the listener than the speech source, the optimal reverberation time included both zero and nonzero values. The results generally support previous theoretical results.
Plinkert, P K; Zenner, H P
Direct observations of the basilar membrane movements show that sound perception can no longer be regarded as a passive process: vulnerable, energy-consuming amplification processes are required in the cochlea. The outer hair cells (OHC) fulfil this demand morphologically and functionally. These sensory cells have a double role: they perceive sound and thus modulate the cochlear biomechanics through their motile activity. The key event of sound transduction is performed by the inner hair cells (IHC) after active sound amplification in the OHC. The control of the OHC is assured by the efferent olivocochlear fibres which release acetylcholine (ACh) and gamma-aminobutyric acid (GABA) into the synaptic cleft at the basal pole of the OHC. Nicotinergic acetylcholine and GABA receptors within the outer cell membrane of OHC were identified and characterised. The application of the neurotransmitter GABA to the basal pole of vital OHC leads to a reversible elongation of the cylindrical cell body while ACh induces a reversible, slow contraction of the sensory cells. These two neurotransmitters are supposed to counteract in the control of the cochlear amplifier. The reciprocal distribution of ACh and GABA receptors and their counteracting function (contraction vs elongation) has an additional impact on the modulation of OHC function. The result is an even more diversified control of the cochlear amplifier. The energy-consuming cochlear amplifications are reflected by an epiphenomenon, i.e. the otoacoustic emissions (OAE). These are emitted by the cochlea and can be divided into "spontaneous OAE", "transitory evoked OAE" (TEOAE), "stimulus frequency OAE" and "distortion product OAE". The TEOAE are now an integrated part of audiological diagnosis.(ABSTRACT TRUNCATED AT 250 WORDS)
Wang, David J; Trehub, Sandra E; Volkova, Anna; van Lieshout, Pascal
Cochlear implants have enabled many congenitally or prelingually deaf children to acquire their native language and communicate successfully on the basis of electrical rather than acoustic input. Nevertheless, degraded spectral input provided by the device reduces the ability to perceive emotion in speech. We compared the vocal imitations of 5- to 7-year-old deaf children who were highly successful bilateral implant users with those of a control sample of children who had normal hearing. First, the children imitated several happy and sad sentences produced by a child model. When adults in Experiment 1 rated the similarity of imitated to model utterances, ratings were significantly higher for the hearing children. Both hearing and deaf children produced poorer imitations of happy than sad utterances because of difficulty matching the greater pitch modulation of the happy versions. When adults in Experiment 2 rated electronically filtered versions of the utterances, which obscured the verbal content, ratings of happy and sad utterances were significantly differentiated for deaf as well as hearing children. The ratings of deaf children, however, were significantly less differentiated. Although deaf children's utterances exhibited culturally typical pitch modulation, their pitch modulation was reduced relative to that of hearing children. One practical implication is that therapeutic interventions for deaf children could expand their focus on suprasegmental aspects of speech perception and production, especially intonation patterns.
David Jueyu Wang
Full Text Available Cochlear implants have enabled many congenitally or prelingually deaf children to acquire their native language and communicate successfully on the basis of electrical rather than acoustic input. Nevertheless, degraded spectral input provided by the device reduces the ability to perceive emotion in speech. We compared the vocal imitations of 5- to 7-year-old deaf children who were highly successful bilateral implant users with those of a control sample of children who had normal hearing. First, the children imitated several happy and sad sentences produced by a child model. When adults in Experiment 1 rated the similarity of imitated to model utterances, ratings were significantly higher for the hearing children. Both hearing and deaf children produced poorer imitations of happy than sad utterances because of difficulty matching the greater pitch modulation of the happy versions. When adults in Experiment 2 rated electronically filtered versions of the utterances, which obscured the verbal content, ratings of happy and sad utterances were significantly differentiated for deaf as well as hearing children. The ratings of deaf children, however, were significantly less differentiated. Although deaf children’s utterances exhibited culturally typical pitch modulation, their pitch modulation was reduced relative to that of hearing children. One practical implication is that therapeutic interventions for deaf children could expand their focus on suprasegmental aspects of speech perception and production, especially intonation patterns.
Watts, Christopher R.; Awan, Shaheen N.
Purpose: In this study, the authors evaluated the diagnostic value of spectral/cepstral measures to differentiate dysphonic from nondysphonic voices using sustained vowels and continuous speech samples. Methodology: Thirty-two age- and gender-matched individuals (16 participants with dysphonia and 16 controls) were recorded reading a standard…
This thesis investigates the way adults and children perceive speech. With adult listeners, the question was whether speech is perceived categorically (categorical speech perception). With children, the question was whether there are age-related differences between the weights assigned to
Full Text Available The aim of the study was twofold: to describe self-reported habits of ICT use in every-day life and to analyze feelings and behavior triggered by ICT and speech deprivation. The study was conducted on three randomly selected groups of students with different tasks: Without Speaking (W/S group (n=10 spent a day without talking to anyone; Without Technology (W/T group (n=13 spent a day without using any kind of ICT, while the third group was a control group (n=10 and had no restrictions. The participants’ task in all groups was to write a diary detailing their feelings, thoughts and behaviors related to their group’s conditions. Before the experiment, students reported their ICT related habits. Right after groups were assigned, they reported their task-related impressions. During the experiment, participants wrote diary records at three time-points. All participants used ICT on a daily basis, and most were online all the time. Dominant ICT activities were communication with friends and family, studying, followed by listening to music and watching films. Speech deprivation was a more difficult task compared to ICT deprivation, resulting in more drop-outs and more negative emotions. However, participants in W/S expected the task to be difficult, and some of them actually reported positive experiences, but for others it was a very difficult, lonesome and terrifying experience. About half of the students in W/T claimed that the task was more difficult than they had expected, and some of them realized that they are dysfunctional without technology, and probably addicted to it.
Full Text Available The main message of this paper is that the sound management of public finances is one of the key factors of sustainable development and that the modern concept of public financial management sets a much higher requirements and standards than the classical approach limited to state budgeting. This paper presents the basic aspects of good and effective management of public finances that is of special importance for Serbia and the countries of Southeastern Europe. Serbia has made a significant progress in public financial management in relation to the situation fifteen years ago, but a lot needs to be done, and the greatest challenges are now focused on public debt management, public investment management and fiscal risk management.
Peter, Beate; Matsushita, Mark; Oda, Kaori; Raskind, Wendy
In 10 cases of 2p15p16.1 microdeletions reported worldwide to date, shared phenotypes included growth retardation, craniofacial and skeletal dysmorphic traits, internal organ defects, intellectual disability, nonverbal or low verbal status, abnormal muscle tone, and gross motor delays. The size of the deletions ranged from 0.3 to 5.7 Mb, where the smallest deletion involved the BCL11A, PAPOLG, and REL genes. Here we report on an 11-year-old male with a heterozygous de novo 0.2 Mb deletion containing a single gene, BCL11A, and a phenotype characterized by childhood apraxia of speech and dysarthria in the presence of general oral and gross motor dyspraxia and hypotonia as well as expressive language and mild intellectual delays. BCL11A is situated within the dyslexia susceptibility candidate region 3 (DYX3) candidate region on chromosome 2. The present case is the first to involve a single gene within the microdeletion region and a phenotype restricted to a subset of the traits observed in other cases with more extensive deletions. © 2014 Wiley Periodicals, Inc.
Peter, Beate; Matsushita, Mark; Raskind, Wendy H
The aim of this pilot study was to investigate a measure of motor sequencing deficit as a potential endophenotype of speech sound disorder (SSD) in a multigenerational family with evidence of familial SSD. In a multigenerational family with evidence of a familial motor-based SSD, affectation status and a measure of motor sequencing during oral motor testing were obtained. To further investigate the role of motor sequencing as an endophenotype for genetic studies, parametric and nonparametric linkage analyses were carried out using a genome-wide panel of 404 microsatellites. In seven of the 10 family members with available data, SSD affectation status and motor sequencing status coincided. Linkage analysis revealed four regions of interest, 6p21, 7q32, 7q36, and 8q24, primarily identified with the measure of motor sequencing ability. The 6p21 region overlaps with a locus implicated in rapid alternating naming in a recent genome-wide dyslexia linkage study. The 7q32 locus contains a locus implicated in dyslexia. The 7q36 locus borders on a gene known to affect the component traits of language impairment. The results are consistent with a motor-based endophenotype of SSD that would be informative for genetic studies. The linkage results in this first genome-wide study in a multigenerational family with SSD warrant follow-up in additional families and with fine mapping or next-generation approaches to gene identification.
Gorka Fraga González
Full Text Available A recent account of dyslexia assumes that a failure to develop automated letter-speech sound integration might be responsible for the observed lack of reading fluency. This study uses a pre-test-training-post-test design to evaluate the effects of a training program based on letter-speech sound associations with a special focus on gains in reading fluency. A sample of 44 children with dyslexia and 23 typical readers, aged 8 to 9, was recruited. Children with dyslexia were randomly allocated to either the training program group (n = 23 or a waiting-list control group (n = 21. The training intensively focused on letter-speech sound mapping and consisted of 34 individual sessions of 45 minutes over a five month period. The children with dyslexia showed substantial reading gains for the main word reading and spelling measures after training, improving at a faster rate than typical readers and waiting-list controls. The results are interpreted within the conceptual framework assuming a multisensory integration deficit as the most proximal cause of dysfluent reading in dyslexia.ISRCTN register ISRCTN12783279.
Fraga González, Gorka; Žarić, Gojko; Tijms, Jurgen; Bonte, Milene; van der Molen, Maurits W.
A recent account of dyslexia assumes that a failure to develop automated letter-speech sound integration might be responsible for the observed lack of reading fluency. This study uses a pre-test-training-post-test design to evaluate the effects of a training program based on letter-speech sound associations with a special focus on gains in reading fluency. A sample of 44 children with dyslexia and 23 typical readers, aged 8 to 9, was recruited. Children with dyslexia were randomly allocated to either the training program group (n = 23) or a waiting-list control group (n = 21). The training intensively focused on letter-speech sound mapping and consisted of 34 individual sessions of 45 minutes over a five month period. The children with dyslexia showed substantial reading gains for the main word reading and spelling measures after training, improving at a faster rate than typical readers and waiting-list controls. The results are interpreted within the conceptual framework assuming a multisensory integration deficit as the most proximal cause of dysfluent reading in dyslexia. Trial Registration: ISRCTN register ISRCTN12783279 PMID:26629707
Apel, Kenn; Lawrence, Jessika
In this study, the authors compared the morphological awareness abilities of children with speech sound disorder (SSD) and children with typical speech skills and examined how morphological awareness ability predicted word-level reading and spelling performance above other known contributors to literacy development. Eighty-eight first-grade students--44 students with SSD and no known history of language deficiencies, and 44 students with typical speech and language skills--completed an assessment battery designed to measure speech sound production, morphological awareness, phonemic awareness, letter-name knowledge, receptive vocabulary, word-level reading, and spelling abilities. The children with SSD scored significantly lower than did their counterparts on the morphological awareness measures as well as on phonemic awareness, word-level reading, and spelling tasks. Regression analyses suggested that morphological awareness predicted significant unique variance on the spelling measure for both groups and on the word-level reading measure for the children with typical skills. These results suggest that children with SSD may present with a general linguistic awareness insufficiency, which puts them at risk for difficulties with literacy and literacy-related tasks.
Jost, Lea B; Eberhard-Moscicka, Aleksandra K; Pleisch, Georgette; Heusser, Veronica; Brandeis, Daniel; Zevin, Jason D; Maurer, Urs
Learning a foreign language in a natural immersion context with high exposure to the new language has been shown to change the way speech sounds of that language are processed at the neural level. It remains unclear, however, to what extent this is also the case for classroom-based foreign language learning, particularly in children. To this end, we presented a mismatch negativity (MMN) experiment during EEG recordings as part of a longitudinal developmental study: 38 monolingual (Swiss-) German speaking children (7.5 years) were tested shortly before they started to learn English at school and followed up one year later. Moreover, 22 (Swiss-) German adults were recorded. Instead of the originally found positive mismatch response in children, an MMN emerged when applying a high-pass filter of 3 Hz. The overlap of a slow-wave positivity with the MMN indicates that two concurrent mismatch processes were elicited in children. The children's MMN in response to the non-native speech contrast was smaller compared to the native speech contrast irrespective of foreign language learning, suggesting that no additional neural resources were committed to processing the foreign language speech sound after one year of classroom-based learning. Copyright © 2015 Elsevier Ltd. All rights reserved.
Bonnard, Damien; Lautissier, Sylvie; Bosset-Audoit, Amélie; Coriat, Géraldine; Beraha, Max; Maunoury, Antoine; Martel, Jacques; Darrouzet, Vincent; Bébéar, Jean-Pierre; Dauman, René
An alternative to bilateral cochlear implantation is offered by the Neurelec Digisonic(®) SP Binaural cochlear implant, which allows stimulation of both cochleae within a single device. The purpose of this prospective study was to compare a group of Neurelec Digisonic(®) SP Binaural implant users (denoted BINAURAL group, n = 7) with a group of bilateral adult cochlear implant users (denoted BILATERAL group, n = 6) in terms of speech perception, sound localization, and self-assessment of health status and hearing disability. Speech perception was assessed using word recognition at 60 dB SPL in quiet and in a 'cocktail party' noise delivered through five loudspeakers in the hemi-sound field facing the patient (signal-to-noise ratio = +10 dB). The sound localization task was to determine the source of a sound stimulus among five speakers positioned between -90° and +90° from midline. Change in health status was assessed using the Glasgow Benefit Inventory and hearing disability was evaluated with the Abbreviated Profile of Hearing Aid Benefit. Speech perception was not statistically different between the two groups, even though there was a trend in favor of the BINAURAL group (mean percent word recognition in the BINAURAL and BILATERAL groups: 70 vs. 56.7% in quiet, 55.7 vs. 43.3% in noise). There was also no significant difference with regard to performance in sound localization and self-assessment of health status and hearing disability. On the basis of the BINAURAL group's performance in hearing tasks involving the detection of interaural differences, implantation with the Neurelec Digisonic(®) SP Binaural implant may be considered to restore effective binaural hearing. Based on these first comparative results, this device seems to provide benefits similar to those of traditional bilateral cochlear implantation, with a new approach to stimulate both auditory nerves. Copyright © 2013 S. Karger AG, Basel.
David, Marion; Lavandier, Mathieu; Grimault, Nicolas; Oxenham, Andrew J
Differences in spatial cues, including interaural time differences (ITDs), interaural level differences (ILDs) and spectral cues, can lead to stream segregation of alternating noise bursts. It is unknown how effective such cues are for streaming sounds with realistic spectro-temporal variations. In particular, it is not known whether the high-frequency spectral cues associated with elevation remain sufficiently robust under such conditions. To answer these questions, sequences of consonant-vowel tokens were generated and filtered by non-individualized head-related transfer functions to simulate the cues associated with different positions in the horizontal and median planes. A discrimination task showed that listeners could discriminate changes in interaural cues both when the stimulus remained constant and when it varied between presentations. However, discrimination of changes in spectral cues was much poorer in the presence of stimulus variability. A streaming task, based on the detection of repeated syllables in the presence of interfering syllables, revealed that listeners can use both interaural and spectral cues to segregate alternating syllable sequences, despite the large spectro-temporal differences between stimuli. However, only the full complement of spatial cues (ILDs, ITDs, and spectral cues) resulted in obligatory streaming in a task that encouraged listeners to integrate the tokens into a single stream.
The aim of the present study was to evaluate MED-EL's Fine Structure Processing (FSP) strategy in comparison with their variations of the standard Continuous Interleaved Sampling (CIS) strategy denoted CIS+ and High Definition CIS (HDCIS). Twenty experienced adult CI users participated in the study in connection with upgrading to a new speech processor and at a two-year follow-up. Blinded paired-comparisons between FSP and HDCIS were performed for speech intelligibility and music sound quality. Standard speech recognition tests in quiet and in noise were also accomplished to monitor the participants' actual performance and to evaluate long-term outcomes. Overall, the paired-comparison results showed no significant differences between the strategies, however, the total numbers of significant individual preferences were: 11 FSP vs. 12 HDCIS for speech, and 4 FSP vs. 15 HDCIS for music. The average speech recognition score decreased significantly after one month with FSP, but after two years there were no significant difference compared to the initial results with CIS+. Owing to the large individual differences in subjective preference, and the fact that the FSP strategy was not superior to the CIS variations, the recipients should be given the opportunity of choosing between the strategies
Full Text Available This study aimed to compare the effects of a non-linguistic auditory intervention approach with a phonological intervention approach on the phonological skills of children with speech sound disorder. A total of 17 children, aged 7-12 years, with speech sound disorder were randomly allocated to either the non-linguistic auditory temporal intervention group (n = 10, average age 7.7 ± 1.2 or phonological intervention group (n = 7, average age 8.6 ± 1.2. The intervention outcomes included auditory-sensory measures (auditory temporal processing skills and cognitive measures (attention, short-term memory, speech production and phonological awareness skills. The auditory approach focused on non-linguistic auditory training (eg. backward masking and frequency discrimination, whereas the phonological approach focused on speech sound training (eg. phonological organisation and awareness. Both interventions consisted of twelve 45-minute sessions delivered twice per week, for a total of nine hours. Intra-group analysis demonstrated that the auditory intervention group showed significant gains in both auditory and cognitive measures, whereas no significant gain was observed in the phonological intervention group. No significant improvement on phonological skills was observed in any of the groups. Inter-group analysis demonstrated significant differences between the improvement following training for both groups, with a more pronounced gain for the non-linguistic auditory temporal intervention in one of the visual attention measures and both auditory measures. Therefore, both analyses suggest that although the non-linguistic auditory intervention approach appeared to be the most effective intervention approach, it was not sufficient to promote the enhancement of phonological skills.
peer-reviewed Background: Children with phonological difficulties are currently spending a long time on waiting lists for Speech and Language therapy. Research has shown that early intervention is imperative for this population in improving their speech (Broomfield and Dodd 2005). This situation has motivated clinicians to find an alternative method of service delivery. A home program that requires little input from clinicians with parents as the agents of therapy would provide immediate t...
Liu, C; Wheeler, B C; O'Brien, W D; Lansing, C R; Bilger, R C; Jones, D L; Feng, A S
This paper describes algorithms for signal extraction for use as a front-end of telecommunication devices, speech recognition systems, as well as hearing aids that operate in noisy environments. The development was based on some independent, hypothesized theories of the computational mechanics of biological systems in which directional hearing is enabled mainly by binaural processing of interaural directional cues. Our system uses two microphones as input devices and a signal processing method based on the two input channels. The signal processing procedure comprises two major stages: (i) source localization, and (ii) cancellation of noise sources based on knowledge of the locations of all sound sources. The source localization, detailed in our previous paper [Liu et al., J. Acoust. Soc. Am. 108, 1888 (2000)], was based on a well-recognized biological architecture comprising a dual delay-line and a coincidence detection mechanism. This paper focuses on description of the noise cancellation stage. We designed a simple subtraction method which, when strategically employed over the dual delay-line structure in the broadband manner, can effectively cancel multiple interfering sound sources and consequently enhance the desired signal. We obtained an 8-10 dB enhancement for the desired speech in the situations of four talkers in the anechoic acoustic test (or 7-10 dB enhancement in the situations of six talkers in the computer simulation) when all the sounds were equally intense and temporally aligned.
Van der Torn, M; Van Gogh, CDL; Verdonck-de Leeuw, IMD; Festen, JM; Verkerke, GJ; Mahieu, HF
Background. A pneumatic artificial sound source incorporated in a regular tracheoesophageal shunt valve may improve alaryngeal voice quality. Methods. In 20 laryngectomees categorized for sex and pharyngoesophageal segment tonicity, a prototype sound-producing voice prosthesis (SPVP) is evaluated
Grogan-Johnson, Sue; Schmidt, Anna Marie; Schenker, Jason; Alvares, Robin; Rowan, Lynne E.; Taylor, Jacquelyn
Telepractice has the potential to provide greater access to speech-language intervention services for children with communication impairments. Substantiation of this delivery model is necessary for telepractice to become an accepted alternative delivery model. This study investigated the progress made by school-age children with speech sound…
Vickers, Deborah A; Backus, Bradford C; Macdonald, Nora K; Rostamzadeh, Niloofar K; Mason, Nisha K; Pandya, Roshni; Marriage, Josephine E; Mahon, Merle H
The assessment of the combined effect of classroom acoustics and sound field amplification (SFA) on children's speech perception within the "live" classroom poses a challenge to researchers. The goals of this study were to determine: (1) Whether personal response system (PRS) hand-held voting cards, together with a closed-set speech perception test (Chear Auditory Perception Test [CAPT]), provide an appropriate method for evaluating speech perception in the classroom; (2) Whether SFA provides better access to the teacher's speech than without SFA for children, taking into account vocabulary age, middle ear dysfunction or ear-canal wax, and home language. Forty-four children from two school-year groups, year 2 (aged 6 years 11 months to 7 years 10 months) and year 3 (aged 7 years 11 months to 8 years 10 months) were tested in two classrooms, using a shortened version of the four-alternative consonant discrimination section of the CAPT. All children used a PRS to register their chosen response, which they selected from four options displayed on the interactive whiteboard. The classrooms were located in a 19th-century school in central London, United Kingdom. Each child sat at their usual position in the room while target speech stimuli were presented either in quiet or in noise. The target speech was presented from the front of the classroom at 65 dBA (calibrated at 1 m) and the presented noise level was 46 dBA measured at the center of the classroom. The older children had an additional noise condition with a noise level of 52 dBA. All conditions were presented twice, once with SFA and once without SFA and the order of testing was randomized. White noise from the teacher's right-hand side of the classroom and International Speech Test Signal from the teacher's left-hand side were used, and the noises were matched at the center point of the classroom (10sec averaging [A-weighted]). Each child's expressive vocabulary age and middle ear status were measured
Raschle, Nora Maria; Smith, Sara Ashley; Zuk, Jennifer; Dauvermann, Maria Regina; Figuccio, Michael Joseph; Gaab, Nadine
.... However, due to technical and practical challenges when neuroimaging young children, evidence of neural correlates of speech and/or voice processing in toddlers and young children remains scarce...
Nora Maria Raschle; Sara Ashley Smith; Jennifer Zuk; Maria Regina Dauvermann; Michael Joseph Figuccio; Nadine Gaab
.... However, due to technical and practical challenges when neuroimaging young children, evidence of neural correlates of speech and/or voice processing in toddlers and young children remains scarce...
Murphy, Cristina F B; Pagan-Neves, Luciana O; Wertzner, Haydée F; Schochat, Eliane
This study aimed to compare the effects of a non-linguistic auditory intervention approach with a phonological intervention approach on the phonological skills of children with speech sound disorder (SSD). A total of 17 children, aged 7-12 years, with SSD were randomly allocated to either the non-linguistic auditory temporal intervention group (n = 10, average age 7.7 ± 1.2) or phonological intervention group (n = 7, average age 8.6 ± 1.2). The intervention outcomes included auditory-sensory measures (auditory temporal processing skills) and cognitive measures (attention, short-term memory, speech production, and phonological awareness skills). The auditory approach focused on non-linguistic auditory training (e.g., backward masking and frequency discrimination), whereas the phonological approach focused on speech sound training (e.g., phonological organization and awareness). Both interventions consisted of 12 45-min sessions delivered twice per week, for a total of 9 h. Intra-group analysis demonstrated that the auditory intervention group showed significant gains in both auditory and cognitive measures, whereas no significant gain was observed in the phonological intervention group. No significant improvement on phonological skills was observed in any of the groups. Inter-group analysis demonstrated significant differences between the improvement following training for both groups, with a more pronounced gain for the non-linguistic auditory temporal intervention in one of the visual attention measures and both auditory measures. Therefore, both analyses suggest that although the non-linguistic auditory intervention approach appeared to be the most effective intervention approach, it was not sufficient to promote the enhancement of phonological skills.
Although there are no standards or guidelines for sound field testing, it is recognized that such testing is an integral part of audiologic evaluation. This paper has reviewed some of the problems in sound field testing, as well as possible solutions to those problems. This review may be summarized as follows: 1. The environment in which sound field testing is conducted is an integral part of the test procedure; thus, the ambient noise and reverberation characteristics of the test room must be known. The test room must have ambient noise levels below the level at which the test signals will occur. 2. The listener must be seated so that the SPL of the test signal is known at that listener's pinna. Thus, care must be taken to exclude anything between the ear of the listener and the loudspeaker, and the height of the loudspeaker must be appropriate for the listener. (Note: If the loudspeakers are raised or lowered, it may be necessary to recalibrate.) The near/far field and direct/reverberant field boundaries should be identified and the listener positioned between those two boundaries. 3. The acoustic properties of the test signal must be defined clearly. An FM signal is best for assessing threshold of hearing. The examiner should measure the SPL and verify the spectral characteristics of the signal. The frequency of calibration measurements should be identical to that used for earphones, generally once every 3 months. Finally, it is important to understand the potential interaction between the test environment, the signal, and the listener when testing in the sound field. If the problems are understood and compensations are made it should be possible to obtain reliable and useful auditory information in the sound field.
Mandelli, Maria Luisa; Caverzasi, Eduardo; Binney, Richard J; Henry, Maya L; Lobach, Iryna; Block, Nikolas; Amirbekian, Bagrat; Dronkers, Nina; Miller, Bruce L; Henry, Roland G; Gorno-Tempini, Maria Luisa
In primary progressive aphasia (PPA), speech and language difficulties are caused by neurodegeneration of specific brain networks. In the nonfluent/agrammatic variant (nfvPPA), motor speech and grammatical deficits are associated with atrophy in a left fronto-insular-striatal network previously implicated in speech production. In vivo dissection of the crossing white matter (WM) tracts within this "speech production network" is complex and has rarely been performed in health or in PPA. We hypothesized that damage to these tracts would be specific to nfvPPA and would correlate with differential aspects of the patients' fluency abilities. We prospectively studied 25 PPA and 21 healthy individuals who underwent extensive cognitive testing and 3 T MRI. Using residual bootstrap Q-ball probabilistic tractography on high angular resolution diffusion-weighted imaging (HARDI), we reconstructed pathways connecting posterior inferior frontal, inferior premotor, insula, supplementary motor area (SMA) complex, striatum, and standard ventral and dorsal language pathways. We extracted tract-specific diffusion tensor imaging (DTI) metrics to assess changes across PPA variants and perform brain-behavioral correlations. Significant WM changes in the left intrafrontal and frontostriatal pathways were found in nfvPPA, but not in the semantic or logopenic variants. Correlations between tract-specific DTI metrics with cognitive scores confirmed the specific involvement of this anterior-dorsal network in fluency and suggested a preferential role of a posterior premotor-SMA pathway in motor speech. This study shows that left WM pathways connecting the speech production network are selectively damaged in nfvPPA and suggests that different tracts within this system are involved in subcomponents of fluency. These findings emphasize the emerging role of diffusion imaging in the differential diagnosis of neurodegenerative diseases. Copyright © 2014 the authors 0270-6474/14/349754-14$15.00/0.
This study examines the relationship between different types of language learning aptitude (measured via the LLAMA test) and adult second language (L2) learners' attainment in speech production in English-as-a-foreign-language (EFL) classrooms. Picture descriptions elicited from 50 Japanese EFL learners from varied proficiency levels were analyzed…
Paradise, J L; Dollaghan, C A; Campbell, T F; Feldman, H M; Bernard, B S; Colborn, D K; Rockette, H E; Janosky, J E; Pitcairn, D L; Sabo, D L; Kurs-Lasky, M; Smith, C G
As part of a prospective study of possible effects of early-life otitis media on speech, language, cognitive, and psychosocial development, we tested relationships between children's cumulative duration of middle ear effusion (MEE) in their first 3 years of life and their scores on measures of language, speech sound production, and cognition at 3 years of age. We enrolled 6350 healthy infants by 2 months of age who presented for primary care at 1 of 2 urban hospitals or 1 of 2 small town/rural and 4 suburban private pediatric practices. We intensively monitored the children's middle ear status by pneumatic otoscopy, supplemented by tympanometry, throughout their first 3 years of life; we monitored the validity of the otoscopic observations on an ongoing basis; and we treated children for otitis media according to specified guidelines. Children who met specified minimum criteria regarding the persistence of MEE became eligible for a clinical trial in which they were assigned randomly to undergo tympanostomy tube placement either promptly or after a defined extended period if MEE remained present. From among those remaining, we selected randomly, within sociodemographic strata, a sample of 241 children who represented a spectrum of MEE experience from having no MEE to having MEE whose cumulative duration fell just short of meeting randomization criteria. In subjects so selected, the estimated duration of MEE ranged from none to 65.6% of the first year of life and 44.8% of the first 3 years of life. In these 241 children we assessed language development, speech sound production, and cognition at 3 years of age, using both formal tests and conversational samples. We found weak to moderate, statistically significant negative correlations between children's cumulative durations of MEE in their first year of life or in age periods that included their first year of life, and their scores on formal tests of receptive vocabulary and verbal aspects of cognition at 3 years of
Christmann, Corinna A; Berti, Stefan; Steinbrink, Claudia; Lachmann, Thomas
We compared processing of speech and non-speech by means of the mismatch negativity (MMN). For this purpose, the MMN elicited by vowels was compared to those elicited by two non-speech stimulus types: spectrally rotated vowels, having the same stimulus complexity as the speech stimuli, and sounds based on the bands of formants of the vowels, representing non-speech stimuli of lower complexity as compared to the other stimulus types. This design allows controlling for effects of stimulus complexity when comparing neural correlates of processing speech to non-speech. Deviants within a modified multi-feature design differed either in duration or spectral property. Moreover, the difficulty to discriminate between the standard and the two deviants was controlled for each stimulus type by means of an additional active discrimination task. Vowels elicited a larger MMN compared to both non-speech stimulus types, supporting the concept of language-specific phoneme representations and the role of the participants' prior experience. Copyright © 2014 Elsevier Inc. All rights reserved.
McNeill, Brigid C; Wolter, Julie; Gillon, Gail T
This study explored the specific nature of a spelling impairment in children with speech sound disorder (SSD) in relation to metalinguistic predictors of spelling development. The metalinguistic (phoneme, morphological, and orthographic awareness) and spelling development of 28 children ages 6-8 years with a history of inconsistent SSD were compared to those of their age-matched (n = 28) and reading-matched (n = 28) peers. Analysis of the literacy outcomes of children within the cohort with persistent (n = 18) versus resolved (n = 10) SSD was also conducted. The age-matched peers outperformed the SSD group on all measures. Children with SSD performed comparably to their reading-matched peers on metalinguistic measures but exhibited lower spelling scores. Children with persistent SSD generally had less favorable outcomes than children with resolved SSD; however, even children with resolved SSD performed poorly on normative spelling measures. Children with SSD have a specific difficulty with spelling that is not commensurate with their metalinguistic and reading ability. Although low metalinguistic awareness appears to inhibit these children's spelling development, other factors should be considered, such as nonverbal rehearsal during spelling attempts and motoric ability. Integration of speech-production and spelling-intervention goals is important to enhance literacy outcomes for this group.
McLeod, Sharynne; Daniel, Graham; Barr, Jacqueline
Children interact with people in context: including home, school, and in the community. Understanding children's relationships within context is important for supporting children's development. Using child-friendly methodologies, the purpose of this research was to understand the lives of children with speech sound disorder (SSD) in context. Thirty-four interviews were undertaken with six school-aged children identified with SSD, and their siblings, friends, parents, grandparents, and teachers. Interview transcripts, questionnaires, and children's drawings were analyzed to reveal that these children experienced the world in context dependent ways (private vs. public worlds). Family and close friends typically provided a safe, supportive environment where children could be themselves and participate in typical childhoods. In contrast, when out of these familiar contexts, the children often were frustrated, embarrassed, and withdrawn, their relationships changed, and they were unable to get their message across in public contexts. Speech-language pathology assessment and intervention could be enhanced by interweaving the valuable insights of children, siblings, friends, parents, teachers, and other adults within children's worlds to more effectively support these children in context. 1. Recognize that children with SSD experience the world in different ways, depending on whether they are in private or public contexts. 2. Describe the changes in the roles of family and friends when children with SSD are in public contexts. 3. Discover the position of the child as central in Bronfenbrenner’s bioecological model. 4. Identify principles of child-friendly research. 5. Recognize the importance of considering the child in context during speech-language pathology assessment and intervention. Crown Copyright © 2012. Published by Elsevier Inc. All rights reserved.
Kaganov, A Sh; Kir'yanov, P A
The objective of the present publication was to discuss the possibility of application of cybernetic modeling methods to overcome the apparent discrepancy between two kinds of the speech records, viz. initial ones (e.g. obtained in the course of special investigation activities) and the voice prints obtained from the persons subjected to the criminalistic examination. The paper is based on the literature sources and the materials of original criminalistics expertises performed by the authors.
McAloone, Timothy Charles
. These two schools of environmental re-search practice are mirrored in the way in which industry approaches environmental problems. Since the definition in 1987 of Sustainable Development  efforts have been made to relate the goals and ideals of sustainabil-ity to the domain of product development, thus...... adding new dimensions, such as social and moral values, to the original agenda of environmental improvement. The redefinition of the role of the product developer, from environmentally conscious product de-veloper to sustainably aware product developer has led to new insights into the way in which...... products are developed and used ¿ and to where environmental effects occur in the lifetime of a product. The role of the product developer is thus more complex in relation to sustainability, as the focus for improvement of a product may not (and very often does not) lie in the physical artefactual...
Dr. Chokri Smaoui
Full Text Available Fossilization is a universal phenomenon that has attracted the attention of teachers and researchers alike. In this regard, the aim of this study is to investigate a supposedly fossilized feature in Tunisian learners’ performance, namely the pronunciation of the /3:/ sound among Intermediate Tunisian English Students (ITES. It tries to show whether ITES pronounce it correctly or whether it is rather often replaced by another phoneme. The study also tries to show the reasons behind fossilization. It is conjectured that L1 interference, lack of exposure to L2 input, and the absence of pronunciation teaching methods are the main factors behind this fossilized pronunciation. Finally, the study tries to apply the audio-articulation method to remedy for this type of fossilization. This method contains many drills that can help learners articulate better, and consequently produce more intelligible sounds.
Laaksonen, Juha-Pertti; Rieger, Jana; Happonen, Risto-Pekka; Harris, Jeffrey; Seikaly, Hadi
The purpose of this study was to use acoustic analyses to describe speech outcomes over the course of 1 year after radial forearm free flap (RFFF) reconstruction of the tongue. Eighteen Canadian English-speaking females and males with reconstruction for oral cancer had speech samples recorded (pre-operative, and 1 month, 6 months, and 1 year post-operative). Acoustic characteristics of formants (F1, F2), fundamental frequency (F0), and duration of 699 vowel and diphthong tokens were analysed. Furthermore, the changes in size of the vowel space area were studied, as well as the effects of radiation therapy (RT) and inclusion of the floor of the mouth (FOM) in the reconstruction. RFFF reconstruction was found to affect several characteristics in males, and a minimal number of variables in females. General signs of reduced ability to articulate were not observed. RT and FOM had no differing effects compared to non-RT or non-FOM. There were individual differences between patients.
Reed, Phil; Howell, Peter; Sackin, Stevie; Pizzimenti, Lisa; Rosen, Stuart
The voiceless affricate/fricative contrast has played an important role in developing auditory theories of speech perception. This type of theory draws some of its support from experimental data on animals. However, nothing is known about differential responding of affricate/fricative continua by animals. In the current study, the ability of hooded rats to "label" an affricate/fricative continuum was tested. Transfer (without retraining) to analogous nonspeech continua was also tested. The nonspeech continua were chosen so that if transfer occurred, it would indicate whether the animals had learned to use rise time or duration cues to differentiate affricates from fricatives. The data from 9 of 10 rats indicated that rats can discriminate between these cues and do so in a similar manner to human subjects. The data from 9 of 10 rats also demonstrated that the rise time of the stimulus was the basis of the discrimination; the remaining rat appeared to use duration. PMID:14674729
Cummins, Nicholas; Schmitt, Maximilian; Amiriparian, Shahin; Krajewski, Jarek; Schuller, Bjorn
A combination of passive, non-invasive and nonintrusive smart monitoring technologies is currently transforming healthcare. These technologies will soon be able to provide immediate health related feedback for a range of illnesses and conditions. Such tools would be game changing for serious public health concerns, such as seasonal cold and flu, for which early diagnosis and social isolation play a key role in reducing the spread. In this regard, this paper explores, for the first times, the automated classification of individuals with Upper Respiratory Tract Infections (URTI) using recorded speech samples. Key results presented indicate that our classifiers can achieve similar results to those seen in related health-based detection tasks indicating the promise of using computational paralinguistic analysis for the detection of URTI related illnesses.
Takagi, M; Takahashi, M; Narita, E; Shimooka, S
Growing human offsprings are creatures that communicate by language. If rapidly growing children lose their deciduous teeth very early in life, their language and pronunciation functions may be seriously affected. The authors conducted a series of tests to find how the voice changes when deciduous teeth are extracted. The results may be summarized as follows. 1. There was no significant difference in the formants of vowels on "a-gyo", the first line of the Japanese syllabary, but an appreciable difference was recognized between children (four to six years old) with missing anterior teeth and posterior teeth when there were consonants before and after these vowels. 2. In the formants of the vowels, there was a marked difference in the vowel "i" when children were fitted with no appliance. 3. The strength of voice components in each frequency range was compared between children with missing anterior teeth and posterior teeth. It differed widely in "o ka a sa n" (which means "Mother") and "a-gyo", when children were fitted with no appliance. These findings indicate that the pronunciation of "o ka a sa n" and "a-gyo" can be recovered to some extent if the children are provided with an appliance. However, a sound analysis indicates that their pronunciation ability of sounds on "ka-gyo", "sa-gyo" and "ta-gyo" (the second, third and fourth lines of the Japanese syllabary) can hardly be fully recovered.
Full Text Available Abstract Background Auditory sustained responses have been recently suggested to reflect neural processing of speech sounds in the auditory cortex. As periodic fluctuations below the pitch range are important for speech perception, it is necessary to investigate how low frequency periodic sounds are processed in the human auditory cortex. Auditory sustained responses have been shown to be sensitive to temporal regularity but the relationship between the amplitudes of auditory evoked sustained responses and the repetitive rates of auditory inputs remains elusive. As the temporal and spectral features of sounds enhance different components of sustained responses, previous studies with click trains and vowel stimuli presented diverging results. In order to investigate the effect of repetition rate on cortical responses, we analyzed the auditory sustained fields evoked by periodic and aperiodic noises using magnetoencephalography. Results Sustained fields were elicited by white noise and repeating frozen noise stimuli with repetition rates of 5-, 10-, 50-, 200- and 500 Hz. The sustained field amplitudes were significantly larger for all the periodic stimuli than for white noise. Although the sustained field amplitudes showed a rising and falling pattern within the repetition rate range, the response amplitudes to 5 Hz repetition rate were significantly larger than to 500 Hz. Conclusions The enhanced sustained field responses to periodic noises show that cortical sensitivity to periodic sounds is maintained for a wide range of repetition rates. Persistence of periodicity sensitivity below the pitch range suggests that in addition to processing the fundamental frequency of voice, sustained field generators can also resolve low frequency temporal modulations in speech envelope.
Van Wijk, A.J.M.
According to the author, there is no energy crisis; nor is there an energy shortage. Three observations illustrate this proposition: (1) We are wasting about 98% of our energy; (2) In one hour, the earth receives more energy from the sun than we consume worldwide in one year; and (3) sustainable energy is all around us. Next, the observations are elaborated and a plan is launched to set up a Green Campus: a living lab, an inspiring place where businesses and university can meet and a place where everyone can get an impression of the energy systems of the future. This way the author is hoping to take a next, important step in the realization of his dream, which is a sustainable energy system for all [Dutch] De auteur stelt dat er geen energiecrisis, geen energietekort is. Drie observaties illustreren deze stelling: (1) We verspillen ruwweg 98% van onze energie; (2) In een uur ontvangt de aarde meer energie van de zon, dan we wereldwijd in een jaar verbruiken; en (3) Duurzame energie is overal rond om ons heen. Vervolgens worden de observaties toegelicht en een plan gelanceerd om een Green Campus op te zetten: een living lab, een inspirerende plek waar bedrijven en universiteit elkaar ontmoeten en een plek waar een ieder een beeld kan krijgen op de energiesystemen van de toekomst. Daarmee hoopt de auteur een volgende en belangrijke stap te zetten in de realisatie van zijn droom, een duurzame energievoorziening voor iedereen.
Gorka Fraga González
Full Text Available We use a neurocognitive perspective to discuss the contribution of learning letter-speech sound (L-SS associations and visual specialization in the initial phases of reading in dyslexic children. We review findings from associative learning studies on related cognitive skills important for establishing and consolidating L-SS associations. Then we review brain potential studies, including our own, that yielded two markers associated with reading fluency. Here we show that the marker related to visual specialization (N170 predicts word and pseudoword reading fluency in children who received additional practice in the processing of morphological word structure. Conversely, L-SS integration (indexed by mismatch negativity (MMN may only remain important when direct orthography to semantic conversion is not possible, such as in pseudoword reading. In addition, the correlation between these two markers supports the notion that multisensory integration facilitates visual specialization. Finally, we review the role of implicit learning and executive functions in audiovisual learning in dyslexia. Implications for remedial research are discussed and suggestions for future studies are presented.
Many disorders can affect our ability to speak and communicate. They range from saying sounds incorrectly to being completely ... to speak or understand speech. Causes include Hearing disorders and deafness Voice problems, such as dysphonia or ...
Full Text Available We used event-related brain potentials (ERPs to study effects of selective attention on the processing of attended and unattended spoken syllables and letters. Participants were presented with syllables randomly occurring in the left or right ear and spoken by different voices and with a concurrent foveal stream of consonant letters written in darker or lighter fonts. During auditory phonological and non-phonological tasks, they responded to syllables in a designated ear starting with a vowel and spoken by female voices, respectively. These syllables occurred infrequently among standard syllables starting with a consonant and spoken by male voices. During visual phonological and non-phonological tasks, they responded to consonant letters with names starting with a vowel and to letters written in dark fonts, respectively. These letters occurred infrequently among standard letters with names starting with a consonant and written in light fonts. To examine genuine effects of attention and task on ERPs not overlapped by ERPs associated with target processing or deviance detection, these effects were studied only in ERPs to auditory and visual standards. During selective listening to syllables in a designated ear, ERPs to the attended syllables were negatively displaced during both phonological and non-phonological auditory tasks. Selective attention to letters elicited an early negative displacement and a subsequent positive displacement of ERPs to attended letters being larger during the visual phonological than non-phonological task suggesting a higher demand for attention during the visual phonological task. Active suppression of unattended speech during the auditory phonological and non-phonological tasks and during the visual phonological tasks was suggested by a rejection positivity to unattended syllables. We also found evidence for suppression of the processing of task-irrelevant visual stimuli in visual ERPs during auditory tasks involving
In this speech the author points out two main recommendations. The first message concerns the necessity of a whole mobilization in favor of the sustainable development, from the government policy and the enterprises management to the human behavior. He presents then three main axis to heighten the enterprises (reinforce the information on the environmental and social impact of the economic activities, the development of sustainable investments, the development of the environmental sponsorship). The second message concerns the necessity to place the environment in the economic growth by the development of the ecology and the eco-technology. (A.L.B.)
Jerger, Susan; Damian, Markus F.; McAlpine, Rachel P.; Abdi, Herve
To communicate, children must discriminate and identify speech sounds. Because visual speech plays an important role in this process, we explored how visual speech influences phoneme discrimination and identification by children. Critical items had intact visual speech (e.g. baez) coupled to non-intact (excised onsets) auditory speech (signified…
Treille, Avril; Vilain, Coriandre; Sato, Marc
Recent magneto-encephalographic and electro-encephalographic studies provide evidence for cross-modal integration during audio-visual and audio-haptic speech perception, with speech gestures viewed...
Treille, Avril; Vilain, Coriandre; Sato, Marc
Recent magneto-encephalographic and electro-encephalographic studies provide evidence for cross-modal integration during audio-visual and audio-haptic speech perception, with speech gestures viewed or felt from manual tactile contact with the speaker's face. Given the temporal precedence of the haptic and visual signals on the acoustic signal in these studies, the observed modulation of N1/P2 auditory evoked responses during bimodal compared to unimodal speech perception suggest that relevant and predictive visual and haptic cues may facilitate auditory speech processing. To further investigate this hypothesis, auditory evoked potentials were here compared during auditory-only, audio-visual and audio-haptic speech perception in live dyadic interactions between a listener and a speaker. In line with previous studies, auditory evoked potentials were attenuated and speeded up during both audio-haptic and audio-visual compared to auditory speech perception. Importantly, the observed latency and amplitude reduction did not significantly depend on the degree of visual and haptic recognition of the speech targets. Altogether, these results further demonstrate cross-modal interactions between the auditory, visual and haptic speech signals. Although they do not contradict the hypothesis that visual and haptic sensory inputs convey predictive information with respect to the incoming auditory speech input, these results suggest that, at least in live conversational interactions, systematic conclusions on sensory predictability in bimodal speech integration have to be taken with caution, with the extraction of predictive cues likely depending on the variability of the speech stimuli.
Rumbach, Anna F; Rose, Tanya A; Cheah, Mynn
To explore Australian speech-language pathologists' use of non-speech oral motor exercises, and rationales for using/not using non-speech oral motor exercises in clinical practice. A total of 124 speech-language pathologists practising in Australia, working with paediatric and/or adult clients with speech sound difficulties, completed an online survey. The majority of speech-language pathologists reported that they did not use non-speech oral motor exercises when working with paediatric or adult clients with speech sound difficulties. However, more than half of the speech-language pathologists working with adult clients who have dysarthria reported using non-speech oral motor exercises with this population. The most frequently reported rationale for using non-speech oral motor exercises in speech sound difficulty management was to improve awareness/placement of articulators. The majority of speech-language pathologists agreed there is no clear clinical or research evidence base to support non-speech oral motor exercise use with clients who have speech sound difficulties. This study provides an overview of Australian speech-language pathologists' reported use and perceptions of non-speech oral motor exercises' applicability and efficacy in treating paediatric and adult clients who have speech sound difficulties. The research findings provide speech-language pathologists with insight into how and why non-speech oral motor exercises are currently used, and adds to the knowledge base regarding Australian speech-language pathology practice of non-speech oral motor exercises in the treatment of speech sound difficulties. Implications for Rehabilitation Non-speech oral motor exercises refer to oral motor activities which do not involve speech, but involve the manipulation or stimulation of oral structures including the lips, tongue, jaw, and soft palate. Non-speech oral motor exercises are intended to improve the function (e.g., movement, strength) of oral structures. The
Vouloumanos, Athena; Gelfand, Hanna M.
The ability to decode atypical and degraded speech signals as intelligible is a hallmark of speech perception. Human adults can perceive sounds as speech even when they are generated by a variety of nonhuman sources including computers and parrots. We examined how infants perceive the speech-like vocalizations of a parrot. Further, we examined how…
Avrill eTreille; Coriandre eVilain; Marc eSato
Recent magneto-encephalographic and electro-encephalographic studies provide evidence for cross-modal integration during audio-visual and audio-haptic speech perception, with speech gestures viewed or felt from manual tactile contact with the speaker’s face. Given the temporal precedence of the haptic and visual signals on the acoustic signal in these studies, the observed modulation of N1/P2 auditory evoked responses during bimodal compared to unimodal speech perception suggest that relevant...
Full Text Available Recent magneto-encephalographic and electro-encephalographic studies provide evidence for cross-modal integration during audio-visual and audio-haptic speech perception, with speech gestures viewed or felt from manual tactile contact with the speaker’s face. Given the temporal precedence of the haptic and visual signals on the acoustic signal in these studies, the observed modulation of N1/P2 auditory evoked responses during bimodal compared to unimodal speech perception suggest that relevant and predictive visual and haptic cues may facilitate auditory speech processing. To further investigate this hypothesis, auditory evoked potentials were here compared during auditory-only, audio-visual and audio-haptic speech perception in live dyadic interactions between a listener and a speaker. In line with previous studies, auditory evoked potentials were attenuated and speeded up during both audio-haptic and audio-visual compared to auditory speech perception. Importantly, the observed latency and amplitude reduction did not significantly depend on the degree of visual and haptic recognition of the speech targets. Altogether, these results further demonstrate cross-modal interactions between the auditory, visual and haptic speech signals. Although they do not contradict the hypothesis that visual and haptic sensory inputs convey predictive information with respect to the incoming auditory speech input, these results suggest that, at least in live conversational interactions, systematic conclusions on sensory predictability in bimodal speech integration have to be taken with caution, with the extraction of predictive cues likely depending on the variability of the speech stimuli.
Comparison of speech discrimination in noise and directional hearing with 2 different sound processors of a bone-anchored hearing system in adults with unilateral severe or profound sensorineural hearing loss.
Wesarg, Thomas; Aschendorff, Antje; Laszig, Roland; Beck, Rainer; Schild, Christian; Hassepass, Frederike; Kroeger, Stefanie; Hocke, Thomas; Arndt, Susan
To evaluate and compare the benefit of a bone-anchored hearing implant with 2 different sound processors in adult patients with unilateral severe to profound sensorineural hearing loss (UHL). Prospective crossover design. Tertiary referral center. Eleven adults with UHL and normal hearing in the contralateral ear were assigned to 2 groups. All subjects were unilaterally implanted with a bone-anchored hearing implant and were initially fitted with 2 different sound processors (SP-1 and SP-2). SP-1 is a multichannel device equipped with an omnidirectional microphone and relatively simple digital signal-processing technology and provides a user-adjustable overall gain and tone control with compression limiting. SP-2 is a fully channel-by-channel programmable device, which can be set with nonlinear dynamic range compression or linear amplification. In addition, SP-2 features automatic noise management, an automatic multichannel directional microphone, microphone position compensation, and an implementation of prescription rules for different types of hearing losses, one of them unilateral deafness. After at least 1-month use of the initial processor, both groups were fitted with the alternative processor. Speech discrimination in noise and localization tests were performed at baseline visit before surgery, after at least 1-month use of the initial processor, and after at least 2-week use of the alternative processor. Relative to unaided baseline, SP-2 enabled significantly better overall speech discrimination results, whereas there was no overall improvement with SP-1. There was no difference in speech discrimination between SP-1 and SP-2 in all spatial settings. Sound localization was comparably poor at baseline and with both processors but significantly better than chance level for all 3 conditions. Patients with UHL have an overall objective benefit for speech discrimination in noise using a bone-anchored hearing implant with SP-2. In contrast, there is no overall
Civier, Oren; Tasko, Stephen M.; Guenther, Frank H.
This paper investigates the hypothesis that stuttering may result in part from impaired readout of feedforward control of speech, which forces persons who stutter (PWS) to produce speech with a motor strategy that is weighted too much toward auditory feedback control. Over-reliance on feedback control leads to production errors which, if they grow large enough, can cause the motor system to “reset” and repeat the current syllable. This hypothesis is investigated using computer simulations of a “neurally impaired” version of the DIVA model, a neural network model of speech acquisition and production. The model’s outputs are compared to published acoustic data from PWS’ fluent speech, and to combined acoustic and articulatory movement data collected from the dysfluent speech of one PWS. The simulations mimic the errors observed in the PWS subject’s speech, as well as the repairs of these errors. Additional simulations were able to account for enhancements of fluency gained by slowed/prolonged speech and masking noise. Together these results support the hypothesis that many dysfluencies in stuttering are due to a bias away from feedforward control and toward feedback control. PMID:20831971
Chang, Yen-Liang; Hung, Chao-Ho; CHEN, PO-YUEH; Chen, Wei-chang; Hung, Shih-Han
Acoustic analysis is often used in speech evaluation but seldom for the evaluation of oral prostheses designed for reconstruction of surgical defect. This study aimed to introduce the application of acoustic analysis for patients with velopharyngeal insufficiency (VPI) due to oral surgery and rehabilitated with oral speech-aid prostheses. Methods: The pre- and postprosthetic rehabilitation acoustic features of sustained vowel sounds from two patients with VPI were analyzed and compared wit...
Josephs, Keith A; Duffy, Joseph R; Strand, Edyth A; Whitwell, Jennifer L; Layton, Kenneth F; Parisi, Joseph E; Hauser, Mary F; Witte, Robert J; Boeve, Bradley F; Knopman, David S; Dickson, Dennis W; Jack, Jr, Clifford R; Petersen, Ronald C
Apraxia of speech (AOS) is a motor speech disorder characterized by slow speaking rate, abnormal prosody and distorted sound substitutions, additions, repetitions and prolongations, sometimes accompanied by groping...
Larsen, Ole Næsbye; Wahlberg, Magnus
There is no difference in principle between the infrasonic and ultrasonic sounds, which are inaudible to humans (or other animals) and the sounds that we can hear. In all cases, sound is a wave of pressure and particle oscillations propagating through an elastic medium, such as air. This chapter...... of these properties are dictated by simple physical relationships between the size of the sound emitter and the wavelength of emitted sound. The wavelengths of the signals need to be sufficiently short in relation to the size of the emitter to allow for the efficient production of propagating sound pressure waves...... is about the physical laws that govern how animals produce sound signals and how physical principles determine the signals’ frequency content and sound level, the nature of the sound field (sound pressure versus particle vibrations) as well as directional properties of the emitted signal. Many...
This dissertation addresses the problem of speech synthesis and speech production modelling based on the fundamental principles of human speech production. Unlike the conventional source-filter model, which assumes the independence of the excitation and the acoustic filter, we treat the entire vocal apparatus as one system consisting of a fluid dynamic aspect and a mechanical part. We model the vocal tract by a three-dimensional moving geometry. We also model the sound propagation inside the vocal apparatus as a three-dimensional nonplane-wave propagation inside a viscous fluid described by Navier-Stokes equations. In our work, we first propose a combined minimum energy and minimum jerk criterion to estimate the dynamic vocal tract movements during speech production. Both theoretical error bound analysis and experimental results show that this method can achieve very close match at the target points and avoid the abrupt change in articulatory trajectory at the same time. Second, a mechanical vocal fold model is used to compute the excitation signal of the vocal tract. The advantage of this model is that it is closely coupled with the vocal tract system based on fundamental aerodynamics. As a result, we can obtain an excitation signal with much more detail than the conventional parametric vocal fold excitation model. Furthermore, strong evidence of source-tract interaction is observed. Finally, we propose a computational model of the fricative and stop types of sounds based on the physical principles of speech production. The advantage of this model is that it uses an exogenous process to model the additional nonsteady and nonlinear effects due to the flow mode, which are ignored by the conventional source- filter speech production model. A recursive algorithm is used to estimate the model parameters. Experimental results show that this model is able to synthesize good quality fricative and stop types of sounds. Based on our dissertation work, we carefully argue
Rabaglia, Cristina D; Maglio, Sam J; Krehm, Madelaine; Seok, Jin H; Trope, Yaacov
Human languages may be more than completely arbitrary symbolic systems. A growing literature supports sound symbolism, or the existence of consistent, intuitive relationships between speech sounds and specific concepts. Prior work establishes that these sound-to-meaning mappings can shape language-related judgments and decisions, but do their effects generalize beyond merely the linguistic and truly color how we navigate our environment? We examine this possibility, relating a predominant sound symbolic distinction (vowel frontness) to a novel associate (spatial proximity) in five studies. We show that changing one vowel in a label can influence estimations of distance, impacting judgment, perception, and action. The results (1) provide the first experimental support for a relationship between vowels and spatial distance and (2) demonstrate that sound-to-meaning mappings have outcomes that extend beyond just language and can - through a single sound - influence how we perceive and behave toward objects in the world. Copyright © 2016 Elsevier B.V. All rights reserved.
Haydée Fiszbein Wertzner
Full Text Available OBJETIVO: Descrever os índices articulatórios quanto aos diferentes tipos de erros e verificar a existência de um tipo de erro preferencial em crianças com transtorno fonológico, em função da presença ou não de histórico de otite média. MÉTODOS: Participaram deste estudo prospectivo e transversal, 21 sujeitos com idade entre 5 anos e 2 meses e 7 anos e 9 meses com diagnóstico de transtorno fonológico. Os sujeitos foram agrupados de acordo com a presença do histórico otite média. O grupo experimental 1 (GE1 foi composto por 14 sujeitos com histórico de otite média e o grupo experimental 2 (GE2 por sete sujeitos que não apresentaram histórico de otite média. Foram calculadas a quantidade de erros de fala (distorções, omissões e substituições e os índices articulatórios. Os dados foram submetidos à análise estatística. RESULTADOS: Os grupos GE1 e GE2 diferiram quanto ao desempenho nos índices na comparação entre as duas provas de fonologia aplicadas. Observou-se em todas as análises que os índices que avaliam as substituições indicaram o tipo de erro mais cometido pelas crianças com transtorno fonológico. CONCLUSÃO: Os índices foram efetivos na indicação da substituição como o erro mais ocorrente em crianças com TF. A maior ocorrência de erros de fala observada na nomeação de figuras em crianças com histórico de otite média indica que tais erros, possivelmente, estão associados à dificuldade na representação fonológica causada pela perda auditiva transitória que vivenciaram.PURPOSE: To describe articulatory indexes for the different speech errors and to verify the existence of a preferred type of error in children with speech sound disorder, according to the presence or absence of otitis media history. METHODS: Participants in this prospective and cross-sectional study were 21 subjects aged between 5 years and 2 months and 7 years and 9 months with speech sound disorder. Subjects were
Parker, Martin; Chapman, A.
A sound installation developed to accompany Anna Chapman's sound installtion "Subjects for Melancholy Retrospection" that was presented t the Edinburgh Art Festrival in August 2010 at the National Trust for Scotland's Newhailes estate.
Homae, Fumitaka; Watanabe, Hama; Taga, Gentaro
Infants often pay special attention to speech sounds, and they appear to detect key features of these sounds. To investigate the neural foundation of speech perception in infants, we measured cortical activation using near-infrared spectroscopy. We presented the following three types of auditory stimuli while 3-month-old infants watched a silent…
Alpern, Ramona Lenny
A speech therapist discusses an approach that combines articulation and language skills beginning with sounds and progressing through multisensory activities to words, phrases, sentences, controlled conversation, and free-flowing conversation. The approach uses therapy based in speech therapy's historical foundations. (CL)
Christmann, Corinna A.; Lachmann, Thomas; Steinbrink, Claudia
Purpose: It is unknown whether phonological deficits are the primary cause of developmental dyslexia or whether they represent a secondary symptom resulting from impairments in processing basic acoustic parameters of speech. This might be due, in part, to methodological difficulties. Our aim was to overcome two of these difficulties: the…
Bhatara, Anjali; Boll-Avetisyan, Natalie; Agus, Trevor; Höhle, Barbara; Nazzi, Thierry
Language experience clearly affects the perception of speech, but little is known about whether these differences in perception extend to non-speech sounds. In this study, we investigated rhythmic perception of non-linguistic sounds in speakers of French and German using a grouping task, in which complexity (variability in sounds, presence of…
nasality accent by which all important speech sounds are characterized as nasal or non-nasal (4). In some languages, such as Hindi or Gujarati (5), some...so far by Professor Stevens and his group are Hindi, Gujarati , Bengali, Portuguese, English, and French. French listeners in the perceptual study
Douglas M Shiller
Full Text Available Auditory input is essential for normal speech development and plays a key role in speech production throughout the life span. In traditional models, auditory input plays two critical roles: 1 establishing the acoustic correlates of speech sounds that serve, in part, as the targets of speech production, and 2 as a source of feedback about a talker's own speech outcomes. This talk will focus on both of these roles, describing a series of studies that examine the capacity of children and adults to adapt to real-time manipulations of auditory feedback during speech production. In one study, we examined sensory and motor adaptation to a manipulation of auditory feedback during production of the fricative “s”. In contrast to prior accounts, adaptive changes were observed not only in speech motor output but also in subjects' perception of the sound. In a second study, speech adaptation was examined following a period of auditory–perceptual training targeting the perception of vowels. The perceptual training was found to systematically improve subjects' motor adaptation response to altered auditory feedback during speech production. The results of both studies support the idea that perceptual and motor processes are tightly coupled in speech production learning, and that the degree and nature of this coupling may change with development.
Chang, Yen-Liang; Hung, Chao-Ho; Chen, Po-Yueh; Chen, Wei-Chang; Hung, Shih-Han
Acoustic analysis is often used in speech evaluation but seldom for the evaluation of oral prostheses designed for reconstruction of surgical defect. This study aimed to introduce the application of acoustic analysis for patients with velopharyngeal insufficiency (VPI) due to oral surgery and rehabilitated with oral speech-aid prostheses. The pre- and postprosthetic rehabilitation acoustic features of sustained vowel sounds from two patients with VPI were analyzed and compared with the acoustic analysis software Praat. There were significant differences in the octave spectrum of sustained vowel speech sound between the pre- and postprosthetic rehabilitation. Acoustic measurements of sustained vowels for patients before and after prosthetic treatment showed no significant differences for all parameters of fundamental frequency, jitter, shimmer, noise-to-harmonics ratio, formant frequency, F1 bandwidth, and band energy difference. The decrease in objective nasality perceptions correlated very well with the decrease in dips of the spectra for the male patient with a higher speech bulb height. Acoustic analysis may be a potential technique for evaluating the functions of oral speech-aid prostheses, which eliminates dysfunctions due to the surgical defect and contributes to a high percentage of intelligible speech. Octave spectrum analysis may also be a valuable tool for detecting changes in nasality characteristics of the voice during prosthetic treatment of VPI. Copyright © 2014. Published by Elsevier B.V.
McCormack, Jane; McLeod, Sharynne; McAllister, Lindy; Harrison, Linda J.
Purpose: The purpose of this article was to understand the experience of speech impairment (speech sound disorders) in everyday life as described by children with speech impairment and their communication partners. Method: Interviews were undertaken with 13 preschool children with speech impairment (mild to severe) and 21 significant others…
Margot I. Visser-Bochane; Dr. Margreet R. Luinge; Sijmen A. Reijneveld; W.P. Krijnen; Dr. C.P. van der Schans
Speech language disorders, which include speech sound disorders and language disorders, are common in early childhood. These problems, and in particular language problems, frequently go under diagnosed, because current screening instruments have no satisfying psychometric properties. Recent research
Full Text Available A method of detecting speech events in a multiple-sound-source condition using audio and video information is proposed. For detecting speech events, sound localization using a microphone array and human tracking by stereo vision is combined by a Bayesian network. From the inference results of the Bayesian network, information on the time and location of speech events can be known. The information on the detected speech events is then utilized in the robust speech interface. A maximum likelihood adaptive beamformer is employed as a preprocessor of the speech recognizer to separate the speech signal from environmental noise. The coefficients of the beamformer are kept updated based on the information of the speech events. The information on the speech events is also used by the speech recognizer for extracting the speech segment.
Cartwright, Julyan H. E.; González, Diego L.; Piro, Oreste
We apply results from nonlinear dynamics to an old problem in acoustical physics: the mechanism of the perception of the pitch of sounds, especially the sounds known as complex tones that are important for music and speech intelligibility.
Cartwright, J H E; Piro, O; Cartwright, Julyan H. E.; Gonzalez, Diego L.; Piro, Oreste
We apply results from nonlinear dynamics to an old problem in acoustical physics: the mechanism of the perception of the pitch of sounds, especially the sounds known as complex tones that are important for music and speech intelligibility.
Bruderer, Alison G; Danielson, D Kyle; Kandhadai, Padmapriya; Werker, Janet F
The influence of speech production on speech perception is well established in adults. However, because adults have a long history of both perceiving and producing speech, the extent to which the perception-production linkage is due to experience is unknown. We addressed this issue by asking whether articulatory configurations can influence infants' speech perception performance. To eliminate influences from specific linguistic experience, we studied preverbal, 6-mo-old infants and tested the discrimination of a nonnative, and hence never-before-experienced, speech sound distinction. In three experimental studies, we used teething toys to control the position and movement of the tongue tip while the infants listened to the speech sounds. Using ultrasound imaging technology, we verified that the teething toys consistently and effectively constrained the movement and positioning of infants' tongues. With a looking-time procedure, we found that temporarily restraining infants' articulators impeded their discrimination of a nonnative consonant contrast but only when the relevant articulator was selectively restrained to prevent the movements associated with producing those sounds. Our results provide striking evidence that even before infants speak their first words and without specific listening experience, sensorimotor information from the articulators influences speech perception. These results transform theories of speech perception by suggesting that even at the initial stages of development, oral-motor movements influence speech sound discrimination. Moreover, an experimentally induced "impairment" in articulator movement can compromise speech perception performance, raising the question of whether long-term oral-motor impairments may impact perceptual development.
Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A
The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg
Regulatory sound insulation requirements for dwellings exist in more than 30 countries in Europe. In some countries, requirements have existed since the 1950s. Findings from comparative studies show that sound insulation descriptors and requirements represent a high degree of diversity...... and initiate – where needed – improvement of sound insulation of new and existing dwellings in Europe to the benefit of the inhabitants and the society. A European COST Action TU0901 "Integrating and Harmonizing Sound Insulation Aspects in Sustainable Urban Housing Constructions", has been established and runs...... 2009-2013. The main objectives of TU0901 are to prepare proposals for harmonized sound insulation descriptors and for a European sound classification scheme with a number of quality classes for dwellings. Findings from the studies provide input for the discussions in COST TU0901. Data collected from 24...
Regulatory sound insulation requirements for dwellings exist in more than 30 countries in Europe. In some countries, requirements have existed since the 1950s. Findings from comparative studies show that sound insulation descriptors and requirements represent a high degree of diversity...... and initiate – where needed – improvement of sound insulation of new and existing dwellings in Europe to the benefit of the inhabitants and the society. A European COST Action TU0901 "Integrating and Harmonizing Sound Insulation Aspects in Sustainable Urban Housing Constructions", has been established and runs...... 2009-2013. The main objectives of TU0901 are to prepare proposals for harmonized sound insulation descriptors and for a European sound classification scheme with a number of quality classes for dwellings. Findings from the studies provide input for the discussions in COST TU0901. Data collected from 24...
Haydée Fiszbein Wertzner
Full Text Available OBJETIVO: Verificar se o índice de gravidade, que mede a porcentagem de consoantes corretas, distingue as crianças com transtorno fonológico em relação às medidas de estimulabilidade e inconsistência de fala, bem como à presença dos históricos familial e de otite média. MÉTODOS: Participaram da pesquisa 15 sujeitos com idades entre 5 e 7 anos e 11meses, com diagnóstico de transtorno fonológico. O índice Porcentagem de Consoantes Corretas - Revisado (PCC-R foi calculado para as provas de imitação de palavras e de nomeação de figuras, separadamente. A partir destas provas também foi computada a necessidade de aplicação da prova de estimulabilidade, de acordo com os critérios propostos em pesquisas anteriores. A prova de inconsistência de fala permitiu classificar os sujeitos como consistentes ou inconsistentes. Os dados foram submetidos à análise estatística. RESULTADOS: Na comparação entre os valores do PCC-R medido na prova de nomeação e de imitação foi observada diferença em relação à necessidade da aplicação da estimulabilidade. Em relação à prova de inconsistência de fala, não houve evidência desta relação. Não foi verificada diferença no PCC-R considerando-se a presença dos históricos de otite média e familial. CONCLUSÃO: O estudo indica que as crianças que precisaram da aplicação da prova de estimulabilidade apresentaram valores mais baixos de PCC-R. Entretanto, em relação à prova de inconsistência de fala e aos históricos de otite ou familial, o PCC-R não determinou diferenças entre as crianças.PURPOSE: To determine whether the severity index that measures the percentage of consonants correct distinguishes children with speech sound disorders (SSD according to measures of stimulability and speech inconsistency, as well as to the presence of heritability (familial history and history of early recurrent otitis media. METHODS: Participants were 15 subjects aged between 5 and 7
Anderson, Carolyn; Cohen, Wendy
Background: Children's speech sound development is assessed by comparing speech production with the typical development of speech sounds based on a child's age and developmental profile. One widely used method of sampling is to elicit a single-word sample along with connected speech. Words produced spontaneously rather than imitated may give a…
Jerry D. Gibson
Full Text Available Speech compression is a key technology underlying digital cellular communications, VoIP, voicemail, and voice response systems. We trace the evolution of speech coding based on the linear prediction model, highlight the key milestones in speech coding, and outline the structures of the most important speech coding standards. Current challenges, future research directions, fundamental limits on performance, and the critical open problem of speech coding for emergency first responders are all discussed.
Brian N Pasley
Full Text Available How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex.
Miller, C; Massey, N; Miller, Corey; Karaali, Orhan; Massey, Noel
We describe the approach to linguistic variation taken by the Motorola speech synthesizer. A pan-dialectal pronunciation dictionary is described, which serves as the training data for a neural network based letter-to-sound converter. Subsequent to dictionary retrieval or letter-to-sound generation, pronunciations are submitted a neural network based postlexical module. The postlexical module has been trained on aligned dictionary pronunciations and hand-labeled narrow phonetic transcriptions. This architecture permits the learning of individual postlexical variation, and can be retrained for each speaker whose voice is being modeled for synthesis. Learning variation in this way can result in greater naturalness for the synthetic speech that is produced by the system.
Kadambe, Shubha L.; Srinivasan, Pramila
Our objective is to demonstrate the applicability of adaptive wavelets for speech applications. In particular, we discuss two applications, namely, classification of unvoiced sounds and speaker identification. First, a method to classify unvoiced sounds using adaptive wavelets, which would help in developing a unified algorithm to classify phonemes (speech sounds), is described. Next, the applicability of adaptive wavelets to identify speakers using very short speech data (one pitch period) is exhibited. The described text-independent phoneme based speaker identification algorithm identifies a speaker by first modeling phonemes and then by clustering all the phonemes belonging to the same speaker into one class. For both applications, we use feed-forward neural network architecture. We demonstrate the performance of both unvoiced sounds classifier and speaker identification algorithms by using representative real speech examples.
Stasenko, Alena; Bonn, Cory; Teghipco, Alex; Garcea, Frank E; Sweet, Catherine; Dombovy, Mary; McDonough, Joyce; Mahon, Bradford Z
The debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. Here, we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. We found that the patient showed a normal phonemic categorical boundary when discriminating two non-words that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the non-word stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labelling impairment. These data suggest that while the motor system is not causally involved in perception of the speech signal, it may be used when other cues (e.g., meaning, context) are not available.
National schemes for sound classification of dwellings exist in more than ten countries in Europe, typically published as national standards. The schemes define quality classes reflecting different levels of acoustical comfort. Main criteria concern airborne and impact sound insulation between...... dwellings, facade sound insulation and installation noise. The schemes have been developed, implemented and revised gradually since the early 1990s. However, due to lack of coordination between countries, there are significant discrepancies, and new standards and revisions continue to increase the diversity...... is needed, and a European COST Action TU0901 "Integrating and Harmonizing Sound Insulation Aspects in Sustainable Urban Housing Constructions", has been established and runs 2009-2013, one of the main objectives being to prepare a proposal for a European sound classification scheme with a number of quality...
Full Text Available The masking effect of a piano composition, played at different speeds and in different octaves, on speech-perception thresholds was investigated in 15 normal-hearing and 14 moderately-hearing-impaired subjects. Running speech (just follow conversation, JFC testing and use of hearing aids increased the everyday validity of the findings. A comparison was made with standard audiometric noises [International Collegium of Rehabilitative Audiology (ICRA noise and speech spectrum-filtered noise (SPN]. All masking sounds, music or noise, were presented at the same equivalent sound level (50 dBA. The results showed a significant effect of piano performance speed and octave (P<.01. Low octave and fast tempo had the largest effect; and high octave and slow tempo, the smallest. Music had a lower masking effect than did ICRA noise with two or six speakers at normal vocal effort (P<.01 and SPN (P<.05. Subjects with hearing loss had higher masked thresholds than the normal-hearing subjects (P<.01, but there were smaller differences between masking conditions (P<.01. It is pointed out that music offers an interesting opportunity for studying masking under realistic conditions, where spectral and temporal features can be varied independently. The results have implications for composing music with vocal parts, designing acoustic environments and creating a balance between speech perception and privacy in social settings.
Eisner, F.; Melinger, A.; Weber, A.C.
The perception of speech sounds can be re-tuned through a mechanism of lexically driven perceptual learning after exposure to instances of atypical speech production. This study asked whether this re-tuning is sensitive to the position of the atypical sound within the word. We investigated
Galilee, Alena; Stefanidou, Chrysi; McCleery, Joseph P
Previous event-related potential (ERP) research utilizing oddball stimulus paradigms suggests diminished processing of speech versus non-speech sounds in children with an Autism Spectrum Disorder (ASD). However, brain mechanisms underlying these speech processing abnormalities, and to what extent they are related to poor language abilities in this population remain unknown. In the current study, we utilized a novel paired repetition paradigm in order to investigate ERP responses associated with the detection and discrimination of speech and non-speech sounds in 4- to 6-year old children with ASD, compared with gender and verbal age matched controls. ERPs were recorded while children passively listened to pairs of stimuli that were either both speech sounds, both non-speech sounds, speech followed by non-speech, or non-speech followed by speech. Control participants exhibited N330 match/mismatch responses measured from temporal electrodes, reflecting speech versus non-speech detection, bilaterally, whereas children with ASD exhibited this effect only over temporal electrodes in the left hemisphere. Furthermore, while the control groups exhibited match/mismatch effects at approximately 600 ms (central N600, temporal P600) when a non-speech sound was followed by a speech sound, these effects were absent in the ASD group. These findings suggest that children with ASD fail to activate right hemisphere mechanisms, likely associated with social or emotional aspects of speech detection, when distinguishing non-speech from speech stimuli. Together, these results demonstrate the presence of atypical speech versus non-speech processing in children with ASD when compared with typically developing children matched on verbal age.
Full Text Available Previous event-related potential (ERP research utilizing oddball stimulus paradigms suggests diminished processing of speech versus non-speech sounds in children with an Autism Spectrum Disorder (ASD. However, brain mechanisms underlying these speech processing abnormalities, and to what extent they are related to poor language abilities in this population remain unknown. In the current study, we utilized a novel paired repetition paradigm in order to investigate ERP responses associated with the detection and discrimination of speech and non-speech sounds in 4- to 6-year old children with ASD, compared with gender and verbal age matched controls. ERPs were recorded while children passively listened to pairs of stimuli that were either both speech sounds, both non-speech sounds, speech followed by non-speech, or non-speech followed by speech. Control participants exhibited N330 match/mismatch responses measured from temporal electrodes, reflecting speech versus non-speech detection, bilaterally, whereas children with ASD exhibited this effect only over temporal electrodes in the left hemisphere. Furthermore, while the control groups exhibited match/mismatch effects at approximately 600 ms (central N600, temporal P600 when a non-speech sound was followed by a speech sound, these effects were absent in the ASD group. These findings suggest that children with ASD fail to activate right hemisphere mechanisms, likely associated with social or emotional aspects of speech detection, when distinguishing non-speech from speech stimuli. Together, these results demonstrate the presence of atypical speech versus non-speech processing in children with ASD when compared with typically developing children matched on verbal age.
Sheets, Boyd V.
This monograph on the anatomical and physiological aspects of the speech mechanism stresses the importance of a general understanding of the process of verbal communication. Contents include "Positions of the Body,""Basic Concepts Linked with the Speech Mechanism,""The Nervous System,""The Respiratory System--Sound-Power Source,""The…
Holzrichter, John F.
A non-acoustic sensor is used to measure a user's speech and then broadcasts an obscuring acoustic signal diminishing the user's vocal acoustic output intensity and/or distorting the voice sounds making them unintelligible to persons nearby. The non-acoustic sensor is positioned proximate or contacting a user's neck or head skin tissue for sensing speech production information.
Cunningham, Dana Aliel
Speech prosody is a multi-faceted dimension of speech which can be measured and analyzed in a variety of ways. In this study, the speech prosody of Mandarin L1 speakers, English L2 speakers, and English L1 speakers was assessed by trained raters who listened to sound clips of the speakers responding to a graph prompt and reading a short passage.…
Saito, Akie; Inoue, Tomoyoshi
The so-called syllable position effect in speech errors has been interpreted as reflecting constraints posed by the frame structure of a given language, which is separately operating from linguistic content during speech production. The effect refers to the phenomenon that when a speech error occurs, replaced and replacing sounds tend to be in the same position within a syllable or word. Most of the evidence for the effect comes from analyses of naturally occurring speech errors in Indo-European languages, and there are few studies examining the effect in experimentally elicited speech errors and in other languages. This study examined whether experimentally elicited sound errors in Japanese exhibits the syllable position effect. In Japanese, the sub-syllabic unit known as "mora" is considered to be a basic sound unit in production. Results showed that the syllable position effect occurred in mora errors, suggesting that the frame constrains the ordering of sounds during speech production.
Crowe, Kathryn; Cumming, Tamara; McCormack, Jane; Baker, Elise; McLeod, Sharynne; Wren, Yvonne; Roulstone, Sue; Masso, Sarah
Early childhood educators are frequently called on to support preschool-aged children with speech sound disorders and to engage these children in activities that target their speech production. This study explored factors that acted as facilitators and/or barriers to the provision of computer-based support for children with speech sound disorders…
Lederman, Norman; Hendricks, Paula
A multisensory sound lab has been developed at the Model Secondary School for the Deaf (District of Columbia). A special floor allows vibrations to be felt, and a spectrum analyzer displays frequencies and harmonics visually. The lab is used for science education, auditory training, speech therapy, music and dance instruction, and relaxation…
... Spotlight Fundraising Ideas Vehicle Donation Volunteer Efforts Speech Development skip to submenu What We Do Cleft & Craniofacial Educational Materials Speech Development To download the PDF version of this factsheet, ...
Hasse Jørgensen, Stina
About Speech Matters - Katarina Gregos, the Greek curator's exhibition at the Danish Pavillion, the Venice Biannual 2011.......About Speech Matters - Katarina Gregos, the Greek curator's exhibition at the Danish Pavillion, the Venice Biannual 2011....
Consumer Guide Speech to Speech Relay Service Speech-to-Speech (STS) is one form of Telecommunications Relay Service (TRS). TRS is a service that allows persons with hearing and speech disabilities ...
Schaefer, R.S.; Beijer, L.J.; Seuskens, W.L.J.B.; Rietveld, A.C.M.; Sadakata, M.
Visualizing acoustic features of speech has proven helpful in speech therapy; however, it is as yet unclear how to create intuitive and fitting visualizations. To better understand the mappings from speech sound aspects to visual space, a large web-based experiment (n = 249) was performed to
Hambly, Helen; Wren, Yvonne; McLeod, Sharynne; Roulstone, Sue
Background: Children who are bilingual and have speech sound disorder are likely to be under-referred, possibly due to confusion about typical speech acquisition in bilingual children. Aims: To investigate what is known about the impact of bilingualism on children's acquisition of speech in English to facilitate the identification and treatment of…
Saito, Akie; Inoue, Tomoyoshi
The so-called syllable position effect in speech errors has been interpreted as reflecting constraints posed by the frame structure of a given language, which is separately operating from linguistic content during speech production. The effect refers to the phenomenon that when a speech error occurs, replaced and replacing sounds tend to be in the…
Bosker, Hans Rutger
In conversation, our own speech and that of others follow each other in rapid succession. Effects of the surrounding context on speech perception are well documented but, despite the ubiquity of the sound of our own voice, it is unknown whether our own speech also influences our perception of other talkers. This study investigated context effects…
Vouloumanos, Athena; Hauser, Marc D.; Werker, Janet F.; Martin, Alia
Human neonates prefer listening to speech compared to many nonspeech sounds, suggesting that humans are born with a bias for speech. However, neonates' preference may derive from properties of speech that are not unique but instead are shared with the vocalizations of other species. To test this, thirty neonates and sixteen 3-month-olds were…
Shafiro, Valeriy; Sheft, Stanley; Kuvadia, Sejal; Gygi, Brian
Purpose: The study investigated the effect of a short computer-based environmental sound training regimen on the perception of environmental sounds and speech in experienced cochlear implant (CI) patients. Method: Fourteen CI patients with the average of 5 years of CI experience participated. The protocol consisted of 2 pretests, 1 week apart,…
Theunissen, Frédéric E; Elie, Julie E
We might be forced to listen to a high-frequency tone at our audiologist's office or we might enjoy falling asleep with a white-noise machine, but the sounds that really matter to us are the voices of our companions or music from our favourite radio station. The auditory system has evolved to process behaviourally relevant natural sounds. Research has shown not only that our brain is optimized for natural hearing tasks but also that using natural sounds to probe the auditory system is the best way to understand the neural computations that enable us to comprehend speech or appreciate music.
Trento, Stefano; Götzen, Amalia De
This paper is an initial attempt to study the world of sound effects for motion pictures, also known as Foley sounds. Throughout several audio and audio-video tests we have compared both Foley and real sounds originated by an identical action. The main purpose was to evaluate if sound effects...
Interestingly, anatomical studies of the adult human brain indi- cate that specialized regions of the brain analyse different types of sounds . Music, speech and environment noise are processed in areas that are anatomically distinct . However, the reasons for this kind of functional organization are not clearly identified.
Mitterer, H.A.; McQueen, J.M.
Understanding foreign speech is difficult, in part because of unusual mappings between sounds and words. It is known that listeners in their native language can use lexical knowledge (about how words ought to sound) to learn how to interpret unusual speech-sounds. We therefore investigated whether
machines; the part of speech technology, which is concerned with automatically .... combined to build up a given word. Swahili phoneme is divided ... Vowel clusters. Unlike in English, two (or three) written vowels that follow each other never merge together to form a single sound. Each keeps its own sound. Example : 'ou' is ...
... therapists help people of all ages with different speech and language disorders. Here are some of them: articulation (say: ar-tik-yuh-LAY-shun) disorders: This when a kid has trouble saying certain sounds or saying words correctly. "Run" might come out ...
Lo, Lap-Yan; Luk, H M; Thompson, Nigel
Sound symbolism suggests a non-arbitrary relationship between speech sounds and the concepts to which those sounds refer (Hinton, Nichols, & Ohala, 2006 ). Supporting evidence comes primarily from studies investigating how speech sounds relate to semantically compatible visual concepts. The present study therefore attempted to examine sound symbolism in the context of tactile perception. Contrary to the propositions of sound symbolism, participants in Experiment 1 did not consistently assign names with plosive consonant to objects with curved frames. Experiment 2, however, found that names with fricative consonants were more likely to be applied to materials with rough surfaces. The results suggested the existence of a direct relationship between speech sounds and their referent concepts that could be crucial in revealing the phenomenon of sound symbolism. A future study was also proposed to study the contributions of mouth shape and airflow to associations between speech sounds and tactile feelings. (161 words).
Savithri, S. R.
The study reviewed Sanskrit books written between 1500 BC and 1904 AD concerning diseases, speech pathology, and audiology. Details are provided of the ancient Indian system of disease classification, the classification of speech sounds, causes of speech disorders, and treatment of speech and language disorders. (DB)
Maslowski, M.; Meyer, A.S.; Bosker, H.R.
Speech rate is known to modulate perception of temporally ambiguous speech sounds. For instance, a vowel may be perceived as short when the immediate speech context is slow, but as long when the context is fast. Yet, effects of long-term tracking of speech rate are largely unexplored. Two
Little is known about the perception of speech sounds by native Danish listeners. However, the Danish sound system differs in several interesting ways from the sound systems of other languages. For instance, Danish is characterized, among other features, by a rich vowel inventory and by different...... to interesting differences in speech perception and acquisition of Danish adults and infants when compared to English. The book is useful for professionals as well as students of linguistics, psycholinguistics and phonetics/phonology, or anyone else who may be interested in language....
Tapas Kumar Patra; Biplab Patra; Puspanjali Mohapatra
This paper presents a method to design a Text to Speech conversion module by the use of Matlab by simple matrix operations. Firstly by the use of microphone some similar sounding words are recorded using a record program in the Matlab window and recorded sounds are saved in .wav format in the directory. The recorded sounds are then sampled and the sampled values are taken and separated into their constituent phonetics. The separated syllables are then concatenated to reconstruct the desired w...
Grimshaw, Mark; Garner, Tom Alexander
We make the case in this essay that sound that is imagined is both a perception and as much a sound as that perceived through external stimulation. To argue this, we look at the evidence from auditory science, neuroscience, and philosophy, briefly present some new conceptual thinking on sound...... that accounts for this view, and then use this to look at what the future might hold in the context of imagining sound and developing technology....
ter Schure, S.M.M.
Newborn infants are sensitive to combinations of visual and auditory speech. Does this ability to match sounds and sights affect how infants learn the sounds of their native language? And are visual articulations the only type of visual information that can influence sound learning? This
Henry, James A; Zaugg, Tara L; Myers, Paula J; Schechter, Martin A
Management of tinnitus generally involves educational counseling, stress reduction, and/or the use of therapeutic sound. This article focuses on therapeutic sound, which can involve three objectives: (a) producing a sense of relief from tinnitus-associated stress (using soothing sound); (b) passively diverting attention away from tinnitus by reducing contrast between tinnitus and the acoustic environment (using background sound); and (c) actively diverting attention away from tinnitus (using interesting sound). Each of these goals can be accomplished using three different types of sound-broadly categorized as environmental sound, music, and speech-resulting in nine combinations of uses of sound and types of sound to manage tinnitus. The authors explain the uses and types of sound, how they can be combined, and how the different combinations are used with Progressive Audiologic Tinnitus Management. They also describe how sound is used with other sound-based methods of tinnitus management (Tinnitus Masking, Tinnitus Retraining Therapy, and Neuromonics).
Basu, Madhavi L.; Surprenant, Aimee M.
Specific Language Impairment (SLI) is a developmental language disorder in which children demonstrate varying degrees of difficulties in acquiring a spoken language. One possible underlying cause is that children with SLI have deficits in processing sounds that are of short duration or when they are presented rapidly. Studies so far have compared their performance on speech and nonspeech sounds of unequal complexity. Hence, it is still unclear whether the deficit is specific to the perception of speech sounds or whether it more generally affects the auditory function. The current study aims to answer this question by comparing the performance of children with SLI on speech and nonspeech sounds synthesized from sine-wave stimuli. The children will be tested using the classic categorical perception paradigm that includes both the identification and discrimination of stimuli along a continuum. If there is a deficit in the performance on both speech and nonspeech tasks, it will show that these children have a deficit in processing complex sounds. Poor performance on only the speech sounds will indicate that the deficit is more related to language. The findings will offer insights into the exact nature of the speech perception deficits in children with SLI. [Work supported by ASHF.
Raca, Gordana; Baas, Becky S; Kirmani, Salman; Laffin, Jennifer J; Jackson, Craig A; Strand, Edythe A; Jakielski, Kathy J; Shriberg, Lawrence D
We report clinical findings that extend the phenotype of the ~550 kb 16p11.2 microdeletion syndrome to include a rare, severe, and persistent pediatric speech sound disorder termed Childhood Apraxia of Speech (CAS...
Baart, M.; Bortfeld, H.; Vroomen, J.
The correspondence between auditory speech and lip-read information can be detected based on a combination of temporal and phonetic cross-modal cues. Here, we determined the point in developmental time at which children start to effectively use phonetic information to match a speech sound with one
Full Text Available One of the goals of text-to-speech (TTS) systems is to produce natural-sounding synthesised speech. Towards this end various natural language processing (NLP) tasks are performed to model the prosodic aspects of the TTS voice. One of the fundamental...
Bruderer, Alison G.; Danielson, D. Kyle; Kandhadai, Padmapriya; Werker, Janet F.
The influence of speech production on speech perception is well established in adults. However, because adults have a long history of both perceiving and producing speech, the extent to which the perception–production linkage is due to experience is unknown. We addressed this issue by asking whether articulatory configurations can influence infants’ speech perception performance. To eliminate influences from specific linguistic experience, we studied preverbal, 6-mo-old infants and tested the discrimination of a nonnative, and hence never-before-experienced, speech sound distinction. In three experimental studies, we used teething toys to control the position and movement of the tongue tip while the infants listened to the speech sounds. Using ultrasound imaging technology, we verified that the teething toys consistently and effectively constrained the movement and positioning of infants’ tongues. With a looking-time procedure, we found that temporarily restraining infants’ articulators impeded their discrimination of a nonnative consonant contrast but only when the relevant articulator was selectively restrained to prevent the movements associated with producing those sounds. Our results provide striking evidence that even before infants speak their first words and without specific listening experience, sensorimotor information from the articulators influences speech perception. These results transform theories of speech perception by suggesting that even at the initial stages of development, oral–motor movements influence speech sound discrimination. Moreover, an experimentally induced “impairment” in articulator movement can compromise speech perception performance, raising the question of whether long-term oral–motor impairments may impact perceptual development. PMID:26460030
Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U
-max output function. Both linear and quadratic inputs are used. The model is trained on 2 hours of sound and tested on publicly available data. A test classification error below 0.05 with 1 s classification windows is achieved. Further more it is shown that linear input performs as well as a quadratic......A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft......, and that even though classification gets marginally better, not much is achieved by increasing the window size beyond 1 s....
Begault, Durand R.; Wenzel, Elizabeth M.
Three-dimensional acoustic display systems have recently been developed that synthesize virtual sound sources over headphones based on filtering by head-related transfer functions (HRTFs), the direction-dependent spectral changes caused primarily by the pinnae. In this study, 11 inexperienced subjects judged the apparent spatial location of headphone-presented speech stimuli filtered with nonindividualized HRTFs. About half of the subjects 'pulled' their judgments toward either the median or the lateral-vertical planes, and estimates were almost always elevated. Individual differences were pronounced for the distance judgments; 15 to 46 percent of stimuli were heard inside the head, with the shortest estimates near the median plane. The results suggest that most listeners can obtain useful azimuth information from speech stimuli filtered by nonindividualized HRTFs. Measurements of localization error and reversal rates are comparable with a previous study that used broadband noise stimuli.
Santoro, Roberta; Moerel, Michelle; De Martino, Federico; Goebel, R.; Ugurbil, Kamil; Yacoub, Essa; Formisano, Elia
Functional neuroimaging research provides detailed observations of the response patterns that natural sounds (e.g. human voices and speech, animal cries, environmental sounds) evoke in the human brain. The computational and representational mechanisms underlying these observations, however, remain
Ylinen, Sari; Uther, Maria; Latvala, Antti; Vepsalainen, Sara; Iverson, Paul; Akahane-Yamada, Reiko; Naatanen, Risto
Foreign-language learning is a prime example of a task that entails perceptual learning. The correct comprehension of foreign-language speech requires the correct recognition of speech sounds. The most difficult speech-sound contrasts for foreign-language learners often are the ones that have multiple phonetic cues, especially if the cues are…
Ravishankar, C., Hughes Network Systems, Germantown, MD
Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the
Jung, Youngsin; Duffy, Joseph R; Josephs, Keith A
Primary progressive aphasia is a neurodegenerative syndrome characterized by progressive language dysfunction. The majority of primary progressive aphasia cases can be classified into three subtypes: nonfluent/agrammatic, semantic, and logopenic variants. Each variant presents with unique clinical features, and is associated with distinctive underlying pathology and neuroimaging findings. Unlike primary progressive aphasia, apraxia of speech is a disorder that involves inaccurate production of sounds secondary to impaired planning or programming of speech movements. Primary progressive apraxia of speech is a neurodegenerative form of apraxia of speech, and it should be distinguished from primary progressive aphasia given its discrete clinicopathological presentation. Recently, there have been substantial advances in our understanding of these speech and language disorders. The clinical, neuroimaging, and histopathological features of primary progressive aphasia and apraxia of speech are reviewed in this article. The distinctions among these disorders for accurate diagnosis are increasingly important from a prognostic and therapeutic standpoint. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Faundez-Zanuy, Marcos; Esposito, Antonietta; Cordasco, Gennaro; Drugman, Thomas; Solé-Casals, Jordi; Morabito, Francesco
This book presents recent advances in nonlinear speech processing beyond nonlinear techniques. It shows that it exploits heuristic and psychological models of human interaction in order to succeed in the implementations of socially believable VUIs and applications for human health and psychological support. The book takes into account the multifunctional role of speech and what is “outside of the box” (see Björn Schuller’s foreword). To this aim, the book is organized in 6 sections, each collecting a small number of short chapters reporting advances “inside” and “outside” themes related to nonlinear speech research. The themes emphasize theoretical and practical issues for modelling socially believable speech interfaces, ranging from efforts to capture the nature of sound changes in linguistic contexts and the timing nature of speech; labors to identify and detect speech features that help in the diagnosis of psychological and neuronal disease, attempts to improve the effectiveness and performa...
This article discusses the change in premise that digitally produced sound brings about and how digital technologies more generally have changed our relationship to the musical artifact, not simply in degree but in kind. It demonstrates how our acoustical conceptions are thoroughly challenged...... by the digital production of sound and, by questioning the ontological basis for digital sound, turns our understanding of the core term substance upside down....
T. E. Braudo
Full Text Available The purpose of this article is to acquaint the specialists, working with children having developmental disorders, with age-related norms for speech development. Many well-known linguists and psychologists studied speech ontogenesis (logogenesis. Speech is a higher mental function, which integrates many functional systems. Speech development in infants during the first months after birth is ensured by the innate hearing and emerging ability to fix the gaze on the face of an adult. Innate emotional reactions are also being developed during this period, turning into nonverbal forms of communication. At about 6 months a baby starts to pronounce some syllables; at 7–9 months – repeats various sounds combinations, pronounced by adults. At 10–11 months a baby begins to react on the words, referred to him/her. The first words usually appear at an age of 1 year; this is the start of the stage of active speech development. At this time it is acceptable, if a child confuses or rearranges sounds, distorts or misses them. By the age of 1.5 years a child begins to understand abstract explanations of adults. Significant vocabulary enlargement occurs between 2 and 3 years; grammatical structures of the language are being formed during this period (a child starts to use phrases and sentences. Preschool age (3–7 y. o. is characterized by incorrect, but steadily improving pronunciation of sounds and phonemic perception. The vocabulary increases; abstract speech and retelling are being formed. Children over 7 y. o. continue to improve grammar, writing and reading skills. The described stages may not have strict age boundaries, as soon as they are dependent not only on environment, but also on the child’s mental constitution, heredity and character.
Fuchs, H. V.; Möser, M.
Sound absorption indicates the transformation of sound energy into heat. It is, for instance, employed to design the acoustics in rooms. The noise emitted by machinery and plants shall be reduced before arriving at a workplace; auditoria such as lecture rooms or concert halls require a certain reverberation time. Such design goals are realised by installing absorbing components at the walls with well-defined absorption characteristics, which are adjusted for corresponding demands. Sound absorbers also play an important role in acoustic capsules, ducts and screens to avoid sound immission from noise intensive environments into the neighbourhood.
Hjortkjær, Jens; Kassuba, Tanja; Madsen, Kristoffer Hougaard
In everyday sound environments, we recognize sound sources and events by attending to relevant aspects of an acoustic input. Evidence about the cortical mechanisms involved in extracting relevant category information from natural sounds is, however, limited to speech. Here, we used functional MRI...... discriminability. Action categories were more accurately decoded in auditory cortex when subjects identified action information. Conversely, the material of the same sound sources was decoded with higher accuracy in the inferior frontal cortex during material identification. Representational similarity analyses...
Tierney, Cheryl D; Pitterle, Kathleen; Kurtz, Marie; Nakhla, Mark; Todorow, Carlyn
Childhood apraxia of speech is a neurologic speech sound disorder in which children have difficulty constructing words and sounds due to poor motor planning and coordination of the articulators required for speech sound production. We report the case of a 3-year-old boy strongly suspected to have childhood apraxia of speech at 18 months of age who used multimodal communication to facilitate language development throughout his work with a speech language pathologist. In 18 months of an intensive structured program, he exhibited atypical rapid improvement, progressing from having no intelligible speech to achieving age-appropriate articulation. We suspect that early introduction of sign language by family proved to be a highly effective form of language development, that when coupled with intensive oro-motor and speech sound therapy, resulted in rapid resolution of symptoms. Copyright © 2016 by the American Academy of Pediatrics.
Wind turbines are an environmentally friendly and sustainable power source. Unfortunately, the noise impact can cause deteriorated living conditions for nearby residents. The audibility of wind turbine sound is influenced by ambient sound. This thesis deals with some aspects of noise from wind turbines. Ambient sounds influence the audibility of wind turbine noise. Models for assessing two commonly occurring natural ambient sounds namely vegetation sound and sound from breaking waves are pres...
Grosse, Julian; van de Par, Steven; Trahiotis, Constantine
The ability to localize sound sources in reverberant environments is dependent upon first-arriving information, an outcome commonly termed "the precedence effect." For example, in laboratory experiments, the combination of a leading (direct) sound followed by a lagging (reflected) sound is localized in the direction of the leading sound. This study was designed to measure the degree to which stimulus compactness/diffuseness (i.e., coherence as represented by interaural cross correlation) of leading and lagging sounds influences performance. The compactness/diffuseness of leading or lagging sounds was varied by either presenting a sound from a single loudspeaker or by presenting mutually uncorrelated versions of similar sounds from nine adjacent loudspeakers. In separate experiments, the listener's task was to point to the perceived location of leading and lagging 10-ms long low-pass filtered white noises or 2-s long tokens of speech. The leading and lagging stimuli were presented either from speakers located directly in front of the listeners or from speakers located ±45° to the right or left. The results indicate that leading compact (coherent) sounds influence perceived location more so than do leading diffuse (incoherent) sounds. This was true independent of whether the sounds were Gaussian noises or tokens of speech.
enunciation, accent, and pronunciation may adversely affect sound conveyance and speech intelligibility during respirator wear. It was postulated that an...having three sounds in a consonant- vowel -consonant sequence. The MRT requires listeners to correctly identify single-syllable words spoken by a...respirator wearer. Sound output level, enunciation, accent, and pronunciation of the respirator-wearing test speaker may adversely affect the sound
Evans, S; McGettigan, C; Agnew, ZK; Rosen, S; Scott, SK
Spoken conversations typically take place in noisy environments and different kinds of masking sounds place differing demands on cognitive resources. Previous studies, examining the modulation of neural activity associated with the properties of competing sounds, have shown that additional speech streams engage the superior temporal gyrus. However, the absence of a condition in which target speech was heard without additional masking made it difficult to identify brain networks specific to masking and to ascertain the extent to which competing speech was processed equivalently to target speech. In this study, we scanned young healthy adults with continuous functional Magnetic Resonance Imaging (fMRI), whilst they listened to stories masked by sounds that differed in their similarity to speech. We show that auditory attention and control networks are activated during attentive listening to masked speech in the absence of an overt behavioural task. We demonstrate that competing speech is processed predominantly in the left hemisphere within the same pathway as target speech but is not treated equivalently within that stream, and that individuals who perform better in speech in noise tasks activate the left mid-posterior superior temporal gyrus more. Finally, we identify neural responses associated with the onset of sounds in the auditory environment, activity was found within right lateralised frontal regions consistent with a phasic alerting response. Taken together, these results provide a comprehensive account of the neural processes involved in listening in noise. PMID:26696297
Büchler, Michael; Allegro, Silvia; Launer, Stefan; Dillier, Norbert
A sound classification system for the automatic recognition of the acoustic environment in a hearing aid is discussed. The system distinguishes the four sound classes "clean speech," "speech in noise," "noise," and "music." A number of features that are inspired by auditory scene analysis are extracted from the sound signal. These features describe amplitude modulations, spectral profile, harmonicity, amplitude onsets, and rhythm. They are evaluated together with different pattern classifiers. Simple classifiers, such as rule-based and minimum-distance classifiers, are compared with more complex approaches, such as Bayes classifier, neural network, and hidden Markov model. Sounds from a large database are employed for both training and testing of the system. The achieved recognition rates are very high except for the class "speech in noise." Problems arise in the classification of compressed pop music, strongly reverberated speech, and tonal or fluctuating noises.
Saigusa, Hideto; Yamaguchi, Satoshi; Nakamura, Tsuyoshi; Komachi, Taro; Kadosono, Osamu; Ito, Hiroyuki; Saigusa, Makoto; Niimi, Seiji
Amyotrophic lateral sclerosis (ALS) is a progressive debilitating neurological disease. ALS disturbs the quality of life by affecting speech, swallowing and free mobility of the arms without affecting intellectual function. It is therefore of significance to improve intelligibility and quality of speech sounds, especially for ALS patients with slowly progressive courses. Currently, however, there is no effective or established approach to improve speech disorder caused by ALS. We investigated a surgical procedure to improve speech disorder for some patients with neuromuscular diseases with velopharyngeal closure incompetence. In this study, we performed the surgical procedure for two patients suffering from severe speech disorder caused by slowly progressing ALS. The patients suffered from speech disorder with hypernasality and imprecise and weak articulation during a 6-year course (patient 1) and a 3-year course (patient 2) of slowly progressing ALS. We narrowed bilateral lateral palatopharyngeal wall at velopharyngeal port, and performed this surgery under general anesthesia without muscle relaxant for the two patients. Postoperatively, intelligibility and quality of their speech sounds were greatly improved within one month without any speech therapy. The patients were also able to generate longer speech phrases after the surgery. Importantly, there was no serious complication during or after the surgery. In summary, we performed bilateral narrowing of lateral palatopharyngeal wall as a speech surgery for two patients suffering from severe speech disorder associated with ALS. With this technique, improved intelligibility and quality of speech can be maintained for longer duration for the patients with slowly progressing ALS.
Schmidbauer, O; Casacuberta, F; Castro, M J; Hegerl, G; Höge, H; Sanchez, J A; Zlokarnik, I
In this paper we demonstrate the feasibility and usefulness of articulation-based approaches in two major areas of speech technology: speech recognition and speech synthesis. Our articulatory recognition model estimates probabilities of categories of manner and place of articulation, which establish the articulatory feature vector. The transformation from the articulatory level to the symbolic level is performed by hidden Markov models or multi-layer perceptions. Evaluations show that the articulatory approach is a good basis for speaker-independent and speaker-adaptive speech recognition. We are now working on a more realistic articulatory model for speech recognition. An algorithm based on an analysis by synthesis model maps the acoustic signal to 10 articulatory parameters which describe the position of the articulators. EMA (electro-magnetic articulograph) measurements recorded at the University of Munich provide good initial estimates of tongue coordinates. In order to improve articulatory speech synthesis we investigated an accurate physical model for the generation of the glottal source with the aid of a numerical simulation. This model takes into account nonlinear vortical flow and its interaction with sound-waves. The simulation results can be used to improve the articulatory synthesis model developed by Ishizaka and Flanagan (1972).
Willadsen, Elisabeth; Chapman, Kathy
The purpose of this chapter is to provide an overview of speech development of children with cleft palate +/- cleft lip. The chapter will begin with a discussion of the impact of clefting on speech. Next, we will provide a brief description of those factors impacting speech development...... for this population of children. Finally, research examining various aspects of speech development of infants and young children with cleft palate (birth to age five) will be reviewed. This final section will be organized by typical stages of speech sound development (e.g., prespeech, the early word stage...
Park, H K; Bradley, J S
Subjective ratings of the audibility, annoyance, and loudness of music and speech sounds transmitted through 20 different simulated walls were used to identify better single number ratings of airborne sound insulation. The first part of this research considered standard measures such as the sound transmission class the weighted sound reduction index (R(w)) and variations of these measures [H. K. Park and J. S. Bradley, J. Acoust. Soc. Am. 126, 208-219 (2009)]. This paper considers a number of other measures including signal-to-noise ratios related to the intelligibility of speech and measures related to the loudness of sounds. An exploration of the importance of the included frequencies showed that the optimum ranges of included frequencies were different for speech and music sounds. Measures related to speech intelligibility were useful indicators of responses to speech sounds but were not as successful for music sounds. A-weighted level differences, signal-to-noise ratios and an A-weighted sound transmission loss measure were good predictors of responses when the included frequencies were optimized for each type of sound. The addition of new spectrum adaptation terms to R(w) values were found to be the most practical approach for achieving more accurate predictions of subjective ratings of transmitted speech and music sounds.
Wormsley, W. E.
This article introduces a group of six papers on sustainability of programs for visually handicapped persons in developing countries. Sustainability is discussed from an anthropological perspective, noting the importance of a social soundness analysis and a social impact assessment, enemies of sustainability, and the need for broad local input in…
Møller, Martin Bo; Olsen, Martin
Sound zones, i.e. spatially confined regions of individual audio content, can be created by appropriate filtering of the desired audio signals reproduced by an array of loudspeakers. The challenge of designing filters for sound zones is twofold: First, the filtered responses should generate...... an acoustic separation between the control regions. Secondly, the pre- and post-ringing as well as spectral deterioration introduced by the filters should be minimized. The tradeoff between acoustic separation and filter ringing is the focus of this paper. A weighted L2-norm penalty is introduced in the sound...
Riecke, Lars; Formisano, Elia; Sorger, Bettina; Başkent, Deniz; Gaudrain, Etienne
Speech is crucial for communication in everyday life. Speech-brain entrainment, the alignment of neural activity to the slow temporal fluctuations (envelope) of acoustic speech input, is a ubiquitous element of current theories of speech processing. Associations between speech-brain entrainment and
Oppenheim, Gary M.; Dell, Gary S.
Inner speech is typically characterized as either the activation of abstract linguistic representations or a detailed articulatory simulation that lacks only the production of sound. We present a study of the ‘speech errors’ that occur during the inner recitation of tongue-twister like phrases. Two forms of inner speech were tested: inner speech without articulatory movements and articulated (mouthed) inner speech. While mouthing one’s inner speech could reasonably be assumed to require more articulatory planning, prominent theories assume that such planning should not affect the experience of inner speech and consequently the errors that are ‘heard’ during its production. The errors occurring in articulated inner speech exhibited the phonemic similarity effect and lexical bias effect, two speech-error phenomena that, in overt speech, have been localized to an articulatory-feature processing level and a lexical-phonological level, respectively. In contrast, errors in unarticulated inner speech did not exhibit the phonemic similarity effect—just the lexical bias effect. The results are interpreted as support for a flexible abstraction account of inner speech. This conclusion has ramifications for the embodiment of language and speech and for the theories of speech production. PMID:21156877
Oppenheim, Gary M; Dell, Gary S
Inner speech is typically characterized as either the activation of abstract linguistic representations or a detailed articulatory simulation that lacks only the production of sound. We present a study of the speech errors that occur during the inner recitation of tongue-twister-like phrases. Two forms of inner speech were tested: inner speech without articulatory movements and articulated (mouthed) inner speech. Although mouthing one's inner speech could reasonably be assumed to require more articulatory planning, prominent theories assume that such planning should not affect the experience of inner speech and, consequently, the errors that are "heard" during its production. The errors occurring in articulated inner speech exhibited the phonemic similarity effect and the lexical bias effect--two speech-error phenomena that, in overt speech, have been localized to an articulatory-feature-processing level and a lexical-phonological level, respectively. In contrast, errors in unarticulated inner speech did not exhibit the phonemic similarity effect--just the lexical bias effect. The results are interpreted as support for a flexible abstraction account of inner speech. This conclusion has ramifications for the embodiment of language and speech and for the theories of speech production.
Gold, Ellen Reid
Presents a guide to recorded sound and television collections available to scholars, researchers, and teachers in mass communication, political rhetoric, public address, folklore, theatre, intercultural communication, oral interpretation, speech, and drama. Focuses on spoken word recordings, such as newscasts and speeches, and includes recordings…
Lametti, Daniel R; Krol, Sonia A; Shiller, Douglas M; Ostry, David J
The perception of speech is notably malleable in adults, yet alterations in perception seem to have little impact on speech production. However, we hypothesized that speech perceptual training might immediately influence speech motor learning. To test this, we paired a speech perceptual-training task with a speech motor-learning task. Subjects performed a series of perceptual tests designed to measure and then manipulate the perceptual distinction between the words head and had. Subjects then produced head with the sound of the vowel altered in real time so that they heard themselves through headphones producing a word that sounded more like had. In support of our hypothesis, the amount of motor learning in response to the voice alterations depended on the perceptual boundary acquired through perceptual training. The studies show that plasticity in adults' speech perception can have immediate consequences for speech production in the context of speech learning. © The Author(s) 2014.
Crocker, Malcolm J.; Jacobsen, Finn
This chapter is an overview, intended for readers with no special knowledge about this particular topic. The chapter deals with all aspects of sound intensity and its measurement from the fundamental theoretical background to practical applications of the measurement technique.......This chapter is an overview, intended for readers with no special knowledge about this particular topic. The chapter deals with all aspects of sound intensity and its measurement from the fundamental theoretical background to practical applications of the measurement technique....
Crocker, M.J.; Jacobsen, Finn
This chapter is an overview, intended for readers with no special knowledge about this particular topic. The chapter deals with all aspects of sound intensity and its measurement from the fundamental theoretical background to practical applications of the measurement technique.......This chapter is an overview, intended for readers with no special knowledge about this particular topic. The chapter deals with all aspects of sound intensity and its measurement from the fundamental theoretical background to practical applications of the measurement technique....
Rao, K Sreenivasa
This book discusses the contribution of articulatory and excitation source information in discriminating sound units. The authors focus on excitation source component of speech -- and the dynamics of various articulators during speech production -- for enhancement of speech recognition (SR) performance. Speech recognition is analyzed for read, extempore, and conversation modes of speech. Five groups of articulatory features (AFs) are explored for speech recognition, in addition to conventional spectral features. Each chapter provides the motivation for exploring the specific feature for SR task, discusses the methods to extract those features, and finally suggests appropriate models to capture the sound unit specific knowledge from the proposed features. The authors close by discussing various combinations of spectral, articulatory and source features, and the desired models to enhance the performance of SR systems.
Russo, Nicole; Nicol, Trent; Trommer, Barbara; Zecker, Steve; Kraus, Nina
Language impairment is a hallmark of autism spectrum disorders (ASD). The origin of the deficit is poorly understood although deficiencies in auditory processing have been detected in both perception and cortical encoding of speech sounds. Little is known about the processing and transcription of speech sounds at earlier (brainstem) levels or…
Suter, 1989b). Speech sounds are called phonemes. These phonemes can be divided into two groups, vowels and consonants. Vowel sounds tend to...and are generally of the form consonant- vowel -consonant (CVC). The lists were generated to form 50 related ensembles, each ensemble consisting of 6...words. The vowel in each ensemble is fixed and the words vary in either the initial or the final phoneme. For each stimulus word the listener must
Rämö, Jussi; Christensen, Lasse; Bech, Søren
This paper focuses on validating a perceptual distraction model, which aims to predict user's perceived distraction caused by audio-on-audio interference. Originally, the distraction model was trained with music targets and interferers using a simple loudspeaker setup, consisting of only two...... loudspeakers. Recently, the model was successfully validated in a complex personal sound-zone system with music targets and speech interferers. In this paper, a second round of validations were conducted by physically altering the sound-zone system and running listening experiments utilizing both of the two...... sound zones within the sound-zone system. Thus, validating the model using a different sound-zone system with both speech-on-music and music-on-speech stimuli sets. The results show that the model performance is equally good in both zones, i.e., with both speech- on-music and music-on-speech stimuli...
Terband, H.R.; van Brenk, F.J.; van Doornik-van der Zee, J.C.
Background/purpose: Several studies indicate a close relation between auditory and speech motor functions in children with speech sound disorders (SSD). The aim of this study was to investigate the ability to compensate and adapt for perturbed auditory feedback in children with SSD compared to
Trouvain, Jürgen; Truong, Khiet Phuong; Devillers, L.; Schuller, B.; Batliner, A.; Rosso, P.; Douglas-Cowie, E.; Cowie, R.; Pelachaud, C.
Conversations do not only consist of spoken words but they also consist of non-verbal vocalisations. Since there is no standard to define and to classify (possible) non-speech sounds the annotations for these vocalisations differ very much for various corpora of conversational speech. There seems to
McCormack, Jane; McAllister, Lindy; McLeod, Sharynne; Harrison, Linda
This study describes the experience of childhood speech impairment (speech sound disorder) from the perspective of two young men and their mothers. Semi-structured interviews were conducted with the four participants, with questions framed around the "International classification of functioning, disability and health" (ICF; WHO, 2001) to gain a…
Shriberg, Lawrence D.; Lewis, Barbara A.; Tomblin, J. Bruce; McSweeny, Jane L.; Karlsson, Heather B.; Scheer, Alison R.
Converging evidence supports the hypothesis that the most common subtype of childhood speech sound disorder (SSD) of currently unknown origin is genetically transmitted. We report the first findings toward a set of diagnostic markers to differentiate this proposed etiological subtype (provisionally termed "speech delay-genetic") from other…
Shriberg, Lawrence D.; Paul, Rhea; Black, Lois M.; van Santen, Jan P.
In a sample of 46 children aged 4 to 7 years with Autism Spectrum Disorder (ASD) and intelligible speech, there was no statistical support for the hypothesis of concomitant Childhood Apraxia of Speech (CAS). Perceptual and acoustic measures of participants’ speech, prosody, and voice were compared with data from 40 typically-developing children, 13 preschool children with Speech Delay, and 15 participants aged 5 to 49 years with CAS in neurogenetic disorders. Speech Delay and Speech Errors, respectively, were modestly and substantially more prevalent in participants with ASD than reported population estimates. Double dissociations in speech, prosody, and voice impairments in ASD were interpreted as consistent with a speech attunement framework, rather than with the motor speech impairments that define CAS. Key Words: apraxia, dyspraxia, motor speech disorder, speech sound disorder PMID:20972615
Visser-Bochane, Margot I.; Luinge, Margreet R.; Reijneveld, Sijmen A.; Krijnen, W.P.; Schans, van der, C.P.
Speech language disorders, which include speech sound disorders and language disorders, are common in early childhood. These problems, and in particular language problems, frequently go under diagnosed, because current screening instruments have no satisfying psychometric properties. Recent research describes consensus among healthcare professionals on clinical signs of atypical speech language development. The aim of this study is to construct a scale with characteristics from different doma...
Anne Birgitta Nilsen
Full Text Available The manifesto of the Norwegian terrorist Anders Behring Breivik is based on the “Eurabia” conspiracy theory. This theory is a key starting point for hate speech amongst many right-wing extremists in Europe, but also has ramifications beyond these environments. In brief, proponents of the Eurabia theory claim that Muslims are occupying Europe and destroying Western culture, with the assistance of the EU and European governments. By contrast, members of Al-Qaeda and other extreme Islamists promote the conspiracy theory “the Crusade” in their hate speech directed against the West. Proponents of the latter theory argue that the West is leading a crusade to eradicate Islam and Muslims, a crusade that is similarly facilitated by their governments. This article presents analyses of texts written by right-wing extremists and Muslim extremists in an effort to shed light on how hate speech promulgates conspiracy theories in order to spread hatred and intolerance.The aim of the article is to contribute to a more thorough understanding of hate speech’s nature by applying rhetorical analysis. Rhetorical analysis is chosen because it offers a means of understanding the persuasive power of speech. It is thus a suitable tool to describe how hate speech works to convince and persuade. The concepts from rhetorical theory used in this article are ethos, logos and pathos. The concept of ethos is used to pinpoint factors that contributed to Osama bin Laden's impact, namely factors that lent credibility to his promotion of the conspiracy theory of the Crusade. In particular, Bin Laden projected common sense, good morals and good will towards his audience. He seemed to have coherent and relevant arguments; he appeared to possess moral credibility; and his use of language demonstrated that he wanted the best for his audience.The concept of pathos is used to define hate speech, since hate speech targets its audience's emotions. In hate speech it is the
Hu, Weiping; Lai, Kefang; Du, Minghui; Chen, Ruchong; Zhong, Shijung; Chen, Rongchang; Zhong, Nanshan
Cough is one of the most common symptoms of many respiratory diseases; the characteristics of intensity and frequency of cough sound offer important clinical messages. When using these messages, we have need to differentiate the cough sound from the other sounds such as speech voice, throat clearing sound and nose clearing sound. In this paper, based on Empirical Mode Decomposition (EMD) and Hidden Markov Model (HMM), we proposed a novel method to analyze and detect cough sound. Employing the property of adaptive dyadic filter banks of EMD, we gained the mean energy distribution in the frequency domain of the signals in order to analyze the statistical characteristics of cough sound and of other sounds not accompanied by cough, and then we found the optimal characteristics for the recognition using HMM. The experiments on clinical date showed that this optimal characteristic method effectively improved the detective rate of cough sound.
Lewis, Barbara A.; Freebairn, Lisa A.; Hansen, Amy J.; Iyenger, Sudha K.; Taylor, H. Gerry
Purpose: The primary aim of this study was to examine differences in speech/language and written language skills between children with suspected childhood apraxia of speech (CAS) and children with other speech-sound disorders at school age. Method: Ten children (7 males and 3 females) who were clinically diagnosed with CAS (CAS group) were…
Meronen, Auli; Tiippana, Kaisa; Westerholm, Jari; Ahonen, Timo
Purpose: The effect of the signal-to-noise ratio (SNR) on the perception of audiovisual speech in children with and without developmental language disorder (DLD) was investigated by varying the noise level and the sound intensity of acoustic speech. The main hypotheses were that the McGurk effect (in which incongruent visual speech alters the…
Arweiler, Iris; Buchholz, Jörg
The auditory system takes advantage of early reflections (ERs) in a room by integrating them with the direct sound (DS) and thereby increasing the effective speech level. In the present paper the benefit from realistic ERs on speech intelligibility in diffuse speech-shaped noise was investigated...
Gerrits, Ellen; de Bree, Elise
Speech perception and speech production were examined in 3-year-old Dutch children at familial risk of developing dyslexia. Their performance in speech sound categorisation and their production of words was compared to that of age-matched children with specific language impairment (SLI) and typically developing controls. We found that speech…
Benesty, Jacob; Chen, Jingdong
We live in a noisy world! In all applications (telecommunications, hands-free communications, recording, human-machine interfaces, etc.) that require at least one microphone, the signal of interest is usually contaminated by noise and reverberation. As a result, the microphone signal has to be ""cleaned"" with digital signal processing tools before it is played out, transmitted, or stored.This book is about speech enhancement. Different well-known and state-of-the-art methods for noise reduction, with one or multiple microphones, are discussed. By speech enhancement, we mean not only noise red
Elliott, Emily M.; Bhagat, Shaum P.; Lynn, Sharon D.
This study investigated the effects of irrelevant sounds on the serial recall performance of visually presented digits in a sample of children diagnosed with (central) auditory processing disorders [(C)APD] and age- and span-matched control groups. The irrelevant sounds used were samples of tones and speech. Memory performance was significantly…
Full Text Available Knowledge extraction by just listening to sounds is a distinctive property. Speech signal is more effective means of communication than text because blind and visually impaired persons can also respond to sounds. This paper aims to develop a cost effective, and user friendly optical character recognition (OCR based speech synthesis system. The OCR based speech synthesis system has been developed using Laboratory virtual instruments engineering workbench (LabVIEW 7.1.
Weisser, Adam; Rindel, Jens Holger
ratings. The classical bass ratio definitions showed poor correlation with all subjective ratings. The overall sound quality ratings gave different results for speech and music. For speech the preferred mean RT should be as low as possible, whereas for music there was found a preferred range between 0......The acoustics of small rooms has been studied with emphasis on sound quality, boominess and boxiness when the rooms are used for speech or music. Seven rooms with very different characteristics have been used for the study. Subjective listening tests were made using binaural recordings...
Carbonell, Kathy M.
One of the lasting concerns in audiology is the unexplained individual differences in speech perception performance even for individuals with similar audiograms. One proposal is that there are cognitive/perceptual individual differences underlying this vulnerability and that these differences are present in normal hearing (NH) individuals but do not reveal themselves in studies that use clear speech produced in quiet (because of a ceiling effect). However, previous studies have failed to uncover cognitive/perceptual variables that explain much of the variance in NH performance on more challenging degraded speech tasks. This lack of strong correlations may be due to either examining the wrong measures (e.g., working memory capacity) or to there being no reliable differences in degraded speech performance in NH listeners (i.e., variability in performance is due to measurement noise). The proposed project has 3 aims; the first, is to establish whether there are reliable individual differences in degraded speech performance for NH listeners that are sustained both across degradation types (speech in noise, compressed speech, noise-vocoded speech) and across multiple testing sessions. The second aim is to establish whether there are reliable differences in NH listeners' ability to adapt their phonetic categories based on short-term statistics both across tasks and across sessions; and finally, to determine whether performance on degraded speech perception tasks are correlated with performance on phonetic adaptability tasks, thus establishing a possible explanatory variable for individual differences in speech perception for NH and hearing impaired listeners.
Bresin, Roberto; Askenfelt, Anders; Friberg, Anders; Hansen, Kjetil; Ternström, Sten
The SMC Sound and Music Computing group at KTH (formerly the Music Acoustics group) is part of the Department of Speech Music and Hearing, School of Computer Science and Communication. In this short report we present the current status of the group mainly focusing on its research. tmh_import_13_01_02, tmh_id_3757 QC 20130503
Explorations and analysis of soundscapes have, since Canadian R. Murray Schafer's work during the early 1970's, developed into various established research - and artistic disciplines. The interest in sonic environments is today present within a broad range of contemporary art projects and in arch......Explorations and analysis of soundscapes have, since Canadian R. Murray Schafer's work during the early 1970's, developed into various established research - and artistic disciplines. The interest in sonic environments is today present within a broad range of contemporary art projects...... and in architectural design. Aesthetics, psychoacoustics, perception, and cognition are all present in this expanding field embracing such categories as soundscape composition, sound art, sonic art, sound design, sound studies and auditory culture. Of greatest significance to the overall field is the investigation...
Imai, Mutsumi; Kita, Sotaro
Sound symbolism is a non-arbitrary relationship between speech sounds and meaning. We review evidence that, contrary to the traditional view in linguistics, sound symbolism is an important design feature of language, which affects online processing of language, and most importantly, language acquisition. We propose the sound symbolism bootstrapping hypothesis, claiming that (i) pre-verbal infants are sensitive to sound symbolism, due to a biologically endowed ability to map and integrate multi-modal input, (ii) sound symbolism helps infants gain referential insight for speech sounds, (iii) sound symbolism helps infants and toddlers associate speech sounds with their referents to establish a lexical representation and (iv) sound symbolism helps toddlers learn words by allowing them to focus on referents embedded in a complex scene, alleviating Quine's problem. We further explore the possibility that sound symbolism is deeply related to language evolution, drawing the parallel between historical development of language across generations and ontogenetic development within individuals. Finally, we suggest that sound symbolism bootstrapping is a part of a more general phenomenon of bootstrapping by means of iconic representations, drawing on similarities and close behavioural links between sound symbolism and speech-accompanying iconic gesture. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
Duelund Mortensen, Peder
Præsentation af projektresultater fra Interreg forskningen Sound Settlements om udvikling af bæredygtighed i det almene boligbyggerier i København, Malmø, Helsingborg og Lund samt europæiske eksempler på best practice......Præsentation af projektresultater fra Interreg forskningen Sound Settlements om udvikling af bæredygtighed i det almene boligbyggerier i København, Malmø, Helsingborg og Lund samt europæiske eksempler på best practice...
Speech intelligibility (SI) is important for different fields of research, engineering and diagnostics in order to quantify very different phenomena like the quality of recordings, communication and playback devices, the reverberation of auditoria, characteristics of hearing impairment, benefit using hearing aids or combinations of these things.
In order for speech to be informative and communicative, segmental and suprasegmental variation is mandatory. Only this leads to meaningful words and sentences. The building blocks are no stable entities put next to each other (like beads on a string or like printed text), but there are gradual
Benesty, Jacob; Jensen, Jesper Rindom; Christensen, Mads Græsbøll
and their performance bounded and assessed in terms of noise reduction and speech distortion. The book shows how various filter designs can be obtained in this framework, including the maximum SNR, Wiener, LCMV, and MVDR filters, and how these can be applied in various contexts, like in single-channel and multichannel...
Iarocci, Grace; Rombough, Adrienne; Yager, Jodi; Weeks, Daniel J.; Chua, Romeo
The bimodal perception of speech sounds was examined in children with autism as compared to mental age--matched typically developing (TD) children. A computer task was employed wherein only the mouth region of the face was displayed and children reported what they heard or saw when presented with consonant-vowel sounds in unimodal auditory…
de Jong, Franciska M.G.; Gauvain, Jean-Luc; den Hartog, Jurgen; den Hartog, Jeremy; Netter, Klaus
This paper describes the Olive project which aims to support automated indexing of video material by use of human language technologies. Olive is making use of speech recognition to automatically derive transcriptions of the sound tracks, generating time-coded linguistic elements which serve as the
G. Vinodh Kumar; Tamesh Halder; Amit Kumar Jaiswal; Abhishek Mukherjee; Dipanjan Roy; Arpan Banerjee
Observable lip movements of the speaker influence perception of auditory speech. A classical example of this influence is reported by listeners who perceive an illusory (cross-modal) speech sound (McGurk-effect) when presented with incongruent audio-visual (AV) speech stimuli. Recent neuroimaging studies of AV speech perception accentuate the role of frontal, parietal, and the integrative brain sites in the vicinity of the superior temporal sulcus (STS) for multisensory speech perception. How...
A. Goedegebure (Andre)
textabstractHearing-aid users often continue to have problems with poor speech understanding in difficult acoustical conditions. Another generally accounted problem is that certain sounds become too loud whereas other sounds are still not audible. Dynamic range compression is a signal processing
Baart, Martijn; Bortfeld, Heather; Vroomen, Jean
The correspondence between auditory speech and lip-read information can be detected based on a combination of temporal and phonetic cross-modal cues. Here, we determined the point in developmental time at which children start to effectively use phonetic information to match a speech sound with one of two articulating faces. We presented 4- to 11-year-olds (N=77) with three-syllabic sine-wave speech replicas of two pseudo-words that were perceived as non-speech and asked them to match the sounds with the corresponding lip-read video. At first, children had no phonetic knowledge about the sounds, and matching was thus based on the temporal cues that are fully retained in sine-wave speech. Next, we trained all children to perceive the phonetic identity of the sine-wave speech and repeated the audiovisual (AV) matching task. Only at around 6.5 years of age did the benefit of having phonetic knowledge about the stimuli become apparent, thereby indicating that AV matching based on phonetic cues presumably develops more slowly than AV matching based on temporal cues. Copyright © 2014 Elsevier Inc. All rights reserved.
Home; Journals; Resonance – Journal of Science Education; Volume 4; Issue 6. Second Sound - The Role of Elastic Waves. R Srinivasan. General Article Volume 4 Issue 6 June 1999 pp 15-19. Fulltext. Click here to view fulltext PDF. Permanent link: http://www.ias.ac.in/article/fulltext/reso/004/06/0015-0019 ...
Home; Journals; Resonance – Journal of Science Education; Volume 4; Issue 3. Second Sound - Waves of Entropy and Temperature. R Srinivasan. General Article Volume 4 Issue 3 March 1999 pp 16-24. Fulltext. Click here to view fulltext PDF. Permanent link: http://www.ias.ac.in/article/fulltext/reso/004/03/0016-0024 ...
Frankle, Christen M.
There is provided an apparatus and method for assisting speech recovery in people with inability to speak due to aphasia, apraxia or another condition with similar effect. A hollow, rigid, thin-walled tube with semi-circular or semi-elliptical cut out shapes at each open end is positioned such that one end mates with the throat/voice box area of the neck of the assistor and the other end mates with the throat/voice box area of the assisted. The speaking person (assistor) makes sounds that produce standing wave vibrations at the same frequency in the vocal cords of the assisted person. Driving the assisted person's vocal cords with the assisted person being able to hear the correct tone enables the assisted person to speak by simply amplifying the vibration of membranes in their throat.
Frankle, Christen M.
There is provided an apparatus and method for assisting speech recovery in people with inability to speak due to aphasia, apraxia or another condition with similar effect. A hollow, rigid, thin-walled tube with semi-circular or semi-elliptical cut out shapes at each open end is positioned such that one end mates with the throat/voice box area of the neck of the assistor and the other end mates with the throat/voice box area of the assisted. The speaking person (assistor) makes sounds that produce standing wave vibrations at the same frequency in the vocal cords of the assisted person. Driving the assisted person's vocal cords with the assisted person being able to hear the correct tone enables the assisted person to speak by simply amplifying the vibration of membranes in their throat.
Full Text Available This paper considers a medium as a substantial translator: an intermediary between the producers and receivers of a communicational act. A medium is a material support to the spiritual potential of human sources. If the medium is a support to meaning, then the relations between different media can be interpreted as a space for making sense of these meanings, a generator of sense: it means that the interaction of substances creates an intermedial space that conceives of a contextualization of specific meaningful elements in order to combine them into the sense of a communicational intervention. The theater itself is multimedia. A theatrical event is a communicational act based on a combination of several autonomous structures: text, scenography, light design, sound, directing, literary interpretation, speech, and, of course, the one that contains all of these: the actor in a human body. The actor is a physical and symbolic, anatomic, and emblematic figure in the synesthetic theatrical act because he reunites in his body all the essential principles and components of theater itself. The actor is an audio-visual being, made of kinetic energy, speech, and human spirit. The actor’s body, as a source, instrument, and goal of the theater, becomes an intersection of sound and light. However, theater as intermedial art is no intermediate practice; it must be seen as interposing bodies between conceivers and receivers, between authors and auditors. The body is not self-evident; the body in contemporary art forms is being redefined as a privilege. The art needs bodily dimensions to explore the medial qualities of substances: because it is alive, it returns to studying biology. The fact that theater is an archaic art form is also the purest promise of its future.
Clarke, Jeanne; Gaudrain, Etienne; Chatterjee, Monita; Başkent, Deniz
Phonemic restoration, or top down repair of speech, is the ability of the brain to perceptually reconstruct missing speech sounds, using remaining speech features, linguistic knowledge and context. This usually occurs in conditions where the interrupted speech is perceived as continuous. The main
Rämö, Jussi; Christensen, Lasse; Bech, Søren
This paper focuses on validating a perceptual distraction model, which aims to predict user’s perceived distraction caused by audio-on-audio interference, e.g., two competing audio sources within the same listening space. Originally, the distraction model was trained with music-on-music stimuli...... using a simple loudspeaker setup, consisting of only two loudspeakers, one for the target sound source and the other for the interfering sound source. Recently, the model was successfully validated in a complex personal sound-zone system with speech-on-music stimuli. Second round of validations were...... conducted by physically altering the sound-zone system and running a set of new listening experiments utilizing two sound zones within the sound-zone system. Thus, validating the model using a different sound-zone system with both speech-on-music and music-on-speech stimuli sets. Preliminary results show...
Jiao, Mingke; Lu, Guohua; Jing, Xijing; Li, Sheng; Li, Yanfeng; Wang, Jianqi
Different speech detection sensors have been developed over the years but they are limited by the loss of high frequency speech energy, and have restricted non-contact detection due to the lack of penetrability. This paper proposes a novel millimeter microwave radar sensor to detect speech signals. The utilization of a high operating frequency and a superheterodyne receiver contributes to the high sensitivity of the radar sensor for small sound vibrations. In addition, the penetrability of microwaves allows the novel sensor to detect speech signals through nonmetal barriers. Results show that the novel sensor can detect high frequency speech energies and that the speech quality is comparable to traditional microphone speech. Moreover, the novel sensor can detect speech signals through a nonmetal material of a certain thickness between the sensor and the subject. Thus, the novel speech sensor expands traditional speech detection techniques and provides an exciting alternative for broader application prospects.
Full Text Available Different speech detection sensors have been developed over the years but they are limited by the loss of high frequency speech energy, and have restricted non-contact detection due to the lack of penetrability. This paper proposes a novel millimeter microwave radar sensor to detect speech signals. The utilization of a high operating frequency and a superheterodyne receiver contributes to the high sensitivity of the radar sensor for small sound vibrations. In addition, the penetrability of microwaves allows the novel sensor to detect speech signals through nonmetal barriers. Results show that the novel sensor can detect high frequency speech energies and that the speech quality is comparable to traditional microphone speech. Moreover, the novel sensor can detect speech signals through a nonmetal material of a certain thickness between the sensor and the subject. Thus, the novel speech sensor expands traditional speech detection techniques and provides an exciting alternative for broader application prospects.
Full Text Available Speech recognition system requires segmentation of speech waveform into fundamental acoustic units. Segmentation is a process of decomposing the speech signal into smaller units. Speech segmentation could be done using wavelet, fuzzy methods, Artificial Neural Networks and Hidden Markov Model. Speech segmentation is a process of breaking continuous stream of sound into some basic units like words, phonemes or syllable that could be recognized. Segmentation could be used to distinguish different types of audio signals from large amount of audio data, often referred as audio classification. The speech segmentation can be divided into two categories based on whether the algorithm uses previous knowledge of data to process the speech. The categories are blind segmentation and aided segmentation.The major issues with the connected speech recognition algorithms were the vocabulary size will be larger with variation in the combination of words in the connected speech and the complexity of the algorithm is more to find the best match for the given test pattern. To overcome these issues, the connected speech has to be segmented into words using the attributes of speech. A methodology using the temporal feature Short Term Energy was proposed and compared with an existing algorithm called Dynamic Thresholding segmentation algorithm which uses spectrogram image of the connected speech for segmentation.
van der Torn, M; Verdonck-de Leeuw, I M; Festen, J M; de Vries, M P; Mahieu, H F
In order to improve voice quality in female laryngectomees and/or laryngectomees with a hypotonic pharyngo-oesophageal segment, a sound-producing voice prosthesis was designed. The new source of voice consists of either one or two bent silicone lips which perform an oscillatory movement driven by the expired pulmonary air that flows along the outward-striking lips through the tracheo-oesophageal shunt valve. Four different prototypes of this pneumatic sound source were evaluated in vitro and in two female laryngectomees, testing the feasibility and characteristics of this new mechanism for alternative alaryngeal voice production. In vivo evaluation included acoustic analyses of both sustained vowels and read-aloud prose, videofluoroscopy, speech rate, and registration of tracheal phonatory pressure and vocal intensity. The mechanism proved feasible and did not result in unacceptable airflow resistance. The average pitch of voice increased and clarity improved in female laryngectomees. Pitch regulation of this prosthetic voice is possible with sufficient modulation to avoid monotony. The quality of voice attained through the sound-producing voice prostheses depends on a patient's ability to let pulmonary air flow easily through the pharyngo-oesophageal segment without evoking the low-frequency mucosal vibrations that form the regular tracheo-oesophageal shunt voice. These initial experimental and clinical results provide directions for the future development of sound-producing voice prostheses. A single relatively long lip in a container with a rectangular lumen that hardly protrudes from the voice prosthesis may have the most promising characteristics.
Full Text Available One of the major problems concerning the evolution of human language is to understand how sounds became associated to meaningful gestures. It has been proposed that the circuit controlling gestures and speech evolved from a circuit involved in the control of arm and mouth movements related to ingestion. This circuit contributed to the evolution of spoken language, moving from a system of communication based on arm gestures. The discovery of the mirror neurons has provided strong support for the gestural theory of speech origin because they offer a natural substrate for the embodiment of language and create a direct link between sender and receiver of a message. Behavioural studies indicate that manual gestures are linked to mouth movements used for syllable emission. Grasping with the hand selectively affected movement of inner or outer parts of the mouth according to syllable pronunciation and hand postures, in addition to hand actions, influenced the control of mouth grasp and vocalization. Gestures and words are also related to each other. It was found that when producing communicative gestures (emblems the intention to interact directly with a conspecific was transferred from gestures to words, inducing modification in voice parameters. Transfer effects of the meaning of representational gestures were found on both vocalizations and meaningful words. It has been concluded that the results of our studies suggest the existence of a system relating gesture to vocalization which was precursor of a more general system reciprocally relating gesture to word.
Rao, Preeti; van Dinther, R.; Veldhuis, Raymond N.J.; Kohlrausch, A.
Both in speech synthesis and in sound coding it is often beneficial to have a measure that predicts whether, and to what extent, two sounds are different. This paper addresses the problem of estimating the perceptual effects of small modifications to the spectral envelope of a harmonic sound. A
Jerger, Susan; Damian, Markus F; McAlpine, Rachel P; Abdi, Hervé
To communicate, children must discriminate and identify speech sounds. Because visual speech plays an important role in this process, we explored how visual speech influences phoneme discrimination and identification by children. Critical items had intact visual speech (e.g. bæz) coupled to non-intact (excised onsets) auditory speech (signified by /-b/æz). Children discriminated syllable pairs that differed in intactness (i.e. bæz:/-b/æz) and identified non-intact nonwords (/-b/æz). We predicted that visual speech would cause children to perceive the non-intact onsets as intact, resulting in more same responses for discrimination and more intact (i.e. bæz) responses for identification in the audiovisual than auditory mode. Visual speech for the easy-to-speechread /b/ but not for the difficult-to-speechread /g/ boosted discrimination and identification (about 35-45%) in children from four to fourteen years. The influence of visual speech on discrimination was uniquely associated with the influence of visual speech on identification and receptive vocabulary skills.
Vitevitch, Michael S
The influence of phonological similarity neighborhoods on the speed and accuracy of speech production was investigated with speech-error elicitation and picture-naming tasks. The results from 2 speech-error elicitation techniques-the spoonerisms of laboratory induced predisposition technique (B. J. Baars, 1992; B. J. Baars & M. T. Motley, 1974; M. T. Motley & B. J. Baars, 1976) and tongue twisters-showed that more errors were elicited for words with few similar sounding words (i.e., a sparse neighborhood) than for words with many similar sounding words (i.e., a dense neighborhood). The results from 3 picture-naming tasks showed that words with sparse neighborhoods were also named more slowly than words with dense neighborhoods. These findings demonstrate that multiple word forms are activated simultaneously and influence the speed and accuracy of speech production. The implications of these findings for current models of speech production are discussed.