WorldWideScience

Sample records for continuous speech stream

  1. Knockdown of Dyslexia-Gene Dcdc2 Interferes with Speech Sound Discrimination in Continuous Streams.

    Science.gov (United States)

    Centanni, Tracy Michelle; Booker, Anne B; Chen, Fuyi; Sloan, Andrew M; Carraway, Ryan S; Rennaker, Robert L; LoTurco, Joseph J; Kilgard, Michael P

    2016-04-27

    Dyslexia is the most common developmental language disorder and is marked by deficits in reading and phonological awareness. One theory of dyslexia suggests that the phonological awareness deficit is due to abnormal auditory processing of speech sounds. Variants in DCDC2 and several other neural migration genes are associated with dyslexia and may contribute to auditory processing deficits. In the current study, we tested the hypothesis that RNAi suppression of Dcdc2 in rats causes abnormal cortical responses to sound and impaired speech sound discrimination. In the current study, rats were subjected in utero to RNA interference targeting of the gene Dcdc2 or a scrambled sequence. Primary auditory cortex (A1) responses were acquired from 11 rats (5 with Dcdc2 RNAi; DC-) before any behavioral training. A separate group of 8 rats (3 DC-) were trained on a variety of speech sound discrimination tasks, and auditory cortex responses were acquired following training. Dcdc2 RNAi nearly eliminated the ability of rats to identify specific speech sounds from a continuous train of speech sounds but did not impair performance during discrimination of isolated speech sounds. The neural responses to speech sounds in A1 were not degraded as a function of presentation rate before training. These results suggest that A1 is not directly involved in the impaired speech discrimination caused by Dcdc2 RNAi. This result contrasts earlier results using Kiaa0319 RNAi and suggests that different dyslexia genes may cause different deficits in the speech processing circuitry, which may explain differential responses to therapy. Although dyslexia is diagnosed through reading difficulty, there is a great deal of variation in the phenotypes of these individuals. The underlying neural and genetic mechanisms causing these differences are still widely debated. In the current study, we demonstrate that suppression of a candidate-dyslexia gene causes deficits on tasks of rapid stimulus processing

  2. Rapid Statistical Learning Supporting Word Extraction From Continuous Speech.

    Science.gov (United States)

    Batterink, Laura J

    2017-07-01

    The identification of words in continuous speech, known as speech segmentation, is a critical early step in language acquisition. This process is partially supported by statistical learning, the ability to extract patterns from the environment. Given that speech segmentation represents a potential bottleneck for language acquisition, patterns in speech may be extracted very rapidly, without extensive exposure. This hypothesis was examined by exposing participants to continuous speech streams composed of novel repeating nonsense words. Learning was measured on-line using a reaction time task. After merely one exposure to an embedded novel word, learners demonstrated significant learning effects, as revealed by faster responses to predictable than to unpredictable syllables. These results demonstrate that learners gained sensitivity to the statistical structure of unfamiliar speech on a very rapid timescale. This ability may play an essential role in early stages of language acquisition, allowing learners to rapidly identify word candidates and "break in" to an unfamiliar language.

  3. Continuous sampling from distributed streams

    DEFF Research Database (Denmark)

    Graham, Cormode; Muthukrishnan, S.; Yi, Ke

    2012-01-01

    A fundamental problem in data management is to draw and maintain a sample of a large data set, for approximate query answering, selectivity estimation, and query planning. With large, streaming data sets, this problem becomes particularly difficult when the data is shared across multiple distribu......A fundamental problem in data management is to draw and maintain a sample of a large data set, for approximate query answering, selectivity estimation, and query planning. With large, streaming data sets, this problem becomes particularly difficult when the data is shared across multiple...... distributed sites. The main challenge is to ensure that a sample is drawn uniformly across the union of the data while minimizing the communication needed to run the protocol on the evolving data. At the same time, it is also necessary to make the protocol lightweight, by keeping the space and time costs low...... for each participant. In this article, we present communication-efficient protocols for continuously maintaining a sample (both with and without replacement) from k distributed streams. These apply to the case when we want a sample from the full streams, and to the sliding window cases of only the W most...

  4. Between-Word Simplification Patterns in the Continuous Speech of Children with Speech Sound Disorders

    Science.gov (United States)

    Klein, Harriet B.; Liu-Shea, May

    2009-01-01

    Purpose: This study was designed to identify and describe between-word simplification patterns in the continuous speech of children with speech sound disorders. It was hypothesized that word combinations would reveal phonological changes that were unobserved with single words, possibly accounting for discrepancies between the intelligibility of…

  5. Automatic transcription of continuous speech into syllable-like units ...

    Indian Academy of Sciences (India)

    style HMM models are generated for each of the clusters during training. During testing .... manual segmentation at syllable-like units followed by isolated style recognition of continu- ous speech ..... obtaining demisyllabic reference patterns.

  6. The Cortical Organization of Speech Processing: Feedback Control and Predictive Coding the Context of a Dual-Stream Model

    Science.gov (United States)

    Hickok, Gregory

    2012-01-01

    Speech recognition is an active process that involves some form of predictive coding. This statement is relatively uncontroversial. What is less clear is the source of the prediction. The dual-stream model of speech processing suggests that there are two possible sources of predictive coding in speech perception: the motor speech system and the…

  7. Spoken Word Recognition of Chinese Words in Continuous Speech

    Science.gov (United States)

    Yip, Michael C. W.

    2015-01-01

    The present study examined the role of positional probability of syllables played in recognition of spoken word in continuous Cantonese speech. Because some sounds occur more frequently at the beginning position or ending position of Cantonese syllables than the others, so these kinds of probabilistic information of syllables may cue the locations…

  8. Relative Contributions of the Dorsal vs. Ventral Speech Streams to Speech Perception are Context Dependent: a lesion study

    Directory of Open Access Journals (Sweden)

    Corianne Rogalsky

    2014-04-01

    Full Text Available The neural basis of speech perception has been debated for over a century. While it is generally agreed that the superior temporal lobes are critical for the perceptual analysis of speech, a major current topic is whether the motor system contributes to speech perception, with several conflicting findings attested. In a dorsal-ventral speech stream framework (Hickok & Poeppel 2007, this debate is essentially about the roles of the dorsal versus ventral speech processing streams. A major roadblock in characterizing the neuroanatomy of speech perception is task-specific effects. For example, much of the evidence for dorsal stream involvement comes from syllable discrimination type tasks, which have been found to behaviorally doubly dissociate from auditory comprehension tasks (Baker et al. 1981. Discrimination task deficits could be a result of difficulty perceiving the sounds themselves, which is the typical assumption, or it could be a result of failures in temporary maintenance of the sensory traces, or the comparison and/or the decision process. Similar complications arise in perceiving sentences: the extent of inferior frontal (i.e. dorsal stream activation during listening to sentences increases as a function of increased task demands (Love et al. 2006. Another complication is the stimulus: much evidence for dorsal stream involvement uses speech samples lacking semantic context (CVs, non-words. The present study addresses these issues in a large-scale lesion-symptom mapping study. 158 patients with focal cerebral lesions from the Mutli-site Aphasia Research Consortium underwent a structural MRI or CT scan, as well as an extensive psycholinguistic battery. Voxel-based lesion symptom mapping was used to compare the neuroanatomy involved in the following speech perception tasks with varying phonological, semantic, and task loads: (i two discrimination tasks of syllables (non-words and words, respectively, (ii two auditory comprehension tasks

  9. Mapping a lateralisation gradient within the ventral stream for auditory speech perception

    OpenAIRE

    Karsten eSpecht

    2013-01-01

    Recent models on speech perception propose a dual stream processing network, with a dorsal stream, extending from the posterior temporal lobe of the left hemisphere through inferior parietal areas into the left inferior frontal gyrus, and a ventral stream that is assumed to originate in the primary auditory cortex in the upper posterior part of the temporal lobe and to extend towards the anterior part of the temporal lobe, where it may connect to the ventral part of the inferior frontal gyrus...

  10. Mapping a lateralization gradient within the ventral stream for auditory speech perception

    OpenAIRE

    Specht, Karsten

    2013-01-01

    Recent models on speech perception propose a dual-stream processing network, with a dorsal stream, extending from the posterior temporal lobe of the left hemisphere through inferior parietal areas into the left inferior frontal gyrus, and a ventral stream that is assumed to originate in the primary auditory cortex in the upper posterior part of the temporal lobe and to extend toward the anterior part of the temporal lobe, where it may connect to the ventral part of the inferior frontal gyrus....

  11. Mapping from Speech to Images Using Continuous State Space Models

    DEFF Research Database (Denmark)

    Lehn-Schiøler, Tue; Hansen, Lars Kai; Larsen, Jan

    2005-01-01

    In this paper a system that transforms speech waveforms to animated faces are proposed. The system relies on continuous state space models to perform the mapping, this makes it possible to ensure video with no sudden jumps and allows continuous control of the parameters in 'face space...... a subjective point of view the model is able to construct an image sequence from an unknown noisy speech sequence even though the number of training examples are limited.......'. The performance of the system is critically dependent on the number of hidden variables, with too few variables the model cannot represent data, and with too many overfitting is noticed. Simulations are performed on recordings of 3-5 sec.\\$\\backslash\\$ video sequences with sentences from the Timit database. From...

  12. Temporal Context in Speech Processing and Attentional Stream Selection: A Behavioral and Neural perspective

    Science.gov (United States)

    Zion Golumbic, Elana M.; Poeppel, David; Schroeder, Charles E.

    2012-01-01

    The human capacity for processing speech is remarkable, especially given that information in speech unfolds over multiple time scales concurrently. Similarly notable is our ability to filter out of extraneous sounds and focus our attention on one conversation, epitomized by the ‘Cocktail Party’ effect. Yet, the neural mechanisms underlying on-line speech decoding and attentional stream selection are not well understood. We review findings from behavioral and neurophysiological investigations that underscore the importance of the temporal structure of speech for achieving these perceptual feats. We discuss the hypothesis that entrainment of ambient neuronal oscillations to speech’s temporal structure, across multiple time-scales, serves to facilitate its decoding and underlies the selection of an attended speech stream over other competing input. In this regard, speech decoding and attentional stream selection are examples of ‘active sensing’, emphasizing an interaction between proactive and predictive top-down modulation of neuronal dynamics and bottom-up sensory input. PMID:22285024

  13. Auditory, Visual and Audiovisual Speech Processing Streams in Superior Temporal Sulcus.

    Science.gov (United States)

    Venezia, Jonathan H; Vaden, Kenneth I; Rong, Feng; Maddox, Dale; Saberi, Kourosh; Hickok, Gregory

    2017-01-01

    The human superior temporal sulcus (STS) is responsive to visual and auditory information, including sounds and facial cues during speech recognition. We investigated the functional organization of STS with respect to modality-specific and multimodal speech representations. Twenty younger adult participants were instructed to perform an oddball detection task and were presented with auditory, visual, and audiovisual speech stimuli, as well as auditory and visual nonspeech control stimuli in a block fMRI design. Consistent with a hypothesized anterior-posterior processing gradient in STS, auditory, visual and audiovisual stimuli produced the largest BOLD effects in anterior, posterior and middle STS (mSTS), respectively, based on whole-brain, linear mixed effects and principal component analyses. Notably, the mSTS exhibited preferential responses to multisensory stimulation, as well as speech compared to nonspeech. Within the mid-posterior and mSTS regions, response preferences changed gradually from visual, to multisensory, to auditory moving posterior to anterior. Post hoc analysis of visual regions in the posterior STS revealed that a single subregion bordering the mSTS was insensitive to differences in low-level motion kinematics yet distinguished between visual speech and nonspeech based on multi-voxel activation patterns. These results suggest that auditory and visual speech representations are elaborated gradually within anterior and posterior processing streams, respectively, and may be integrated within the mSTS, which is sensitive to more abstract speech information within and across presentation modalities. The spatial organization of STS is consistent with processing streams that are hypothesized to synthesize perceptual speech representations from sensory signals that provide convergent information from visual and auditory modalities.

  14. Streaming Video Games: Copyright Infringement or Protected Speech?

    Directory of Open Access Journals (Sweden)

    Eirik Evert Elias Jungar

    2016-12-01

    Full Text Available Streaming video games, that is, live broadcasting playing video games on the internet, is incredibly popular. Millions tune into twitch.tv daily to watch eSport tournaments, their favourite streamer, and chat with other viewers. But all is not rosy in the world of streaming games. Recently, some game developers have aggressively exercised their copyright to, firstly, claim part of the streamers’ revenue, and secondly, control the context in which their game is shown. The article analyzes whether game developers have, and should have, such rights under EU copyright law. Reaching the conclusion that video game streams infringe the game developer’s right to communicate their works to the public, I argue that freedom of expression can and should be used to rein in their rights in certain cases. Subjecting the lawfulness of streams to game developers’ good will risks stifling the expressions of streamers. The streamers, their audience, and even the copyright holders, would be worse off for it.

  15. Streaming of Continuous Media for Distance Education Systems

    Science.gov (United States)

    Dashti, Ali; Safar, Maytham

    2007-01-01

    Distance education created new challenges regarding the delivery of large size isochronous continuous streaming media (SM) objects. In this paper, we consider the design of a framework for customized SM presentations, where each presentation consists of a number of SM objects that should be retrieved and displayed to the user in a coherent…

  16. Mapping a lateralisation gradient within the ventral stream for auditory speech perception

    Directory of Open Access Journals (Sweden)

    Karsten eSpecht

    2013-10-01

    Full Text Available Recent models on speech perception propose a dual stream processing network, with a dorsal stream, extending from the posterior temporal lobe of the left hemisphere through inferior parietal areas into the left inferior frontal gyrus, and a ventral stream that is assumed to originate in the primary auditory cortex in the upper posterior part of the temporal lobe and to extend towards the anterior part of the temporal lobe, where it may connect to the ventral part of the inferior frontal gyrus. This article describes and reviews the results from a series of complementary functional magnetic imaging (fMRI studies that aimed to trace the hierarchical processing network for speech comprehension within the left and right hemisphere with a particular focus on the temporal lobe and the ventral stream. As hypothesised, the results demonstrate a bilateral involvement of the temporal lobes in the processing of speech signals. However, an increasing leftward asymmetry was detected from auditory-phonetic to lexico-semantic processing and along the posterior-anterior axis, thus forming a lateralisation gradient. This increasing leftward lateralisation was particularly evident for the left superior temporal sulcus (STS and more anterior parts of the temporal lobe.

  17. Mapping a lateralization gradient within the ventral stream for auditory speech perception.

    Science.gov (United States)

    Specht, Karsten

    2013-01-01

    Recent models on speech perception propose a dual-stream processing network, with a dorsal stream, extending from the posterior temporal lobe of the left hemisphere through inferior parietal areas into the left inferior frontal gyrus, and a ventral stream that is assumed to originate in the primary auditory cortex in the upper posterior part of the temporal lobe and to extend toward the anterior part of the temporal lobe, where it may connect to the ventral part of the inferior frontal gyrus. This article describes and reviews the results from a series of complementary functional magnetic resonance imaging studies that aimed to trace the hierarchical processing network for speech comprehension within the left and right hemisphere with a particular focus on the temporal lobe and the ventral stream. As hypothesized, the results demonstrate a bilateral involvement of the temporal lobes in the processing of speech signals. However, an increasing leftward asymmetry was detected from auditory-phonetic to lexico-semantic processing and along the posterior-anterior axis, thus forming a "lateralization" gradient. This increasing leftward lateralization was particularly evident for the left superior temporal sulcus and more anterior parts of the temporal lobe.

  18. Online and unsupervised face recognition for continuous video stream

    Science.gov (United States)

    Huo, Hongwen; Feng, Jufu

    2009-10-01

    We present a novel online face recognition approach for video stream in this paper. Our method includes two stages: pre-training and online training. In the pre-training phase, our method observes interactions, collects batches of input data, and attempts to estimate their distributions (Box-Cox transformation is adopted here to normalize rough estimates). In the online training phase, our method incrementally improves classifiers' knowledge of the face space and updates it continuously with incremental eigenspace analysis. The performance achieved by our method shows its great potential in video stream processing.

  19. Lexical decoder for continuous speech recognition: sequential neural network approach

    International Nuclear Information System (INIS)

    Iooss, Christine

    1991-01-01

    The work presented in this dissertation concerns the study of a connectionist architecture to treat sequential inputs. In this context, the model proposed by J.L. Elman, a recurrent multilayers network, is used. Its abilities and its limits are evaluated. Modifications are done in order to treat erroneous or noisy sequential inputs and to classify patterns. The application context of this study concerns the realisation of a lexical decoder for analytical multi-speakers continuous speech recognition. Lexical decoding is completed from lattices of phonemes which are obtained after an acoustic-phonetic decoding stage relying on a K Nearest Neighbors search technique. Test are done on sentences formed from a lexicon of 20 words. The results are obtained show the ability of the proposed connectionist model to take into account the sequentiality at the input level, to memorize the context and to treat noisy or erroneous inputs. (author) [fr

  20. Recognition of speaker-dependent continuous speech with KEAL

    Science.gov (United States)

    Mercier, G.; Bigorgne, D.; Miclet, L.; Le Guennec, L.; Querre, M.

    1989-04-01

    A description of the speaker-dependent continuous speech recognition system KEAL is given. An unknown utterance, is recognized by means of the followng procedures: acoustic analysis, phonetic segmentation and identification, word and sentence analysis. The combination of feature-based, speaker-independent coarse phonetic segmentation with speaker-dependent statistical classification techniques is one of the main design features of the acoustic-phonetic decoder. The lexical access component is essentially based on a statistical dynamic programming technique which aims at matching a phonemic lexical entry containing various phonological forms, against a phonetic lattice. Sentence recognition is achieved by use of a context-free grammar and a parsing algorithm derived from Earley's parser. A speaker adaptation module allows some of the system parameters to be adjusted by matching known utterances with their acoustical representation. The task to be performed, described by its vocabulary and its grammar, is given as a parameter of the system. Continuously spoken sentences extracted from a 'pseudo-Logo' language are analyzed and results are presented.

  1. Comparing Measures of Voice Quality from Sustained Phonation and Continuous Speech

    Science.gov (United States)

    Gerratt, Bruce R.; Kreiman, Jody; Garellek, Marc

    2016-01-01

    Purpose: The question of what type of utterance--a sustained vowel or continuous speech--is best for voice quality analysis has been extensively studied but with equivocal results. This study examines whether previously reported differences derive from the articulatory and prosodic factors occurring in continuous speech versus sustained phonation.…

  2. Segment-based acoustic models for continuous speech recognition

    Science.gov (United States)

    Ostendorf, Mari; Rohlicek, J. R.

    1993-07-01

    This research aims to develop new and more accurate stochastic models for speaker-independent continuous speech recognition, by extending previous work in segment-based modeling and by introducing a new hierarchical approach to representing intra-utterance statistical dependencies. These techniques, which are more costly than traditional approaches because of the large search space associated with higher order models, are made feasible through rescoring a set of HMM-generated N-best sentence hypotheses. We expect these different modeling techniques to result in improved recognition performance over that achieved by current systems, which handle only frame-based observations and assume that these observations are independent given an underlying state sequence. In the fourth quarter of the project, we have completed the following: (1) ported our recognition system to the Wall Street Journal task, a standard task in the ARPA community; (2) developed an initial dependency-tree model of intra-utterance observation correlation; and (3) implemented baseline language model estimation software. Our initial results on the Wall Street Journal task are quite good and represent significantly improved performance over most HMM systems reporting on the Nov. 1992 5k vocabulary test set.

  3. Continuity-Aware Scheduling Algorithm for Scalable Video Streaming

    Directory of Open Access Journals (Sweden)

    Atinat Palawan

    2016-05-01

    Full Text Available The consumer demand for retrieving and delivering visual content through consumer electronic devices has increased rapidly in recent years. The quality of video in packet networks is susceptible to certain traffic characteristics: average bandwidth availability, loss, delay and delay variation (jitter. This paper presents a scheduling algorithm that modifies the stream of scalable video to combat jitter. The algorithm provides unequal look-ahead by safeguarding the base layer (without the need for overhead of the scalable video. The results of the experiments show that our scheduling algorithm reduces the number of frames with a violated deadline and significantly improves the continuity of the video stream without compromising the average Y Peek Signal-to-Noise Ratio (PSNR.

  4. Device for continuous analysis of a stream of material

    International Nuclear Information System (INIS)

    Krampe, G.

    1981-01-01

    A radioactive radiation source and a radioactive detector are associated, as a unit, with equipment for conveying coal or other material in a continuous stream. One part of the conveying path or the whole path lies in the irradiation zone of the source, and the detector receives the radiation reflected by the material. The radiation source and the detector are carried by impacting means situated on the conveying path in such a way as to deflect the material from a portion of the conveying means travelling in a first direction, on to another portion travelling in a second direction intersecting the first direction. (author)

  5. Continuous-speech segmentation at the beginning of language acquisition: electrophysiological evidence

    NARCIS (Netherlands)

    Kooijman, V.M.

    2007-01-01

    Word segmentation, or detecting word boundaries in continuous speech, is not an easy task. Spoken language does not contain silences to indicate word boundaries and words partly overlap due to coarticalution. Still, adults listening to their native language perceive speech as individual words. They

  6. A Bit Stream Scalable Speech/Audio Coder Combining Enhanced Regular Pulse Excitation and Parametric Coding

    Science.gov (United States)

    Riera-Palou, Felip; den Brinker, Albertus C.

    2007-12-01

    This paper introduces a new audio and speech broadband coding technique based on the combination of a pulse excitation coder and a standardized parametric coder, namely, MPEG-4 high-quality parametric coder. After presenting a series of enhancements to regular pulse excitation (RPE) to make it suitable for the modeling of broadband signals, it is shown how pulse and parametric codings complement each other and how they can be merged to yield a layered bit stream scalable coder able to operate at different points in the quality bit rate plane. The performance of the proposed coder is evaluated in a listening test. The major result is that the extra functionality of the bit stream scalability does not come at the price of a reduced performance since the coder is competitive with standardized coders (MP3, AAC, SSC).

  7. A Bit Stream Scalable Speech/Audio Coder Combining Enhanced Regular Pulse Excitation and Parametric Coding

    Directory of Open Access Journals (Sweden)

    Albertus C. den Brinker

    2007-01-01

    Full Text Available This paper introduces a new audio and speech broadband coding technique based on the combination of a pulse excitation coder and a standardized parametric coder, namely, MPEG-4 high-quality parametric coder. After presenting a series of enhancements to regular pulse excitation (RPE to make it suitable for the modeling of broadband signals, it is shown how pulse and parametric codings complement each other and how they can be merged to yield a layered bit stream scalable coder able to operate at different points in the quality bit rate plane. The performance of the proposed coder is evaluated in a listening test. The major result is that the extra functionality of the bit stream scalability does not come at the price of a reduced performance since the coder is competitive with standardized coders (MP3, AAC, SSC.

  8. Optimizing Cost of Continuous Overlapping Queries over Data Streams by Filter Adaption

    KAUST Repository

    Xie, Qing; Zhang, Xiangliang; Li, Zhixu; Zhou, Xiaofang

    2016-01-01

    The problem we aim to address is the optimization of cost management for executing multiple continuous queries on data streams, where each query is defined by several filters, each of which monitors certain status of the data stream. Specially

  9. A Russian Keyword Spotting System Based on Large Vocabulary Continuous Speech Recognition and Linguistic Knowledge

    Directory of Open Access Journals (Sweden)

    Valentin Smirnov

    2016-01-01

    Full Text Available The paper describes the key concepts of a word spotting system for Russian based on large vocabulary continuous speech recognition. Key algorithms and system settings are described, including the pronunciation variation algorithm, and the experimental results on the real-life telecom data are provided. The description of system architecture and the user interface is provided. The system is based on CMU Sphinx open-source speech recognition platform and on the linguistic models and algorithms developed by Speech Drive LLC. The effective combination of baseline statistic methods, real-world training data, and the intensive use of linguistic knowledge led to a quality result applicable to industrial use.

  10. Exploiting Speech for Automatic TV Delinearization: From Streams to Cross-Media Semantic Navigation

    Directory of Open Access Journals (Sweden)

    Guinaudeau Camille

    2011-01-01

    Full Text Available The gradual migration of television from broadcast diffusion to Internet diffusion offers countless possibilities for the generation of rich navigable contents. However, it also raises numerous scientific issues regarding delinearization of TV streams and content enrichment. In this paper, we study how speech can be used at different levels of the delinearization process, using automatic speech transcription and natural language processing (NLP for the segmentation and characterization of TV programs and for the generation of semantic hyperlinks in videos. Transcript-based video delinearization requires natural language processing techniques robust to transcription peculiarities, such as transcription errors, and to domain and genre differences. We therefore propose to modify classical NLP techniques, initially designed for regular texts, to improve their robustness in the context of TV delinearization. We demonstrate that the modified NLP techniques can efficiently handle various types of TV material and be exploited for program description, for topic segmentation, and for the generation of semantic hyperlinks between multimedia contents. We illustrate the concept of cross-media semantic navigation with a description of our news navigation demonstrator presented during the NEM Summit 2009.

  11. Radiation streaming: the continuing problem of shield design

    International Nuclear Information System (INIS)

    Avery, A.F.

    1977-01-01

    The practical problems of shield design are reviewed and the major difficulties are shown to be those associated with streaming problems. The situations in which streaming occurs in various types of reactor are described including LMFBR's and fusion devices, and examples are given of ways in which the problems have been solved

  12. Elastic execution of continuous mapreduce jobs over data streams

    DEFF Research Database (Denmark)

    2015-01-01

    There is provided a set of methods describing how to elastically change the resources used by a MapReduce job on streaming data while executing......There is provided a set of methods describing how to elastically change the resources used by a MapReduce job on streaming data while executing...

  13. The role of continuous low-frequency harmonicity cues for interrupted speech perception in bimodal hearing.

    Science.gov (United States)

    Oh, Soo Hee; Donaldson, Gail S; Kong, Ying-Yee

    2016-04-01

    Low-frequency acoustic cues have been shown to enhance speech perception by cochlear-implant users, particularly when target speech occurs in a competing background. The present study examined the extent to which a continuous representation of low-frequency harmonicity cues contributes to bimodal benefit in simulated bimodal listeners. Experiment 1 examined the benefit of restoring a continuous temporal envelope to the low-frequency ear while the vocoder ear received a temporally interrupted stimulus. Experiment 2 examined the effect of providing continuous harmonicity cues in the low-frequency ear as compared to restoring a continuous temporal envelope in the vocoder ear. Findings indicate that bimodal benefit for temporally interrupted speech increases when continuity is restored to either or both ears. The primary benefit appears to stem from the continuous temporal envelope in the low-frequency region providing additional phonetic cues related to manner and F1 frequency; a secondary contribution is provided by low-frequency harmonicity cues when a continuous representation of the temporal envelope is present in the low-frequency, or both ears. The continuous temporal envelope and harmonicity cues of low-frequency speech are thought to support bimodal benefit by facilitating identification of word and syllable boundaries, and by restoring partial phonetic cues that occur during gaps in the temporally interrupted stimulus.

  14. Cognitive Bias for Learning Speech Sounds From a Continuous Signal Space Seems Nonlinguistic

    Directory of Open Access Journals (Sweden)

    Sabine van der Ham

    2015-10-01

    Full Text Available When learning language, humans have a tendency to produce more extreme distributions of speech sounds than those observed most frequently: In rapid, casual speech, vowel sounds are centralized, yet cross-linguistically, peripheral vowels occur almost universally. We investigate whether adults’ generalization behavior reveals selective pressure for communication when they learn skewed distributions of speech-like sounds from a continuous signal space. The domain-specific hypothesis predicts that the emergence of sound categories is driven by a cognitive bias to make these categories maximally distinct, resulting in more skewed distributions in participants’ reproductions. However, our participants showed more centered distributions, which goes against this hypothesis, indicating that there are no strong innate linguistic biases that affect learning these speech-like sounds. The centralization behavior can be explained by a lack of communicative pressure to maintain categories.

  15. Neural speech recognition: continuous phoneme decoding using spatiotemporal representations of human cortical activity

    Science.gov (United States)

    Moses, David A.; Mesgarani, Nima; Leonard, Matthew K.; Chang, Edward F.

    2016-10-01

    Objective. The superior temporal gyrus (STG) and neighboring brain regions play a key role in human language processing. Previous studies have attempted to reconstruct speech information from brain activity in the STG, but few of them incorporate the probabilistic framework and engineering methodology used in modern speech recognition systems. In this work, we describe the initial efforts toward the design of a neural speech recognition (NSR) system that performs continuous phoneme recognition on English stimuli with arbitrary vocabulary sizes using the high gamma band power of local field potentials in the STG and neighboring cortical areas obtained via electrocorticography. Approach. The system implements a Viterbi decoder that incorporates phoneme likelihood estimates from a linear discriminant analysis model and transition probabilities from an n-gram phonemic language model. Grid searches were used in an attempt to determine optimal parameterizations of the feature vectors and Viterbi decoder. Main results. The performance of the system was significantly improved by using spatiotemporal representations of the neural activity (as opposed to purely spatial representations) and by including language modeling and Viterbi decoding in the NSR system. Significance. These results emphasize the importance of modeling the temporal dynamics of neural responses when analyzing their variations with respect to varying stimuli and demonstrate that speech recognition techniques can be successfully leveraged when decoding speech from neural signals. Guided by the results detailed in this work, further development of the NSR system could have applications in the fields of automatic speech recognition and neural prosthetics.

  16. Continuous turbidity monitoring in streams of northwestern California

    Science.gov (United States)

    Rand Eads; Jack Lewis

    2002-01-01

    Abstract - Redwood Sciences Laboratory, a field office of the USDA Forest Service, Pacific Southwest Research Station has developed and refined methods and instrumentation to monitor turbidity and suspended sediment in streams of northern California since 1996. Currently we operate 21 stations and have provided assistance in the installation of 6 gaging stations for...

  17. The development of multisensory speech perception continues into the late childhood years.

    Science.gov (United States)

    Ross, Lars A; Molholm, Sophie; Blanco, Daniella; Gomez-Ramirez, Manuel; Saint-Amour, Dave; Foxe, John J

    2011-06-01

    Observing a speaker's articulations substantially improves the intelligibility of spoken speech, especially under noisy listening conditions. This multisensory integration of speech inputs is crucial to effective communication. Appropriate development of this ability has major implications for children in classroom and social settings, and deficits in it have been linked to a number of neurodevelopmental disorders, especially autism. It is clear from structural imaging studies that there is a prolonged maturational course within regions of the perisylvian cortex that persists into late childhood, and these regions have been firmly established as being crucial to speech and language functions. Given this protracted maturational timeframe, we reasoned that multisensory speech processing might well show a similarly protracted developmental course. Previous work in adults has shown that audiovisual enhancement in word recognition is most apparent within a restricted range of signal-to-noise ratios (SNRs). Here, we investigated when these properties emerge during childhood by testing multisensory speech recognition abilities in typically developing children aged between 5 and 14 years, and comparing them with those of adults. By parametrically varying SNRs, we found that children benefited significantly less from observing visual articulations, displaying considerably less audiovisual enhancement. The findings suggest that improvement in the ability to recognize speech-in-noise and in audiovisual integration during speech perception continues quite late into the childhood years. The implication is that a considerable amount of multisensory learning remains to be achieved during the later schooling years, and that explicit efforts to accommodate this learning may well be warranted. European Journal of Neuroscience © 2011 Federation of European Neuroscience Societies and Blackwell Publishing Ltd. No claim to original US government works.

  18. Distributed continuous media streaming - Using redundant hierarchy (RED-Hi) servers

    OpenAIRE

    Shah, Mohammad Ahmed

    2014-01-01

    ABSTRACT: The first part of this thesis provides a survey of continuous media serves, including discussions on streaming protocols, models and techniques. In the second part, a novel distributed media streaming system is introduced. In order to manage the traffic in a fault tolerant and effective manner a hierarchical topology, so called redundant hierarchy (RED-Hi) is used. The proposed system works in three steps, namely, object location, path reservation and object delivery. Simulations ar...

  19. Decoding the attended speech stream with multi-channel EEG: implications for online, daily-life applications

    Science.gov (United States)

    Mirkovic, Bojana; Debener, Stefan; Jaeger, Manuela; De Vos, Maarten

    2015-08-01

    Objective. Recent studies have provided evidence that temporal envelope driven speech decoding from high-density electroencephalography (EEG) and magnetoencephalography recordings can identify the attended speech stream in a multi-speaker scenario. The present work replicated the previous high density EEG study and investigated the necessary technical requirements for practical attended speech decoding with EEG. Approach. Twelve normal hearing participants attended to one out of two simultaneously presented audiobook stories, while high density EEG was recorded. An offline iterative procedure eliminating those channels contributing the least to decoding provided insight into the necessary channel number and optimal cross-subject channel configuration. Aiming towards the future goal of near real-time classification with an individually trained decoder, the minimum duration of training data necessary for successful classification was determined by using a chronological cross-validation approach. Main results. Close replication of the previously reported results confirmed the method robustness. Decoder performance remained stable from 96 channels down to 25. Furthermore, for less than 15 min of training data, the subject-independent (pre-trained) decoder performed better than an individually trained decoder did. Significance. Our study complements previous research and provides information suggesting that efficient low-density EEG online decoding is within reach.

  20. Selective particle and cell capture in a continuous flow using micro-vortex acoustic streaming.

    Science.gov (United States)

    Collins, David J; Khoo, Bee Luan; Ma, Zhichao; Winkler, Andreas; Weser, Robert; Schmidt, Hagen; Han, Jongyoon; Ai, Ye

    2017-05-16

    Acoustic streaming has emerged as a promising technique for refined microscale manipulation, where strong rotational flow can give rise to particle and cell capture. In contrast to hydrodynamically generated vortices, acoustic streaming is rapidly tunable, highly scalable and requires no external pressure source. Though streaming is typically ignored or minimized in most acoustofluidic systems that utilize other acoustofluidic effects, we maximize the effect of acoustic streaming in a continuous flow using a high-frequency (381 MHz), narrow-beam focused surface acoustic wave. This results in rapid fluid streaming, with velocities orders of magnitude greater than that of the lateral flow, to generate fluid vortices that extend the entire width of a 400 μm wide microfluidic channel. We characterize the forces relevant for vortex formation in a combined streaming/lateral flow system, and use these acoustic streaming vortices to selectively capture 2 μm from a mixed suspension with 1 μm particles and human breast adenocarcinoma cells (MDA-231) from red blood cells.

  1. Two-Step Fair Scheduling of Continuous Media Streams over Error-Prone Wireless Channels

    Science.gov (United States)

    Oh, Soohyun; Lee, Jin Wook; Park, Taejoon; Jo, Tae-Chang

    In wireless cellular networks, streaming of continuous media (with strict QoS requirements) over wireless links is challenging due to their inherent unreliability characterized by location-dependent, bursty errors. To address this challenge, we present a two-step scheduling algorithm for a base station to provide streaming of continuous media to wireless clients over the error-prone wireless links. The proposed algorithm is capable of minimizing the packet loss rate of individual clients in the presence of error bursts, by transmitting packets in the round-robin manner and also adopting a mechanism for channel prediction and swapping.

  2. System optimization for continuous on-stream elemental analysis using low-output isotopic neutron sources

    International Nuclear Information System (INIS)

    Rizk, R.A.M.

    1989-01-01

    In continuous on-stream neutron activation analysis, the material to be analyzed may be continuously recirculated in a closed loop system between an activation source and a shielded detector. In this paper an analytical formulation of the detector response for such a system is presented. This formulation should be useful in optimizing the system design parameters for specific applications. A study has been made of all parameters that influence the detector response during on-stream analysis. Feasibility applications of the method to solutions of manganese and vanadium using a 5 μg 252 Cf neutron source are demonstrated. (author)

  3. Technique for producing a continuous interference-free stream of Argon-41 in air

    International Nuclear Information System (INIS)

    Tseng, T.-T.; Jester, W.A.

    1984-01-01

    A monitoring system was developed for the detection of 131 I in the presence of orders of magnitude higher concentrations of radioactive noble gas. During the course of this work, a technique was developed for producing a continuous air stream of 41 Ar required for testing this concept. The 41 Ar stream is produced by the neutron activation of air using a research reactor. The 41 Ar content of the air stream can be varied by many orders of magnitude by varying the reactor power level and the rate at which the air is pumped through a vertically positioned tube in or in front of the reactor. It was found that the neutrons also activate other air constituents, producing undesirable interference radionuclides. Selective filtering techniques have therefore been developed to remove these interference radionuclides from the 41 Ar air stream

  4. Efficient Processing of Continuous Skyline Query over Smarter Traffic Data Stream for Cloud Computing

    Directory of Open Access Journals (Sweden)

    Wang Hanning

    2013-01-01

    Full Text Available The analyzing and processing of multisource real-time transportation data stream lay a foundation for the smart transportation's sensibility, interconnection, integration, and real-time decision making. Strong computing ability and valid mass data management mode provided by the cloud computing, is feasible for handling Skyline continuous query in the mass distributed uncertain transportation data stream. In this paper, we gave architecture of layered smart transportation about data processing, and we formalized the description about continuous query over smart transportation data Skyline. Besides, we proposed mMR-SUDS algorithm (Skyline query algorithm of uncertain transportation stream data based on micro-batchinMap Reduce based on sliding window division and architecture.

  5. VoIP Speech Encryption System Using Stream Cipher with Chaotic ...

    African Journals Online (AJOL)

    pc

    2018-03-22

    Mar 22, 2018 ... The technologies of Internet doesn't give any security mechanism and there is ... VoIP system, both digital (e.g., PC, PDA) and analog (e.g., telephone) devices ... the protection to speech through traditional encryption schemes ...

  6. Discrimination and streaming of speech sounds based on differences in interaural and spectral cues.

    Science.gov (United States)

    David, Marion; Lavandier, Mathieu; Grimault, Nicolas; Oxenham, Andrew J

    2017-09-01

    Differences in spatial cues, including interaural time differences (ITDs), interaural level differences (ILDs) and spectral cues, can lead to stream segregation of alternating noise bursts. It is unknown how effective such cues are for streaming sounds with realistic spectro-temporal variations. In particular, it is not known whether the high-frequency spectral cues associated with elevation remain sufficiently robust under such conditions. To answer these questions, sequences of consonant-vowel tokens were generated and filtered by non-individualized head-related transfer functions to simulate the cues associated with different positions in the horizontal and median planes. A discrimination task showed that listeners could discriminate changes in interaural cues both when the stimulus remained constant and when it varied between presentations. However, discrimination of changes in spectral cues was much poorer in the presence of stimulus variability. A streaming task, based on the detection of repeated syllables in the presence of interfering syllables, revealed that listeners can use both interaural and spectral cues to segregate alternating syllable sequences, despite the large spectro-temporal differences between stimuli. However, only the full complement of spatial cues (ILDs, ITDs, and spectral cues) resulted in obligatory streaming in a task that encouraged listeners to integrate the tokens into a single stream.

  7. Integrating the pulse of the riverscape and landscape: modelling stream metabolism using continuous dissolved oxygen measurements

    Science.gov (United States)

    Soulsby, C.; Birkel, C.; Malcolm, I.; Tetzlaff, D.

    2013-12-01

    Stream metabolism is a fundamental pulse of the watershed which reflects both the in-stream environment and its connectivity with the wider landscape. We used high quality, continuous (15 minute), long-term (>3 years) measurement of stream dissolved oxygen (DO) concentrations to estimate photosynthetic productivity (P) and system respiration (R) in forest and moorland reaches of an upland stream with peaty soils. We calibrated a simple five parameter numerical oxygen mass balance model driven by radiation, stream and air temperature, stream depth and re-aeration capacity. This used continuous 24-hour periods for the whole time series to identify behavioural simulations where DO simulations were re-produced sufficiently well to be considered reasonable representations of ecosystem functioning. Results were evaluated using a seasonal Regional Sensitivity Analysis and a co-linearity index for parameter sensitivity. This showed that >95 % of the behavioural models for the moorland and forest sites were identifiable and able to infer in-stream processes from the DO time series for almost half of all measured days at both sites. Days when the model failed to simulate DO levels successfully provided invaluable insight into time periods when other factors are likely to disrupt in-stream metabolic processes; these include (a) flood events when scour reduces the biomass of benthic primary producers, (b) periods of high water colour in higher summer/autumn flows and (c) low flow periods when hyporheic respiration is evident. Monthly P/R ratios <1 indicate a heterotrophic system with both sites exhibiting similar temporal patterns; with a maximum in February and a second peak during summer months. However, the estimated net ecosystem productivity (NPP) suggests that the moorland reach without riparian tree cover is likely to be a much larger source of carbon to the atmosphere (122 mmol C m-2 d-1) compared to the forested reach (64 mmol C m-2 d-1). The study indicates the value

  8. Optimizing Cost of Continuous Overlapping Queries over Data Streams by Filter Adaption

    KAUST Repository

    Xie, Qing

    2016-01-12

    The problem we aim to address is the optimization of cost management for executing multiple continuous queries on data streams, where each query is defined by several filters, each of which monitors certain status of the data stream. Specially the filter can be shared by different queries and expensive to evaluate. The conventional objective for such a problem is to minimize the overall execution cost to solve all queries, by planning the order of filter evaluation in shared strategy. However, in streaming scenario, the characteristics of data items may change in process, which can bring some uncertainty to the outcome of individual filter evaluation, and affect the plan of query execution as well as the overall execution cost. In our work, considering the influence of the uncertain variation of data characteristics, we propose a framework to deal with the dynamic adjustment of filter ordering for query execution on data stream, and focus on the issues of cost management. By incrementally monitoring and analyzing the results of filter evaluation, our proposed approach can be effectively adaptive to the varied stream behavior and adjust the optimal ordering of filter evaluation, so as to optimize the execution cost. In order to achieve satisfactory performance and efficiency, we also discuss the trade-off between the adaptivity of our framework and the overhead incurred by filter adaption. The experimental results on synthetic and two real data sets (traffic and multimedia) show that our framework can effectively reduce and balance the overall query execution cost and keep high adaptivity in streaming scenario.

  9. SlipStream: automated provisioning and continuous deployment in the cloud

    CERN Multimedia

    CERN. Geneva

    2012-01-01

    Cloud technology is now everywhere. Beyond the hype, it provides a real opportunity to improve the engineering of software systems. Lately the DevOps movement has also gain momentum, which take an agile approach at bringing developers and system administrators closer together to better engineer software systems. In this context, this presentation focuses on new tools for exploiting cloud services (private and public) in order to create a continuous flow between software commits and fully deployed and configured software systems, automatically and on-demand. To illustrate this, we present SlipStream and StratusLab. SlipStream is a new product developed by SixSq, able to create virtual machines and orchestrate multi-machine deployments.  SlipStream started from an idea developed in the context of the ETICS project, led by CERN. StratusLab is an open-source IaaS distribution, able to create public and private clouds. This presentation will also describe a case study where SlipStream dep...

  10. CC_TRS: Continuous Clustering of Trajectory Stream Data Based on Micro Cluster Life

    Directory of Open Access Journals (Sweden)

    Musaab Riyadh

    2017-01-01

    Full Text Available The rapid spreading of positioning devices leads to the generation of massive spatiotemporal trajectories data. In some scenarios, spatiotemporal data are received in stream manner. Clustering of stream data is beneficial for different applications such as traffic management and weather forecasting. In this article, an algorithm for Continuous Clustering of Trajectory Stream Data Based on Micro Cluster Life is proposed. The algorithm consists of two phases. There is the online phase where temporal micro clusters are used to store summarized spatiotemporal information for each group of similar segments. The clustering task in online phase is based on temporal micro cluster lifetime instead of time window technique which divides stream data into time bins and clusters each bin separately. For offline phase, a density based clustering approach is used to generate macro clusters depending on temporal micro clusters. The evaluation of the proposed algorithm on real data sets shows the efficiency and the effectiveness of the proposed algorithm and proved it is efficient alternative to time window technique.

  11. Continuing Inequity through Neoliberalism: The Conveyance of White Dominance in the Educational Policy Speeches of President Barack Obama

    Science.gov (United States)

    Hairston, Thomas W.

    2013-01-01

    The purpose of this critical discourse analysis is to examine how the political speeches and statements of President Barack Obama knowingly or unknowingly continue practices and policies of White privilege within educational policy and practice by constructing education in a neoliberal frame. With presidents having the ability to communicate…

  12. A Selectivity based approach to Continuous Pattern Detection in Streaming Graphs

    Energy Technology Data Exchange (ETDEWEB)

    Choudhury, Sutanay; Holder, Larry; Chin, George; Agarwal, Khushbu; Feo, John T.

    2015-05-27

    Cyber security is one of the most significant technical challenges in current times. Detecting adversarial activities, prevention of theft of intellectual properties and customer data is a high priority for corporations and government agencies around the world. Cyber defenders need to analyze massive-scale, high-resolution network flows to identify, categorize, and mitigate attacks involving networks spanning institutional and national boundaries. Many of the cyber attacks can be described as subgraph patterns, with prominent examples being insider infiltrations (path queries), denial of service (parallel paths) and malicious spreads (tree queries). This motivates us to explore subgraph matching on streaming graphs in a continuous setting. The novelty of our work lies in using the subgraph distributional statistics collected from the streaming graph to determine the query processing strategy. We introduce a ``Lazy Search" algorithm where the search strategy is decided on a vertex-to-vertex basis depending on the likelihood of a match in the vertex neighborhood. We also propose a metric named ``Relative Selectivity" that is used to select between different query processing strategies. Our experiments performed on real online news, network traffic stream and a synthetic social network benchmark demonstrate 10-100x speedups over non-incremental, selectivity agnostic approaches.

  13. A continuous-flow system for measuring in vitro oxygen and nitrogen metabolism in separated stream communities

    DEFF Research Database (Denmark)

    Prahl, C.; Jeppesen, E.; Sand-Jensen, Kaj

    1991-01-01

    on the stream bank, consists of several macrophyte and sediment chambers equipped with a double-flow system that ensures an internal water velocity close to that in the stream and which, by continuously renewing the water, mimics diel fluctuation in stream temperature and water chemistry. Water temperature...... production and dark respiration occurred at similar rates (6-7g O2 m-2 day-1), net balance being about zero. Inorganic nitrogen was consumed both by the sediment and to a greater extent by the macrophytes, the diel average consumption being 1g N m-2 day-1. 3. The sum of the activity in the macrophyte...... and sediment chambers corresponded to the overall activity of the stream section as determined by upstream/downstream mass balance. This indicates that the results obtained with the continuous-flow chambers realistically describe the oxygen and the nitrogen metabolism of the stream....

  14. Continuous ammonium enrichment of a woodland stream: uptake kinetics, leaf decomposition, and nitrification

    Energy Technology Data Exchange (ETDEWEB)

    Newbold, J D; Elwood, J W; Schulze, M S; Stark, R W; Barmeier, J C

    1983-01-01

    In order to test for nitrogen limitation and examine ammonium uptake by stream sediments, ammonium hydroxide was added continuously at concentrations averaging 100 /sup +/gl/sup -1/ for 70 days to a second-order reach of Walker Branch, an undisturbed woodland stream in Tennessee. Ammonium uptake during the first 4 h of addition corresponded to adsorption kinetics rather than to first-order uptake or to Michaelis-Menten kinetics. However, the calculated adsorption partition coefficient was two to four orders of magnitude greater than values reported for physical adsorption of ammonium, suggesting that the uptake was largely biotic. Mass balance indicated that the uptake of ammonium from the water could be accounted for by increased nitrogen content in benthic organic detritus. Nitrification, inferred from longitudinal gradients in NO/sub 3/, began soon after enrichment and increased dramatically near the end of the experiment. Both ammonium and nitrate concentrations dropped quickly to near background levels when input ceased, indicating little desorption or nitrification of excess nitrogen stored in the reach. There was no evidence of nitrogen limitation as measured by weight loss, oxygen consumption, phosphorus content, and macroinvertebrate density of red oak leaf packs, or by chlorophyll content and aufwuchs biomass on plexiglass slides. A continuous phosphorus enrichment 1 year earlier had demonstrated phosphorus limitation in Walker Branch. 38 references, 6 figures, 3 tables.

  15. Real-time continuous visual biofeedback in the treatment of speech breathing disorders following childhood traumatic brain injury: report of one case.

    Science.gov (United States)

    Murdoch, B E; Pitt, G; Theodoros, D G; Ward, E C

    1999-01-01

    The efficacy of traditional and physiological biofeedback methods for modifying abnormal speech breathing patterns was investigated in a child with persistent dysarthria following severe traumatic brain injury (TBI). An A-B-A-B single-subject experimental research design was utilized to provide the subject with two exclusive periods of therapy for speech breathing, based on traditional therapy techniques and physiological biofeedback methods, respectively. Traditional therapy techniques included establishing optimal posture for speech breathing, explanation of the movement of the respiratory muscles, and a hierarchy of non-speech and speech tasks focusing on establishing an appropriate level of sub-glottal air pressure, and improving the subject's control of inhalation and exhalation. The biofeedback phase of therapy utilized variable inductance plethysmography (or Respitrace) to provide real-time, continuous visual biofeedback of ribcage circumference during breathing. As in traditional therapy, a hierarchy of non-speech and speech tasks were devised to improve the subject's control of his respiratory pattern. Throughout the project, the subject's respiratory support for speech was assessed both instrumentally and perceptually. Instrumental assessment included kinematic and spirometric measures, and perceptual assessment included the Frenchay Dysarthria Assessment, Assessment of Intelligibility of Dysarthric Speech, and analysis of a speech sample. The results of the study demonstrated that real-time continuous visual biofeedback techniques for modifying speech breathing patterns were not only effective, but superior to the traditional therapy techniques for modifying abnormal speech breathing patterns in a child with persistent dysarthria following severe TBI. These results show that physiological biofeedback techniques are potentially useful clinical tools for the remediation of speech breathing impairment in the paediatric dysarthric population.

  16. On the Use of Evolutionary Algorithms to Improve the Robustness of Continuous Speech Recognition Systems in Adverse Conditions

    Directory of Open Access Journals (Sweden)

    Sid-Ahmed Selouani

    2003-07-01

    Full Text Available Limiting the decrease in performance due to acoustic environment changes remains a major challenge for continuous speech recognition (CSR systems. We propose a novel approach which combines the Karhunen-Loève transform (KLT in the mel-frequency domain with a genetic algorithm (GA to enhance the data representing corrupted speech. The idea consists of projecting noisy speech parameters onto the space generated by the genetically optimized principal axis issued from the KLT. The enhanced parameters increase the recognition rate for highly interfering noise environments. The proposed hybrid technique, when included in the front-end of an HTK-based CSR system, outperforms that of the conventional recognition process in severe interfering car noise environments for a wide range of signal-to-noise ratios (SNRs varying from 16 dB to −4 dB. We also showed the effectiveness of the KLT-GA method in recognizing speech subject to telephone channel degradations.

  17. STREAM

    DEFF Research Database (Denmark)

    Godsk, Mikkel

    This paper presents a flexible model, ‘STREAM’, for transforming higher science education into blended and online learning. The model is inspired by ideas of active and collaborative learning and builds on feedback strategies well-known from Just-in-Time Teaching, Flipped Classroom, and Peer...... Instruction. The aim of the model is to provide both a concrete and comprehensible design toolkit for adopting and implementing educational technologies in higher science teaching practice and at the same time comply with diverse ambitions. As opposed to the above-mentioned feedback strategies, the STREAM...... model supports a relatively diverse use of educational technologies and may also be used to transform teaching into completely online learning. So far both teachers and educational developers have positively received the model and the initial design experiences show promise....

  18. Calibration of AN Acoustic Sensor (geophone) for Continuous Bedload Monitoring in Mountainous Streams

    Science.gov (United States)

    Tsakiris, A. G.; Papanicolaou, T.

    2010-12-01

    Measurement of bedload rates is a crucial component in the study of alluvial processes in mountainous streams. Stream restoration efforts, the validation of morphodynamic models and the calibration empirical transport formulae rely on accurate bedload transport measurements. Bedload measurements using traditional methods (e.g. samplers, traps) are time consuming, resource intensive and not always feasible, especially at higher flow conditions. These limitations could potentially be addressed by acoustic instruments, which may provide unattended, continuous bedload measurements even at higher flow conditions, provided that these instruments are properly calibrated. The objective of this study is to calibrate an acoustic instrument (geophone) for performing bedload measurements in a well-monitored laboratory environment at conditions corresponding to low flow regime in mountainous streams. The geophone was manufactured by ClampOn® and was attached to the bottom of a steel plate with dimensions 0.15x0.15 m. The geophone registers the energy of the acoustic signal produced by the movement of the bedload particles over the steel plate with time resolution of one second. The plate-sensor system was installed in an acrylic housing such that the steel plate top surface was at the same level with the surface of a flat porous bed consisting of unisize spheres with diameter 19.1 mm. Unisize spherical glass particles, 15.9 mm in diameter, were preplaced along a 2 m long section upstream of the sensor, and were entrained over the steel plate. In these experiments, the geophone records spanned the complete experiment duratio. Plan view video of the particle movement over the steel plate was recorded via an overhead camera, and was used to calculate the actual bedload rate over the steel plate. Synchronized analysis of this plan view video and the geophone time series revealed that the geophone detected 62% of the bedload particles passing over the steel plate, which triggered

  19. Fishing in Speech Stream

    DEFF Research Database (Denmark)

    Juel Henrichsen, Peter

    2011-01-01

    We present a learning device able to deduce a set of Danish color and shape terms. Only two data sources are available to the learner: A phonetic transcription of a human informant solving a description task, and a minimal formal model of the picture being described. The system thus contains no p...

  20. Quantifying in-stream nitrate reaction rates using continuously-collected water quality data

    Science.gov (United States)

    Matthew Miller; Anthony Tesoriero; Paul Capel

    2016-01-01

    High frequency in situ nitrate data from three streams of varying hydrologic condition, land use, and watershed size were used to quantify the mass loading of nitrate to streams from two sources – groundwater discharge and event flow – at a daily time step for one year. These estimated loadings were used to quantify temporally-variable in-stream nitrate processing ...

  1. Learning a Continuous-Time Streaming Video QoE Model.

    Science.gov (United States)

    Ghadiyaram, Deepti; Pan, Janice; Bovik, Alan C

    2018-05-01

    Over-the-top adaptive video streaming services are frequently impacted by fluctuating network conditions that can lead to rebuffering events (stalling events) and sudden bitrate changes. These events visually impact video consumers' quality of experience (QoE) and can lead to consumer churn. The development of models that can accurately predict viewers' instantaneous subjective QoE under such volatile network conditions could potentially enable the more efficient design of quality-control protocols for media-driven services, such as YouTube, Amazon, Netflix, and so on. However, most existing models only predict a single overall QoE score on a given video and are based on simple global video features, without accounting for relevant aspects of human perception and behavior. We have created a QoE evaluator, called the time-varying QoE Indexer, that accounts for interactions between stalling events, analyzes the spatial and temporal content of a video, predicts the perceptual video quality, models the state of the client-side data buffer, and consequently predicts continuous-time quality scores that agree quite well with human opinion scores. The new QoE predictor also embeds the impact of relevant human cognitive factors, such as memory and recency, and their complex interactions with the video content being viewed. We evaluated the proposed model on three different video databases and attained standout QoE prediction performance.

  2. A Measure of the Auditory-perceptual Quality of Strain from Electroglottographic Analysis of Continuous Dysphonic Speech: Application to Adductor Spasmodic Dysphonia.

    Science.gov (United States)

    Somanath, Keerthan; Mau, Ted

    2016-11-01

    (1) To develop an automated algorithm to analyze electroglottographic (EGG) signal in continuous dysphonic speech, and (2) to identify EGG waveform parameters that correlate with the auditory-perceptual quality of strain in the speech of patients with adductor spasmodic dysphonia (ADSD). Software development with application in a prospective controlled study. EGG was recorded from 12 normal speakers and 12 subjects with ADSD reading excerpts from the Rainbow Passage. Data were processed by a new algorithm developed with the specific goal of analyzing continuous dysphonic speech. The contact quotient, pulse width, a new parameter peak skew, and various contact closing slope quotient and contact opening slope quotient measures were extracted. EGG parameters were compared between normal and ADSD speech. Within the ADSD group, intra-subject comparison was also made between perceptually strained syllables and unstrained syllables. The opening slope quotient SO7525 distinguished strained syllables from unstrained syllables in continuous speech within individual subjects with ADSD. The standard deviations, but not the means, of contact quotient, EGGW50, peak skew, and SO7525 were different between normal and ADSD speakers. The strain-stress pattern in continuous speech can be visualized as color gradients based on the variation of EGG parameter values. EGG parameters may provide a within-subject measure of vocal strain and serve as a marker for treatment response. The addition of EGG to multidimensional assessment may lead to improved characterization of the voice disturbance in ADSD. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  3. Musician advantage for speech-on-speech perception

    NARCIS (Netherlands)

    Başkent, Deniz; Gaudrain, Etienne

    Evidence for transfer of musical training to better perception of speech in noise has been mixed. Unlike speech-in-noise, speech-on-speech perception utilizes many of the skills that musical training improves, such as better pitch perception and stream segregation, as well as use of higher-level

  4. Research of Features of the Phonetic System of Speech and Identification of Announcers on the Voice

    Directory of Open Access Journals (Sweden)

    Roman Aleksandrovich Vasilyev

    2013-02-01

    Full Text Available In the work the method of the phonetic analysis of speech — allocation of the list of elementary speech units such as separate phonemes from a continuous stream of informal conversation of the specific announcer is offered. The practical algorithm of identification of the announcer — process of definition speaking of the set of announcers is described.

  5. Developing an Effective Model for Predicting Spatially and Temporally Continuous Stream Temperatures from Remotely Sensed Land Surface Temperatures

    Directory of Open Access Journals (Sweden)

    Kristina M. McNyset

    2015-12-01

    Full Text Available Although water temperature is important to stream biota, it is difficult to collect in a spatially and temporally continuous fashion. We used remotely-sensed Land Surface Temperature (LST data to estimate mean daily stream temperature for every confluence-to-confluence reach in the John Day River, OR, USA for a ten year period. Models were built at three spatial scales: site-specific, subwatershed, and basin-wide. Model quality was assessed using jackknife and cross-validation. Model metrics for linear regressions of the predicted vs. observed data across all sites and years: site-specific r2 = 0.95, Root Mean Squared Error (RMSE = 1.25 °C; subwatershed r2 = 0.88, RMSE = 2.02 °C; and basin-wide r2 = 0.87, RMSE = 2.12 °C. Similar analyses were conducted using 2012 eight-day composite LST and eight-day mean stream temperature in five watersheds in the interior Columbia River basin. Mean model metrics across all basins: r2 = 0.91, RMSE = 1.29 °C. Sensitivity analyses indicated accurate basin-wide models can be parameterized using data from as few as four temperature logger sites. This approach generates robust estimates of stream temperature through time for broad spatial regions for which there is only spatially and temporally patchy observational data, and may be useful for managers and researchers interested in stream biota.

  6. Evaluation of Measurements Collected with Multi-Parameter Continuous Water-Quality Monitors in Selected Illinois Streams, 2001-03

    Science.gov (United States)

    Groschen, George E.; King, Robin B.

    2005-01-01

    Eight streams, representing a wide range of environmental and water-quality conditions across Illinois, were monitored from July 2001 to October 2003 for five water-quality parameters as part of a pilot study by the U.S. Geological Survey (USGS) in cooperation with the Illinois Environmental Protection Agency (IEPA). Continuous recording multi-parameter water-quality monitors were installed to collect data on water temperature, dissolved-oxygen concentrations, specific conductivity, pH, and turbidity. The monitors were near USGS streamflow-gaging stations where stage and streamflow are continuously recorded. During the study period, the data collected for these five parameters generally met the data-quality objectives established by the USGS and IEPA at all eight stations. A similar pilot study during this period for measurement of chlorophyll concentrations failed to achieve the data-quality objectives. Of all the sensors used, the temperature sensors provided the most accurate and reliable measurements (generally within ?5 percent of a calibrated thermometer reading). Signal adjustments and calibration of all other sensors are dependent upon an accurate and precise temperature measurement. The dissolved-oxygen sensors were the next most reliable during the study and were responsive to changing conditions and accurate at all eight stations. Specific conductivity was the third most accurate and reliable measurement collected from the multi-parameter monitors. Specific conductivity at the eight stations varied widely-from less than 40 microsiemens (?S) at Rayse Creek near Waltonville to greater than 3,500 ?S at Salt Creek at Western Springs. In individual streams, specific conductivity often changed quickly (greater than 25 percent in less than 3 hours) and the sensors generally provided good to excellent record of these variations at all stations. The widest range of specific-conductivity measurements was in Salt Creek at Western Springs in the Greater Chicago

  7. Guidelines for the collection of continuous stream water-temperature data in Alaska

    Science.gov (United States)

    Toohey, Ryan C.; Neal, Edward G.; Solin, Gary L.

    2014-01-01

    Objectives of stream monitoring programs differ considerably among many of the academic, Federal, state, tribal, and non-profit organizations in the state of Alaska. Broad inclusion of stream-temperature monitoring can provide an opportunity for collaboration in the development of a statewide stream-temperature database. Statewide and regional coordination could reduce overall monitoring cost, while providing better analyses at multiple spatial and temporal scales to improve resource decision-making. Increased adoption of standardized protocols and data-quality standards may allow for validation of historical modeling efforts with better projection calibration. For records of stream water temperature to be generally consistent, unbiased, and reproducible, data must be collected and analyzed according to documented protocols. Collection of water-temperature data requires definition of data-quality objectives, good site selection, proper selection of instrumentation, proper installation of sensors, periodic site visits to maintain sensors and download data, pre- and post-deployment verification against an NIST-certified thermometer, potential data corrections, and proper documentation, review, and approval. A study created to develop a quality-assurance project plan, data-quality objectives, and a database management plan that includes procedures for data archiving and dissemination could provide a means to standardize a statewide stream-temperature database in Alaska. Protocols can be modified depending on desired accuracy or specific needs of data collected. This document is intended to guide users in collecting time series water-temperature data in Alaskan streams and draws extensively on the broader protocols already published by the U.S. Geological Survey.

  8. Interdependent processing and encoding of speech and concurrent background noise.

    Science.gov (United States)

    Cooper, Angela; Brouwer, Susanne; Bradlow, Ann R

    2015-05-01

    Speech processing can often take place in adverse listening conditions that involve the mixing of speech and background noise. In this study, we investigated processing dependencies between background noise and indexical speech features, using a speeded classification paradigm (Garner, 1974; Exp. 1), and whether background noise is encoded and represented in memory for spoken words in a continuous recognition memory paradigm (Exp. 2). Whether or not the noise spectrally overlapped with the speech signal was also manipulated. The results of Experiment 1 indicated that background noise and indexical features of speech (gender, talker identity) cannot be completely segregated during processing, even when the two auditory streams are spectrally nonoverlapping. Perceptual interference was asymmetric, whereby irrelevant indexical feature variation in the speech signal slowed noise classification to a greater extent than irrelevant noise variation slowed speech classification. This asymmetry may stem from the fact that speech features have greater functional relevance to listeners, and are thus more difficult to selectively ignore than background noise. Experiment 2 revealed that a recognition cost for words embedded in different types of background noise on the first and second occurrences only emerged when the noise and the speech signal were spectrally overlapping. Together, these data suggest integral processing of speech and background noise, modulated by the level of processing and the spectral separation of the speech and noise.

  9. Continuous Distributed Top-k Monitoring over High-Speed Rail Data Stream in Cloud Computing Environment

    Directory of Open Access Journals (Sweden)

    Hanning Wang

    2013-01-01

    Full Text Available In the environment of cloud computing, real-time mass data about high-speed rail which is based on the intense monitoring of large scale perceived equipment provides strong support for the safety and maintenance of high-speed rail. In this paper, we focus on the Top-k algorithm of continuous distribution based on Multisource distributed data stream for high-speed rail monitoring. Specifically, we formalized Top-k monitoring model of high-speed rail and proposed DTMR that is the Top-k monitoring algorithm with random, continuous, or strictly monotone aggregation functions. The DTMR was proved to be valid by lots of experiments.

  10. Bilingual Mothers' Language Choice in Child-directed Speech: Continuity and Change

    OpenAIRE

    De Houwer, Annick; Bornstein, Marc H.

    2016-01-01

    An important aspect of Family Language Policy in bilingual families is parental language choice. Little is known about the continuity in parental language choice and the factors affecting it. This longitudinal study explores maternal language choice over time. Thirty-one bilingual mothers provided reports of what language(s) they spoke with their children. Mother-child interactions were videotaped when children were pre-verbal (5M), producing words in two languages (20M), and fluent speakers ...

  11. An apparatus for separating and continuously recovering a particulate material carried by a gas stream

    International Nuclear Information System (INIS)

    Becker, W.R.; Dada, A.G.; Dehollander, W.R.; Sloat, R.J.

    1974-01-01

    Description is given of an apparatus adapted to separate and recover a particulate material carried by hot corrosive gases. The apparatus comprises a flow-channel connected to a gas stream source carrying a particulate material, a first and second tubes connected to said flow-channel, filtrating devices, recovery containers and flow-restricting valves. This can be applied to the recovery of uranium oxides generated by flame reactions [fr

  12. Current Training and Continuing Education Needs of Preschool and School-Based Speech-Language Pathologists regarding Children with Cleft Lip/Palate

    Science.gov (United States)

    Bedwinek, Anne P.; Kummer, Ann W.; Rice, Gale B.; Grames, Lynn Marty

    2010-01-01

    Purpose: The purpose of this study was to obtain information regarding the education and experience of preschool and school-based speech-language pathologists (SLPs) regarding the assessment and treatment of children born with cleft lip and/or palate and to determine their continuing education needs in this area. Method: A 16-item mixed-methods…

  13. Bilingual Mothers' Language Choice in Child-directed Speech: Continuity and Change.

    Science.gov (United States)

    De Houwer, Annick; Bornstein, Marc H

    2016-01-01

    An important aspect of Family Language Policy in bilingual families is parental language choice. Little is known about the continuity in parental language choice and the factors affecting it. This longitudinal study explores maternal language choice over time. Thirty-one bilingual mothers provided reports of what language(s) they spoke with their children. Mother-child interactions were videotaped when children were pre-verbal (5M), producing words in two languages (20M), and fluent speakers (53M). All children had heard two languages from birth in the home. Most mothers reported addressing children in the same single language. Observational data confirmed mothers' use of mainly a single language in interactions with their children, but also showed the occasional use of the other language in over half the sample when children were 20 months. Once children were 53 months mothers again used only the same language they reported speaking to children. These findings reveal a possible effect of children's overall level of language development and demonstrate the difficulty of adhering to a strict "one person, one language" policy. The fact that there was longitudinal continuity in the language most mothers mainly spoke with children provided children with cumulative language input learning opportunities.

  14. Ecosystem and physiological scales of microbial responses to nutrients in a detritus-based stream: results of a 5-year continuous enrichment

    Science.gov (United States)

    Keller Suberkropp; Vladislav Gulis; Amy D. Rosemond; Jonathan Benstead

    2010-01-01

    Our study examined the response of leaf detritus–associated microorganisms (both bacteria and fungi) to a 5-yr continuous nutrient enrichment of a forested headwater stream. Leaf litter dominates detritus inputs to such streams and, on a system wide scale, serves as the key substrate for microbial colonization. We determined physiological responses as microbial biomass...

  15. Single-channel in-ear-EEG detects the focus of auditory attention to concurrent tone streams and mixed speech

    Science.gov (United States)

    Fiedler, Lorenz; Wöstmann, Malte; Graversen, Carina; Brandmeyer, Alex; Lunner, Thomas; Obleser, Jonas

    2017-06-01

    Objective. Conventional, multi-channel scalp electroencephalography (EEG) allows the identification of the attended speaker in concurrent-listening (‘cocktail party’) scenarios. This implies that EEG might provide valuable information to complement hearing aids with some form of EEG and to install a level of neuro-feedback. Approach. To investigate whether a listener’s attentional focus can be detected from single-channel hearing-aid-compatible EEG configurations, we recorded EEG from three electrodes inside the ear canal (‘in-Ear-EEG’) and additionally from 64 electrodes on the scalp. In two different, concurrent listening tasks, participants (n  =  7) were fitted with individualized in-Ear-EEG pieces and were either asked to attend to one of two dichotically-presented, concurrent tone streams or to one of two diotically-presented, concurrent audiobooks. A forward encoding model was trained to predict the EEG response at single EEG channels. Main results. Each individual participants’ attentional focus could be detected from single-channel EEG response recorded from short-distance configurations consisting only of a single in-Ear-EEG electrode and an adjacent scalp-EEG electrode. The differences in neural responses to attended and ignored stimuli were consistent in morphology (i.e. polarity and latency of components) across subjects. Significance. In sum, our findings show that the EEG response from a single-channel, hearing-aid-compatible configuration provides valuable information to identify a listener’s focus of attention.

  16. Collective plasma effects associated with the continuous injection model of solar flare particle streams

    Science.gov (United States)

    Vlahos, L.; Papadopoulos, K.

    1979-01-01

    A modified continuous injection model for impulsive solar flares that includes self-consistent plasma nonlinearities based on the concept of marginal stability is presented. A quasi-stationary state is established, composed of a hot truncated electron Maxwellian distribution confined by acoustic turbulence on the top of the loop and energetic electron beams precipitating in the chromosphere. It is shown that the radiation properties of the model are in accordance with observations.

  17. The natural statistics of audiovisual speech.

    Directory of Open Access Journals (Sweden)

    Chandramouli Chandrasekaran

    2009-07-01

    Full Text Available Humans, like other animals, are exposed to a continuous stream of signals, which are dynamic, multimodal, extended, and time varying in nature. This complex input space must be transduced and sampled by our sensory systems and transmitted to the brain where it can guide the selection of appropriate actions. To simplify this process, it's been suggested that the brain exploits statistical regularities in the stimulus space. Tests of this idea have largely been confined to unimodal signals and natural scenes. One important class of multisensory signals for which a quantitative input space characterization is unavailable is human speech. We do not understand what signals our brain has to actively piece together from an audiovisual speech stream to arrive at a percept versus what is already embedded in the signal structure of the stream itself. In essence, we do not have a clear understanding of the natural statistics of audiovisual speech. In the present study, we identified the following major statistical features of audiovisual speech. First, we observed robust correlations and close temporal correspondence between the area of the mouth opening and the acoustic envelope. Second, we found the strongest correlation between the area of the mouth opening and vocal tract resonances. Third, we observed that both area of the mouth opening and the voice envelope are temporally modulated in the 2-7 Hz frequency range. Finally, we show that the timing of mouth movements relative to the onset of the voice is consistently between 100 and 300 ms. We interpret these data in the context of recent neural theories of speech which suggest that speech communication is a reciprocally coupled, multisensory event, whereby the outputs of the signaler are matched to the neural processes of the receiver.

  18. Left Superior Temporal Gyrus Is Coupled to Attended Speech in a Cocktail-Party Auditory Scene.

    Science.gov (United States)

    Vander Ghinst, Marc; Bourguignon, Mathieu; Op de Beeck, Marc; Wens, Vincent; Marty, Brice; Hassid, Sergio; Choufani, Georges; Jousmäki, Veikko; Hari, Riitta; Van Bogaert, Patrick; Goldman, Serge; De Tiège, Xavier

    2016-02-03

    Using a continuous listening task, we evaluated the coupling between the listener's cortical activity and the temporal envelopes of different sounds in a multitalker auditory scene using magnetoencephalography and corticovocal coherence analysis. Neuromagnetic signals were recorded from 20 right-handed healthy adult humans who listened to five different recorded stories (attended speech streams), one without any multitalker background (No noise) and four mixed with a "cocktail party" multitalker background noise at four signal-to-noise ratios (5, 0, -5, and -10 dB) to produce speech-in-noise mixtures, here referred to as Global scene. Coherence analysis revealed that the modulations of the attended speech stream, presented without multitalker background, were coupled at ∼0.5 Hz to the activity of both superior temporal gyri, whereas the modulations at 4-8 Hz were coupled to the activity of the right supratemporal auditory cortex. In cocktail party conditions, with the multitalker background noise, the coupling was at both frequencies stronger for the attended speech stream than for the unattended Multitalker background. The coupling strengths decreased as the Multitalker background increased. During the cocktail party conditions, the ∼0.5 Hz coupling became left-hemisphere dominant, compared with bilateral coupling without the multitalker background, whereas the 4-8 Hz coupling remained right-hemisphere lateralized in both conditions. The brain activity was not coupled to the multitalker background or to its individual talkers. The results highlight the key role of listener's left superior temporal gyri in extracting the slow ∼0.5 Hz modulations, likely reflecting the attended speech stream within a multitalker auditory scene. When people listen to one person in a "cocktail party," their auditory cortex mainly follows the attended speech stream rather than the entire auditory scene. However, how the brain extracts the attended speech stream from the whole

  19. Thermo-fluid-dynamics of turbulent boundary layer over a moving continuous flat sheet in a parallel free stream

    Science.gov (United States)

    Afzal, Bushra; Noor Afzal Team; Bushra Afzal Team

    2014-11-01

    The momentum and thermal turbulent boundary layers over a continuous moving sheet subjected to a free stream have been analyzed in two layers (inner wall and outer wake) theory at large Reynolds number. The present work is based on open Reynolds equations of momentum and heat transfer without any closure model say, like eddy viscosity or mixing length etc. The matching of inner and outer layers has been carried out by Izakson-Millikan-Kolmogorov hypothesis. The matching for velocity and temperature profiles yields the logarithmic laws and power laws in overlap region of inner and outer layers, along with friction factor and heat transfer laws. The uniformly valid solution for velocity, Reynolds shear stress, temperature and thermal Reynolds heat flux have been proposed by introducing the outer wake functions due to momentum and thermal boundary layers. The comparison with experimental data for velocity profile, temperature profile, skin friction and heat transfer are presented. In outer non-linear layers, the lowest order momentum and thermal boundary layer equations have also been analyses by using eddy viscosity closure model, and results are compared with experimental data. Retired Professor, Embassy Hotel, Rasal Ganj, Aligarh 202001 India.

  20. Dansk Rapport: Work Stream 3: Fokus gruppe interviews:Militante fra den anden side Side: Demokratiske kræfter mod hate-speech i Danmark

    OpenAIRE

    Siim, Birte; Larsen, Jeppe Fuglsang; Meret, Susi

    2014-01-01

    The purpose of this national report is to analyze the role of social movements/organizations/initiatives in the struggle against racism, discrimination, hate-speech and behavior in Denmark. The first part includes a brief summary of the Danish political landscape for the democratic anti-bodies. This is followed by a mapping of voluntary movements/groups/organizations comparing the diverse policies and strategies towards racism, discrimination and hate-speech and behavior as well as the kind o...

  1. Speech Problems

    Science.gov (United States)

    ... Staying Safe Videos for Educators Search English Español Speech Problems KidsHealth / For Teens / Speech Problems What's in ... a person's ability to speak clearly. Some Common Speech and Language Disorders Stuttering is a problem that ...

  2. Effective Connectivity Hierarchically Links Temporoparietal and Frontal Areas of the Auditory Dorsal Stream with the Motor Cortex Lip Area during Speech Perception

    Science.gov (United States)

    Murakami, Takenobu; Restle, Julia; Ziemann, Ulf

    2012-01-01

    A left-hemispheric cortico-cortical network involving areas of the temporoparietal junction (Tpj) and the posterior inferior frontal gyrus (pIFG) is thought to support sensorimotor integration of speech perception into articulatory motor activation, but how this network links with the lip area of the primary motor cortex (M1) during speech…

  3. Speech Compression

    Directory of Open Access Journals (Sweden)

    Jerry D. Gibson

    2016-06-01

    Full Text Available Speech compression is a key technology underlying digital cellular communications, VoIP, voicemail, and voice response systems. We trace the evolution of speech coding based on the linear prediction model, highlight the key milestones in speech coding, and outline the structures of the most important speech coding standards. Current challenges, future research directions, fundamental limits on performance, and the critical open problem of speech coding for emergency first responders are all discussed.

  4. Estimated fecal coliform bacteria concentrations using near real-time continuous water-quality and streamflow data from five stream sites in Chester County, Pennsylvania, 2007–16

    Science.gov (United States)

    Senior, Lisa A.

    2017-09-15

    Several streams used for recreational activities, such as fishing, swimming, and boating, in Chester County, Pennsylvania, are known to have periodic elevated concentrations of fecal coliform bacteria, a type of bacteria used to indicate the potential presence of fecally related pathogens that may pose health risks to humans exposed through water contact. The availability of near real-time continuous stream discharge, turbidity, and other water-quality data for some streams in the county presents an opportunity to use surrogates to estimate near real-time concentrations of fecal coliform (FC) bacteria and thus provide some information about associated potential health risks during recreational use of streams.The U.S. Geological Survey (USGS), in cooperation with the Chester County Health Department (CCHD) and the Chester County Water Resources Authority (CCWRA), has collected discrete stream samples for analysis of FC concentrations during March–October annually at or near five gaging stations where near real-time continuous data on stream discharge, turbidity, and water temperature have been collected since 2007 (or since 2012 at 2 of the 5 stations). In 2014, the USGS, in cooperation with the CCWRA and CCHD, began to develop regression equations to estimate FC concentrations using available near real-time continuous data. Regression equations included possible explanatory variables of stream discharge, turbidity, water temperature, and seasonal factors calculated using Julian Day with base-10 logarithmic (log) transformations of selected variables.The regression equations were developed using the data from 2007 to 2015 (101–106 discrete bacteria samples per site) for three gaging stations on Brandywine Creek (West Branch Brandywine Creek at Modena, East Branch Brandywine Creek below Downingtown, and Brandywine Creek at Chadds Ford) and from 2012 to 2015 (37–38 discrete bacteria samples per site) for one station each on French Creek near Phoenixville and

  5. Continuous analytical control of the streaming waters in a uranium treatment plant and of various chemical products using automatic discharge valves

    International Nuclear Information System (INIS)

    Archimbaud, M.; Simeon, C.

    1968-01-01

    This report describes a method for controlling the streaming waters produced by the Pierrelatte Centre; it is based on continuous analysis, with simultaneous recording of the species liable to be found accidentally in the corresponding hydrological circuits (chlorides, fluorides, chromium VI, uranium). An alarm set off at pre-determined thresholds leads to an automatic cutting off of the discharge valves; the outward flow of the waters is thus interrupted. This study has shown the various applications which can be found for this water control method, and gives an idea of the cost price. (authors) [fr

  6. Common neural substrates support speech and non-speech vocal tract gestures.

    Science.gov (United States)

    Chang, Soo-Eun; Kenney, Mary Kay; Loucks, Torrey M J; Poletto, Christopher J; Ludlow, Christy L

    2009-08-01

    The issue of whether speech is supported by the same neural substrates as non-speech vocal tract gestures has been contentious. In this fMRI study we tested whether producing non-speech vocal tract gestures in humans shares the same functional neuroanatomy as non-sense speech syllables. Production of non-speech vocal tract gestures, devoid of phonological content but similar to speech in that they had familiar acoustic and somatosensory targets, was compared to the production of speech syllables without meaning. Brain activation related to overt production was captured with BOLD fMRI using a sparse sampling design for both conditions. Speech and non-speech were compared using voxel-wise whole brain analyses, and ROI analyses focused on frontal and temporoparietal structures previously reported to support speech production. Results showed substantial activation overlap between speech and non-speech function in regions. Although non-speech gesture production showed greater extent and amplitude of activation in the regions examined, both speech and non-speech showed comparable left laterality in activation for both target perception and production. These findings posit a more general role of the previously proposed "auditory dorsal stream" in the left hemisphere--to support the production of vocal tract gestures that are not limited to speech processing.

  7. Data-driven analysis of functional brain interactions during free listening to music and speech.

    Science.gov (United States)

    Fang, Jun; Hu, Xintao; Han, Junwei; Jiang, Xi; Zhu, Dajiang; Guo, Lei; Liu, Tianming

    2015-06-01

    Natural stimulus functional magnetic resonance imaging (N-fMRI) such as fMRI acquired when participants were watching video streams or listening to audio streams has been increasingly used to investigate functional mechanisms of the human brain in recent years. One of the fundamental challenges in functional brain mapping based on N-fMRI is to model the brain's functional responses to continuous, naturalistic and dynamic natural stimuli. To address this challenge, in this paper we present a data-driven approach to exploring functional interactions in the human brain during free listening to music and speech streams. Specifically, we model the brain responses using N-fMRI by measuring the functional interactions on large-scale brain networks with intrinsically established structural correspondence, and perform music and speech classification tasks to guide the systematic identification of consistent and discriminative functional interactions when multiple subjects were listening music and speech in multiple categories. The underlying premise is that the functional interactions derived from N-fMRI data of multiple subjects should exhibit both consistency and discriminability. Our experimental results show that a variety of brain systems including attention, memory, auditory/language, emotion, and action networks are among the most relevant brain systems involved in classic music, pop music and speech differentiation. Our study provides an alternative approach to investigating the human brain's mechanism in comprehension of complex natural music and speech.

  8. Speech Matters

    DEFF Research Database (Denmark)

    Hasse Jørgensen, Stina

    2011-01-01

    About Speech Matters - Katarina Gregos, the Greek curator's exhibition at the Danish Pavillion, the Venice Biannual 2011.......About Speech Matters - Katarina Gregos, the Greek curator's exhibition at the Danish Pavillion, the Venice Biannual 2011....

  9. Speech-to-Speech Relay Service

    Science.gov (United States)

    Consumer Guide Speech to Speech Relay Service Speech-to-Speech (STS) is one form of Telecommunications Relay Service (TRS). TRS is a service that allows persons with hearing and speech disabilities ...

  10. Apraxia of Speech

    Science.gov (United States)

    ... Health Info » Voice, Speech, and Language Apraxia of Speech On this page: What is apraxia of speech? ... about apraxia of speech? What is apraxia of speech? Apraxia of speech (AOS)—also known as acquired ...

  11. Introductory speeches

    International Nuclear Information System (INIS)

    2001-01-01

    This CD is multimedia presentation of programme safety upgrading of Bohunice V1 NPP. This chapter consist of introductory commentary and 4 introductory speeches (video records): (1) Introductory speech of Vincent Pillar, Board chairman and director general of Slovak electric, Plc. (SE); (2) Introductory speech of Stefan Schmidt, director of SE - Bohunice Nuclear power plants; (3) Introductory speech of Jan Korec, Board chairman and director general of VUJE Trnava, Inc. - Engineering, Design and Research Organisation, Trnava; Introductory speech of Dietrich Kuschel, Senior vice-president of FRAMATOME ANP Project and Engineering

  12. Productivity of Stream Definitions

    NARCIS (Netherlands)

    Endrullis, Jörg; Grabmayer, Clemens; Hendriks, Dimitri; Isihara, Ariya; Klop, Jan

    2007-01-01

    We give an algorithm for deciding productivity of a large and natural class of recursive stream definitions. A stream definition is called ‘productive’ if it can be evaluated continuously in such a way that a uniquely determined stream is obtained as the limit. Whereas productivity is undecidable

  13. Productivity of stream definitions

    NARCIS (Netherlands)

    Endrullis, J.; Grabmayer, C.A.; Hendriks, D.; Isihara, A.; Klop, J.W.

    2008-01-01

    We give an algorithm for deciding productivity of a large and natural class of recursive stream definitions. A stream definition is called ‘productive’ if it can be evaluated continually in such a way that a uniquely determined stream in constructor normal form is obtained as the limit. Whereas

  14. Peculiarities of the Continuous Glucose Monitoring Data Stream and Their Impact on Developing Closed-Loop Control Technology

    OpenAIRE

    Kovatchev, Boris; Clarke, William

    2008-01-01

    Therapeutic advances in type 1 diabetes (T1DM) are currently focused on developing a closed-loop control system using a continuous glucose monitor (CGM), subcutaneous insulin delivery, and a control algorithm. Because a CGM assesses blood glucose indirectly (and therefore often inaccurately), it limits the effectiveness of the controller. In order to improve the quality of CGM data, a series of analyses are suggested. These analyses evaluate and compensate for CGM errors, assess risks associa...

  15. Estimation of Constituent Concentrations, Loads, and Yields in Streams of Johnson County, Northeast Kansas, Using Continuous Water-Quality Monitoring and Regression Models, October 2002 through December 2006

    Science.gov (United States)

    Rasmussen, Teresa J.; Lee, Casey J.; Ziegler, Andrew C.

    2008-01-01

    Johnson County is one of the most rapidly developing counties in Kansas. Population growth and expanding urban land use affect the quality of county streams, which are important for human and environmental health, water supply, recreation, and aesthetic value. This report describes estimates of streamflow and constituent concentrations, loads, and yields in relation to watershed characteristics in five Johnson County streams using continuous in-stream sensor measurements. Specific conductance, pH, water temperature, turbidity, and dissolved oxygen were monitored in five watersheds from October 2002 through December 2006. These continuous data were used in conjunction with discrete water samples to develop regression models for continuously estimating concentrations of other constituents. Continuous regression-based concentrations were estimated for suspended sediment, total suspended solids, dissolved solids and selected major ions, nutrients (nitrogen and phosphorus species), and fecal-indicator bacteria. Continuous daily, monthly, seasonal, and annual loads were calculated from concentration estimates and streamflow. The data are used to describe differences in concentrations, loads, and yields and to explain these differences relative to watershed characteristics. Water quality at the five monitoring sites varied according to hydrologic conditions; contributing drainage area; land use (including degree of urbanization); relative contributions from point and nonpoint constituent sources; and human activity within each watershed. Dissolved oxygen (DO) concentrations were less than the Kansas aquatic-life-support criterion of 5.0 mg/L less than 10 percent of the time at all sites except Indian Creek, which had DO concentrations less than the criterion about 15 percent of the time. Concentrations of suspended sediment, chloride (winter only), indicator bacteria, and pesticides were substantially larger during periods of increased streamflow. Suspended

  16. Sleep Disrupts High-Level Speech Parsing Despite Significant Basic Auditory Processing.

    Science.gov (United States)

    Makov, Shiri; Sharon, Omer; Ding, Nai; Ben-Shachar, Michal; Nir, Yuval; Zion Golumbic, Elana

    2017-08-09

    parsing are also preserved. We used a novel approach for studying the depth of speech processing across wakefulness and sleep while tracking neuronal activity with EEG. We found that responses to the auditory sound stream remained intact; however, the sleeping brain did not show signs of hierarchical parsing of the continuous stream of syllables into words, phrases, and sentences. The results suggest that sleep imposes a functional barrier between basic sensory processing and high-level cognitive processing. This paradigm also holds promise for studying residual cognitive abilities in a wide array of unresponsive states. Copyright © 2017 the authors 0270-6474/17/377772-10$15.00/0.

  17. Audio-Visual Speech Recognition Using MPEG-4 Compliant Visual Features

    Directory of Open Access Journals (Sweden)

    Petar S. Aleksic

    2002-11-01

    Full Text Available We describe an audio-visual automatic continuous speech recognition system, which significantly improves speech recognition performance over a wide range of acoustic noise levels, as well as under clean audio conditions. The system utilizes facial animation parameters (FAPs supported by the MPEG-4 standard for the visual representation of speech. We also describe a robust and automatic algorithm we have developed to extract FAPs from visual data, which does not require hand labeling or extensive training procedures. The principal component analysis (PCA was performed on the FAPs in order to decrease the dimensionality of the visual feature vectors, and the derived projection weights were used as visual features in the audio-visual automatic speech recognition (ASR experiments. Both single-stream and multistream hidden Markov models (HMMs were used to model the ASR system, integrate audio and visual information, and perform a relatively large vocabulary (approximately 1000 words speech recognition experiments. The experiments performed use clean audio data and audio data corrupted by stationary white Gaussian noise at various SNRs. The proposed system reduces the word error rate (WER by 20% to 23% relatively to audio-only speech recognition WERs, at various SNRs (0–30 dB with additive white Gaussian noise, and by 19% relatively to audio-only speech recognition WER under clean audio conditions.

  18. Neuronal basis of speech comprehension.

    Science.gov (United States)

    Specht, Karsten

    2014-01-01

    Verbal communication does not rely only on the simple perception of auditory signals. It is rather a parallel and integrative processing of linguistic and non-linguistic information, involving temporal and frontal areas in particular. This review describes the inherent complexity of auditory speech comprehension from a functional-neuroanatomical perspective. The review is divided into two parts. In the first part, structural and functional asymmetry of language relevant structures will be discus. The second part of the review will discuss recent neuroimaging studies, which coherently demonstrate that speech comprehension processes rely on a hierarchical network involving the temporal, parietal, and frontal lobes. Further, the results support the dual-stream model for speech comprehension, with a dorsal stream for auditory-motor integration, and a ventral stream for extracting meaning but also the processing of sentences and narratives. Specific patterns of functional asymmetry between the left and right hemisphere can also be demonstrated. The review article concludes with a discussion on interactions between the dorsal and ventral streams, particularly the involvement of motor related areas in speech perception processes, and outlines some remaining unresolved issues. This article is part of a Special Issue entitled Human Auditory Neuroimaging. Copyright © 2013 Elsevier B.V. All rights reserved.

  19. The stream of experience when watching artistic movies. Dynamic aesthetic effects revealed by the continuous evaluation procedure (CEP

    Directory of Open Access Journals (Sweden)

    Claudia eMuth

    2015-03-01

    Full Text Available Research in perception and appreciation is often focused on snapshots, stills of experience. Static approaches allow for multidimensional assessment, but are unable to catch the crucial dynamics of affective and perceptual processes; for instance, aesthetic phenomena such as the ‘Aesthetic-Aha’ (the increase in liking after the sudden detection of Gestalt, effects of expectation, or Berlyne’s idea that ‘disorientation’ with a ‘promise of success’ elicits interest. We conducted empirical studies on indeterminate artistic movies depicting the evolution and metamorphosis of Gestalt and investigated (i the effects of sudden perceptual insights on liking; that is, Aesthetic Aha-effects, (ii the dynamics of interest before moments of insight, and (iii the dynamics of complexity before and after moments of insight. Via the so-called Continuous Evaluation Procedure (CEP enabling analogous evaluation in a continuous way, participants assessed the material on two aesthetic dimensions blockwise either in a gallery or a laboratory. The material’s inherent dynamics were described via assessments of liking, interest, determinacy and surprise along with a computational analysis on the variable complexity. We identified moments of insight as peaks in determinacy and surprise. Statistically significant changes in liking and interest demonstrated that: (i insights increase liking, (ii interest already increases 1,500 ms before such moments of insight, supporting the idea that it is evoked by an expectation of understanding, and (iii insights occur during increasing complexity. We propose a preliminary model of dynamics in liking and interest with regard to complexity and perceptual insight and discuss descriptions of participants’ experiences of insight. Our results point to the importance of systematic analyses of dynamics in art perception and appreciation.

  20. The stream of experience when watching artistic movies. Dynamic aesthetic effects revealed by the Continuous Evaluation Procedure (CEP).

    Science.gov (United States)

    Muth, Claudia; Raab, Marius H; Carbon, Claus-Christian

    2015-01-01

    Research in perception and appreciation is often focused on snapshots, stills of experience. Static approaches allow for multidimensional assessment, but are unable to catch the crucial dynamics of affective and perceptual processes; for instance, aesthetic phenomena such as the "Aesthetic-Aha" (the increase in liking after the sudden detection of Gestalt), effects of expectation, or Berlyne's idea that "disorientation" with a "promise of success" elicits interest. We conducted empirical studies on indeterminate artistic movies depicting the evolution and metamorphosis of Gestalt and investigated (i) the effects of sudden perceptual insights on liking; that is, "Aesthetic Aha"-effects, (ii) the dynamics of interest before moments of insight, and (iii) the dynamics of complexity before and after moments of insight. Via the so-called Continuous Evaluation Procedure (CEP) enabling analogous evaluation in a continuous way, participants assessed the material on two aesthetic dimensions blockwise either in a gallery or a laboratory. The material's inherent dynamics were described via assessments of liking, interest, determinacy, and surprise along with a computational analysis on the variable complexity. We identified moments of insight as peaks in determinacy and surprise. Statistically significant changes in liking and interest demonstrated that: (i) insights increase liking, (ii) interest already increases 1500 ms before such moments of insight, supporting the idea that it is evoked by an expectation of understanding, and (iii) insights occur during increasing complexity. We propose a preliminary model of dynamics in liking and interest with regard to complexity and perceptual insight and discuss descriptions of participants' experiences of insight. Our results point to the importance of systematic analyses of dynamics in art perception and appreciation.

  1. Speech coding

    Energy Technology Data Exchange (ETDEWEB)

    Ravishankar, C., Hughes Network Systems, Germantown, MD

    1998-05-08

    Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the

  2. Continuous preparation of Fe{sub 3}O{sub 4} nanoparticles through Impinging Stream-Rotating Packed Bed reactor and their electrochemistry detection toward heavy metal ions

    Energy Technology Data Exchange (ETDEWEB)

    Fan, Hong-Lei [Shanxi Province Key Laboratory of Higee-Oriented Chemical Engineering, North University of China, Taiyuan, 030051 (China); Zhou, Shao-Feng [Shanxi Province Key Laboratory of Functional Nanocomposites, North University of China, Taiyuan, 030051 (China); Gao, Jing [Shanxi Province Key Laboratory of Higee-Oriented Chemical Engineering, North University of China, Taiyuan, 030051 (China); Liu, You-Zhi, E-mail: lyzzhongxin@126.com [Shanxi Province Key Laboratory of Higee-Oriented Chemical Engineering, North University of China, Taiyuan, 030051 (China)

    2016-06-25

    We reported the continuous preparation and electrochemical behavior toward heavy metal ions of the Fe{sub 3}O{sub 4} nanoparticles (Fe{sub 3}O{sub 4} NPs). This Fe{sub 3}O{sub 4} NPs were fabricated through a novel Impinging Stream-Rotating Packed Bed reactor with a high production rate of 2.23 kg/hour. The as-prepared Fe{sub 3}O{sub 4} NPs were quasi-spherical with a mean diameter of about 10 nm and shown the characteristics of superparamagnetism with the saturated magnetization of 60.5 emu/g. The electrochemical characterization of the as-prepared Fe{sub 3}O{sub 4} NPs toward heavy metal ions were evaluated using square wave anodic stripping voltammetry (SWASV) analysis. The results indicated that the modified electrode could be used to individual detection of Pb(II), Cu(II), Hg(II) and Cd(II). In particular, the modified electrode exhibited the selective detection toward Pb(II) with higher sensitivity of 14.9 μA/μM, while the response to Cu(II), Hg(II) and Cd(II) were negligible. Besides, the modified electrode shown good stability and potential practical applicability in the electrochemical determination of Pb(II). This above results offered a simple method for continuous preparation sensing materials in the application field of electrochemical detection of toxic metal ions through the technology of process intensification. - Highlights: • Fe{sub 3}O{sub 4} nanoparticles were continuous prepared through IS-RPB reactor. • The Fe{sub 3}O{sub 4} nanoparticles showed selective detection of heavy metal ions. • It exhibited favorable sensitivity (14.9 μA μM{sup −1}) and LOD (0.119 μM) for Pb(II). • The as-prepared nanoparticles showed favorable potential application.

  3. Continuous preparation of Fe3O4 nanoparticles through Impinging Stream-Rotating Packed Bed reactor and their electrochemistry detection toward heavy metal ions

    International Nuclear Information System (INIS)

    Fan, Hong-Lei; Zhou, Shao-Feng; Gao, Jing; Liu, You-Zhi

    2016-01-01

    We reported the continuous preparation and electrochemical behavior toward heavy metal ions of the Fe 3 O 4 nanoparticles (Fe 3 O 4 NPs). This Fe 3 O 4 NPs were fabricated through a novel Impinging Stream-Rotating Packed Bed reactor with a high production rate of 2.23 kg/hour. The as-prepared Fe 3 O 4 NPs were quasi-spherical with a mean diameter of about 10 nm and shown the characteristics of superparamagnetism with the saturated magnetization of 60.5 emu/g. The electrochemical characterization of the as-prepared Fe 3 O 4 NPs toward heavy metal ions were evaluated using square wave anodic stripping voltammetry (SWASV) analysis. The results indicated that the modified electrode could be used to individual detection of Pb(II), Cu(II), Hg(II) and Cd(II). In particular, the modified electrode exhibited the selective detection toward Pb(II) with higher sensitivity of 14.9 μA/μM, while the response to Cu(II), Hg(II) and Cd(II) were negligible. Besides, the modified electrode shown good stability and potential practical applicability in the electrochemical determination of Pb(II). This above results offered a simple method for continuous preparation sensing materials in the application field of electrochemical detection of toxic metal ions through the technology of process intensification. - Highlights: • Fe 3 O 4 nanoparticles were continuous prepared through IS-RPB reactor. • The Fe 3 O 4 nanoparticles showed selective detection of heavy metal ions. • It exhibited favorable sensitivity (14.9 μA μM −1 ) and LOD (0.119 μM) for Pb(II). • The as-prepared nanoparticles showed favorable potential application.

  4. New insights into agricultural pesticide pollution through a complete and continuous pesticide screening during one growing season in five small Swiss streams

    Science.gov (United States)

    Mangold, Simon; Doppler, Tobias; Spycher, Simon; Langer, Miriam; Junghans, Marion; Kunz, Manuel; Stamm, Christian; Singer, Heinz

    2017-04-01

    Agricultural pesticides are regularly found in many surface waters draining agricultural areas. Due to large fluctuations in concentration over time and the potentially high number of pesticides, it is difficult to obtain a complete overview of the real pollution level. This collaborative project between research, federal and cantonal authorities in Switzerland aimed for a comprehensive assessment of pesticide pollution in five small agricultural streams to tackle this knowledge gap. The five streams are located in catchments (1.5 to 9 km2) with intensive agriculture covering a wide range of crops including vegetables, vineyards and orchards. Twelve-hour composite samples were collected continuously from March until the end of August 2015 with automatic sampling devices, yielding 360 samples per site. Using precipitation and water level data, we differentiated between discharge events and low-flow periods. Samples from discharge events where measured individually whereas samples taken during dry weather were pooled for the analysis. This procedure resulted in a complete concentration profile over the entire monitoring period covered by 34 - 60 samples per site. The analysis, using liquid chromatography coupled to high resolution mass spectrometry involved a target screening of about 220 pesticides. The measured concentrations were compared to chronic and acute environmental quality standards (EQS values) resulting in risk quotients RQs, which are the ratios between measured concentrations and the respective EQS values. Despite the small size of the catchments, we observed a large pesticide diversity in all of them with 68 to 103 detected compounds per study area. At all sites, chronic EQS values were exceeded. However, the exposure levels varied substantially among catchments. Maximum chronic RQs per site ranged between 1.1 and 48.8 and the duration of EQS exceedance varied between 2 weeks and 5.5 months. Additionally, the data reveal (very) high concentration

  5. Mutual Information Based Dynamic Integration of Multiple Feature Streams for Robust Real-Time LVCSR

    Science.gov (United States)

    Sato, Shoei; Kobayashi, Akio; Onoe, Kazuo; Homma, Shinichi; Imai, Toru; Takagi, Tohru; Kobayashi, Tetsunori

    We present a novel method of integrating the likelihoods of multiple feature streams, representing different acoustic aspects, for robust speech recognition. The integration algorithm dynamically calculates a frame-wise stream weight so that a higher weight is given to a stream that is robust to a variety of noisy environments or speaking styles. Such a robust stream is expected to show discriminative ability. A conventional method proposed for the recognition of spoken digits calculates the weights front the entropy of the whole set of HMM states. This paper extends the dynamic weighting to a real-time large-vocabulary continuous speech recognition (LVCSR) system. The proposed weight is calculated in real-time from mutual information between an input stream and active HMM states in a searchs pace without an additional likelihood calculation. Furthermore, the mutual information takes the width of the search space into account by calculating the marginal entropy from the number of active states. In this paper, we integrate three features that are extracted through auditory filters by taking into account the human auditory system's ability to extract amplitude and frequency modulations. Due to this, features representing energy, amplitude drift, and resonant frequency drifts, are integrated. These features are expected to provide complementary clues for speech recognition. Speech recognition experiments on field reports and spontaneous commentary from Japanese broadcast news showed that the proposed method reduced error words by 9.2% in field reports and 4.7% in spontaneous commentaries relative to the best result obtained from a single stream.

  6. The role of high-level processes for oscillatory phase entrainment to speech sound

    Directory of Open Access Journals (Sweden)

    Benedikt eZoefel

    2015-12-01

    Full Text Available Constantly bombarded with input, the brain has the need to filter out relevant information while ignoring the irrelevant rest. A powerful tool may be represented by neural oscillations which entrain their high-excitability phase to important input while their low-excitability phase attenuates irrelevant information. Indeed, the alignment between brain oscillations and speech improves intelligibility and helps dissociating speakers during a cocktail party. Although well-investigated, the contribution of low- and high-level processes to phase entrainment to speech sound has only recently begun to be understood. Here, we review those findings, and concentrate on three main results: (1 Phase entrainment to speech sound is modulated by attention or predictions, likely supported by top-down signals and indicating higher-level processes involved in the brain’s adjustment to speech. (2 As phase entrainment to speech can be observed without systematic fluctuations in sound amplitude or spectral content, it does not only reflect a passive steady-state ringing of the cochlea, but entails a higher-level process. (3 The role of intelligibility for phase entrainment is debated. Recent results suggest that intelligibility modulates the behavioral consequences of entrainment, rather than directly affecting the strength of entrainment in auditory regions. We conclude that phase entrainment to speech reflects a sophisticated mechanism: Several high-level processes interact to optimally align neural oscillations with predicted events of high relevance, even when they are hidden in a continuous stream of background noise.

  7. Continuous preparation of nanoscale zero-valent iron using impinging stream-rotating packed bed reactor and their application in reduction of nitrobenzene

    Science.gov (United States)

    Jiao, Weizhou; Qin, Yuejiao; Luo, Shuai; Feng, Zhirong; Liu, Youzhi

    2017-02-01

    Nanoscale zero-valent iron (nZVI) was continuously prepared by high-gravity reaction precipitation through a novel impinging stream-rotating packed bed (IS-RPB). Reactant solutions of FeSO4 and NaBH4 were conducted into the IS-RPB with flow rates of 60 L/h and rotating speed of 1000 r/min for the preparation of nZVI. As-prepared nZVI obtained by IS-RPB were quasi-spherical morphology and almost uniformly distributed with a particle size of 10-20 nm. The reactivity of nZVI was estimated by the degradation of 100 ml nitrobenzene (NB) with initial concentration of 250 mg/L. The optimum dosage of nZVI obtained by IS-RPB was 4.0 g/L as the NB could be completely removed within 10 min, which reduced 20% compared with nZVI obtained by stirred tank reactor (STR). The reduction of NB and production of aniline (AN) followed pseudo-first-order kinetics, and the pseudo-first-order rate constants were 0.0147 and 0.0034 s-1, respectively. Furthermore, the as-prepared nZVI using IS-RPB reactor in this work can be used within a relatively wide range pH of 1-9.

  8. Continuous preparation of nanoscale zero-valent iron using impinging stream-rotating packed bed reactor and their application in reduction of nitrobenzene

    Energy Technology Data Exchange (ETDEWEB)

    Jiao, Weizhou, E-mail: jwz0306@126.com; Qin, Yuejiao [North University of China, Shanxi Province Key Laboratory of Higee-Oriented Chemical Engineering (China); Luo, Shuai [Virginia Polytechnic Institute and State University, Department of Civil and Environmental Engineering (United States); Feng, Zhirong; Liu, Youzhi [North University of China, Shanxi Province Key Laboratory of Higee-Oriented Chemical Engineering (China)

    2017-02-15

    Nanoscale zero-valent iron (nZVI) was continuously prepared by high-gravity reaction precipitation through a novel impinging stream-rotating packed bed (IS-RPB). Reactant solutions of FeSO{sub 4} and NaBH{sub 4} were conducted into the IS-RPB with flow rates of 60 L/h and rotating speed of 1000 r/min for the preparation of nZVI. As-prepared nZVI obtained by IS-RPB were quasi-spherical morphology and almost uniformly distributed with a particle size of 10–20 nm. The reactivity of nZVI was estimated by the degradation of 100 ml nitrobenzene (NB) with initial concentration of 250 mg/L. The optimum dosage of nZVI obtained by IS-RPB was 4.0 g/L as the NB could be completely removed within 10 min, which reduced 20% compared with nZVI obtained by stirred tank reactor (STR). The reduction of NB and production of aniline (AN) followed pseudo-first-order kinetics, and the pseudo-first-order rate constants were 0.0147 and 0.0034 s{sup −1}, respectively. Furthermore, the as-prepared nZVI using IS-RPB reactor in this work can be used within a relatively wide range pH of 1–9.

  9. Neural Entrainment to Speech Modulates Speech Intelligibility

    NARCIS (Netherlands)

    Riecke, Lars; Formisano, Elia; Sorger, Bettina; Baskent, Deniz; Gaudrain, Etienne

    2018-01-01

    Speech is crucial for communication in everyday life. Speech-brain entrainment, the alignment of neural activity to the slow temporal fluctuations (envelope) of acoustic speech input, is a ubiquitous element of current theories of speech processing. Associations between speech-brain entrainment and

  10. Visual Input Enhances Selective Speech Envelope Tracking in Auditory Cortex at a ‘Cocktail Party’

    Science.gov (United States)

    Golumbic, Elana Zion; Cogan, Gregory B.; Schroeder, Charles E.; Poeppel, David

    2013-01-01

    Our ability to selectively attend to one auditory signal amidst competing input streams, epitomized by the ‘Cocktail Party’ problem, continues to stimulate research from various approaches. How this demanding perceptual feat is achieved from a neural systems perspective remains unclear and controversial. It is well established that neural responses to attended stimuli are enhanced compared to responses to ignored ones, but responses to ignored stimuli are nonetheless highly significant, leading to interference in performance. We investigated whether congruent visual input of an attended speaker enhances cortical selectivity in auditory cortex, leading to diminished representation of ignored stimuli. We recorded magnetoencephalographic (MEG) signals from human participants as they attended to segments of natural continuous speech. Using two complementary methods of quantifying the neural response to speech, we found that viewing a speaker’s face enhances the capacity of auditory cortex to track the temporal speech envelope of that speaker. This mechanism was most effective in a ‘Cocktail Party’ setting, promoting preferential tracking of the attended speaker, whereas without visual input no significant attentional modulation was observed. These neurophysiological results underscore the importance of visual input in resolving perceptual ambiguity in a noisy environment. Since visual cues in speech precede the associated auditory signals, they likely serve a predictive role in facilitating auditory processing of speech, perhaps by directing attentional resources to appropriate points in time when to-be-attended acoustic input is expected to arrive. PMID:23345218

  11. Streams with Strahler Stream Order

    Data.gov (United States)

    Minnesota Department of Natural Resources — Stream segments with Strahler stream order values assigned. As of 01/08/08 the linework is from the DNR24K stream coverages and will not match the updated...

  12. Contributions of speech science to the technology of man-machine voice interactions

    Science.gov (United States)

    Lea, Wayne A.

    1977-01-01

    Research in speech understanding was reviewed. Plans which include prosodics research, phonological rules for speech understanding systems, and continued interdisciplinary phonetics research are discussed. Improved acoustic phonetic analysis capabilities in speech recognizers are suggested.

  13. Persistent blood stream infection in patients supported with a continuous-flow left ventricular assist device is associated with an increased risk of cerebrovascular accidents.

    Science.gov (United States)

    Trachtenberg, Barry H; Cordero-Reyes, Andrea M; Aldeiri, Molham; Alvarez, Paulino; Bhimaraj, Arvind; Ashrith, Guha; Elias, Barbara; Suarez, Erik E; Bruckner, Brian; Loebe, Matthias; Harris, Richard L; Zhang, J Yi; Torre-Amione, Guillermo; Estep, Jerry D

    2015-02-01

    Common adverse events in patients supported with Continuous-flow left ventricular assist devices (CF-LVAD) include infections and cerebrovascular accidents (CVA). Some studies have suggested a possible association between blood stream infection (BSI) and CVA. Medical records of patients who received Heartmate II (HMII) CF-LVADs in 2008-2012 at a single center were reviewed. CVA was categorized as either hemorrhagic (HCVA) or ischemic (ICVA). BSI was divided into persistent (pBSI) and nonpersistent (non-pBSI). pBSI was defined as BSI with the same organism on repeated blood culture >72 hours from initial blood culture despite antibiotics. Univariate and multivariate analyses were performed to determine predictors. A total of 149 patients had HMII implanted; 76% were male, and the overall mean age was 55.4 ± 13 years. There were a total of 19 (13%) patients who had CVA (7 HCVA and 12 ICVA) at a median of 295 days (range 5-1,096 days) after implantation. There were a total of 28 (19%) patients with pBSI and 17 (11%) patients with non-pBSI. Patients with pBSI had a trend toward greater BMI (31 kg/m(2) vs 27 kg/m(2); P = .09), and longer duration of support (1,019 d vs 371 d; P < .001) compared with those with non-pBSI. Persistent BSI was associated with an increased risk of mortality and with all-cause CVA on multivariate analysis (odds ratio [OR] 5.97; P = .003) as well as persistent Pseudomonas aeruginosa infection (OR 4.54; P = .048). Persistent BSI is not uncommon in patients supported by CF-LVAD and is highly associated with all-cause CVA and increased all-cause mortality. Copyright © 2015 Elsevier Inc. All rights reserved.

  14. Continuous water quality monitoring to determine the cause of coral reef ecosystem degradation for coastal windward Oahu streams during 2002 (NODC Accession 0001070)

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Kaneohe and Waimanalo streams on the windward side of the island of Oahu in the Hawaiian Islands have been hardened to prevent flooding. The hardening process has...

  15. Continuous water quality monitoring to determine the cause of coral reef ecosystem degradation for coastal Windward Oahu streams during 2002 (NODC Accession 0001070)

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Kaneohe and Waimanalo streams on the windward side of the island of Oahu in the Hawaiian Islands have been hardened to prevent flooding. The hardening process has...

  16. Speech Research

    Science.gov (United States)

    Several articles addressing topics in speech research are presented. The topics include: exploring the functional significance of physiological tremor: A biospectroscopic approach; differences between experienced and inexperienced listeners to deaf speech; a language-oriented view of reading and its disabilities; Phonetic factors in letter detection; categorical perception; Short-term recall by deaf signers of American sign language; a common basis for auditory sensory storage in perception and immediate memory; phonological awareness and verbal short-term memory; initiation versus execution time during manual and oral counting by stutterers; trading relations in the perception of speech by five-year-old children; the role of the strap muscles in pitch lowering; phonetic validation of distinctive features; consonants and syllable boundaires; and vowel information in postvocalic frictions.

  17. Hate speech

    Directory of Open Access Journals (Sweden)

    Anne Birgitta Nilsen

    2014-12-01

    Full Text Available The manifesto of the Norwegian terrorist Anders Behring Breivik is based on the “Eurabia” conspiracy theory. This theory is a key starting point for hate speech amongst many right-wing extremists in Europe, but also has ramifications beyond these environments. In brief, proponents of the Eurabia theory claim that Muslims are occupying Europe and destroying Western culture, with the assistance of the EU and European governments. By contrast, members of Al-Qaeda and other extreme Islamists promote the conspiracy theory “the Crusade” in their hate speech directed against the West. Proponents of the latter theory argue that the West is leading a crusade to eradicate Islam and Muslims, a crusade that is similarly facilitated by their governments. This article presents analyses of texts written by right-wing extremists and Muslim extremists in an effort to shed light on how hate speech promulgates conspiracy theories in order to spread hatred and intolerance.The aim of the article is to contribute to a more thorough understanding of hate speech’s nature by applying rhetorical analysis. Rhetorical analysis is chosen because it offers a means of understanding the persuasive power of speech. It is thus a suitable tool to describe how hate speech works to convince and persuade. The concepts from rhetorical theory used in this article are ethos, logos and pathos. The concept of ethos is used to pinpoint factors that contributed to Osama bin Laden's impact, namely factors that lent credibility to his promotion of the conspiracy theory of the Crusade. In particular, Bin Laden projected common sense, good morals and good will towards his audience. He seemed to have coherent and relevant arguments; he appeared to possess moral credibility; and his use of language demonstrated that he wanted the best for his audience.The concept of pathos is used to define hate speech, since hate speech targets its audience's emotions. In hate speech it is the

  18. Speech enhancement

    CERN Document Server

    Benesty, Jacob; Chen, Jingdong

    2006-01-01

    We live in a noisy world! In all applications (telecommunications, hands-free communications, recording, human-machine interfaces, etc.) that require at least one microphone, the signal of interest is usually contaminated by noise and reverberation. As a result, the microphone signal has to be ""cleaned"" with digital signal processing tools before it is played out, transmitted, or stored.This book is about speech enhancement. Different well-known and state-of-the-art methods for noise reduction, with one or multiple microphones, are discussed. By speech enhancement, we mean not only noise red

  19. Speech Intelligibility

    Science.gov (United States)

    Brand, Thomas

    Speech intelligibility (SI) is important for different fields of research, engineering and diagnostics in order to quantify very different phenomena like the quality of recordings, communication and playback devices, the reverberation of auditoria, characteristics of hearing impairment, benefit using hearing aids or combinations of these things.

  20. Infants' preference for native audiovisual speech dissociated from congruency preference.

    Directory of Open Access Journals (Sweden)

    Kathleen Shaw

    Full Text Available Although infant speech perception in often studied in isolated modalities, infants' experience with speech is largely multimodal (i.e., speech sounds they hear are accompanied by articulating faces. Across two experiments, we tested infants' sensitivity to the relationship between the auditory and visual components of audiovisual speech in their native (English and non-native (Spanish language. In Experiment 1, infants' looking times were measured during a preferential looking task in which they saw two simultaneous visual speech streams articulating a story, one in English and the other in Spanish, while they heard either the English or the Spanish version of the story. In Experiment 2, looking times from another group of infants were measured as they watched single displays of congruent and incongruent combinations of English and Spanish audio and visual speech streams. Findings demonstrated an age-related increase in looking towards the native relative to non-native visual speech stream when accompanied by the corresponding (native auditory speech. This increase in native language preference did not appear to be driven by a difference in preference for native vs. non-native audiovisual congruence as we observed no difference in looking times at the audiovisual streams in Experiment 2.

  1. 78 FR 49693 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

    Science.gov (United States)

    2013-08-15

    ...-Speech Services for Individuals with Hearing and Speech Disabilities, Report and Order (Order), document...] Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services; Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and Speech Disabilities...

  2. Sixteen-Month-Old Infants' Segment Words from Infant- and Adult-Directed Speech

    Science.gov (United States)

    Mani, Nivedita; Pätzold, Wiebke

    2016-01-01

    One of the first challenges facing the young language learner is the task of segmenting words from a natural language speech stream, without prior knowledge of how these words sound. Studies with younger children find that children find it easier to segment words from fluent speech when the words are presented in infant-directed speech, i.e., the…

  3. Stream Crossings

    Data.gov (United States)

    Vermont Center for Geographic Information — Physical measurements and attributes of stream crossing structures and adjacent stream reaches which are used to provide a relative rating of aquatic organism...

  4. Akamai Streaming

    OpenAIRE

    ECT Team, Purdue

    2007-01-01

    Akamai offers world-class streaming media services that enable Internet content providers and enterprises to succeed in today's Web-centric marketplace. They deliver live event Webcasts (complete with video production, encoding, and signal acquisition services), streaming media on demand, 24/7 Webcasts and a variety of streaming application services based upon their EdgeAdvantage.

  5. Automatic speech recognition (ASR) based approach for speech therapy of aphasic patients: A review

    Science.gov (United States)

    Jamal, Norezmi; Shanta, Shahnoor; Mahmud, Farhanahani; Sha'abani, MNAH

    2017-09-01

    This paper reviews the state-of-the-art an automatic speech recognition (ASR) based approach for speech therapy of aphasic patients. Aphasia is a condition in which the affected person suffers from speech and language disorder resulting from a stroke or brain injury. Since there is a growing body of evidence indicating the possibility of improving the symptoms at an early stage, ASR based solutions are increasingly being researched for speech and language therapy. ASR is a technology that transfers human speech into transcript text by matching with the system's library. This is particularly useful in speech rehabilitation therapies as they provide accurate, real-time evaluation for speech input from an individual with speech disorder. ASR based approaches for speech therapy recognize the speech input from the aphasic patient and provide real-time feedback response to their mistakes. However, the accuracy of ASR is dependent on many factors such as, phoneme recognition, speech continuity, speaker and environmental differences as well as our depth of knowledge on human language understanding. Hence, the review examines recent development of ASR technologies and its performance for individuals with speech and language disorders.

  6. Prediction and constraint in audiovisual speech perception.

    Science.gov (United States)

    Peelle, Jonathan E; Sommers, Mitchell S

    2015-07-01

    During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing the precision of prediction. Electrophysiological studies demonstrate that oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to acoustic information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration

  7. Prediction and constraint in audiovisual speech perception

    Science.gov (United States)

    Peelle, Jonathan E.; Sommers, Mitchell S.

    2015-01-01

    During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing precision of prediction. Electrophysiological studies demonstrate oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to auditory information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration, supported

  8. Speech and audio processing for coding, enhancement and recognition

    CERN Document Server

    Togneri, Roberto; Narasimha, Madihally

    2015-01-01

    This book describes the basic principles underlying the generation, coding, transmission and enhancement of speech and audio signals, including advanced statistical and machine learning techniques for speech and speaker recognition with an overview of the key innovations in these areas. Key research undertaken in speech coding, speech enhancement, speech recognition, emotion recognition and speaker diarization are also presented, along with recent advances and new paradigms in these areas. ·         Offers readers a single-source reference on the significant applications of speech and audio processing to speech coding, speech enhancement and speech/speaker recognition. Enables readers involved in algorithm development and implementation issues for speech coding to understand the historical development and future challenges in speech coding research; ·         Discusses speech coding methods yielding bit-streams that are multi-rate and scalable for Voice-over-IP (VoIP) Networks; ·     �...

  9. Hidden neural networks: application to speech recognition

    DEFF Research Database (Denmark)

    Riis, Søren Kamaric

    1998-01-01

    We evaluate the hidden neural network HMM/NN hybrid on two speech recognition benchmark tasks; (1) task independent isolated word recognition on the Phonebook database, and (2) recognition of broad phoneme classes in continuous speech from the TIMIT database. It is shown how hidden neural networks...

  10. Speech disorders - children

    Science.gov (United States)

    ... disorder; Voice disorders; Vocal disorders; Disfluency; Communication disorder - speech disorder; Speech disorder - stuttering ... evaluation tools that can help identify and diagnose speech disorders: Denver Articulation Screening Examination Goldman-Fristoe Test of ...

  11. Perception of Intersensory Synchrony in Audiovisual Speech: Not that Special

    Science.gov (United States)

    Vroomen, Jean; Stekelenburg, Jeroen J.

    2011-01-01

    Perception of intersensory temporal order is particularly difficult for (continuous) audiovisual speech, as perceivers may find it difficult to notice substantial timing differences between speech sounds and lip movements. Here we tested whether this occurs because audiovisual speech is strongly paired ("unity assumption"). Participants made…

  12. Speech Processing.

    Science.gov (United States)

    1983-05-01

    The VDE system developed had the capability of recognizing up to 248 separate words in syntactic structures. 4 The two systems described are isolated...AND SPEAKER RECOGNITION by M.J.Hunt 5 ASSESSMENT OF SPEECH SYSTEMS ’ ..- * . by R.K.Moore 6 A SURVEY OF CURRENT EQUIPMENT AND RESEARCH’ by J.S.Bridle...TECHNOLOGY IN NAVY TRAINING SYSTEMS by R.Breaux, M.Blind and R.Lynchard 10 9 I-I GENERAL REVIEW OF MILITARY APPLICATIONS OF VOICE PROCESSING DR. BRUNO

  13. Speech Recognition

    Directory of Open Access Journals (Sweden)

    Adrian Morariu

    2009-01-01

    Full Text Available This paper presents a method of speech recognition by pattern recognition techniques. Learning consists in determining the unique characteristics of a word (cepstral coefficients by eliminating those characteristics that are different from one word to another. For learning and recognition, the system will build a dictionary of words by determining the characteristics of each word to be used in the recognition. Determining the characteristics of an audio signal consists in the following steps: noise removal, sampling it, applying Hamming window, switching to frequency domain through Fourier transform, calculating the magnitude spectrum, filtering data, determining cepstral coefficients.

  14. An analysis of machine translation and speech synthesis in speech-to-speech translation system

    OpenAIRE

    Hashimoto, K.; Yamagishi, J.; Byrne, W.; King, S.; Tokuda, K.

    2011-01-01

    This paper provides an analysis of the impacts of machine translation and speech synthesis on speech-to-speech translation systems. The speech-to-speech translation system consists of three components: speech recognition, machine translation and speech synthesis. Many techniques for integration of speech recognition and machine translation have been proposed. However, speech synthesis has not yet been considered. Therefore, in this paper, we focus on machine translation and speech synthesis, ...

  15. Speech Communication and Liberal Education.

    Science.gov (United States)

    Bradley, Bert E.

    1979-01-01

    Argues for the continuation of liberal education over career-oriented programs. Defines liberal education as one that develops abilities that transcend occupational concerns, and that enables individuals to cope with shifts in values, vocations, careers, and the environment. Argues that speech communication makes a significant contribution to…

  16. Speech recognition implementation in radiology

    International Nuclear Information System (INIS)

    White, Keith S.

    2005-01-01

    Continuous speech recognition (SR) is an emerging technology that allows direct digital transcription of dictated radiology reports. The SR systems are being widely deployed in the radiology community. This is a review of technical and practical issues that should be considered when implementing an SR system. (orig.)

  17. State of the Art. Work Stream 3 - the Danish Report:Militante fra den anden side. Anti-bodies og hate-speech og adfærd i Danmark.

    OpenAIRE

    Siim, Birte; Larsen, Jeppe Fuglsang; Meret, Susi

    2014-01-01

    The purpose of the State Of the Art (SOA) is to gain knowledge about the Danish Context on organisations, groups and movements in civil society countering hate speech, institutional racism and exclusionary practices and to identify gaps in national research on the issue that can be explored through field work, interviews and group discussions/dialogues, possibly to be debated at roundtable convening in the autumn of 2014.The SOA gives an overview of the role of the state, institutions, the po...

  18. Speech and Language Delay

    Science.gov (United States)

    ... OTC Relief for Diarrhea Home Diseases and Conditions Speech and Language Delay Condition Speech and Language Delay Share Print Table of Contents1. ... Treatment6. Everyday Life7. Questions8. Resources What is a speech and language delay? A speech and language delay ...

  19. Stream systems.

    Science.gov (United States)

    Jack E. Williams; Gordon H. Reeves

    2006-01-01

    Restored, high-quality streams provide innumerable benefits to society. In the Pacific Northwest, high-quality stream habitat often is associated with an abundance of salmonid fishes such as chinook salmon (Oncorhynchus tshawytscha), coho salmon (O. kisutch), and steelhead (O. mykiss). Many other native...

  20. Voice-associated static face image releases speech from informational masking.

    Science.gov (United States)

    Gao, Yayue; Cao, Shuyang; Qu, Tianshu; Wu, Xihong; Li, Haifeng; Zhang, Jinsheng; Li, Liang

    2014-06-01

    In noisy, multipeople talking environments such as a cocktail party, listeners can use various perceptual and/or cognitive cues to improve recognition of target speech against masking, particularly informational masking. Previous studies have shown that temporally prepresented voice cues (voice primes) improve recognition of target speech against speech masking but not noise masking. This study investigated whether static face image primes that have become target-voice associated (i.e., facial images linked through associative learning with voices reciting the target speech) can be used by listeners to unmask speech. The results showed that in 32 normal-hearing younger adults, temporally prepresenting a voice-priming sentence with the same voice reciting the target sentence significantly improved the recognition of target speech that was masked by irrelevant two-talker speech. When a person's face photograph image became associated with the voice reciting the target speech by learning, temporally prepresenting the target-voice-associated face image significantly improved recognition of target speech against speech masking, particularly for the last two keywords in the target sentence. Moreover, speech-recognition performance under the voice-priming condition was significantly correlated to that under the face-priming condition. The results suggest that learned facial information on talker identity plays an important role in identifying the target-talker's voice and facilitating selective attention to the target-speech stream against the masking-speech stream. © 2014 The Institute of Psychology, Chinese Academy of Sciences and Wiley Publishing Asia Pty Ltd.

  1. Continuous plutonium(IV) oxalate precipitation, filtration, and calcination process. [From product streams from Redox, Purex, or Recuplex solvent extraction plants

    Energy Technology Data Exchange (ETDEWEB)

    Beede, R L

    1956-09-27

    A continuous plutonium (IV) oxalate precipitation, filtration, and calcination process has been developed. Continuous and batch decomposition of the oxalate in the filtrates has been demonstrated. The processes have been demonstrated in prototype equipment. Plutonium (IV) oxalate was precipitated continuously at room temperature by the concurrent addition of plutonium (IV) nitrate feed and oxalic acid into the pan of a modified rotary drum filter. The plutonium (IV) oxalate was calcined to plutonium dioxide, which could be readily hydrofluorinated. Continuous decomposition of the oxalate in synthetic plutonium (IV) oxalate filtrates containing plutonium (IV) oxalate solids was demonstrated using co-current flow in a U-shaped reactor. Feeds containing from 10 to 100 g/1 Pu, as plutonium (IV) nitrate, and 1.0 to 6.5 M HNO/sub 3/, respectively, can be processed. One molar oxalic acid is used as the precipitant. Temperatures of 20 to 35/sup 0/C for the precipitation and filtration are satisfactory. Plutonium (IV) oxalate can be calcined at 300 to 400/sup 0/C in a screw-type drier-calciner to plutonium dioxide and hydrofluorinated at 450 to 550/sup 0/C. Plutonium dioxide exceeding purity requirements has been produced in the prototype equipment. Advantages of continuous precipitation and filtration are: uniform plutonium (IV) oxalate, improved filtration characteristics, elimination of heating and cooling facilities, and higher capacities through a single unit. Advantages of the screw-type drier-calciner are the continuous production of an oxide satisfactory for feed for the proposed plant vibrating tube hydrofluorinator, and ease of coupling continuous precipitation and filtration to this proposed hydrofluorinator. Continuous decomposition of oxalate in filtrates offers advantages in decreasing filtrate storage requirements when coupled to a filtrate concentrator. (JGB)

  2. An ALE meta-analysis on the audiovisual integration of speech signals.

    Science.gov (United States)

    Erickson, Laura C; Heeg, Elizabeth; Rauschecker, Josef P; Turkeltaub, Peter E

    2014-11-01

    The brain improves speech processing through the integration of audiovisual (AV) signals. Situations involving AV speech integration may be crudely dichotomized into those where auditory and visual inputs contain (1) equivalent, complementary signals (validating AV speech) or (2) inconsistent, different signals (conflicting AV speech). This simple framework may allow the systematic examination of broad commonalities and differences between AV neural processes engaged by various experimental paradigms frequently used to study AV speech integration. We conducted an activation likelihood estimation metaanalysis of 22 functional imaging studies comprising 33 experiments, 311 subjects, and 347 foci examining "conflicting" versus "validating" AV speech. Experimental paradigms included content congruency, timing synchrony, and perceptual measures, such as the McGurk effect or synchrony judgments, across AV speech stimulus types (sublexical to sentence). Colocalization of conflicting AV speech experiments revealed consistency across at least two contrast types (e.g., synchrony and congruency) in a network of dorsal stream regions in the frontal, parietal, and temporal lobes. There was consistency across all contrast types (synchrony, congruency, and percept) in the bilateral posterior superior/middle temporal cortex. Although fewer studies were available, validating AV speech experiments were localized to other regions, such as ventral stream visual areas in the occipital and inferior temporal cortex. These results suggest that while equivalent, complementary AV speech signals may evoke activity in regions related to the corroboration of sensory input, conflicting AV speech signals recruit widespread dorsal stream areas likely involved in the resolution of conflicting sensory signals. Copyright © 2014 Wiley Periodicals, Inc.

  3. Stream Evaluation

    Data.gov (United States)

    Kansas Data Access and Support Center — Digital representation of the map accompanying the "Kansas stream and river fishery resource evaluation" (R.E. Moss and K. Brunson, 1981.U.S. Fish and Wildlife...

  4. The speech signal segmentation algorithm using pitch synchronous analysis

    Directory of Open Access Journals (Sweden)

    Amirgaliyev Yedilkhan

    2017-03-01

    Full Text Available Parameterization of the speech signal using the algorithms of analysis synchronized with the pitch frequency is discussed. Speech parameterization is performed by the average number of zero transitions function and the signal energy function. Parameterization results are used to segment the speech signal and to isolate the segments with stable spectral characteristics. Segmentation results can be used to generate a digital voice pattern of a person or be applied in the automatic speech recognition. Stages needed for continuous speech segmentation are described.

  5. THE ONTOGENESIS OF SPEECH DEVELOPMENT

    Directory of Open Access Journals (Sweden)

    T. E. Braudo

    2017-01-01

    Full Text Available The purpose of this article is to acquaint the specialists, working with children having developmental disorders, with age-related norms for speech development. Many well-known linguists and psychologists studied speech ontogenesis (logogenesis. Speech is a higher mental function, which integrates many functional systems. Speech development in infants during the first months after birth is ensured by the innate hearing and emerging ability to fix the gaze on the face of an adult. Innate emotional reactions are also being developed during this period, turning into nonverbal forms of communication. At about 6 months a baby starts to pronounce some syllables; at 7–9 months – repeats various sounds combinations, pronounced by adults. At 10–11 months a baby begins to react on the words, referred to him/her. The first words usually appear at an age of 1 year; this is the start of the stage of active speech development. At this time it is acceptable, if a child confuses or rearranges sounds, distorts or misses them. By the age of 1.5 years a child begins to understand abstract explanations of adults. Significant vocabulary enlargement occurs between 2 and 3 years; grammatical structures of the language are being formed during this period (a child starts to use phrases and sentences. Preschool age (3–7 y. o. is characterized by incorrect, but steadily improving pronunciation of sounds and phonemic perception. The vocabulary increases; abstract speech and retelling are being formed. Children over 7 y. o. continue to improve grammar, writing and reading skills. The described stages may not have strict age boundaries, as soon as they are dependent not only on environment, but also on the child’s mental constitution, heredity and character.

  6. Speech and Communication Disorders

    Science.gov (United States)

    ... to being completely unable to speak or understand speech. Causes include Hearing disorders and deafness Voice problems, ... or those caused by cleft lip or palate Speech problems like stuttering Developmental disabilities Learning disorders Autism ...

  7. Continuous speech recognition with sparse coding

    CSIR Research Space (South Africa)

    Smit, WJ

    2009-04-01

    Full Text Available generative model. The spike train is classified by making use of a spike train model and dynamic programming. It is computationally expensive to find a sparse code. We use an iterative subset selection algorithm with quadratic programming for this process...

  8. Audio stream classification for multimedia database search

    Science.gov (United States)

    Artese, M.; Bianco, S.; Gagliardi, I.; Gasparini, F.

    2013-03-01

    Search and retrieval of huge archives of Multimedia data is a challenging task. A classification step is often used to reduce the number of entries on which to perform the subsequent search. In particular, when new entries of the database are continuously added, a fast classification based on simple threshold evaluation is desirable. In this work we present a CART-based (Classification And Regression Tree [1]) classification framework for audio streams belonging to multimedia databases. The database considered is the Archive of Ethnography and Social History (AESS) [2], which is mainly composed of popular songs and other audio records describing the popular traditions handed down generation by generation, such as traditional fairs, and customs. The peculiarities of this database are that it is continuously updated; the audio recordings are acquired in unconstrained environment; and for the non-expert human user is difficult to create the ground truth labels. In our experiments, half of all the available audio files have been randomly extracted and used as training set. The remaining ones have been used as test set. The classifier has been trained to distinguish among three different classes: speech, music, and song. All the audio files in the dataset have been previously manually labeled into the three classes above defined by domain experts.

  9. Statistical learning of speech, not music, in congenital amusia.

    Science.gov (United States)

    Peretz, Isabelle; Saffran, Jenny; Schön, Daniele; Gosselin, Nathalie

    2012-04-01

    The acquisition of both speech and music uses general principles: learners extract statistical regularities present in the environment. Yet, individuals who suffer from congenital amusia (commonly called tone-deafness) have experienced lifelong difficulties in acquiring basic musical skills, while their language abilities appear essentially intact. One possible account for this dissociation between music and speech is that amusics lack normal experience with music. If given appropriate exposure, amusics might be able to acquire basic musical abilities. To test this possibility, a group of 11 adults with congenital amusia, and their matched controls, were exposed to a continuous stream of syllables or tones for 21-minute. Their task was to try to identify three-syllable nonsense words or three-tone motifs having an identical statistical structure. The results of five experiments show that amusics can learn novel words as easily as controls, whereas they systematically fail on musical materials. Thus, inappropriate musical exposure cannot fully account for the musical disorder. Implications of the results for the domain specificity of statistical learning are discussed. © 2012 New York Academy of Sciences.

  10. Free Speech Yearbook 1978.

    Science.gov (United States)

    Phifer, Gregg, Ed.

    The 17 articles in this collection deal with theoretical and practical freedom of speech issues. The topics include: freedom of speech in Marquette Park, Illinois; Nazis in Skokie, Illinois; freedom of expression in the Confederate States of America; Robert M. LaFollette's arguments for free speech and the rights of Congress; the United States…

  11. Providing Continuous Assurance

    NARCIS (Netherlands)

    Kocken, Jonne; Hulstijn, Joris

    2017-01-01

    It has been claimed that continuous assurance can be attained by combining continuous monitoring by management, with continuous auditing of data streams and the effectiveness of internal controls by an external auditor. However, we find that in existing literature the final step to continuous

  12. Speech in spinocerebellar ataxia.

    Science.gov (United States)

    Schalling, Ellika; Hartelius, Lena

    2013-12-01

    Spinocerebellar ataxias (SCAs) are a heterogeneous group of autosomal dominant cerebellar ataxias clinically characterized by progressive ataxia, dysarthria and a range of other concomitant neurological symptoms. Only a few studies include detailed characterization of speech symptoms in SCA. Speech symptoms in SCA resemble ataxic dysarthria but symptoms related to phonation may be more prominent. One study to date has shown an association between differences in speech and voice symptoms related to genotype. More studies of speech and voice phenotypes are motivated, to possibly aid in clinical diagnosis. In addition, instrumental speech analysis has been demonstrated to be a reliable measure that may be used to monitor disease progression or therapy outcomes in possible future pharmacological treatments. Intervention by speech and language pathologists should go beyond assessment. Clinical guidelines for management of speech, communication and swallowing need to be developed for individuals with progressive cerebellar ataxia. Copyright © 2013 Elsevier Inc. All rights reserved.

  13. Digital speech processing using Matlab

    CERN Document Server

    Gopi, E S

    2014-01-01

    Digital Speech Processing Using Matlab deals with digital speech pattern recognition, speech production model, speech feature extraction, and speech compression. The book is written in a manner that is suitable for beginners pursuing basic research in digital speech processing. Matlab illustrations are provided for most topics to enable better understanding of concepts. This book also deals with the basic pattern recognition techniques (illustrated with speech signals using Matlab) such as PCA, LDA, ICA, SVM, HMM, GMM, BPN, and KSOM.

  14. Cortical Representations of Speech in a Multitalker Auditory Scene.

    Science.gov (United States)

    Puvvada, Krishna C; Simon, Jonathan Z

    2017-09-20

    The ability to parse a complex auditory scene into perceptual objects is facilitated by a hierarchical auditory system. Successive stages in the hierarchy transform an auditory scene of multiple overlapping sources, from peripheral tonotopically based representations in the auditory nerve, into perceptually distinct auditory-object-based representations in the auditory cortex. Here, using magnetoencephalography recordings from men and women, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in distinct hierarchical stages of the auditory cortex. Using systems-theoretic methods of stimulus reconstruction, we show that the primary-like areas in the auditory cortex contain dominantly spectrotemporal-based representations of the entire auditory scene. Here, both attended and ignored speech streams are represented with almost equal fidelity, and a global representation of the full auditory scene with all its streams is a better candidate neural representation than that of individual streams being represented separately. We also show that higher-order auditory cortical areas, by contrast, represent the attended stream separately and with significantly higher fidelity than unattended streams. Furthermore, the unattended background streams are more faithfully represented as a single unsegregated background object rather than as separated objects. Together, these findings demonstrate the progression of the representations and processing of a complex acoustic scene up through the hierarchy of the human auditory cortex. SIGNIFICANCE STATEMENT Using magnetoencephalography recordings from human listeners in a simulated cocktail party environment, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in separate hierarchical stages of the auditory cortex. We show that the primary-like areas in the auditory cortex use a dominantly spectrotemporal-based representation of the entire auditory

  15. Improvement of intelligibility of ideal binary-masked noisy speech by adding background noise.

    Science.gov (United States)

    Cao, Shuyang; Li, Liang; Wu, Xihong

    2011-04-01

    When a target-speech/masker mixture is processed with the signal-separation technique, ideal binary mask (IBM), intelligibility of target speech is remarkably improved in both normal-hearing listeners and hearing-impaired listeners. Intelligibility of speech can also be improved by filling in speech gaps with un-modulated broadband noise. This study investigated whether intelligibility of target speech in the IBM-treated target-speech/masker mixture can be further improved by adding a broadband-noise background. The results of this study show that following the IBM manipulation, which remarkably released target speech from speech-spectrum noise, foreign-speech, or native-speech masking (experiment 1), adding a broadband-noise background with the signal-to-noise ratio no less than 4 dB significantly improved intelligibility of target speech when the masker was either noise (experiment 2) or speech (experiment 3). The results suggest that since adding the noise background shallows the areas of silence in the time-frequency domain of the IBM-treated target-speech/masker mixture, the abruption of transient changes in the mixture is smoothed and the perceived continuity of target-speech components becomes enhanced, leading to improved target-speech intelligibility. The findings are useful for advancing computational auditory scene analysis, hearing-aid/cochlear-implant designs, and understanding of speech perception under "cocktail-party" conditions.

  16. Audio-visual speech timing sensitivity is enhanced in cluttered conditions.

    Directory of Open Access Journals (Sweden)

    Warrick Roseboom

    2011-04-01

    Full Text Available Events encoded in separate sensory modalities, such as audition and vision, can seem to be synchronous across a relatively broad range of physical timing differences. This may suggest that the precision of audio-visual timing judgments is inherently poor. Here we show that this is not necessarily true. We contrast timing sensitivity for isolated streams of audio and visual speech, and for streams of audio and visual speech accompanied by additional, temporally offset, visual speech streams. We find that the precision with which synchronous streams of audio and visual speech are identified is enhanced by the presence of additional streams of asynchronous visual speech. Our data suggest that timing perception is shaped by selective grouping processes, which can result in enhanced precision in temporally cluttered environments. The imprecision suggested by previous studies might therefore be a consequence of examining isolated pairs of audio and visual events. We argue that when an isolated pair of cross-modal events is presented, they tend to group perceptually and to seem synchronous as a consequence. We have revealed greater precision by providing multiple visual signals, possibly allowing a single auditory speech stream to group selectively with the most synchronous visual candidate. The grouping processes we have identified might be important in daily life, such as when we attempt to follow a conversation in a crowded room.

  17. The functional anatomy of speech perception: Dorsal and ventral processing pathways

    Science.gov (United States)

    Hickok, Gregory

    2003-04-01

    Drawing on recent developments in the cortical organization of vision, and on data from a variety of sources, Hickok and Poeppel (2000) have proposed a new model of the functional anatomy of speech perception. The model posits that early cortical stages of speech perception involve auditory fields in the superior temporal gyrus bilaterally (although asymmetrically). This cortical processing system then diverges into two broad processing streams, a ventral stream, involved in mapping sound onto meaning, and a dorsal stream, involved in mapping sound onto articulatory-based representations. The ventral stream projects ventrolaterally toward inferior posterior temporal cortex which serves as an interface between sound and meaning. The dorsal stream projects dorsoposteriorly toward the parietal lobe and ultimately to frontal regions. This network provides a mechanism for the development and maintenance of ``parity'' between auditory and motor representations of speech. Although the dorsal stream represents a tight connection between speech perception and speech production, it is not a critical component of the speech perception process under ecologically natural listening conditions. Some degree of bi-directionality in both the dorsal and ventral pathways is also proposed. A variety of recent empirical tests of this model have provided further support for the proposal.

  18. Online collaboration environments in telemedicine applications of speech therapy.

    Science.gov (United States)

    Pierrakeas, C; Georgopoulos, V; Malandraki, G

    2005-01-01

    The use of telemedicine in speech and language pathology provides patients in rural and remote areas with access to quality rehabilitation services that are sufficient, accessible, and user-friendly leading to new possibilities in comprehensive and long-term, cost-effective diagnosis and therapy. This paper discusses the use of online collaboration environments for various telemedicine applications of speech therapy which include online group speech therapy scenarios, multidisciplinary clinical consulting team, and online mentoring and continuing education.

  19. Gesture and Speech in Interaction - 4th edition (GESPIN 4)

    OpenAIRE

    Ferré , Gaëlle; Mark , Tutton

    2015-01-01

    International audience; The fourth edition of Gesture and Speech in Interaction (GESPIN) was held in Nantes, France. With more than 40 papers, these proceedings show just what a flourishing field of enquiry gesture studies continues to be. The keynote speeches of the conference addressed three different aspects of multimodal interaction:gesture and grammar, gesture acquisition, and gesture and social interaction. In a talk entitled Qualitiesof event construal in speech and gesture: Aspect and...

  20. Recognizing speech in a novel accent: the motor theory of speech perception reframed.

    Science.gov (United States)

    Moulin-Frier, Clément; Arbib, Michael A

    2013-08-01

    The motor theory of speech perception holds that we perceive the speech of another in terms of a motor representation of that speech. However, when we have learned to recognize a foreign accent, it seems plausible that recognition of a word rarely involves reconstruction of the speech gestures of the speaker rather than the listener. To better assess the motor theory and this observation, we proceed in three stages. Part 1 places the motor theory of speech perception in a larger framework based on our earlier models of the adaptive formation of mirror neurons for grasping, and for viewing extensions of that mirror system as part of a larger system for neuro-linguistic processing, augmented by the present consideration of recognizing speech in a novel accent. Part 2 then offers a novel computational model of how a listener comes to understand the speech of someone speaking the listener's native language with a foreign accent. The core tenet of the model is that the listener uses hypotheses about the word the speaker is currently uttering to update probabilities linking the sound produced by the speaker to phonemes in the native language repertoire of the listener. This, on average, improves the recognition of later words. This model is neutral regarding the nature of the representations it uses (motor vs. auditory). It serve as a reference point for the discussion in Part 3, which proposes a dual-stream neuro-linguistic architecture to revisits claims for and against the motor theory of speech perception and the relevance of mirror neurons, and extracts some implications for the reframing of the motor theory.

  1. Knowledge discovery from data streams

    CERN Document Server

    Gama, Joao

    2010-01-01

    Since the beginning of the Internet age and the increased use of ubiquitous computing devices, the large volume and continuous flow of distributed data have imposed new constraints on the design of learning algorithms. Exploring how to extract knowledge structures from evolving and time-changing data, Knowledge Discovery from Data Streams presents a coherent overview of state-of-the-art research in learning from data streams.The book covers the fundamentals that are imperative to understanding data streams and describes important applications, such as TCP/IP traffic, GPS data, sensor networks,

  2. Speech Alarms Pilot Study

    Science.gov (United States)

    Sandor, Aniko; Moses, Haifa

    2016-01-01

    Speech alarms have been used extensively in aviation and included in International Building Codes (IBC) and National Fire Protection Association's (NFPA) Life Safety Code. However, they have not been implemented on space vehicles. Previous studies conducted at NASA JSC showed that speech alarms lead to faster identification and higher accuracy. This research evaluated updated speech and tone alerts in a laboratory environment and in the Human Exploration Research Analog (HERA) in a realistic setup.

  3. Objective assessment of stream segregation abilities of CI users as a function of electrode separation

    DEFF Research Database (Denmark)

    Paredes Gallardo, Andreu; Madsen, Sara Miay Kim; Dau, Torsten

    Auditory streaming is a perceptual process by which the human auditory system organizes sounds from different sources into perceptually meaningful elements. Segregation of sound sources is important, among others, for understanding speech in noisy environments, which is especially challenging...

  4. Ear, Hearing and Speech

    DEFF Research Database (Denmark)

    Poulsen, Torben

    2000-01-01

    An introduction is given to the the anatomy and the function of the ear, basic psychoacoustic matters (hearing threshold, loudness, masking), the speech signal and speech intelligibility. The lecture note is written for the course: Fundamentals of Acoustics and Noise Control (51001)......An introduction is given to the the anatomy and the function of the ear, basic psychoacoustic matters (hearing threshold, loudness, masking), the speech signal and speech intelligibility. The lecture note is written for the course: Fundamentals of Acoustics and Noise Control (51001)...

  5. Principles of speech coding

    CERN Document Server

    Ogunfunmi, Tokunbo

    2010-01-01

    It is becoming increasingly apparent that all forms of communication-including voice-will be transmitted through packet-switched networks based on the Internet Protocol (IP). Therefore, the design of modern devices that rely on speech interfaces, such as cell phones and PDAs, requires a complete and up-to-date understanding of the basics of speech coding. Outlines key signal processing algorithms used to mitigate impairments to speech quality in VoIP networksOffering a detailed yet easily accessible introduction to the field, Principles of Speech Coding provides an in-depth examination of the

  6. Speech disorder prevention

    Directory of Open Access Journals (Sweden)

    Miladis Fornaris-Méndez

    2017-04-01

    Full Text Available Language therapy has trafficked from a medical focus until a preventive focus. However, difficulties are evidenced in the development of this last task, because he is devoted bigger space to the correction of the disorders of the language. Because the speech disorders is the dysfunction with more frequently appearance, acquires special importance the preventive work that is developed to avoid its appearance. Speech education since early age of the childhood makes work easier for prevent the appearance of speech disorders in the children. The present work has as objective to offer different activities for the prevention of the speech disorders.

  7. Pure apraxia of speech due to infarct in premotor cortex.

    Science.gov (United States)

    Patira, Riddhi; Ciniglia, Lauren; Calvert, Timothy; Altschuler, Eric L

    Apraxia of speech (AOS) is now recognized as an articulation disorder distinct from dysarthria and aphasia. Various lesions have been associated with AOS in studies that are limited in precise localization due to variability in size and type of pathology. We present a case of pure AOS in setting of an acute stroke to localize more precisely than ever before the brain area responsible for AOS, dorsal premotor cortex (dPMC). The dPMC is in unique position to plan and coordinate speech production by virtue of its connection with nearby motor cortex harboring corticobulbar tract, supplementary motor area, inferior frontal operculum, and temporo-parietal area via the dorsal stream of dual-stream model of speech processing. The role of dPMC is further supported as part of dorsal stream in the dual-stream model of speech processing as well as controller in the hierarchical state feedback control model. Copyright © 2017 Polish Neurological Society. Published by Elsevier Urban & Partner Sp. z o.o. All rights reserved.

  8. Auditory and Cognitive Factors Underlying Individual Differences in Aided Speech-Understanding among Older Adults

    Directory of Open Access Journals (Sweden)

    Larry E. Humes

    2013-10-01

    Full Text Available This study was designed to address individual differences in aided speech understanding among a relatively large group of older adults. The group of older adults consisted of 98 adults (50 female and 48 male ranging in age from 60 to 86 (mean = 69.2. Hearing loss was typical for this age group and about 90% had not worn hearing aids. All subjects completed a battery of tests, including cognitive (6 measures, psychophysical (17 measures, and speech-understanding (9 measures, as well as the Speech, Spatial and Qualities of Hearing (SSQ self-report scale. Most of the speech-understanding measures made use of competing speech and the non-speech psychophysical measures were designed to tap phenomena thought to be relevant for the perception of speech in competing speech (e.g., stream segregation, modulation-detection interference. All measures of speech understanding were administered with spectral shaping applied to the speech stimuli to fully restore audibility through at least 4000 Hz. The measures used were demonstrated to be reliable in older adults and, when compared to a reference group of 28 young normal-hearing adults, age-group differences were observed on many of the measures. Principal-components factor analysis was applied successfully to reduce the number of independent and dependent (speech understanding measures for a multiple-regression analysis. Doing so yielded one global cognitive-processing factor and five non-speech psychoacoustic factors (hearing loss, dichotic signal detection, multi-burst masking, stream segregation, and modulation detection as potential predictors. To this set of six potential predictor variables were added subject age, Environmental Sound Identification (ESI, and performance on the text-recognition-threshold (TRT task (a visual analog of interrupted speech recognition. These variables were used to successfully predict one global aided speech-understanding factor, accounting for about 60% of the variance.

  9. Segmentation, Diarization and Speech Transcription: Surprise Data Unraveled

    NARCIS (Netherlands)

    Huijbregts, M.A.H.

    2008-01-01

    In this thesis, research on large vocabulary continuous speech recognition for unknown audio conditions is presented. For automatic speech recognition systems based on statistical methods, it is important that the conditions of the audio used for training the statistical models match the conditions

  10. Speech interaction strategies for a humanoid assistant

    Directory of Open Access Journals (Sweden)

    Stüker Sebastian

    2018-01-01

    Full Text Available The goal of SecondHands, a H2020 project, is to design a robot that can offer help to a maintenance technician in a proactive manner. The robot is to act as a second pair of hands that can assist the technician when he is in need of help. In order for the robot to be of real help to the technician, it needs to understand his needs and follow his commands. Interaction via speech is a crucial part of this. Due to the nature of the situation in which the interactions take place, often the technician needs to speak to the robot when under stress performing strenuous physical labor, the classical turn based interaction schemes need to be transformed into dialogue systems that perform stream processing, anticipating user intentions, correcting itself as more information become available, in order to be able to respond in a rapid manner. In order to meet these demands, we are developing low-latency streaming based automatic speech recognition systems in combination with recurrent neural network based Natural Language Understanding systems that perform slot filling and intent recognition in order for the robot to provide assistance in a rapid manner, that can be partly based on speculative classifications that are then being refined as more speech becomes available.

  11. Cortical oscillations and entrainment in speech processing during working memory load

    DEFF Research Database (Denmark)

    Hjortkjær, Jens; Märcher-Rørsted, Jonatan; Fuglsang, Søren A

    2018-01-01

    Neuronal oscillations are thought to play an important role in working memory (WM) and speech processing. Listening to speech in real-life situations is often cognitively demanding but it is unknown whether WM load influences how auditory cortical activity synchronizes to speech features. Here, we...... developed an auditory n-back paradigm to investigate cortical entrainment to speech envelope fluctuations under different degrees of WM load. We measured the electroencephalogram, pupil dilations and behavioural performance from 22 subjects listening to continuous speech with an embedded n-back task....... The speech stimuli consisted of long spoken number sequences created to match natural speech in terms of sentence intonation, syllabic rate and phonetic content. To burden different WM functions during speech processing, listeners performed an n-back task on the speech sequences in different levels...

  12. Intervention for Childhood Apraxia of Speech: A Single-Case Study

    Science.gov (United States)

    Martikainen, Anna-Leena; Korpilahti, Pirjo

    2011-01-01

    The underlying nature and diagnosis of childhood apraxia of speech (CAS) still requires clarification. However, the label "CAS" or "suspected CAS" continues to be assigned to a group of children with speech problems, and speech and language therapists need to be aware of effective treatment for these children. The aim of this study was to assess…

  13. Tuning Neural Phase Entrainment to Speech.

    Science.gov (United States)

    Falk, Simone; Lanzilotti, Cosima; Schön, Daniele

    2017-08-01

    Musical rhythm positively impacts on subsequent speech processing. However, the neural mechanisms underlying this phenomenon are so far unclear. We investigated whether carryover effects from a preceding musical cue to a speech stimulus result from a continuation of neural phase entrainment to periodicities that are present in both music and speech. Participants listened and memorized French metrical sentences that contained (quasi-)periodic recurrences of accents and syllables. Speech stimuli were preceded by a rhythmically regular or irregular musical cue. Our results show that the presence of a regular cue modulates neural response as estimated by EEG power spectral density, intertrial coherence, and source analyses at critical frequencies during speech processing compared with the irregular condition. Importantly, intertrial coherences for regular cues were indicative of the participants' success in memorizing the subsequent speech stimuli. These findings underscore the highly adaptive nature of neural phase entrainment across fundamentally different auditory stimuli. They also support current models of neural phase entrainment as a tool of predictive timing and attentional selection across cognitive domains.

  14. The Hierarchical Cortical Organization of Human Speech Processing.

    Science.gov (United States)

    de Heer, Wendy A; Huth, Alexander G; Griffiths, Thomas L; Gallant, Jack L; Theunissen, Frédéric E

    2017-07-05

    natural speech. Both cerebral hemispheres were actively involved in speech processing in large and equal amounts. Also, the transformation from spectral features to semantic elements occurs early in the cortical speech-processing stream. Our experimental and analytical approaches are important alternatives and complements to standard approaches that use segmented speech and block designs, which report more laterality in speech processing and associated semantic processing to higher levels of cortex than reported here. Copyright © 2017 the authors 0270-6474/17/376539-19$15.00/0.

  15. Collective speech acts

    NARCIS (Netherlands)

    Meijers, A.W.M.; Tsohatzidis, S.L.

    2007-01-01

    From its early development in the 1960s, speech act theory always had an individualistic orientation. It focused exclusively on speech acts performed by individual agents. Paradigmatic examples are ‘I promise that p’, ‘I order that p’, and ‘I declare that p’. There is a single speaker and a single

  16. Private Speech in Ballet

    Science.gov (United States)

    Johnston, Dale

    2006-01-01

    Authoritarian teaching practices in ballet inhibit the use of private speech. This paper highlights the critical importance of private speech in the cognitive development of young ballet students, within what is largely a non-verbal art form. It draws upon research by Russian psychologist Lev Vygotsky and contemporary socioculturalists, to…

  17. Free Speech Yearbook 1980.

    Science.gov (United States)

    Kane, Peter E., Ed.

    The 11 articles in this collection deal with theoretical and practical freedom of speech issues. The topics covered are (1) the United States Supreme Court and communication theory; (2) truth, knowledge, and a democratic respect for diversity; (3) denial of freedom of speech in Jock Yablonski's campaign for the presidency of the United Mine…

  18. Illustrated Speech Anatomy.

    Science.gov (United States)

    Shearer, William M.

    Written for students in the fields of speech correction and audiology, the text deals with the following: structures involved in respiration; the skeleton and the processes of inhalation and exhalation; phonation and pitch, the larynx, and esophageal speech; muscles involved in articulation; muscles involved in resonance; and the anatomy of the…

  19. Free Speech. No. 38.

    Science.gov (United States)

    Kane, Peter E., Ed.

    This issue of "Free Speech" contains the following articles: "Daniel Schoor Relieved of Reporting Duties" by Laurence Stern, "The Sellout at CBS" by Michael Harrington, "Defending Dan Schorr" by Tome Wicker, "Speech to the Washington Press Club, February 25, 1976" by Daniel Schorr, "Funds…

  20. Synergetic Organization in Speech Rhythm

    Science.gov (United States)

    Cummins, Fred

    The Speech Cycling Task is a novel experimental paradigm developed together with Robert Port and Keiichi Tajima at Indiana University. In a task of this sort, subjects repeat a phrase containing multiple prominent, or stressed, syllables in time with an auditory metronome, which can be simple or complex. A phase-based collective variable is defined in the acoustic speech signal. This paper reports on two experiments using speech cycling which together reveal many of the hallmarks of hierarchically coupled oscillatory processes. The first experiment requires subjects to place the final stressed syllable of a small phrase at specified phases within the overall Phrase Repetition Cycle (PRC). It is clearly demonstrated that only three patterns, characterized by phases around 1/3, 1/2 or 2/3 are reliably produced, and these points are attractors for other target phases. The system is thus multistable, and the attractors correspond to stable couplings between the metrical foot and the PRC. A second experiment examines the behavior of these attractors at increased rates. Faster rates lead to mode jumps between attractors. Previous experiments have also illustrated hysteresis as the system moves from one mode to the next. The dynamical organization is particularly interesting from a modeling point of view, as there is no single part of the speech production system which cycles at the level of either the metrical foot or the phrase repetition cycle. That is, there is no continuous kinematic observable in the system. Nonetheless, there is strong evidence that the oscopic behavior of the entire production system is correctly described as hierarchically coupled oscillators. There are many parallels between this organization and the forms of inter-limb coupling observed in locomotion and rhythmic manual tasks.

  1. Speech recognition systems on the Cell Broadband Engine

    Energy Technology Data Exchange (ETDEWEB)

    Liu, Y; Jones, H; Vaidya, S; Perrone, M; Tydlitat, B; Nanda, A

    2007-04-20

    In this paper we describe our design, implementation, and first results of a prototype connected-phoneme-based speech recognition system on the Cell Broadband Engine{trademark} (Cell/B.E.). Automatic speech recognition decodes speech samples into plain text (other representations are possible) and must process samples at real-time rates. Fortunately, the computational tasks involved in this pipeline are highly data-parallel and can receive significant hardware acceleration from vector-streaming architectures such as the Cell/B.E. Identifying and exploiting these parallelism opportunities is challenging, but also critical to improving system performance. We observed, from our initial performance timings, that a single Cell/B.E. processor can recognize speech from thousands of simultaneous voice channels in real time--a channel density that is orders-of-magnitude greater than the capacity of existing software speech recognizers based on CPUs (central processing units). This result emphasizes the potential for Cell/B.E.-based speech recognition and will likely lead to the future development of production speech systems using Cell/B.E. clusters.

  2. A survey of systems for massive stream analytics

    OpenAIRE

    Singh, Maninder Pal; Hoque, Mohammad A.; Tarkoma, Sasu

    2016-01-01

    The immense growth of data demands switching from traditional data processing solutions to systems, which can process a continuous stream of real time data. Various applications employ stream processing systems to provide solutions to emerging Big Data problems. Open-source solutions such as Storm, Spark Streaming, and S4 are the attempts to answer key stream processing questions. The recent introduction of real time stream processing commercial solutions such as Amazon Kinesis, IBM Infospher...

  3. Speech Production and Speech Discrimination by Hearing-Impaired Children.

    Science.gov (United States)

    Novelli-Olmstead, Tina; Ling, Daniel

    1984-01-01

    Seven hearing impaired children (five to seven years old) assigned to the Speakers group made highly significant gains in speech production and auditory discrimination of speech, while Listeners made only slight speech production gains and no gains in auditory discrimination. Combined speech and auditory training was more effective than auditory…

  4. Prevalence of Speech Disorders in Arak Primary School Students, 2014-2015

    Directory of Open Access Journals (Sweden)

    Abdoreza Yavari

    2016-09-01

    Full Text Available Abstract Background: The speech disorders may produce irreparable damage to childs speech and language development in the psychosocial view. The voice, speech sound production and fluency disorders are speech disorders, that may result from delay or impairment in speech motor control mechanism, central neuron system disorders, improper language stimulation or voice abuse. Materials and Methods: This study examined the prevalence of speech disorders in 1393 Arakian students at 1 to 6th grades of primary school. After collecting continuous speech samples, picture description, passage reading and phonetic test, we recorded the pathological signs of stuttering, articulation disorder and voice disorders in a special sheet. Results: The prevalence of articulation, voice and stuttering disorders was 8%, 3.5% and%1 and the prevalence of speech disorders was 11.9%. The prevalence of speech disorders was decreasing with increasing of student’s grade. 12.2% of boy students and 11.7% of girl students of primary school in Arak had speech disorders. Conclusion: The prevalence of speech disorders of primary school students in Arak is similar to the prevalence of speech disorders in Kermanshah, but the prevalence of speech disorders in this research is smaller than many similar researches in Iran. It seems that racial and cultural diversity has some effect on increasing the prevalence of speech disorders in Arak city.

  5. C-SPARQL : SPARQL for continuous querying

    OpenAIRE

    Barbieri, Davide Francesco; Braga, Daniele; Ceri, Stefano; Valle, Emanuele Della; Grossniklaus, Michael

    2009-01-01

    C-SPARQL is an extension of SPARQL to support continuous queries, registered and continuously executed over RDF data streams, considering windows of such streams. Supporting streams in RDF format guarantees interoperability and opens up important applications, in which reasoners can deal with knowledge that evolves over time. We present C-SPARQL by means of examples in Urban Computing.

  6. Language-universal constraints on speech segmentation

    NARCIS (Netherlands)

    Norris, D.; McQueen, J.M.; Cutler, A.; Butterfield, S.; Kearns, R.K.

    2001-01-01

    Two word-spotting experiments are reported that examine whether the Possible-Word Constraint (PWC; Norris, McQueen, Cutler & Butterfield, 1997) is a language-specific or language-universal strategy for the segmentation of continuous speech. The PWC disfavors parses which leave an impossible residue

  7. Inner Speech's Relationship With Overt Speech in Poststroke Aphasia.

    Science.gov (United States)

    Stark, Brielle C; Geva, Sharon; Warburton, Elizabeth A

    2017-09-18

    Relatively preserved inner speech alongside poor overt speech has been documented in some persons with aphasia (PWA), but the relationship of overt speech with inner speech is still largely unclear, as few studies have directly investigated these factors. The present study investigates the relationship of relatively preserved inner speech in aphasia with selected measures of language and cognition. Thirty-eight persons with chronic aphasia (27 men, 11 women; average age 64.53 ± 13.29 years, time since stroke 8-111 months) were classified as having relatively preserved inner and overt speech (n = 21), relatively preserved inner speech with poor overt speech (n = 8), or not classified due to insufficient measurements of inner and/or overt speech (n = 9). Inner speech scores (by group) were correlated with selected measures of language and cognition from the Comprehensive Aphasia Test (Swinburn, Porter, & Al, 2004). The group with poor overt speech showed a significant relationship of inner speech with overt naming (r = .95, p speech and language and cognition factors were not significant for the group with relatively good overt speech. As in previous research, we show that relatively preserved inner speech is found alongside otherwise severe production deficits in PWA. PWA with poor overt speech may rely more on preserved inner speech for overt picture naming (perhaps due to shared resources with verbal working memory) and for written picture description (perhaps due to reliance on inner speech due to perceived task difficulty). Assessments of inner speech may be useful as a standard component of aphasia screening, and therapy focused on improving and using inner speech may prove clinically worthwhile. https://doi.org/10.23641/asha.5303542.

  8. On-stream chemical element monitor

    International Nuclear Information System (INIS)

    Averitt, O.R.; Dorsch, R.R.

    1979-01-01

    An apparatus and method for on-stream chemical element monitoring are described wherein a multiplicity of sample streams are flowed continuously through individual analytical cells and fluorescence analyses are performed on the sample streams in sequence, together with a method of controlling the time duration of each analysis as a function of the concomitant radiation exposure of a preselected perforate reference material interposed in the sample-radiation source path

  9. Syntactic error modeling and scoring normalization in speech recognition: Error modeling and scoring normalization in the speech recognition task for adult literacy training

    Science.gov (United States)

    Olorenshaw, Lex; Trawick, David

    1991-01-01

    The purpose was to develop a speech recognition system to be able to detect speech which is pronounced incorrectly, given that the text of the spoken speech is known to the recognizer. Better mechanisms are provided for using speech recognition in a literacy tutor application. Using a combination of scoring normalization techniques and cheater-mode decoding, a reasonable acceptance/rejection threshold was provided. In continuous speech, the system was tested to be able to provide above 80 pct. correct acceptance of words, while correctly rejecting over 80 pct. of incorrectly pronounced words.

  10. Environmental Contamination of Normal Speech.

    Science.gov (United States)

    Harley, Trevor A.

    1990-01-01

    Environmentally contaminated speech errors (irrelevant words or phrases derived from the speaker's environment and erroneously incorporated into speech) are hypothesized to occur at a high level of speech processing, but with a relatively late insertion point. The data indicate that speech production processes are not independent of other…

  11. Continuous analytical control of the streaming waters in a uranium treatment plant and of various chemical products using automatic discharge valves; Controle par analyse en continu des eaux de ruissellement d'une usine traitant de l'uranium et divers produits chimiques avec commande automatique des vannes de decharge

    Energy Technology Data Exchange (ETDEWEB)

    Archimbaud, M; Simeon, C [Commissariat a l' Energie Atomique, Pierrelatte (France)

    1968-07-01

    This report describes a method for controlling the streaming waters produced by the Pierrelatte Centre; it is based on continuous analysis, with simultaneous recording of the species liable to be found accidentally in the corresponding hydrological circuits (chlorides, fluorides, chromium VI, uranium). An alarm set off at pre-determined thresholds leads to an automatic cutting off of the discharge valves; the outward flow of the waters is thus interrupted. This study has shown the various applications which can be found for this water control method, and gives an idea of the cost price. (authors) [French] Ce rapport decrit un mode de controle des eaux de ruissellement provenant du Centre de Pierrelatte base sur une analyse en continu, avec enregistrement des corps susceptibles de se retrouver accidentellement dans les reseaux hydrologiques correspondants (chlorures, fluorures, chrome VI, uranium). Le declenchement d'une alarme a partir de seuils choisis permet de fermer automatiquement les vannes de decharge et d'arreter ainsi l'ecoulement vers l'exterieur. Cette etude montre quelles peuvent etre les diverses applications de cette methode de controle des eaux et elle indique un ordre de grandeur du prix de revient. (auteurs)

  12. APPRECIATING SPEECH THROUGH GAMING

    Directory of Open Access Journals (Sweden)

    Mario T Carreon

    2014-06-01

    Full Text Available This paper discusses the Speech and Phoneme Recognition as an Educational Aid for the Deaf and Hearing Impaired (SPREAD application and the ongoing research on its deployment as a tool for motivating deaf and hearing impaired students to learn and appreciate speech. This application uses the Sphinx-4 voice recognition system to analyze the vocalization of the student and provide prompt feedback on their pronunciation. The packaging of the application as an interactive game aims to provide additional motivation for the deaf and hearing impaired student through visual motivation for them to learn and appreciate speech.

  13. Global Freedom of Speech

    DEFF Research Database (Denmark)

    Binderup, Lars Grassme

    2007-01-01

    , as opposed to a legal norm, that curbs exercises of the right to free speech that offend the feelings or beliefs of members from other cultural groups. The paper rejects the suggestion that acceptance of such a norm is in line with liberal egalitarian thinking. Following a review of the classical liberal...... egalitarian reasons for free speech - reasons from overall welfare, from autonomy and from respect for the equality of citizens - it is argued that these reasons outweigh the proposed reasons for curbing culturally offensive speech. Currently controversial cases such as that of the Danish Cartoon Controversy...

  14. Internet images of the speech pathology profession.

    Science.gov (United States)

    Byrne, Nicole

    2017-06-05

    Objective The Internet provides the general public with information about speech pathology services, including client groups and service delivery models, as well as the professionals providing the services. Although this information assists the general public and other professionals to both access and understand speech pathology services, it also potentially provides information about speech pathology as a prospective career, including the types of people who are speech pathologists (i.e. demographics). The aim of the present study was to collect baseline data on how the speech pathology profession was presented via images on the Internet. Methods A pilot prospective observational study using content analysis methodology was conducted to analyse publicly available Internet images related to the speech pathology profession. The terms 'Speech Pathology' and 'speech pathologist' to represent both the profession and the professional were used, resulting in the identification of 200 images. These images were considered across a range of areas, including who was in the image (e.g. professional, client, significant other), the technology used and the types of intervention. Results The majority of images showed both a client and a professional (i.e. speech pathologist). While the professional was predominantly presented as female, the gender of the client was more evenly distributed. The clients were more likely to be preschool or school aged, however male speech pathologists were presented as providing therapy to selected age groups (i.e. school aged and younger adults). Images were predominantly of individual therapy and the few group images that were presented were all paediatric. Conclusion Current images of speech pathology continue to portray narrow professional demographics and client groups (e.g. paediatrics). Promoting images of wider scope to fully represent the depth and breadth of speech pathology professional practice may assist in attracting a more diverse

  15. Improving on hidden Markov models: An articulatorily constrained, maximum likelihood approach to speech recognition and speech coding

    Energy Technology Data Exchange (ETDEWEB)

    Hogden, J.

    1996-11-05

    The goal of the proposed research is to test a statistical model of speech recognition that incorporates the knowledge that speech is produced by relatively slow motions of the tongue, lips, and other speech articulators. This model is called Maximum Likelihood Continuity Mapping (Malcom). Many speech researchers believe that by using constraints imposed by articulator motions, we can improve or replace the current hidden Markov model based speech recognition algorithms. Unfortunately, previous efforts to incorporate information about articulation into speech recognition algorithms have suffered because (1) slight inaccuracies in our knowledge or the formulation of our knowledge about articulation may decrease recognition performance, (2) small changes in the assumptions underlying models of speech production can lead to large changes in the speech derived from the models, and (3) collecting measurements of human articulator positions in sufficient quantity for training a speech recognition algorithm is still impractical. The most interesting (and in fact, unique) quality of Malcom is that, even though Malcom makes use of a mapping between acoustics and articulation, Malcom can be trained to recognize speech using only acoustic data. By learning the mapping between acoustics and articulation using only acoustic data, Malcom avoids the difficulties involved in collecting articulator position measurements and does not require an articulatory synthesizer model to estimate the mapping between vocal tract shapes and speech acoustics. Preliminary experiments that demonstrate that Malcom can learn the mapping between acoustics and articulation are discussed. Potential applications of Malcom aside from speech recognition are also discussed. Finally, specific deliverables resulting from the proposed research are described.

  16. Neural entrainment to rhythmically-presented auditory, visual and audio-visual speech in children

    Directory of Open Access Journals (Sweden)

    Alan James Power

    2012-07-01

    Full Text Available Auditory cortical oscillations have been proposed to play an important role in speech perception. It is suggested that the brain may take temporal ‘samples’ of information from the speech stream at different rates, phase-resetting ongoing oscillations so that they are aligned with similar frequency bands in the input (‘phase locking’. Information from these frequency bands is then bound together for speech perception. To date, there are no explorations of neural phase-locking and entrainment to speech input in children. However, it is clear from studies of language acquisition that infants use both visual speech information and auditory speech information in learning. In order to study neural entrainment to speech in typically-developing children, we use a rhythmic entrainment paradigm (underlying 2 Hz or delta rate based on repetition of the syllable ba, presented in either the auditory modality alone, the visual modality alone, or as auditory-visual speech (via a talking head. To ensure attention to the task, children aged 13 years were asked to press a button as fast as possible when the ba stimulus violated the rhythm for each stream type. Rhythmic violation depended on delaying the occurrence of a ba in the isochronous stream. Neural entrainment was demonstrated for all stream types, and individual differences in standardized measures of language processing were related to auditory entrainment at the theta rate. Further, there was significant modulation of the preferred phase of auditory entrainment in the theta band when visual speech cues were present, indicating cross-modal phase resetting. The rhythmic entrainment paradigm developed here offers a method for exploring individual differences in oscillatory phase locking during development. In particular, a method for assessing neural entrainment and cross-modal phase resetting would be useful for exploring developmental learning difficulties thought to involve temporal sampling

  17. Differential modulation of auditory responses to attended and unattended speech in different listening conditions.

    Science.gov (United States)

    Kong, Ying-Yee; Mullangi, Ala; Ding, Nai

    2014-10-01

    This study investigates how top-down attention modulates neural tracking of the speech envelope in different listening conditions. In the quiet conditions, a single speech stream was presented and the subjects paid attention to the speech stream (active listening) or watched a silent movie instead (passive listening). In the competing speaker (CS) conditions, two speakers of opposite genders were presented diotically. Ongoing electroencephalographic (EEG) responses were measured in each condition and cross-correlated with the speech envelope of each speaker at different time lags. In quiet, active and passive listening resulted in similar neural responses to the speech envelope. In the CS conditions, however, the shape of the cross-correlation function was remarkably different between the attended and unattended speech. The cross-correlation with the attended speech showed stronger N1 and P2 responses but a weaker P1 response compared to the cross-correlation with the unattended speech. Furthermore, the N1 response to the attended speech in the CS condition was enhanced and delayed compared with the active listening condition in quiet, while the P2 response to the unattended speaker in the CS condition was attenuated compared with the passive listening in quiet. Taken together, these results demonstrate that top-down attention differentially modulates envelope-tracking neural activity at different time lags and suggest that top-down attention can both enhance the neural responses to the attended sound stream and suppress the responses to the unattended sound stream. Copyright © 2014 Elsevier B.V. All rights reserved.

  18. Reliance on auditory feedback in children with childhood apraxia of speech.

    Science.gov (United States)

    Iuzzini-Seigel, Jenya; Hogan, Tiffany P; Guarino, Anthony J; Green, Jordan R

    2015-01-01

    Children with childhood apraxia of speech (CAS) have been hypothesized to continuously monitor their speech through auditory feedback to minimize speech errors. We used an auditory masking paradigm to determine the effect of attenuating auditory feedback on speech in 30 children: 9 with CAS, 10 with speech delay, and 11 with typical development. The masking only affected the speech of children with CAS as measured by voice onset time and vowel space area. These findings provide preliminary support for greater reliance on auditory feedback among children with CAS. Readers of this article should be able to (i) describe the motivation for investigating the role of auditory feedback in children with CAS; (ii) report the effects of feedback attenuation on speech production in children with CAS, speech delay, and typical development, and (iii) understand how the current findings may support a feedforward program deficit in children with CAS. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  19. Charisma in business speeches

    DEFF Research Database (Denmark)

    Niebuhr, Oliver; Brem, Alexander; Novák-Tót, Eszter

    2016-01-01

    to business speeches. Consistent with the public opinion, our findings are indicative of Steve Jobs being a more charismatic speaker than Mark Zuckerberg. Beyond previous studies, our data suggest that rhythm and emphatic accentuation are also involved in conveying charisma. Furthermore, the differences...... between Steve Jobs and Mark Zuckerberg and the investor- and customer-related sections of their speeches support the modern understanding of charisma as a gradual, multiparametric, and context-sensitive concept....

  20. Speech spectrum envelope modeling

    Czech Academy of Sciences Publication Activity Database

    Vích, Robert; Vondra, Martin

    Vol. 4775, - (2007), s. 129-137 ISSN 0302-9743. [COST Action 2102 International Workshop. Vietri sul Mare, 29.03.2007-31.03.2007] R&D Projects: GA AV ČR(CZ) 1ET301710509 Institutional research plan: CEZ:AV0Z20670512 Keywords : speech * speech processing * cepstral analysis Subject RIV: JA - Electronics ; Optoelectronics, Electrical Engineering Impact factor: 0.302, year: 2005

  1. Speech rhythms and multiplexed oscillatory sensory coding in the human brain.

    Directory of Open Access Journals (Sweden)

    Joachim Gross

    2013-12-01

    Full Text Available Cortical oscillations are likely candidates for segmentation and coding of continuous speech. Here, we monitored continuous speech processing with magnetoencephalography (MEG to unravel the principles of speech segmentation and coding. We demonstrate that speech entrains the phase of low-frequency (delta, theta and the amplitude of high-frequency (gamma oscillations in the auditory cortex. Phase entrainment is stronger in the right and amplitude entrainment is stronger in the left auditory cortex. Furthermore, edges in the speech envelope phase reset auditory cortex oscillations thereby enhancing their entrainment to speech. This mechanism adapts to the changing physical features of the speech envelope and enables efficient, stimulus-specific speech sampling. Finally, we show that within the auditory cortex, coupling between delta, theta, and gamma oscillations increases following speech edges. Importantly, all couplings (i.e., brain-speech and also within the cortex attenuate for backward-presented speech, suggesting top-down control. We conclude that segmentation and coding of speech relies on a nested hierarchy of entrained cortical oscillations.

  2. Speech Rhythms and Multiplexed Oscillatory Sensory Coding in the Human Brain

    Science.gov (United States)

    Gross, Joachim; Hoogenboom, Nienke; Thut, Gregor; Schyns, Philippe; Panzeri, Stefano; Belin, Pascal; Garrod, Simon

    2013-01-01

    Cortical oscillations are likely candidates for segmentation and coding of continuous speech. Here, we monitored continuous speech processing with magnetoencephalography (MEG) to unravel the principles of speech segmentation and coding. We demonstrate that speech entrains the phase of low-frequency (delta, theta) and the amplitude of high-frequency (gamma) oscillations in the auditory cortex. Phase entrainment is stronger in the right and amplitude entrainment is stronger in the left auditory cortex. Furthermore, edges in the speech envelope phase reset auditory cortex oscillations thereby enhancing their entrainment to speech. This mechanism adapts to the changing physical features of the speech envelope and enables efficient, stimulus-specific speech sampling. Finally, we show that within the auditory cortex, coupling between delta, theta, and gamma oscillations increases following speech edges. Importantly, all couplings (i.e., brain-speech and also within the cortex) attenuate for backward-presented speech, suggesting top-down control. We conclude that segmentation and coding of speech relies on a nested hierarchy of entrained cortical oscillations. PMID:24391472

  3. Dog-directed speech: why do we use it and do dogs pay attention to it?

    Science.gov (United States)

    Ben-Aderet, Tobey; Gallego-Abenza, Mario; Reby, David; Mathevon, Nicolas

    2017-01-11

    Pet-directed speech is strikingly similar to infant-directed speech, a peculiar speaking pattern with higher pitch and slower tempo known to engage infants' attention and promote language learning. Here, we report the first investigation of potential factors modulating the use of dog-directed speech, as well as its immediate impact on dogs' behaviour. We recorded adult participants speaking in front of pictures of puppies, adult and old dogs, and analysed the quality of their speech. We then performed playback experiments to assess dogs' reaction to dog-directed speech compared with normal speech. We found that human speakers used dog-directed speech with dogs of all ages and that the acoustic structure of dog-directed speech was mostly independent of dog age, except for sound pitch which was relatively higher when communicating with puppies. Playback demonstrated that, in the absence of other non-auditory cues, puppies were highly reactive to dog-directed speech, and that the pitch was a key factor modulating their behaviour, suggesting that this specific speech register has a functional value in young dogs. Conversely, older dogs did not react differentially to dog-directed speech compared with normal speech. The fact that speakers continue to use dog-directed with older dogs therefore suggests that this speech pattern may mainly be a spontaneous attempt to facilitate interactions with non-verbal listeners. © 2017 The Author(s).

  4. [Prosody, speech input and language acquisition].

    Science.gov (United States)

    Jungheim, M; Miller, S; Kühn, D; Ptok, M

    2014-04-01

    In order to acquire language, children require speech input. The prosody of the speech input plays an important role. In most cultures adults modify their code when communicating with children. Compared to normal speech this code differs especially with regard to prosody. For this review a selective literature search in PubMed and Scopus was performed. Prosodic characteristics are a key feature of spoken language. By analysing prosodic features, children gain knowledge about underlying grammatical structures. Child-directed speech (CDS) is modified in a way that meaningful sequences are highlighted acoustically so that important information can be extracted from the continuous speech flow more easily. CDS is said to enhance the representation of linguistic signs. Taking into consideration what has previously been described in the literature regarding the perception of suprasegmentals, CDS seems to be able to support language acquisition due to the correspondence of prosodic and syntactic units. However, no findings have been reported, stating that the linguistically reduced CDS could hinder first language acquisition.

  5. Memory for speech and speech for memory.

    Science.gov (United States)

    Locke, J L; Kutz, K J

    1975-03-01

    Thirty kindergarteners, 15 who substituted /w/ for /r/ and 15 with correct articulation, received two perception tests and a memory test that included /w/ and /r/ in minimally contrastive syllables. Although both groups had nearly perfect perception of the experimenter's productions of /w/ and /r/, misarticulating subjects perceived their own tape-recorded w/r productions as /w/. In the memory task these same misarticulating subjects committed significantly more /w/-/r/ confusions in unspoken recall. The discussion considers why people subvocally rehearse; a developmental period in which children do not rehearse; ways subvocalization may aid recall, including motor and acoustic encoding; an echoic store that provides additional recall support if subjects rehearse vocally, and perception of self- and other- produced phonemes by misarticulating children-including its relevance to a motor theory of perception. Evidence is presented that speech for memory can be sufficiently impaired to cause memory disorder. Conceptions that restrict speech disorder to an impairment of communication are challenged.

  6. Precision of working memory for speech sounds.

    Science.gov (United States)

    Joseph, Sabine; Iverson, Paul; Manohar, Sanjay; Fox, Zoe; Scott, Sophie K; Husain, Masud

    2015-01-01

    Memory for speech sounds is a key component of models of verbal working memory (WM). But how good is verbal WM? Most investigations assess this using binary report measures to derive a fixed number of items that can be stored. However, recent findings in visual WM have challenged such "quantized" views by employing measures of recall precision with an analogue response scale. WM for speech sounds might rely on both continuous and categorical storage mechanisms. Using a novel speech matching paradigm, we measured WM recall precision for phonemes. Vowel qualities were sampled from a formant space continuum. A probe vowel had to be adjusted to match the vowel quality of a target on a continuous, analogue response scale. Crucially, this provided an index of the variability of a memory representation around its true value and thus allowed us to estimate how memories were distorted from the original sounds. Memory load affected the quality of speech sound recall in two ways. First, there was a gradual decline in recall precision with increasing number of items, consistent with the view that WM representations of speech sounds become noisier with an increase in the number of items held in memory, just as for vision. Based on multidimensional scaling (MDS), the level of noise appeared to be reflected in distortions of the formant space. Second, as memory load increased, there was evidence of greater clustering of participants' responses around particular vowels. A mixture model captured both continuous and categorical responses, demonstrating a shift from continuous to categorical memory with increasing WM load. This suggests that direct acoustic storage can be used for single items, but when more items must be stored, categorical representations must be used.

  7. About the theory of congested transport streams

    OpenAIRE

    Valeriy GUK

    2009-01-01

    Talked about a theory, based on integrity of continuous motion of a transport stream. Placing of car and its speed is in a stream - second. Principle of application of the generalized methods of design and new descriptions of the states of transport streams opens up. Travelling and transport potentials are set, and also external capacity of the system a «transport stream» is an exergy, that allows to make differential equation and decide the applied tasks of organization of travelling motion....

  8. Predicting speech intelligibility in conditions with nonlinearly processed noisy speech

    DEFF Research Database (Denmark)

    Jørgensen, Søren; Dau, Torsten

    2013-01-01

    The speech-based envelope power spectrum model (sEPSM; [1]) was proposed in order to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII). The sEPSM applies the signal-tonoise ratio in the envelope domain (SNRenv), which was demonstrated...... to successfully predict speech intelligibility in conditions with nonlinearly processed noisy speech, such as processing with spectral subtraction. Moreover, a multiresolution version (mr-sEPSM) was demonstrated to account for speech intelligibility in various conditions with stationary and fluctuating...

  9. Music and Speech Perception in Children Using Sung Speech.

    Science.gov (United States)

    Nie, Yingjiu; Galvin, John J; Morikawa, Michael; André, Victoria; Wheeler, Harley; Fu, Qian-Jie

    2018-01-01

    This study examined music and speech perception in normal-hearing children with some or no musical training. Thirty children (mean age = 11.3 years), 15 with and 15 without formal music training participated in the study. Music perception was measured using a melodic contour identification (MCI) task; stimuli were a piano sample or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note). Speech perception was measured in quiet and in steady noise using a matrix-styled sentence recognition task; stimuli were naturally intonated speech or sung speech with a fixed pitch (same note for each word) or a mixed pitch (different notes for each word). Significant musician advantages were observed for MCI and speech in noise but not for speech in quiet. MCI performance was significantly poorer with the mixed timbre stimuli. Speech performance in noise was significantly poorer with the fixed or mixed pitch stimuli than with spoken speech. Across all subjects, age at testing and MCI performance were significantly correlated with speech performance in noise. MCI and speech performance in quiet was significantly poorer for children than for adults from a related study using the same stimuli and tasks; speech performance in noise was significantly poorer for young than for older children. Long-term music training appeared to benefit melodic pitch perception and speech understanding in noise in these pediatric listeners.

  10. Automatic phoneme category selectivity in the dorsal auditory stream.

    Science.gov (United States)

    Chevillet, Mark A; Jiang, Xiong; Rauschecker, Josef P; Riesenhuber, Maximilian

    2013-03-20

    Debates about motor theories of speech perception have recently been reignited by a burst of reports implicating premotor cortex (PMC) in speech perception. Often, however, these debates conflate perceptual and decision processes. Evidence that PMC activity correlates with task difficulty and subject performance suggests that PMC might be recruited, in certain cases, to facilitate category judgments about speech sounds (rather than speech perception, which involves decoding of sounds). However, it remains unclear whether PMC does, indeed, exhibit neural selectivity that is relevant for speech decisions. Further, it is unknown whether PMC activity in such cases reflects input via the dorsal or ventral auditory pathway, and whether PMC processing of speech is automatic or task-dependent. In a novel modified categorization paradigm, we presented human subjects with paired speech sounds from a phonetic continuum but diverted their attention from phoneme category using a challenging dichotic listening task. Using fMRI rapid adaptation to probe neural selectivity, we observed acoustic-phonetic selectivity in left anterior and left posterior auditory cortical regions. Conversely, we observed phoneme-category selectivity in left PMC that correlated with explicit phoneme-categorization performance measured after scanning, suggesting that PMC recruitment can account for performance on phoneme-categorization tasks. Structural equation modeling revealed connectivity from posterior, but not anterior, auditory cortex to PMC, suggesting a dorsal route for auditory input to PMC. Our results provide evidence for an account of speech processing in which the dorsal stream mediates automatic sensorimotor integration of speech and may be recruited to support speech decision tasks.

  11. Practical speech user interface design

    CERN Document Server

    Lewis, James R

    2010-01-01

    Although speech is the most natural form of communication between humans, most people find using speech to communicate with machines anything but natural. Drawing from psychology, human-computer interaction, linguistics, and communication theory, Practical Speech User Interface Design provides a comprehensive yet concise survey of practical speech user interface (SUI) design. It offers practice-based and research-based guidance on how to design effective, efficient, and pleasant speech applications that people can really use. Focusing on the design of speech user interfaces for IVR application

  12. Shifting stream planform state decreases stream productivity yet increases riparian animal production

    Science.gov (United States)

    Venarsky, Michael P.; Walters, David M.; Hall, Robert O.; Livers, Bridget; Wohl, Ellen

    2018-01-01

    In the Colorado Front Range (USA), disturbance history dictates stream planform. Undisturbed, old-growth streams have multiple channels and large amounts of wood and depositional habitat. Disturbed streams (wildfires and logging tested how these opposing stream states influenced organic matter, benthic macroinvertebrate secondary production, emerging aquatic insect flux, and riparian spider biomass. Organic matter and macroinvertebrate production did not differ among sites per unit area (m−2), but values were 2 ×–21 × higher in undisturbed reaches per unit of stream valley (m−1 valley) because total stream area was higher in undisturbed reaches. Insect emergence was similar among streams at the per unit area and per unit of stream valley. However, rescaling insect emergence to per meter of stream bank showed that the emerging insect biomass reaching the stream bank was lower in undisturbed sites because multi-channel reaches had 3 × more stream bank than single-channel reaches. Riparian spider biomass followed the same pattern as emerging aquatic insects, and we attribute this to bottom-up limitation caused by the multi-channeled undisturbed sites diluting prey quantity (emerging insects) reaching the stream bank (riparian spider habitat). These results show that historic landscape disturbances continue to influence stream and riparian communities in the Colorado Front Range. However, these legacy effects are only weakly influencing habitat-specific function and instead are primarily influencing stream–riparian community productivity by dictating both stream planform (total stream area, total stream bank length) and the proportional distribution of specific habitat types (pools vs riffles).

  13. Under-resourced speech recognition based on the speech manifold

    CSIR Research Space (South Africa)

    Sahraeian, R

    2015-09-01

    Full Text Available Conventional acoustic modeling involves estimating many parameters to effectively model feature distributions. The sparseness of speech and text data, however, degrades the reliability of the estimation process and makes speech recognition a...

  14. Indonesian Automatic Speech Recognition For Command Speech Controller Multimedia Player

    Directory of Open Access Journals (Sweden)

    Vivien Arief Wardhany

    2014-12-01

    Full Text Available The purpose of multimedia devices development is controlling through voice. Nowdays voice that can be recognized only in English. To overcome the issue, then recognition using Indonesian language model and accousticc model and dictionary. Automatic Speech Recognizier is build using engine CMU Sphinx with modified english language to Indonesian Language database and XBMC used as the multimedia player. The experiment is using 10 volunteers testing items based on 7 commands. The volunteers is classifiedd by the genders, 5 Male & 5 female. 10 samples is taken in each command, continue with each volunteer perform 10 testing command. Each volunteer also have to try all 7 command that already provided. Based on percentage clarification table, the word “Kanan” had the most recognize with percentage 83% while “pilih” is the lowest one. The word which had the most wrong clarification is “kembali” with percentagee 67%, while the word “kanan” is the lowest one. From the result of Recognition Rate by male there are several command such as “Kembali”, “Utama”, “Atas “ and “Bawah” has the low Recognition Rate. Especially for “kembali” cannot be recognized as the command in the female voices but in male voice that command has 4% of RR this is because the command doesn’t have similar word in english near to “kembali” so the system unrecognize the command. Also for the command “Pilih” using the female voice has 80% of RR but for the male voice has only 4% of RR. This problem is mostly because of the different voice characteristic between adult male and female which male has lower voice frequencies (from 85 to 180 Hz than woman (165 to 255 Hz.The result of the experiment showed that each man had different number of recognition rate caused by the difference tone, pronunciation, and speed of speech. For further work needs to be done in order to improving the accouracy of the Indonesian Automatic Speech Recognition system

  15. Speech Alarms Pilot Study

    Science.gov (United States)

    Sandor, A.; Moses, H. R.

    2016-01-01

    Currently on the International Space Station (ISS) and other space vehicles Caution & Warning (C&W) alerts are represented with various auditory tones that correspond to the type of event. This system relies on the crew's ability to remember what each tone represents in a high stress, high workload environment when responding to the alert. Furthermore, crew receive a year or more in advance of the mission that makes remembering the semantic meaning of the alerts more difficult. The current system works for missions conducted close to Earth where ground operators can assist as needed. On long duration missions, however, they will need to work off-nominal events autonomously. There is evidence that speech alarms may be easier and faster to recognize, especially during an off-nominal event. The Information Presentation Directed Research Project (FY07-FY09) funded by the Human Research Program included several studies investigating C&W alerts. The studies evaluated tone alerts currently in use with NASA flight deck displays along with candidate speech alerts. A follow-on study used four types of speech alerts to investigate how quickly various types of auditory alerts with and without a speech component - either at the beginning or at the end of the tone - can be identified. Even though crew were familiar with the tone alert from training or direct mission experience, alerts starting with a speech component were identified faster than alerts starting with a tone. The current study replicated the results from the previous study in a more rigorous experimental design to determine if the candidate speech alarms are ready for transition to operations or if more research is needed. Four types of alarms (caution, warning, fire, and depressurization) were presented to participants in both tone and speech formats in laboratory settings and later in the Human Exploration Research Analog (HERA). In the laboratory study, the alerts were presented by software and participants were

  16. Interaction between stream temperature, streamflow, and groundwater exchanges in alpine streams

    Science.gov (United States)

    Constantz, James E.

    1998-01-01

    Four alpine streams were monitored to continuously collect stream temperature and streamflow for periods ranging from a week to a year. In a small stream in the Colorado Rockies, diurnal variations in both stream temperature and streamflow were significantly greater in losing reaches than in gaining reaches, with minimum streamflow losses occurring early in the day and maximum losses occurring early in the evening. Using measured stream temperature changes, diurnal streambed infiltration rates were predicted to increase as much as 35% during the day (based on a heat and water transport groundwater model), while the measured increase in streamflow loss was 40%. For two large streams in the Sierra Nevada Mountains, annual stream temperature variations ranged from 0° to 25°C. In summer months, diurnal stream temperature variations were 30–40% of annual stream temperature variations, owing to reduced streamflows and increased atmospheric heating. Previous reports document that one Sierra stream site generally gains groundwater during low flows, while the second Sierra stream site may lose water during low flows. For August the diurnal streamflow variation was 11% at the gaining stream site and 30% at the losing stream site. On the basis of measured diurnal stream temperature variations, streambed infiltration rates were predicted to vary diurnally as much as 20% at the losing stream site. Analysis of results suggests that evapotranspiration losses determined diurnal streamflow variations in the gaining reaches, while in the losing reaches, evapotranspiration losses were compounded by diurnal variations in streambed infiltration. Diurnal variations in stream temperature were reduced in the gaining reaches as a result of discharging groundwater of relatively constant temperature. For the Sierra sites, comparison of results with those from a small tributary demonstrated that stream temperature patterns were useful in delineating discharges of bank storage following

  17. Intelligibility of speech of children with speech and sound disorders

    OpenAIRE

    Ivetac, Tina

    2014-01-01

    The purpose of this study is to examine speech intelligibility of children with primary speech and sound disorders aged 3 to 6 years in everyday life. The research problem is based on the degree to which parents or guardians, immediate family members (sister, brother, grandparents), extended family members (aunt, uncle, cousin), child's friends, other acquaintances, child's teachers and strangers understand the speech of children with speech sound disorders. We examined whether the level ...

  18. StreamExplorer: A Multi-Stage System for Visually Exploring Events in Social Streams.

    Science.gov (United States)

    Wu, Yingcai; Chen, Zhutian; Sun, Guodao; Xie, Xiao; Cao, Nan; Liu, Shixia; Cui, Weiwei

    2017-10-18

    Analyzing social streams is important for many applications, such as crisis management. However, the considerable diversity, increasing volume, and high dynamics of social streams of large events continue to be significant challenges that must be overcome to ensure effective exploration. We propose a novel framework by which to handle complex social streams on a budget PC. This framework features two components: 1) an online method to detect important time periods (i.e., subevents), and 2) a tailored GPU-assisted Self-Organizing Map (SOM) method, which clusters the tweets of subevents stably and efficiently. Based on the framework, we present StreamExplorer to facilitate the visual analysis, tracking, and comparison of a social stream at three levels. At a macroscopic level, StreamExplorer uses a new glyph-based timeline visualization, which presents a quick multi-faceted overview of the ebb and flow of a social stream. At a mesoscopic level, a map visualization is employed to visually summarize the social stream from either a topical or geographical aspect. At a microscopic level, users can employ interactive lenses to visually examine and explore the social stream from different perspectives. Two case studies and a task-based evaluation are used to demonstrate the effectiveness and usefulness of StreamExplorer.Analyzing social streams is important for many applications, such as crisis management. However, the considerable diversity, increasing volume, and high dynamics of social streams of large events continue to be significant challenges that must be overcome to ensure effective exploration. We propose a novel framework by which to handle complex social streams on a budget PC. This framework features two components: 1) an online method to detect important time periods (i.e., subevents), and 2) a tailored GPU-assisted Self-Organizing Map (SOM) method, which clusters the tweets of subevents stably and efficiently. Based on the framework, we present Stream

  19. Neural tuning to low-level features of speech throughout the perisylvian cortex

    NARCIS (Netherlands)

    Berezutskaya, Y.; Freudenburg, Z.V.; Güçlü, U.; Gerven, M.A.J. van; Ramsey, N.F.

    2017-01-01

    Despite a large body of research, we continue to lack a detailed account of how auditory processing of continuous speech unfolds in the human brain. Previous research showed the propagation of low-level acoustic features of speech from posterior superior temporal gyrus towards anterior superior

  20. Neural tuning to low-level features of speech throughout the perisylvian cortex

    NARCIS (Netherlands)

    Berezutskaya, Julia; Freudenburg, Zachary V.; Güçlü, Umut; van Gerven, Marcel A.J.; Ramsey, Nick F.

    2017-01-01

    Despite a large body of research, we continue to lack a detailed account of how auditory processing of continuous speech unfolds in the human brain. Previous research showed the propagation of low-level acoustic features of speech from posterior superior temporal gyrus toward anterior superior

  1. Robust Speech/Non-Speech Classification in Heterogeneous Multimedia Content

    NARCIS (Netherlands)

    Huijbregts, M.A.H.; de Jong, Franciska M.G.

    In this paper we present a speech/non-speech classification method that allows high quality classification without the need to know in advance what kinds of audible non-speech events are present in an audio recording and that does not require a single parameter to be tuned on in-domain data. Because

  2. Tackling the complexity in speech

    DEFF Research Database (Denmark)

    section includes four carefully selected chapters. They deal with facets of speech production, speech acoustics, and/or speech perception or recognition, place them in an integrated phonetic-phonological perspective, and relate them in more or less explicit ways to aspects of speech technology. Therefore......, we hope that this volume can help speech scientists with traditional training in phonetics and phonology to keep up with the latest developments in speech technology. In the opposite direction, speech researchers starting from a technological perspective will hopefully get inspired by reading about...... the questions, phenomena, and communicative functions that are currently addressed in phonetics and phonology. Either way, the future of speech research lies in international, interdisciplinary collaborations, and our volume is meant to reflect and facilitate such collaborations...

  3. Streaming Compression of Hexahedral Meshes

    Energy Technology Data Exchange (ETDEWEB)

    Isenburg, M; Courbet, C

    2010-02-03

    We describe a method for streaming compression of hexahedral meshes. Given an interleaved stream of vertices and hexahedral our coder incrementally compresses the mesh in the presented order. Our coder is extremely memory efficient when the input stream documents when vertices are referenced for the last time (i.e. when it contains topological finalization tags). Our coder then continuously releases and reuses data structures that no longer contribute to compressing the remainder of the stream. This means in practice that our coder has only a small fraction of the whole mesh in memory at any time. We can therefore compress very large meshes - even meshes that do not file in memory. Compared to traditional, non-streaming approaches that load the entire mesh and globally reorder it during compression, our algorithm trades a less compact compressed representation for significant gains in speed, memory, and I/O efficiency. For example, on the 456k hexahedra 'blade' mesh, our coder is twice as fast and uses 88 times less memory (only 3.1 MB) with the compressed file increasing about 3% in size. We also present the first scheme for predictive compression of properties associated with hexahedral cells.

  4. Shifting stream planform state decreases stream productivity yet increases riparian animal production

    Science.gov (United States)

    Venarsky, Michael P.; Walters, David M.; Hall, Robert O.; Livers, Bridget; Wohl, Ellen

    2018-01-01

    In the Colorado Front Range (USA), disturbance history dictates stream planform. Undisturbed, old-growth streams have multiple channels and large amounts of wood and depositional habitat. Disturbed streams (wildfires and logging production, emerging aquatic insect flux, and riparian spider biomass. Organic matter and macroinvertebrate production did not differ among sites per unit area (m−2), but values were 2 ×–21 × higher in undisturbed reaches per unit of stream valley (m−1 valley) because total stream area was higher in undisturbed reaches. Insect emergence was similar among streams at the per unit area and per unit of stream valley. However, rescaling insect emergence to per meter of stream bank showed that the emerging insect biomass reaching the stream bank was lower in undisturbed sites because multi-channel reaches had 3 × more stream bank than single-channel reaches. Riparian spider biomass followed the same pattern as emerging aquatic insects, and we attribute this to bottom-up limitation caused by the multi-channeled undisturbed sites diluting prey quantity (emerging insects) reaching the stream bank (riparian spider habitat). These results show that historic landscape disturbances continue to influence stream and riparian communities in the Colorado Front Range. However, these legacy effects are only weakly influencing habitat-specific function and instead are primarily influencing stream–riparian community productivity by dictating both stream planform (total stream area, total stream bank length) and the proportional distribution of specific habitat types (pools vs riffles).

  5. Innovative Speech Reconstructive Surgery

    OpenAIRE

    Hashem Shemshadi

    2003-01-01

    Proper speech functioning in human being, depends on the precise coordination and timing balances in a series of complex neuro nuscular movements and actions. Starting from the prime organ of energy source of expelled air from respirato y system; deliver such air to trigger vocal cords; swift changes of this phonatory episode to a comprehensible sound in RESONACE and final coordination of all head and neck structures to elicit final speech in ...

  6. The chairman's speech

    International Nuclear Information System (INIS)

    Allen, A.M.

    1986-01-01

    The paper contains a transcript of a speech by the chairman of the UKAEA, to mark the publication of the 1985/6 annual report. The topics discussed in the speech include: the Chernobyl accident and its effect on public attitudes to nuclear power, management and disposal of radioactive waste, the operation of UKAEA as a trading fund, and the UKAEA development programmes. The development programmes include work on the following: fast reactor technology, thermal reactors, reactor safety, health and safety aspects of water cooled reactors, the Joint European Torus, and under-lying research. (U.K.)

  7. Visualizing structures of speech expressiveness

    DEFF Research Database (Denmark)

    Herbelin, Bruno; Jensen, Karl Kristoffer; Graugaard, Lars

    2008-01-01

    Speech is both beautiful and informative. In this work, a conceptual study of the speech, through investigation of the tower of Babel, the archetypal phonemes, and a study of the reasons of uses of language is undertaken in order to create an artistic work investigating the nature of speech. The ....... The artwork is presented at the Re:New festival in May 2008....

  8. Noise-robust cortical tracking of attended speech in real-world acoustic scenes

    DEFF Research Database (Denmark)

    Fuglsang, Søren; Dau, Torsten; Hjortkjær, Jens

    2017-01-01

    Selectively attending to one speaker in a multi-speaker scenario is thought to synchronize low-frequency cortical activity to the attended speech signal. In recent studies, reconstruction of speech from single-trial electroencephalogram (EEG) data has been used to decode which talker a listener...... is attending to in a two-talker situation. It is currently unclear how this generalizes to more complex sound environments. Behaviorally, speech perception is robust to the acoustic distortions that listeners typically encounter in everyday life, but it is unknown whether this is mirrored by a noise......-robust neural tracking of attended speech. Here we used advanced acoustic simulations to recreate real-world acoustic scenes in the laboratory. In virtual acoustic realities with varying amounts of reverberation and number of interfering talkers, listeners selectively attended to the speech stream...

  9. Speech act theory and New Testament exegesis

    Directory of Open Access Journals (Sweden)

    J. Botha

    1991-01-01

    Full Text Available Speech act theory offers New Testament exegesis some additional ways and means of approaching the text of the New Testament. This, the second in a series of two articles that make a plea for the continued utilisation and application of this theory to the text of the New Testament, deals with some of the possibilities and potential this theory holds for reading biblical texts. Advantages are pointed out and a few suggestions for the future proposed.

  10. Workshop: Welcoming speech

    International Nuclear Information System (INIS)

    Lummerzheim, D.

    1994-01-01

    The welcoming speech underlines the fact that any validation process starting with calculation methods and ending with studies on the long-term behaviour of a repository system can only be effected through laboratory, field and natural-analogue studies. The use of natural analogues (NA) is to secure the biosphere and to verify whether this safety really exists. (HP) [de

  11. Hearing speech in music.

    Science.gov (United States)

    Ekström, Seth-Reino; Borg, Erik

    2011-01-01

    The masking effect of a piano composition, played at different speeds and in different octaves, on speech-perception thresholds was investigated in 15 normal-hearing and 14 moderately-hearing-impaired subjects. Running speech (just follow conversation, JFC) testing and use of hearing aids increased the everyday validity of the findings. A comparison was made with standard audiometric noises [International Collegium of Rehabilitative Audiology (ICRA) noise and speech spectrum-filtered noise (SPN)]. All masking sounds, music or noise, were presented at the same equivalent sound level (50 dBA). The results showed a significant effect of piano performance speed and octave (Ptempo had the largest effect; and high octave and slow tempo, the smallest. Music had a lower masking effect than did ICRA noise with two or six speakers at normal vocal effort (Pmusic offers an interesting opportunity for studying masking under realistic conditions, where spectral and temporal features can be varied independently. The results have implications for composing music with vocal parts, designing acoustic environments and creating a balance between speech perception and privacy in social settings.

  12. Hearing speech in music

    Directory of Open Access Journals (Sweden)

    Seth-Reino Ekström

    2011-01-01

    Full Text Available The masking effect of a piano composition, played at different speeds and in different octaves, on speech-perception thresholds was investigated in 15 normal-hearing and 14 moderately-hearing-impaired subjects. Running speech (just follow conversation, JFC testing and use of hearing aids increased the everyday validity of the findings. A comparison was made with standard audiometric noises [International Collegium of Rehabilitative Audiology (ICRA noise and speech spectrum-filtered noise (SPN]. All masking sounds, music or noise, were presented at the same equivalent sound level (50 dBA. The results showed a significant effect of piano performance speed and octave (P<.01. Low octave and fast tempo had the largest effect; and high octave and slow tempo, the smallest. Music had a lower masking effect than did ICRA noise with two or six speakers at normal vocal effort (P<.01 and SPN (P<.05. Subjects with hearing loss had higher masked thresholds than the normal-hearing subjects (P<.01, but there were smaller differences between masking conditions (P<.01. It is pointed out that music offers an interesting opportunity for studying masking under realistic conditions, where spectral and temporal features can be varied independently. The results have implications for composing music with vocal parts, designing acoustic environments and creating a balance between speech perception and privacy in social settings.

  13. Free Speech Yearbook 1979.

    Science.gov (United States)

    Kane, Peter E., Ed.

    The seven articles in this collection deal with theoretical and practical freedom of speech issues. Topics covered are: the United States Supreme Court, motion picture censorship, and the color line; judicial decision making; the established scientific community's suppression of the ideas of Immanuel Velikovsky; the problems of avant-garde jazz,…

  14. StreamCat

    Data.gov (United States)

    U.S. Environmental Protection Agency — The StreamCat Dataset provides summaries of natural and anthropogenic landscape features for ~2.65 million streams, and their associated catchments, within the...

  15. Prioritized Contact Transport Stream

    Science.gov (United States)

    Hunt, Walter Lee, Jr. (Inventor)

    2015-01-01

    A detection process, contact recognition process, classification process, and identification process are applied to raw sensor data to produce an identified contact record set containing one or more identified contact records. A prioritization process is applied to the identified contact record set to assign a contact priority to each contact record in the identified contact record set. Data are removed from the contact records in the identified contact record set based on the contact priorities assigned to those contact records. A first contact stream is produced from the resulting contact records. The first contact stream is streamed in a contact transport stream. The contact transport stream may include and stream additional contact streams. The contact transport stream may be varied dynamically over time based on parameters such as available bandwidth, contact priority, presence/absence of contacts, system state, and configuration parameters.

  16. Nobel peace speech

    Directory of Open Access Journals (Sweden)

    Joshua FRYE

    2017-07-01

    Full Text Available The Nobel Peace Prize has long been considered the premier peace prize in the world. According to Geir Lundestad, Secretary of the Nobel Committee, of the 300 some peace prizes awarded worldwide, “none is in any way as well known and as highly respected as the Nobel Peace Prize” (Lundestad, 2001. Nobel peace speech is a unique and significant international site of public discourse committed to articulating the universal grammar of peace. Spanning over 100 years of sociopolitical history on the world stage, Nobel Peace Laureates richly represent an important cross-section of domestic and international issues increasingly germane to many publics. Communication scholars’ interest in this rhetorical genre has increased in the past decade. Yet, the norm has been to analyze a single speech artifact from a prestigious or controversial winner rather than examine the collection of speeches for generic commonalities of import. In this essay, we analyze the discourse of Nobel peace speech inductively and argue that the organizing principle of the Nobel peace speech genre is the repetitive form of normative liberal principles and values that function as rhetorical topoi. These topoi include freedom and justice and appeal to the inviolable, inborn right of human beings to exercise certain political and civil liberties and the expectation of equality of protection from totalitarian and tyrannical abuses. The significance of this essay to contemporary communication theory is to expand our theoretical understanding of rhetoric’s role in the maintenance and development of an international and cross-cultural vocabulary for the grammar of peace.

  17. Metaheuristic applications to speech enhancement

    CERN Document Server

    Kunche, Prajna

    2016-01-01

    This book serves as a basic reference for those interested in the application of metaheuristics to speech enhancement. The major goal of the book is to explain the basic concepts of optimization methods and their use in heuristic optimization in speech enhancement to scientists, practicing engineers, and academic researchers in speech processing. The authors discuss why it has been a challenging problem for researchers to develop new enhancement algorithms that aid in the quality and intelligibility of degraded speech. They present powerful optimization methods to speech enhancement that can help to solve the noise reduction problems. Readers will be able to understand the fundamentals of speech processing as well as the optimization techniques, how the speech enhancement algorithms are implemented by utilizing optimization methods, and will be given the tools to develop new algorithms. The authors also provide a comprehensive literature survey regarding the topic.

  18. Speech Motor Control in Fluent and Dysfluent Speech Production of an Individual with Apraxia of Speech and Broca's Aphasia

    Science.gov (United States)

    van Lieshout, Pascal H. H. M.; Bose, Arpita; Square, Paula A.; Steele, Catriona M.

    2007-01-01

    Apraxia of speech (AOS) is typically described as a motor-speech disorder with clinically well-defined symptoms, but without a clear understanding of the underlying problems in motor control. A number of studies have compared the speech of subjects with AOS to the fluent speech of controls, but only a few have included speech movement data and if…

  19. Predicting automatic speech recognition performance over communication channels from instrumental speech quality and intelligibility scores

    NARCIS (Netherlands)

    Gallardo, L.F.; Möller, S.; Beerends, J.

    2017-01-01

    The performance of automatic speech recognition based on coded-decoded speech heavily depends on the quality of the transmitted signals, determined by channel impairments. This paper examines relationships between speech recognition performance and measurements of speech quality and intelligibility

  20. Associations between speech features and phenotypic severity in Treacher Collins syndrome.

    Science.gov (United States)

    Asten, Pamela; Akre, Harriet; Persson, Christina

    2014-04-28

    children, adolescents and a subgroup of adults with TCS. Only children displayed markedly reduced intelligibility. Speech was significantly correlated with phenotypic severity of TCS and orofacial dysfunction. Follow-up and treatment of speech should still be focused on young patients, but some adults with TCS seem to require continuing speech and language pathology services.

  1. Speech is Golden

    DEFF Research Database (Denmark)

    Juel Henrichsen, Peter

    2014-01-01

    on the supply side. The present article reports on a new public action strategy which has taken shape in the course of 2013-14. While Denmark is a small language area, our public sector is well organised and has considerable purchasing power. Across this past year, Danish local authorities have organised around......Most of the Danish municipalities are ready to begin to adopt automatic speech recognition, but at the same time remain nervous following a long series of bad business cases in the recent past. Complaints are voiced over costly licences and low service levels, typical effects of a de facto monopoly...... the speech technology challenge, they have formulated a number of joint questions and new requirements to be met by suppliers and have deliberately worked towards formulating tendering material which will allow fair competition. Public researchers have contributed to this work, including the author...

  2. Benthic invertebrate fauna, small streams

    Science.gov (United States)

    J. Bruce Wallace; S.L. Eggert

    2009-01-01

    Small streams (first- through third-order streams) make up >98% of the total number of stream segments and >86% of stream length in many drainage networks. Small streams occur over a wide array of climates, geology, and biomes, which influence temperature, hydrologic regimes, water chemistry, light, substrate, stream permanence, a basin's terrestrial plant...

  3. Multilevel Analysis in Analyzing Speech Data

    Science.gov (United States)

    Guddattu, Vasudeva; Krishna, Y.

    2011-01-01

    The speech produced by human vocal tract is a complex acoustic signal, with diverse applications in phonetics, speech synthesis, automatic speech recognition, speaker identification, communication aids, speech pathology, speech perception, machine translation, hearing research, rehabilitation and assessment of communication disorders and many…

  4. Speech-Language Therapy (For Parents)

    Science.gov (United States)

    ... Staying Safe Videos for Educators Search English Español Speech-Language Therapy KidsHealth / For Parents / Speech-Language Therapy ... most kids with speech and/or language disorders. Speech Disorders, Language Disorders, and Feeding Disorders A speech ...

  5. [Improving speech comprehension using a new cochlear implant speech processor].

    Science.gov (United States)

    Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A

    2009-06-01

    The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg

  6. Neurophysiology of speech differences in childhood apraxia of speech.

    Science.gov (United States)

    Preston, Jonathan L; Molfese, Peter J; Gumkowski, Nina; Sorcinelli, Andrea; Harwood, Vanessa; Irwin, Julia R; Landi, Nicole

    2014-01-01

    Event-related potentials (ERPs) were recorded during a picture naming task of simple and complex words in children with typical speech and with childhood apraxia of speech (CAS). Results reveal reduced amplitude prior to speaking complex (multisyllabic) words relative to simple (monosyllabic) words for the CAS group over the right hemisphere during a time window thought to reflect phonological encoding of word forms. Group differences were also observed prior to production of spoken tokens regardless of word complexity during a time window just prior to speech onset (thought to reflect motor planning/programming). Results suggest differences in pre-speech neurolinguistic processes.

  7. Human Superior Temporal Gyrus Organization of Spectrotemporal Modulation Tuning Derived from Speech Stimuli.

    Science.gov (United States)

    Hullett, Patrick W; Hamilton, Liberty S; Mesgarani, Nima; Schreiner, Christoph E; Chang, Edward F

    2016-02-10

    The human superior temporal gyrus (STG) is critical for speech perception, yet the organization of spectrotemporal processing of speech within the STG is not well understood. Here, to characterize the spatial organization of spectrotemporal processing of speech across human STG, we use high-density cortical surface field potential recordings while participants listened to natural continuous speech. While synthetic broad-band stimuli did not yield sustained activation of the STG, spectrotemporal receptive fields could be reconstructed from vigorous responses to speech stimuli. We find that the human STG displays a robust anterior-posterior spatial distribution of spectrotemporal tuning in which the posterior STG is tuned for temporally fast varying speech sounds that have relatively constant energy across the frequency axis (low spectral modulation) while the anterior STG is tuned for temporally slow varying speech sounds that have a high degree of spectral variation across the frequency axis (high spectral modulation). This work illustrates organization of spectrotemporal processing in the human STG, and illuminates processing of ethologically relevant speech signals in a region of the brain specialized for speech perception. Considerable evidence has implicated the human superior temporal gyrus (STG) in speech processing. However, the gross organization of spectrotemporal processing of speech within the STG is not well characterized. Here we use natural speech stimuli and advanced receptive field characterization methods to show that spectrotemporal features within speech are well organized along the posterior-to-anterior axis of the human STG. These findings demonstrate robust functional organization based on spectrotemporal modulation content, and illustrate that much of the encoded information in the STG represents the physical acoustic properties of speech stimuli. Copyright © 2016 the authors 0270-6474/16/362014-13$15.00/0.

  8. Speech endpoint detection with non-language speech sounds for generic speech processing applications

    Science.gov (United States)

    McClain, Matthew; Romanowski, Brian

    2009-05-01

    Non-language speech sounds (NLSS) are sounds produced by humans that do not carry linguistic information. Examples of these sounds are coughs, clicks, breaths, and filled pauses such as "uh" and "um" in English. NLSS are prominent in conversational speech, but can be a significant source of errors in speech processing applications. Traditionally, these sounds are ignored by speech endpoint detection algorithms, where speech regions are identified in the audio signal prior to processing. The ability to filter NLSS as a pre-processing step can significantly enhance the performance of many speech processing applications, such as speaker identification, language identification, and automatic speech recognition. In order to be used in all such applications, NLSS detection must be performed without the use of language models that provide knowledge of the phonology and lexical structure of speech. This is especially relevant to situations where the languages used in the audio are not known apriori. We present the results of preliminary experiments using data from American and British English speakers, in which segments of audio are classified as language speech sounds (LSS) or NLSS using a set of acoustic features designed for language-agnostic NLSS detection and a hidden-Markov model (HMM) to model speech generation. The results of these experiments indicate that the features and model used are capable of detection certain types of NLSS, such as breaths and clicks, while detection of other types of NLSS such as filled pauses will require future research.

  9. Solar wind stream interfaces

    International Nuclear Information System (INIS)

    Gosling, J.T.; Asbridge, J.R.; Bame, S.J.; Feldman, W.C.

    1978-01-01

    Measurements aboard Imp 6, 7, and 8 reveal that approximately one third of all high-speed solar wind streams observed at 1 AU contain a sharp boundary (of thickness less than approx.4 x 10 4 km) near their leading edge, called a stream interface, which separates plasma of distinctly different properties and origins. Identified as discontinuities across which the density drops abruptly, the proton temperature increases abruptly, and the speed rises, stream interfaces are remarkably similar in character from one stream to the next. A superposed epoch analysis of plasma data has been performed for 23 discontinuous stream interfaces observed during the interval March 1971 through August 1974. Among the results of this analysis are the following: (1) a stream interface separates what was originally thick (i.e., dense) slow gas from what was originally thin (i.e., rare) fast gas; (2) the interface is the site of a discontinuous shear in the solar wind flow in a frame of reference corotating with the sun; (3) stream interfaces occur at speeds less than 450 km s - 1 and close to or at the maximum of the pressure ridge at the leading edges of high-speed streams; (4) a discontinuous rise by approx.40% in electron temperature occurs at the interface; and (5) discontinuous changes (usually rises) in alpha particle abundance and flow speed relative to the protons occur at the interface. Stream interfaces do not generally recur on successive solar rotations, even though the streams in which they are embedded often do. At distances beyond several astronomical units, stream interfaces should be bounded by forward-reverse shock pairs; three of four reverse shocks observed at 1 AU during 1971--1974 were preceded within approx.1 day by stream interfaces. Our observations suggest that many streams close to the sun are bounded on all sides by large radial velocity shears separating rapidly expanding plasma from more slowly expanding plasma

  10. Phonetic search methods for large speech databases

    CERN Document Server

    Moyal, Ami; Tetariy, Ella; Gishri, Michal

    2013-01-01

    “Phonetic Search Methods for Large Databases” focuses on Keyword Spotting (KWS) within large speech databases. The brief will begin by outlining the challenges associated with Keyword Spotting within large speech databases using dynamic keyword vocabularies. It will then continue by highlighting the various market segments in need of KWS solutions, as well as, the specific requirements of each market segment. The work also includes a detailed description of the complexity of the task and the different methods that are used, including the advantages and disadvantages of each method and an in-depth comparison. The main focus will be on the Phonetic Search method and its efficient implementation. This will include a literature review of the various methods used for the efficient implementation of Phonetic Search Keyword Spotting, with an emphasis on the authors’ own research which entails a comparative analysis of the Phonetic Search method which includes algorithmic details. This brief is useful for resea...

  11. VPipe: Virtual Pipelining for Scheduling of DAG Stream Query Plans

    Science.gov (United States)

    Wang, Song; Gupta, Chetan; Mehta, Abhay

    There are data streams all around us that can be harnessed for tremendous business and personal advantage. For an enterprise-level stream processing system such as CHAOS [1] (Continuous, Heterogeneous Analytic Over Streams), handling of complex query plans with resource constraints is challenging. While several scheduling strategies exist for stream processing, efficient scheduling of complex DAG query plans is still largely unsolved. In this paper, we propose a novel execution scheme for scheduling complex directed acyclic graph (DAG) query plans with meta-data enriched stream tuples. Our solution, called Virtual Pipelined Chain (or VPipe Chain for short), effectively extends the "Chain" pipelining scheduling approach to complex DAG query plans.

  12. Abortion and compelled physician speech.

    Science.gov (United States)

    Orentlicher, David

    2015-01-01

    Informed consent mandates for abortion providers may infringe the First Amendment's freedom of speech. On the other hand, they may reinforce the physician's duty to obtain informed consent. Courts can promote both doctrines by ensuring that compelled physician speech pertains to medical facts about abortion rather than abortion ideology and that compelled speech is truthful and not misleading. © 2015 American Society of Law, Medicine & Ethics, Inc.

  13. Speech Recognition on Mobile Devices

    DEFF Research Database (Denmark)

    Tan, Zheng-Hua; Lindberg, Børge

    2010-01-01

    in the mobile context covering motivations, challenges, fundamental techniques and applications. Three ASR architectures are introduced: embedded speech recognition, distributed speech recognition and network speech recognition. Their pros and cons and implementation issues are discussed. Applications within......The enthusiasm of deploying automatic speech recognition (ASR) on mobile devices is driven both by remarkable advances in ASR technology and by the demand for efficient user interfaces on such devices as mobile phones and personal digital assistants (PDAs). This chapter presents an overview of ASR...

  14. Distance Learning: Effectiveness of an Interdisciplinary Course in Speech Pathology and Dentistry

    Science.gov (United States)

    Ramos, Janine Santos; da Silva, Letícia Korb; Pinzan, Arnaldo; de Castro Rodrigues, Antonio; Berretin-Felix, Giédre

    2015-01-01

    Objective: Evaluate the effectiveness of distance learning courses for the purpose of interdisciplinary continuing education in Speech Pathology and Dentistry. Methods: The online course was made available on the Moodle platform. A total of 30 undergraduates participated in the study (15 from the Dentistry course and 15 from the Speech Pathology…

  15. Fraser and the Cheerleader: Values and the Boundaries of Student Speech

    Science.gov (United States)

    Ehrensal, Patricia A. L.

    2012-01-01

    Student speech has and continues to be a contested issue in schools. The Supreme Court ruled in "Tinker" that students do not shed their rights at the schoolhouse gate; in the "Kuhlmeier" and "Fraser" decisions, however, the Court gave school officials greater latitude in regulating student speech, especially when it…

  16. Academic Freedom in Classroom Speech: A Heuristic Model for U.S. Catholic Higher Education

    Science.gov (United States)

    Jacobs, Richard M.

    2010-01-01

    As the nation's Catholic universities and colleges continually clarify their identity, this article examines academic freedom in classroom speech, offering a heuristic model for use as board members, academic administrators, and faculty leaders discuss, evaluate, and judge allegations of misconduct in classroom speech. Focusing upon the practice…

  17. The significance of small streams

    Science.gov (United States)

    Wohl, Ellen

    2017-09-01

    Headwaters, defined here as first- and secondorder streams, make up 70%‒80% of the total channel length of river networks. These small streams exert a critical influence on downstream portions of the river network by: retaining or transmitting sediment and nutrients; providing habitat and refuge for diverse aquatic and riparian organisms; creating migration corridors; and governing connectivity at the watershed-scale. The upstream-most extent of the channel network and the longitudinal continuity and lateral extent of headwaters can be difficult to delineate, however, and people are less likely to recognize the importance of headwaters relative to other portions of a river network. Consequently, headwaters commonly lack the legal protections accorded to other portions of a river network and are more likely to be significantly altered or completely obliterated by land use.

  18. Current trends in multilingual speech processing

    Indian Academy of Sciences (India)

    2016-08-26

    ; speech-to-speech translation; language identification. ... interest owing to two strong driving forces. Firstly, technical advances in speech recognition and synthesis are posing new challenges and opportunities to researchers.

  19. Stream Response to an Extreme Defoliation Event

    Science.gov (United States)

    Gold, A.; Loffredo, J.; Addy, K.; Bernhardt, E. S.; Berdanier, A. B.; Schroth, A. W.; Inamdar, S. P.; Bowden, W. B.

    2017-12-01

    Extreme climatic events are known to profoundly impact stream flow and stream fluxes. These events can also exert controls on insect outbreaks, which may create marked changes in stream characteristics. The invasive Gypsy Moth (Lymantria dispar dispar) experiences episodic infestations based on extreme climatic conditions within the northeastern U.S. In most years, gypsy moth populations are kept in check by diseases. In 2016 - after successive years of unusually warm, dry spring and summer weather -gypsy moth caterpillars defoliated over half of Rhode Island's 160,000 forested ha. No defoliation of this magnitude had occurred for more than 30 years. We examined one RI headwater stream's response to the defoliation event in 2016 compared with comparable data in 2014 and 2015. Stream temperature and flow was gauged continuously by USGS and dissolved oxygen (DO) was measured with a YSI EXO2 sonde every 30 minutes during a series of deployments in the spring, summer and fall from 2014-2016. We used the single station, open channel method to estimate stream metabolism metrics. We also assessed local climate and stream temperature data from 2009-2016. We observed changes in stream responses during the defoliation event that suggest changes in ET, solar radiation and heat flux. Although the summer of 2016 had more drought stress (PDSI) than previous years, stream flow occurred throughout the summer, in contrast to several years with lower drought stress when stream flow ceased. Air temperature in 2016 was similar to prior years, but stream temperature was substantially higher than the prior seven years, likely due to the loss of canopy shading. DO declined dramatically in 2016 compared to prior years - more than the rising stream temperatures would indicate. Gross Primary Productivity was significantly higher during the year of the defoliation, indicating more total fixation of inorganic carbon from photo-autotrophs. In 2016, Ecosystem Respiration was also higher and Net

  20. Inventory of miscellaneous streams

    International Nuclear Information System (INIS)

    Lueck, K.J.

    1995-09-01

    On December 23, 1991, the US Department of Energy, Richland Operations Office (RL) and the Washington State Department of Ecology (Ecology) agreed to adhere to the provisions of the Department of Ecology Consent Order. The Consent Order lists the regulatory milestones for liquid effluent streams at the Hanford Site to comply with the permitting requirements of Washington Administrative Code. The RL provided the US Congress a Plan and Schedule to discontinue disposal of contaminated liquid effluent into the soil column on the Hanford Site. The plan and schedule document contained a strategy for the implementation of alternative treatment and disposal systems. This strategy included prioritizing the streams into two phases. The Phase 1 streams were considered to be higher priority than the Phase 2 streams. The actions recommended for the Phase 1 and 2 streams in the two reports were incorporated in the Hanford Federal Facility Agreement and Consent Order. Miscellaneous Streams are those liquid effluents streams identified within the Consent Order that are discharged to the ground but are not categorized as Phase 1 or Phase 2 Streams. This document consists of an inventory of the liquid effluent streams being discharged into the Hanford soil column

  1. Hydrography - Streams and Shorelines

    Data.gov (United States)

    California Natural Resource Agency — The hydrography layer consists of flowing waters (rivers and streams), standing waters (lakes and ponds), and wetlands -- both natural and manmade. Two separate...

  2. Adaptation to delayed auditory feedback induces the temporal recalibration effect in both speech perception and production.

    Science.gov (United States)

    Yamamoto, Kosuke; Kawabata, Hideaki

    2014-12-01

    We ordinarily speak fluently, even though our perceptions of our own voices are disrupted by various environmental acoustic properties. The underlying mechanism of speech is supposed to monitor the temporal relationship between speech production and the perception of auditory feedback, as suggested by a reduction in speech fluency when the speaker is exposed to delayed auditory feedback (DAF). While many studies have reported that DAF influences speech motor processing, its relationship to the temporal tuning effect on multimodal integration, or temporal recalibration, remains unclear. We investigated whether the temporal aspects of both speech perception and production change due to adaptation to the delay between the motor sensation and the auditory feedback. This is a well-used method of inducing temporal recalibration. Participants continually read texts with specific DAF times in order to adapt to the delay. Then, they judged the simultaneity between the motor sensation and the vocal feedback. We measured the rates of speech with which participants read the texts in both the exposure and re-exposure phases. We found that exposure to DAF changed both the rate of speech and the simultaneity judgment, that is, participants' speech gained fluency. Although we also found that a delay of 200 ms appeared to be most effective in decreasing the rates of speech and shifting the distribution on the simultaneity judgment, there was no correlation between these measurements. These findings suggest that both speech motor production and multimodal perception are adaptive to temporal lag but are processed in distinct ways.

  3. Multimodal Speech Capture System for Speech Rehabilitation and Learning.

    Science.gov (United States)

    Sebkhi, Nordine; Desai, Dhyey; Islam, Mohammad; Lu, Jun; Wilson, Kimberly; Ghovanloo, Maysam

    2017-11-01

    Speech-language pathologists (SLPs) are trained to correct articulation of people diagnosed with motor speech disorders by analyzing articulators' motion and assessing speech outcome while patients speak. To assist SLPs in this task, we are presenting the multimodal speech capture system (MSCS) that records and displays kinematics of key speech articulators, the tongue and lips, along with voice, using unobtrusive methods. Collected speech modalities, tongue motion, lips gestures, and voice are visualized not only in real-time to provide patients with instant feedback but also offline to allow SLPs to perform post-analysis of articulators' motion, particularly the tongue, with its prominent but hardly visible role in articulation. We describe the MSCS hardware and software components, and demonstrate its basic visualization capabilities by a healthy individual repeating the words "Hello World." A proof-of-concept prototype has been successfully developed for this purpose, and will be used in future clinical studies to evaluate its potential impact on accelerating speech rehabilitation by enabling patients to speak naturally. Pattern matching algorithms to be applied to the collected data can provide patients with quantitative and objective feedback on their speech performance, unlike current methods that are mostly subjective, and may vary from one SLP to another.

  4. Measurement of speech parameters in casual speech of dementia patients

    NARCIS (Netherlands)

    Ossewaarde, Roelant; Jonkers, Roel; Jalvingh, Fedor; Bastiaanse, Yvonne

    Measurement of speech parameters in casual speech of dementia patients Roelant Adriaan Ossewaarde1,2, Roel Jonkers1, Fedor Jalvingh1,3, Roelien Bastiaanse1 1CLCG, University of Groningen (NL); 2HU University of Applied Sciences Utrecht (NL); 33St. Marienhospital - Vechta, Geriatric Clinic Vechta

  5. Alternative Speech Communication System for Persons with Severe Speech Disorders

    Science.gov (United States)

    Selouani, Sid-Ahmed; Sidi Yakoub, Mohammed; O'Shaughnessy, Douglas

    2009-12-01

    Assistive speech-enabled systems are proposed to help both French and English speaking persons with various speech disorders. The proposed assistive systems use automatic speech recognition (ASR) and speech synthesis in order to enhance the quality of communication. These systems aim at improving the intelligibility of pathologic speech making it as natural as possible and close to the original voice of the speaker. The resynthesized utterances use new basic units, a new concatenating algorithm and a grafting technique to correct the poorly pronounced phonemes. The ASR responses are uttered by the new speech synthesis system in order to convey an intelligible message to listeners. Experiments involving four American speakers with severe dysarthria and two Acadian French speakers with sound substitution disorders (SSDs) are carried out to demonstrate the efficiency of the proposed methods. An improvement of the Perceptual Evaluation of the Speech Quality (PESQ) value of 5% and more than 20% is achieved by the speech synthesis systems that deal with SSD and dysarthria, respectively.

  6. Speech Perception as a Multimodal Phenomenon

    OpenAIRE

    Rosenblum, Lawrence D.

    2008-01-01

    Speech perception is inherently multimodal. Visual speech (lip-reading) information is used by all perceivers and readily integrates with auditory speech. Imaging research suggests that the brain treats auditory and visual speech similarly. These findings have led some researchers to consider that speech perception works by extracting amodal information that takes the same form across modalities. From this perspective, speech integration is a property of the input information itself. Amodal s...

  7. How early do children understand gesture-speech combinations with iconic gestures?

    Science.gov (United States)

    Stanfield, Carmen; Williamson, Rebecca; Ozçalişkan, Seyda

    2014-03-01

    Children understand gesture+speech combinations in which a deictic gesture adds new information to the accompanying speech by age 1;6 (Morford & Goldin-Meadow, 1992; 'push'+point at ball). This study explores how early children understand gesture+speech combinations in which an iconic gesture conveys additional information not found in the accompanying speech (e.g., 'read'+BOOK gesture). Our analysis of two- to four-year-old children's responses in a gesture+speech comprehension task showed that children grasp the meaning of iconic co-speech gestures by age three and continue to improve their understanding with age. Overall, our study highlights the important role gesture plays in language comprehension as children learn to unpack increasingly complex communications addressed to them at the early ages.

  8. The Effects of Background Noise on the Performance of an Automatic Speech Recogniser

    Science.gov (United States)

    Littlefield, Jason; HashemiSakhtsari, Ahmad

    2002-11-01

    Ambient or environmental noise is a major factor that affects the performance of an automatic speech recognizer. Large vocabulary, speaker-dependent, continuous speech recognizers are commercially available. Speech recognizers, perform well in a quiet environment, but poorly in a noisy environment. Speaker-dependent speech recognizers require training prior to them being tested, where the level of background noise in both phases affects the performance of the recognizer. This study aims to determine whether the best performance of a speech recognizer occurs when the levels of background noise during the training and test phases are the same, and how the performance is affected when the levels of background noise during the training and test phases are different. The relationship between the performance of the speech recognizer and upgrading the computer speed and amount of memory as well as software version was also investigated.

  9. Auditory Modeling for Noisy Speech Recognition

    National Research Council Canada - National Science Library

    2000-01-01

    ... digital filtering for noise cancellation which interfaces to speech recognition software. It uses auditory features in speech recognition training, and provides applications to multilingual spoken language translation...

  10. Teaching Speech Acts

    Directory of Open Access Journals (Sweden)

    Teaching Speech Acts

    2007-01-01

    Full Text Available In this paper I argue that pragmatic ability must become part of what we teach in the classroom if we are to realize the goals of communicative competence for our students. I review the research on pragmatics, especially those articles that point to the effectiveness of teaching pragmatics in an explicit manner, and those that posit methods for teaching. I also note two areas of scholarship that address classroom needs—the use of authentic data and appropriate assessment tools. The essay concludes with a summary of my own experience teaching speech acts in an advanced-level Portuguese class.

  11. Controlling the acoustic streaming by pulsed ultrasounds.

    Science.gov (United States)

    Hoyos, Mauricio; Castro, Angélica

    2013-01-01

    We propose a technique based on pulsed ultrasounds for controlling, reducing to a minimum observable value the acoustic streaming in closed ultrasonic standing wave fluidic resonators. By modifying the number of pulses and the repetition time it is possible to reduce the velocity of the acoustic streaming with respect to the velocity generated by the continuous ultrasound mode of operation. The acoustic streaming is observed at the nodal plane where a suspension of 800nm latex particles was focused by primary radiation force. A mixture of 800nm and 15μm latex particles has been also used for showing that the acoustic streaming is hardly reduced while primary and secondary forces continue to operate. The parameter we call "pulse mode factor" i.e. the time of applied ultrasound divided by the duty cycle, is found to be the adequate parameter that controls the acoustic streaming. We demonstrate that pulsed ultrasound is more efficient for controlling the acoustic streaming than the variation of the amplitude of the standing waves. Copyright © 2012 Elsevier B.V. All rights reserved.

  12. Different neurophysiological mechanisms underlying word and rule extraction from speech.

    Directory of Open Access Journals (Sweden)

    Ruth De Diego Balaguer

    Full Text Available The initial process of identifying words from spoken language and the detection of more subtle regularities underlying their structure are mandatory processes for language acquisition. Little is known about the cognitive mechanisms that allow us to extract these two types of information and their specific time-course of acquisition following initial contact with a new language. We report time-related electrophysiological changes that occurred while participants learned an artificial language. These changes strongly correlated with the discovery of the structural rules embedded in the words. These changes were clearly different from those related to word learning and occurred during the first minutes of exposition. There is a functional distinction in the nature of the electrophysiological signals during acquisition: an increase in negativity (N400 in the central electrodes is related to word-learning and development of a frontal positivity (P2 is related to rule-learning. In addition, the results of an online implicit and a post-learning test indicate that, once the rules of the language have been acquired, new words following the rule are processed as words of the language. By contrast, new words violating the rule induce syntax-related electrophysiological responses when inserted online in the stream (an early frontal negativity followed by a late posterior positivity and clear lexical effects when presented in isolation (N400 modulation. The present study provides direct evidence suggesting that the mechanisms to extract words and structural dependencies from continuous speech are functionally segregated. When these mechanisms are engaged, the electrophysiological marker associated with rule-learning appears very quickly, during the earliest phases of exposition to a new language.

  13. Speech Training for Inmate Rehabilitation.

    Science.gov (United States)

    Parkinson, Michael G.; Dobkins, David H.

    1982-01-01

    Using a computerized content analysis, the authors demonstrate changes in speech behaviors of prison inmates. They conclude that two to four hours of public speaking training can have only limited effect on students who live in a culture in which "prison speech" is the expected and rewarded form of behavior. (PD)

  14. Separating Underdetermined Convolutive Speech Mixtures

    DEFF Research Database (Denmark)

    Pedersen, Michael Syskind; Wang, DeLiang; Larsen, Jan

    2006-01-01

    a method for underdetermined blind source separation of convolutive mixtures. The proposed framework is applicable for separation of instantaneous as well as convolutive speech mixtures. It is possible to iteratively extract each speech signal from the mixture by combining blind source separation...

  15. Speech recognition from spectral dynamics

    Indian Academy of Sciences (India)

    Some of the history of gradual infusion of the modulation spectrum concept into Automatic recognition of speech (ASR) comes next, pointing to the relationship of modulation spectrum processing to wellaccepted ASR techniques such as dynamic speech features or RelAtive SpecTrAl (RASTA) filtering. Next, the frequency ...

  16. Speech Prosody in Cerebellar Ataxia

    Science.gov (United States)

    Casper, Maureen A.; Raphael, Lawrence J.; Harris, Katherine S.; Geibel, Jennifer M.

    2007-01-01

    Persons with cerebellar ataxia exhibit changes in physical coordination and speech and voice production. Previously, these alterations of speech and voice production were described primarily via perceptual coordinates. In this study, the spatial-temporal properties of syllable production were examined in 12 speakers, six of whom were healthy…

  17. Cortical oscillations and entrainment in speech processing during working memory load.

    Science.gov (United States)

    Hjortkjaer, Jens; Märcher-Rørsted, Jonatan; Fuglsang, Søren A; Dau, Torsten

    2018-02-02

    Neuronal oscillations are thought to play an important role in working memory (WM) and speech processing. Listening to speech in real-life situations is often cognitively demanding but it is unknown whether WM load influences how auditory cortical activity synchronizes to speech features. Here, we developed an auditory n-back paradigm to investigate cortical entrainment to speech envelope fluctuations under different degrees of WM load. We measured the electroencephalogram, pupil dilations and behavioural performance from 22 subjects listening to continuous speech with an embedded n-back task. The speech stimuli consisted of long spoken number sequences created to match natural speech in terms of sentence intonation, syllabic rate and phonetic content. To burden different WM functions during speech processing, listeners performed an n-back task on the speech sequences in different levels of background noise. Increasing WM load at higher n-back levels was associated with a decrease in posterior alpha power as well as increased pupil dilations. Frontal theta power increased at the start of the trial and increased additionally with higher n-back level. The observed alpha-theta power changes are consistent with visual n-back paradigms suggesting general oscillatory correlates of WM processing load. Speech entrainment was measured as a linear mapping between the envelope of the speech signal and low-frequency cortical activity (level) decreased cortical speech envelope entrainment. Although entrainment persisted under high load, our results suggest a top-down influence of WM processing on cortical speech entrainment. © 2018 The Authors. European Journal of Neuroscience published by Federation of European Neuroscience Societies and John Wiley & Sons Ltd.

  18. A Network Model of Observation and Imitation of Speech

    Science.gov (United States)

    Mashal, Nira; Solodkin, Ana; Dick, Anthony Steven; Chen, E. Elinor; Small, Steven L.

    2012-01-01

    Much evidence has now accumulated demonstrating and quantifying the extent of shared regional brain activation for observation and execution of speech. However, the nature of the actual networks that implement these functions, i.e., both the brain regions and the connections among them, and the similarities and differences across these networks has not been elucidated. The current study aims to characterize formally a network for observation and imitation of syllables in the healthy adult brain and to compare their structure and effective connectivity. Eleven healthy participants observed or imitated audiovisual syllables spoken by a human actor. We constructed four structural equation models to characterize the networks for observation and imitation in each of the two hemispheres. Our results show that the network models for observation and imitation comprise the same essential structure but differ in important ways from each other (in both hemispheres) based on connectivity. In particular, our results show that the connections from posterior superior temporal gyrus and sulcus to ventral premotor, ventral premotor to dorsal premotor, and dorsal premotor to primary motor cortex in the left hemisphere are stronger during imitation than during observation. The first two connections are implicated in a putative dorsal stream of speech perception, thought to involve translating auditory speech signals into motor representations. Thus, the current results suggest that flow of information during imitation, starting at the posterior superior temporal cortex and ending in the motor cortex, enhances input to the motor cortex in the service of speech execution. PMID:22470360

  19. Comparison of Classification Methods for Detecting Emotion from Mandarin Speech

    Science.gov (United States)

    Pao, Tsang-Long; Chen, Yu-Te; Yeh, Jun-Heng

    It is said that technology comes out from humanity. What is humanity? The very definition of humanity is emotion. Emotion is the basis for all human expression and the underlying theme behind everything that is done, said, thought or imagined. Making computers being able to perceive and respond to human emotion, the human-computer interaction will be more natural. Several classifiers are adopted for automatically assigning an emotion category, such as anger, happiness or sadness, to a speech utterance. These classifiers were designed independently and tested on various emotional speech corpora, making it difficult to compare and evaluate their performance. In this paper, we first compared several popular classification methods and evaluated their performance by applying them to a Mandarin speech corpus consisting of five basic emotions, including anger, happiness, boredom, sadness and neutral. The extracted feature streams contain MFCC, LPCC, and LPC. The experimental results show that the proposed WD-MKNN classifier achieves an accuracy of 81.4% for the 5-class emotion recognition and outperforms other classification techniques, including KNN, MKNN, DW-KNN, LDA, QDA, GMM, HMM, SVM, and BPNN. Then, to verify the advantage of the proposed method, we compared these classifiers by applying them to another Mandarin expressive speech corpus consisting of two emotions. The experimental results still show that the proposed WD-MKNN outperforms others.

  20. Dynamic Source Selection to Handle Changes of User’s Interest in Continuous Query

    OpenAIRE

    Ohki, Kosuke; Watanabe, Yousuke; Kitagawa, Hiroyuki

    2010-01-01

    The volume of stream data delivered from information sources has been increasing. A demand for efficient processing of stream data has become more and more important. Stream processing systems [1] can continuously process stream data according to user’s requests. A request is usually specified as a continuous query written in SQL-like language.

  1. Lexical and sublexical units in speech perception.

    Science.gov (United States)

    Giroux, Ibrahima; Rey, Arnaud

    2009-03-01

    Saffran, Newport, and Aslin (1996a) found that human infants are sensitive to statistical regularities corresponding to lexical units when hearing an artificial spoken language. Two sorts of segmentation strategies have been proposed to account for this early word-segmentation ability: bracketing strategies, in which infants are assumed to insert boundaries into continuous speech, and clustering strategies, in which infants are assumed to group certain speech sequences together into units (Swingley, 2005). In the present study, we test the predictions of two computational models instantiating each of these strategies i.e., Serial Recurrent Networks: Elman, 1990; and Parser: Perruchet & Vinter, 1998 in an experiment where we compare the lexical and sublexical recognition performance of adults after hearing 2 or 10 min of an artificial spoken language. The results are consistent with Parser's predictions and the clustering approach, showing that performance on words is better than performance on part-words only after 10 min. This result suggests that word segmentation abilities are not merely due to stronger associations between sublexical units but to the emergence of stronger lexical representations during the development of speech perception processes. Copyright © 2009, Cognitive Science Society, Inc.

  2. On speech recognition during anaesthesia

    DEFF Research Database (Denmark)

    Alapetite, Alexandre

    2007-01-01

    This PhD thesis in human-computer interfaces (informatics) studies the case of the anaesthesia record used during medical operations and the possibility to supplement it with speech recognition facilities. Problems and limitations have been identified with the traditional paper-based anaesthesia...... and inaccuracies in the anaesthesia record. Supplementing the electronic anaesthesia record interface with speech input facilities is proposed as one possible solution to a part of the problem. The testing of the various hypotheses has involved the development of a prototype of an electronic anaesthesia record...... interface with speech input facilities in Danish. The evaluation of the new interface was carried out in a full-scale anaesthesia simulator. This has been complemented by laboratory experiments on several aspects of speech recognition for this type of use, e.g. the effects of noise on speech recognition...

  3. From Gesture to Speech

    Directory of Open Access Journals (Sweden)

    Maurizio Gentilucci

    2012-11-01

    Full Text Available One of the major problems concerning the evolution of human language is to understand how sounds became associated to meaningful gestures. It has been proposed that the circuit controlling gestures and speech evolved from a circuit involved in the control of arm and mouth movements related to ingestion. This circuit contributed to the evolution of spoken language, moving from a system of communication based on arm gestures. The discovery of the mirror neurons has provided strong support for the gestural theory of speech origin because they offer a natural substrate for the embodiment of language and create a direct link between sender and receiver of a message. Behavioural studies indicate that manual gestures are linked to mouth movements used for syllable emission. Grasping with the hand selectively affected movement of inner or outer parts of the mouth according to syllable pronunciation and hand postures, in addition to hand actions, influenced the control of mouth grasp and vocalization. Gestures and words are also related to each other. It was found that when producing communicative gestures (emblems the intention to interact directly with a conspecific was transferred from gestures to words, inducing modification in voice parameters. Transfer effects of the meaning of representational gestures were found on both vocalizations and meaningful words. It has been concluded that the results of our studies suggest the existence of a system relating gesture to vocalization which was precursor of a more general system reciprocally relating gesture to word.

  4. LHCb trigger streams optimization

    Science.gov (United States)

    Derkach, D.; Kazeev, N.; Neychev, R.; Panin, A.; Trofimov, I.; Ustyuzhanin, A.; Vesterinen, M.

    2017-10-01

    The LHCb experiment stores around 1011 collision events per year. A typical physics analysis deals with a final sample of up to 107 events. Event preselection algorithms (lines) are used for data reduction. Since the data are stored in a format that requires sequential access, the lines are grouped into several output file streams, in order to increase the efficiency of user analysis jobs that read these data. The scheme efficiency heavily depends on the stream composition. By putting similar lines together and balancing the stream sizes it is possible to reduce the overhead. We present a method for finding an optimal stream composition. The method is applied to a part of the LHCb data (Turbo stream) on the stage where it is prepared for user physics analysis. This results in an expected improvement of 15% in the speed of user analysis jobs, and will be applied on data to be recorded in 2017.

  5. Asteroid/meteorite streams

    Science.gov (United States)

    Drummond, J.

    The independent discovery of the same three streams (named alpha, beta, and gamma) among 139 Earth approaching asteroids and among 89 meteorite producing fireballs presents the possibility of matching specific meteorites to specific asteroids, or at least to asteroids in the same stream and, therefore, presumably of the same composition. Although perhaps of limited practical value, the three meteorites with known orbits are all ordinary chondrites. To identify, in general, the taxonomic type of the parent asteroid, however, would be of great scientific interest since these most abundant meteorite types cannot be unambiguously spectrally matched to an asteroid type. The H5 Pribram meteorite and asteroid 4486 (unclassified) are not part of a stream, but travel in fairly similar orbits. The LL5 Innisfree meteorite is orbitally similar to asteroid 1989DA (unclassified), and both are members of a fourth stream (delta) defined by five meteorite-dropping fireballs and this one asteroid. The H5 Lost City meteorite is orbitally similar to 1980AA (S type), which is a member of stream gamma defined by four asteroids and four fireballs. Another asteroid in this stream is classified as an S type, another is QU, and the fourth is unclassified. This stream suggests that ordinary chondrites should be associated with S (and/or Q) asteroids. Two of the known four V type asteroids belong to another stream, beta, defined by five asteroids and four meteorite-dropping (but unrecovered) fireballs, making it the most probable source of the eucrites. The final stream, alpha, defined by five asteroids and three fireballs is of unknown composition since no meteorites have been recovered and only one asteroid has an ambiguous classification of QRS. If this stream, or any other as yet undiscovered ones, were found to be composed of a more practical material (e.g., water or metalrich), then recovery of the associated meteorites would provide an opportunity for in-hand analysis of a potential

  6. Audio-Visual Tibetan Speech Recognition Based on a Deep Dynamic Bayesian Network for Natural Human Robot Interaction

    Directory of Open Access Journals (Sweden)

    Yue Zhao

    2012-12-01

    Full Text Available Audio-visual speech recognition is a natural and robust approach to improving human-robot interaction in noisy environments. Although multi-stream Dynamic Bayesian Network and coupled HMM are widely used for audio-visual speech recognition, they fail to learn the shared features between modalities and ignore the dependency of features among the frames within each discrete state. In this paper, we propose a Deep Dynamic Bayesian Network (DDBN to perform unsupervised extraction of spatial-temporal multimodal features from Tibetan audio-visual speech data and build an accurate audio-visual speech recognition model under a no frame-independency assumption. The experiment results on Tibetan speech data from some real-world environments showed the proposed DDBN outperforms the state-of-art methods in word recognition accuracy.

  7. Secure remote service execution for web media streaming

    OpenAIRE

    Mikityuk, Alexandra

    2017-01-01

    Through continuous advancements in streaming and Web technologies over the past decade, the Web has become a platform for media delivery. Web standards like HTML5 have been designed accordingly, allowing for the delivery of applications, high-quality streaming video, and hooks for interoperable content protection. Efficient video encoding algorithms such as AVC/HEVC and streaming protocols such as MPEG-DASH have served as additional triggers for this evolution. Users now employ...

  8. Simulated rape, orgy, gory killings & hate speech

    DEFF Research Database (Denmark)

    Kierkegaard, Sylvia; Kierkegaard, Patrick

    2011-01-01

    Schwarzenegger v. Entertainment Merchants Association has been identified as one of the most important case on games before the US Supreme Court and the “the single most important challenge gaming has ever face”. To resolve Schwarzenegger, the Justices will need to decide how much First Amendment....... If it follows established precedent dealing with freedom of speech, the sale of gratuitously violent video games to minors will continue with contents for kids getting gorier, bloodier and grittier – all for fun, of course....

  9. Percent Forest Adjacent to Streams

    Data.gov (United States)

    U.S. Environmental Protection Agency — The type of vegetation along a stream influences the water quality in the stream. Intact buffer strips of natural vegetation along streams tend to intercept...

  10. Percent Agriculture Adjacent to Streams

    Data.gov (United States)

    U.S. Environmental Protection Agency — The type of vegetation along a stream influences the water quality in the stream. Intact buffer strips of natural vegetation along streams tend to intercept...

  11. The potential of speech act theory for New Testament exegesis ...

    African Journals Online (AJOL)

    Speech act theory as well offers New Testament exegesis some additional ways and means of approaching the text of the New Testament. This first in a series of two articles making a plea for the continued utilisation and application of this theory to the text of the New Testament, offers a brief discussion of the basic ...

  12. Speech act theory and New Testament exegesis | Botha | HTS ...

    African Journals Online (AJOL)

    Speech act theory offers New Testament exegesis some additional ways and means of approaching the text of the New Testament. This, the second in a series of two articles that make a plea for the continued utilisation and application of this theory to the text of the New Testament, deals with some of the possibilities and ...

  13. Speech control interface for Eurocontrol’s LINK2000+ system

    Directory of Open Access Journals (Sweden)

    Dan-Cristian ION

    2012-06-01

    Full Text Available This paper continues recent research of the authors, considering the use of speech recognition in air traffic control. It proposes the use of a voice control interface for Eurocontrol’s LINK2000+ system, offering an alternative means to improve air transport safety and efficiency.

  14. Criteria for Labelling Prosodic Aspects of English Speech.

    Science.gov (United States)

    Bagshaw, Paul C.; Williams, Briony J.

    A study reports a set of labelling criteria which have been developed to label prosodic events in clear, continuous speech, and proposes a scheme whereby this information can be transcribed in a machine readable format. A prosody in a syllabic domain which is synchronized with a phonemic segmentation was annotated. A procedural definition of…

  15. Cross-language differences in cue use for speech segmentation

    NARCIS (Netherlands)

    Tyler, M.D.; Cutler, A.

    2009-01-01

    Two artificial-language learning experiments directly compared English, French, and Dutch listeners' use of suprasegmental cues for continuous-speech segmentation. In both experiments, listeners heard unbroken sequences of consonant-vowel syllables, composed of recurring three- and four-syllable

  16. Speech, Language, and Audiology Services in Public Schools

    Science.gov (United States)

    Sunderland, L.C.

    2004-01-01

    The prevalence of communication disorders (speech, language, and hearing) among school-age children continues to increase, making it imperative that the classroom teacher be able to identify children in need of services. This article provides information that will enable all teachers to recognize when a child is exhibiting signs of a communication…

  17. Stuttering Frequency, Speech Rate, Speech Naturalness, and Speech Effort During the Production of Voluntary Stuttering.

    Science.gov (United States)

    Davidow, Jason H; Grossman, Heather L; Edge, Robin L

    2018-05-01

    Voluntary stuttering techniques involve persons who stutter purposefully interjecting disfluencies into their speech. Little research has been conducted on the impact of these techniques on the speech pattern of persons who stutter. The present study examined whether changes in the frequency of voluntary stuttering accompanied changes in stuttering frequency, articulation rate, speech naturalness, and speech effort. In total, 12 persons who stutter aged 16-34 years participated. Participants read four 300-syllable passages during a control condition, and three voluntary stuttering conditions that involved attempting to produce purposeful, tension-free repetitions of initial sounds or syllables of a word for two or more repetitions (i.e., bouncing). The three voluntary stuttering conditions included bouncing on 5%, 10%, and 15% of syllables read. Friedman tests and follow-up Wilcoxon signed ranks tests were conducted for the statistical analyses. Stuttering frequency, articulation rate, and speech naturalness were significantly different between the voluntary stuttering conditions. Speech effort did not differ between the voluntary stuttering conditions. Stuttering frequency was significantly lower during the three voluntary stuttering conditions compared to the control condition, and speech effort was significantly lower during two of the three voluntary stuttering conditions compared to the control condition. Due to changes in articulation rate across the voluntary stuttering conditions, it is difficult to conclude, as has been suggested previously, that voluntary stuttering is the reason for stuttering reductions found when using voluntary stuttering techniques. Additionally, future investigations should examine different types of voluntary stuttering over an extended period of time to determine their impact on stuttering frequency, speech rate, speech naturalness, and speech effort.

  18. Bioassessment in nonperennial streams: Hydrologic stability influences assessment validity

    Science.gov (United States)

    Mazor, R. D.; Stein, E. D.; Schiff, K.; Ode, P.; Rehn, A.

    2011-12-01

    Nonperennial streams pose a challenge for bioassessment, as assessment tools developed in perennial streams may not work in these systems. For example, indices of biotic integrity (IBIs) developed in perennial streams may give improper indications of impairment in nonperennial streams, or may be unstable. We sampled benthic macroinvertebrates from 12 nonperennial streams in southern California. In addition, we deployed loggers to obtain continuous measures of flow. 3 sites were revisited over 2 years. For each site, we calculated several metrics, IBIs, and O/E scores to determine if assessments were consistent and valid throughout the summer. Hydrology varied widely among the streams, with several streams drying between sampling events. IBIs suggested good ecological health at the beginning of the study, but declined sharply at some sites. Multivariate ordination suggested that, despite differences among sites, changes in community structure were similar, with shifts from Ephemeroptera, Plecoptera, and Trichoptera to Coleoptera and more tolerant organisms. Site revisits revealed a surprising level of variability, as 2 of the 3 revisited sites had perennial or near-perennial flow in the second year of sampling. IBI scores were more consistent in streams with stable hydrographs than in those with strongly intermittent hydrographs. These results suggest that nonperennial streams can be monitored successfully, but they may require short index periods and distinct metrics from those used in perennial streams. In addition, better approaches to mapping nonperennial streams are required.

  19. Visual Speech Fills in Both Discrimination and Identification of Non-Intact Auditory Speech in Children

    Science.gov (United States)

    Jerger, Susan; Damian, Markus F.; McAlpine, Rachel P.; Abdi, Herve

    2018-01-01

    To communicate, children must discriminate and identify speech sounds. Because visual speech plays an important role in this process, we explored how visual speech influences phoneme discrimination and identification by children. Critical items had intact visual speech (e.g. baez) coupled to non-intact (excised onsets) auditory speech (signified…

  20. Speech Silicon: An FPGA Architecture for Real-Time Hidden Markov-Model-Based Speech Recognition

    Directory of Open Access Journals (Sweden)

    Schuster Jeffrey

    2006-01-01

    Full Text Available This paper examines the design of an FPGA-based system-on-a-chip capable of performing continuous speech recognition on medium sized vocabularies in real time. Through the creation of three dedicated pipelines, one for each of the major operations in the system, we were able to maximize the throughput of the system while simultaneously minimizing the number of pipeline stalls in the system. Further, by implementing a token-passing scheme between the later stages of the system, the complexity of the control was greatly reduced and the amount of active data present in the system at any time was minimized. Additionally, through in-depth analysis of the SPHINX 3 large vocabulary continuous speech recognition engine, we were able to design models that could be efficiently benchmarked against a known software platform. These results, combined with the ability to reprogram the system for different recognition tasks, serve to create a system capable of performing real-time speech recognition in a vast array of environments.

  1. Speech Silicon: An FPGA Architecture for Real-Time Hidden Markov-Model-Based Speech Recognition

    Directory of Open Access Journals (Sweden)

    Alex K. Jones

    2006-11-01

    Full Text Available This paper examines the design of an FPGA-based system-on-a-chip capable of performing continuous speech recognition on medium sized vocabularies in real time. Through the creation of three dedicated pipelines, one for each of the major operations in the system, we were able to maximize the throughput of the system while simultaneously minimizing the number of pipeline stalls in the system. Further, by implementing a token-passing scheme between the later stages of the system, the complexity of the control was greatly reduced and the amount of active data present in the system at any time was minimized. Additionally, through in-depth analysis of the SPHINX 3 large vocabulary continuous speech recognition engine, we were able to design models that could be efficiently benchmarked against a known software platform. These results, combined with the ability to reprogram the system for different recognition tasks, serve to create a system capable of performing real-time speech recognition in a vast array of environments.

  2. Speech enhancement using emotion dependent codebooks

    NARCIS (Netherlands)

    Naidu, D.H.R.; Srinivasan, S.

    2012-01-01

    Several speech enhancement approaches utilize trained models of clean speech data, such as codebooks, Gaussian mixtures, and hidden Markov models. These models are typically trained on neutral clean speech data, without any emotion. However, in practical scenarios, emotional speech is a common

  3. Automated Speech Rate Measurement in Dysarthria

    Science.gov (United States)

    Martens, Heidi; Dekens, Tomas; Van Nuffelen, Gwen; Latacz, Lukas; Verhelst, Werner; De Bodt, Marc

    2015-01-01

    Purpose: In this study, a new algorithm for automated determination of speech rate (SR) in dysarthric speech is evaluated. We investigated how reliably the algorithm calculates the SR of dysarthric speech samples when compared with calculation performed by speech-language pathologists. Method: The new algorithm was trained and tested using Dutch…

  4. Is Birdsong More Like Speech or Music?

    Science.gov (United States)

    Shannon, Robert V

    2016-04-01

    Music and speech share many acoustic cues but not all are equally important. For example, harmonic pitch is essential for music but not for speech. When birds communicate is their song more like speech or music? A new study contrasting pitch and spectral patterns shows that birds perceive their song more like humans perceive speech. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Freedom of Speech Newsletter, September, 1975.

    Science.gov (United States)

    Allen, Winfred G., Jr., Ed.

    The Freedom of Speech Newsletter is the communication medium for the Freedom of Speech Interest Group of the Western Speech Communication Association. The newsletter contains such features as a statement of concern by the National Ad Hoc Committee Against Censorship; Reticence and Free Speech, an article by James F. Vickrey discussing the subtle…

  6. Wadeable Streams Assessment Data

    Science.gov (United States)

    The Wadeable Streams Assessment (WSA) is a first-ever statistically-valid survey of the biological condition of small streams throughout the U.S. The U.S. Environmental Protection Agency (EPA) worked with the states to conduct the assessment in 2004-2005. Data for each parameter sampled in the Wadeable Streams Assessment (WSA) are available for downloading in a series of files as comma separated values (*.csv). Each *.csv data file has a companion text file (*.txt) that lists a dataset label and individual descriptions for each variable. Users should view the *.txt files first to help guide their understanding and use of the data.

  7. Speech recovery device

    Energy Technology Data Exchange (ETDEWEB)

    Frankle, Christen M.

    2004-04-20

    There is provided an apparatus and method for assisting speech recovery in people with inability to speak due to aphasia, apraxia or another condition with similar effect. A hollow, rigid, thin-walled tube with semi-circular or semi-elliptical cut out shapes at each open end is positioned such that one end mates with the throat/voice box area of the neck of the assistor and the other end mates with the throat/voice box area of the assisted. The speaking person (assistor) makes sounds that produce standing wave vibrations at the same frequency in the vocal cords of the assisted person. Driving the assisted person's vocal cords with the assisted person being able to hear the correct tone enables the assisted person to speak by simply amplifying the vibration of membranes in their throat.

  8. Speech recovery device

    Energy Technology Data Exchange (ETDEWEB)

    Frankle, Christen M.

    2000-10-19

    There is provided an apparatus and method for assisting speech recovery in people with inability to speak due to aphasia, apraxia or another condition with similar effect. A hollow, rigid, thin-walled tube with semi-circular or semi-elliptical cut out shapes at each open end is positioned such that one end mates with the throat/voice box area of the neck of the assistor and the other end mates with the throat/voice box area of the assisted. The speaking person (assistor) makes sounds that produce standing wave vibrations at the same frequency in the vocal cords of the assisted person. Driving the assisted person's vocal cords with the assisted person being able to hear the correct tone enables the assisted person to speak by simply amplifying the vibration of membranes in their throat.

  9. Steganalysis of recorded speech

    Science.gov (United States)

    Johnson, Micah K.; Lyu, Siwei; Farid, Hany

    2005-03-01

    Digital audio provides a suitable cover for high-throughput steganography. At 16 bits per sample and sampled at a rate of 44,100 Hz, digital audio has the bit-rate to support large messages. In addition, audio is often transient and unpredictable, facilitating the hiding of messages. Using an approach similar to our universal image steganalysis, we show that hidden messages alter the underlying statistics of audio signals. Our statistical model begins by building a linear basis that captures certain statistical properties of audio signals. A low-dimensional statistical feature vector is extracted from this basis representation and used by a non-linear support vector machine for classification. We show the efficacy of this approach on LSB embedding and Hide4PGP. While no explicit assumptions about the content of the audio are made, our technique has been developed and tested on high-quality recorded speech.

  10. Phoneme Compression: processing of the speech signal and effects on speech intelligibility in hearing-Impaired listeners

    NARCIS (Netherlands)

    A. Goedegebure (Andre)

    2005-01-01

    textabstractHearing-aid users often continue to have problems with poor speech understanding in difficult acoustical conditions. Another generally accounted problem is that certain sounds become too loud whereas other sounds are still not audible. Dynamic range compression is a signal processing

  11. What Information Is Necessary for Speech Categorization? Harnessing Variability in the Speech Signal by Integrating Cues Computed Relative to Expectations

    Science.gov (United States)

    McMurray, Bob; Jongman, Allard

    2011-01-01

    Most theories of categorization emphasize how continuous perceptual information is mapped to categories. However, equally important are the informational assumptions of a model, the type of information subserving this mapping. This is crucial in speech perception where the signal is variable and context dependent. This study assessed the…

  12. Studies of Speech Disorders in Schizophrenia. History and State-of-the-art

    Directory of Open Access Journals (Sweden)

    Shedovskiy E. F.

    2015-08-01

    Full Text Available The article reviews studies of speech disorders in schizophrenia. The authors paid attention to a historical course and characterization of studies of areas: the actual psychopathological (speech disorders as a psychopathological symptoms, their description and taxonomy, psychological (isolated neurons and pathopsychological perspective analysis separately analyzed some modern foreign works, covering a variety of approaches to the study of speech disorders in the endogenous mental disorders. Disorders and features of speech are among the most striking manifestations of schizophrenia along with impaired thinking (Savitskaya A. V., Mikirtumov B. E.. With all the variety of symptoms, speech disorders in schizophrenia could be classified and organized. The few clinical psychological studies of speech activity in schizophrenia presented work on the study of generation and standard speech utterance; features verbal associative process, speed parameters of speech utterances. Special attention is given to integrated research in the mainstream of biological psychiatry and genetic trends. It is shown that the topic for more than a half-century history of originality of speech pathology in schizophrenia has received some coverage in the psychiatric and psychological literature and continues to generate interest in the modern integrated multidisciplinary approach

  13. Speech enhancement theory and practice

    CERN Document Server

    Loizou, Philipos C

    2013-01-01

    With the proliferation of mobile devices and hearing devices, including hearing aids and cochlear implants, there is a growing and pressing need to design algorithms that can improve speech intelligibility without sacrificing quality. Responding to this need, Speech Enhancement: Theory and Practice, Second Edition introduces readers to the basic problems of speech enhancement and the various algorithms proposed to solve these problems. Updated and expanded, this second edition of the bestselling textbook broadens its scope to include evaluation measures and enhancement algorithms aimed at impr

  14. Delivering Instruction via Streaming Media: A Higher Education Perspective.

    Science.gov (United States)

    Mortensen, Mark; Schlieve, Paul; Young, Jon

    2000-01-01

    Describes streaming media, an audio/video presentation that is delivered across a network so that it is viewed while being downloaded onto the user's computer, including a continuous stream of video that can be pre-recorded or live. Discusses its use for nontraditional students in higher education and reports on implementation experiences. (LRW)

  15. The benefit of combining a deep neural network architecture with ideal ratio mask estimation in computational speech segregation to improve speech intelligibility.

    Science.gov (United States)

    Bentsen, Thomas; May, Tobias; Kressner, Abigail A; Dau, Torsten

    2018-01-01

    Computational speech segregation attempts to automatically separate speech from noise. This is challenging in conditions with interfering talkers and low signal-to-noise ratios. Recent approaches have adopted deep neural networks and successfully demonstrated speech intelligibility improvements. A selection of components may be responsible for the success with these state-of-the-art approaches: the system architecture, a time frame concatenation technique and the learning objective. The aim of this study was to explore the roles and the relative contributions of these components by measuring speech intelligibility in normal-hearing listeners. A substantial improvement of 25.4 percentage points in speech intelligibility scores was found going from a subband-based architecture, in which a Gaussian Mixture Model-based classifier predicts the distributions of speech and noise for each frequency channel, to a state-of-the-art deep neural network-based architecture. Another improvement of 13.9 percentage points was obtained by changing the learning objective from the ideal binary mask, in which individual time-frequency units are labeled as either speech- or noise-dominated, to the ideal ratio mask, where the units are assigned a continuous value between zero and one. Therefore, both components play significant roles and by combining them, speech intelligibility improvements were obtained in a six-talker condition at a low signal-to-noise ratio.

  16. INTEGRATING MACHINE TRANSLATION AND SPEECH SYNTHESIS COMPONENT FOR ENGLISH TO DRAVIDIAN LANGUAGE SPEECH TO SPEECH TRANSLATION SYSTEM

    Directory of Open Access Journals (Sweden)

    J. SANGEETHA

    2015-02-01

    Full Text Available This paper provides an interface between the machine translation and speech synthesis system for converting English speech to Tamil text in English to Tamil speech to speech translation system. The speech translation system consists of three modules: automatic speech recognition, machine translation and text to speech synthesis. Many procedures for incorporation of speech recognition and machine translation have been projected. Still speech synthesis system has not yet been measured. In this paper, we focus on integration of machine translation and speech synthesis, and report a subjective evaluation to investigate the impact of speech synthesis, machine translation and the integration of machine translation and speech synthesis components. Here we implement a hybrid machine translation (combination of rule based and statistical machine translation and concatenative syllable based speech synthesis technique. In order to retain the naturalness and intelligibility of synthesized speech Auto Associative Neural Network (AANN prosody prediction is used in this work. The results of this system investigation demonstrate that the naturalness and intelligibility of the synthesized speech are strongly influenced by the fluency and correctness of the translated text.

  17. Speech Telepractice: Installing a Speech Therapy Upgrade for the 21st Century

    Directory of Open Access Journals (Sweden)

    Michael P. Towey

    2012-12-01

    Full Text Available Much of speech therapy involves the clinician guiding the therapeutic process (e.g., presenting stimuli and eliciting client responses; however, this Brief Communication describes a different approach to speech therapy delivery. Clinicians at Waldo County General Hospital (WCGH use high definition audio and video to engage clients in telepractice using interactive web-based virtual environments. This technology enables clients and their clinicians to co-create salient treatment activities using authentic materials captured via digital cameras, video and/or curricular materials.  Both therapists and clients manipulate the materials and interact online in real-time. The web-based technology engenders highly personalized and engaging activities, such that clients’ interactions with these high interest tasks often continue well beyond the therapy sessions.

  18. REVISED STREAM CODE AND WASP5 BENCHMARK

    International Nuclear Information System (INIS)

    Chen, K

    2005-01-01

    STREAM is an emergency response code that predicts downstream pollutant concentrations for releases from the SRS area to the Savannah River. The STREAM code uses an algebraic equation to approximate the solution of the one dimensional advective transport differential equation. This approach generates spurious oscillations in the concentration profile when modeling long duration releases. To improve the capability of the STREAM code to model long-term releases, its calculation module was replaced by the WASP5 code. WASP5 is a US EPA water quality analysis program that simulates one-dimensional pollutant transport through surface water. Test cases were performed to compare the revised version of STREAM with the existing version. For continuous releases, results predicted by the revised STREAM code agree with physical expectations. The WASP5 code was benchmarked with the US EPA 1990 and 1991 dye tracer studies, in which the transport of the dye was measured from its release at the New Savannah Bluff Lock and Dam downstream to Savannah. The peak concentrations predicted by the WASP5 agreed with the measurements within ±20.0%. The transport times of the dye concentration peak predicted by the WASP5 agreed with the measurements within ±3.6%. These benchmarking results demonstrate that STREAM should be capable of accurately modeling releases from SRS outfalls

  19. Online feature selection with streaming features.

    Science.gov (United States)

    Wu, Xindong; Yu, Kui; Ding, Wei; Wang, Hao; Zhu, Xingquan

    2013-05-01

    We propose a new online feature selection framework for applications with streaming features where the knowledge of the full feature space is unknown in advance. We define streaming features as features that flow in one by one over time whereas the number of training examples remains fixed. This is in contrast with traditional online learning methods that only deal with sequentially added observations, with little attention being paid to streaming features. The critical challenges for Online Streaming Feature Selection (OSFS) include 1) the continuous growth of feature volumes over time, 2) a large feature space, possibly of unknown or infinite size, and 3) the unavailability of the entire feature set before learning starts. In the paper, we present a novel Online Streaming Feature Selection method to select strongly relevant and nonredundant features on the fly. An efficient Fast-OSFS algorithm is proposed to improve feature selection performance. The proposed algorithms are evaluated extensively on high-dimensional datasets and also with a real-world case study on impact crater detection. Experimental results demonstrate that the algorithms achieve better compactness and higher prediction accuracy than existing streaming feature selection algorithms.

  20. Speech of people with autism: Echolalia and echolalic speech

    OpenAIRE

    Błeszyński, Jacek Jarosław

    2013-01-01

    Speech of people with autism is recognised as one of the basic diagnostic, therapeutic and theoretical problems. One of the most common symptoms of autism in children is echolalia, described here as being of different types and severity. This paper presents the results of studies into different levels of echolalia, both in normally developing children and in children diagnosed with autism, discusses the differences between simple echolalia and echolalic speech - which can be considered to b...

  1. Advocate: A Distributed Architecture for Speech-to-Speech Translation

    Science.gov (United States)

    2009-01-01

    tecture, are either wrapped natural-language processing ( NLP ) components or objects developed from scratch using the architecture’s API. GATE is...framework, we put together a demonstration Arabic -to- English speech translation system using both internally developed ( Arabic speech recognition and MT...conditions of our Arabic S2S demonstration system described earlier. Once again, the data size was varied and eighty identical requests were

  2. Future Roads Near Streams

    Data.gov (United States)

    U.S. Environmental Protection Agency — Roads are a source of auto related pollutants (e.g. gasoline, oil and other engine fluids). When roads are near streams, rain can wash these pollutants directly into...

  3. Channelized Streams in Iowa

    Data.gov (United States)

    Iowa State University GIS Support and Research Facility — This draft dataset consists of all ditches or channelized pieces of stream that could be identified using three input datasets; namely the1:24,000 National...

  4. Stochastic ice stream dynamics.

    Science.gov (United States)

    Mantelli, Elisa; Bertagni, Matteo Bernard; Ridolfi, Luca

    2016-08-09

    Ice streams are narrow corridors of fast-flowing ice that constitute the arterial drainage network of ice sheets. Therefore, changes in ice stream flow are key to understanding paleoclimate, sea level changes, and rapid disintegration of ice sheets during deglaciation. The dynamics of ice flow are tightly coupled to the climate system through atmospheric temperature and snow recharge, which are known exhibit stochastic variability. Here we focus on the interplay between stochastic climate forcing and ice stream temporal dynamics. Our work demonstrates that realistic climate fluctuations are able to (i) induce the coexistence of dynamic behaviors that would be incompatible in a purely deterministic system and (ii) drive ice stream flow away from the regime expected in a steady climate. We conclude that environmental noise appears to be crucial to interpreting the past behavior of ice sheets, as well as to predicting their future evolution.

  5. Roads Near Streams

    Data.gov (United States)

    U.S. Environmental Protection Agency — Roads are a source of auto related pollutants (e.g. gasoline, oil and other engine fluids). When roads are near streams, rain can wash these pollutants directly into...

  6. Streaming tearing mode

    Science.gov (United States)

    Shigeta, M.; Sato, T.; Dasgupta, B.

    1985-01-01

    The magnetohydrodynamic stability of streaming tearing mode is investigated numerically. A bulk plasma flow parallel to the antiparallel magnetic field lines and localized in the neutral sheet excites a streaming tearing mode more strongly than the usual tearing mode, particularly for the wavelength of the order of the neutral sheet width (or smaller), which is stable for the usual tearing mode. Interestingly, examination of the eigenfunctions of the velocity perturbation and the magnetic field perturbation indicates that the streaming tearing mode carries more energy in terms of the kinetic energy rather than the magnetic energy. This suggests that the streaming tearing mode instability can be a more feasible mechanism of plasma acceleration than the usual tearing mode instability.

  7. DNR 24K Streams

    Data.gov (United States)

    Minnesota Department of Natural Resources — 1:24,000 scale streams captured from USGS seven and one-half minute quadrangle maps, with perennial vs. intermittent classification, and connectivity through lakes,...

  8. Trout Stream Special Regulations

    Data.gov (United States)

    Minnesota Department of Natural Resources — This layer shows Minnesota trout streams that have a special regulation as described in the 2006 Minnesota Fishing Regulations. Road crossings were determined using...

  9. Scientific stream pollution analysis

    National Research Council Canada - National Science Library

    Nemerow, Nelson Leonard

    1974-01-01

    A comprehensive description of the analysis of water pollution that presents a careful balance of the biological,hydrological, chemical and mathematical concepts involved in the evaluation of stream...

  10. Collaborative Media Streaming

    OpenAIRE

    Kahmann, Verena

    2008-01-01

    Mit Hilfe der IP-Technologie erbrachte Multimedia-Dienste wie IPTV oder Video-on-Demand sind zur Zeit ein gefragtes Thema. Technisch werden solche Dienste unter dem Begriff "Streaming" eingeordnet. Ein Server sendet Mediendaten kontinuierlich an Empfänger, welche die Daten sofort weiterverarbeiten und anzeigen. Über einen Rückkanal hat der Kunde die Möglichkeit der Einflussnahme auf die Wiedergabe. Eine Weiterentwicklung dieser Streaming-Dienste ist die Möglichkeit, gemeinsam mit anderen dens...

  11. The Right Temporoparietal Junction Supports Speech Tracking During Selective Listening: Evidence from Concurrent EEG-fMRI.

    Science.gov (United States)

    Puschmann, Sebastian; Steinkamp, Simon; Gillich, Imke; Mirkovic, Bojana; Debener, Stefan; Thiel, Christiane M

    2017-11-22

    Listening selectively to one out of several competing speakers in a "cocktail party" situation is a highly demanding task. It relies on a widespread cortical network, including auditory sensory, but also frontal and parietal brain regions involved in controlling auditory attention. Previous work has shown that, during selective listening, ongoing neural activity in auditory sensory areas is dominated by the attended speech stream, whereas competing input is suppressed. The relationship between these attentional modulations in the sensory tracking of the attended speech stream and frontoparietal activity during selective listening is, however, not understood. We studied this question in young, healthy human participants (both sexes) using concurrent EEG-fMRI and a sustained selective listening task, in which one out of two competing speech streams had to be attended selectively. An EEG-based speech envelope reconstruction method was applied to assess the strength of the cortical tracking of the to-be-attended and the to-be-ignored stream during selective listening. Our results show that individual speech envelope reconstruction accuracies obtained for the to-be-attended speech stream were positively correlated with the amplitude of sustained BOLD responses in the right temporoparietal junction, a core region of the ventral attention network. This brain region further showed task-related functional connectivity to secondary auditory cortex and regions of the frontoparietal attention network, including the intraparietal sulcus and the inferior frontal gyrus. This suggests that the right temporoparietal junction is involved in controlling attention during selective listening, allowing for a better cortical tracking of the attended speech stream. SIGNIFICANCE STATEMENT Listening selectively to one out of several simultaneously talking speakers in a "cocktail party" situation is a highly demanding task. It activates a widespread network of auditory sensory and

  12. Comprehension of synthetic speech and digitized natural speech by adults with aphasia.

    Science.gov (United States)

    Hux, Karen; Knollman-Porter, Kelly; Brown, Jessica; Wallace, Sarah E

    2017-09-01

    Using text-to-speech technology to provide simultaneous written and auditory content presentation may help compensate for chronic reading challenges if people with aphasia can understand synthetic speech output; however, inherent auditory comprehension challenges experienced by people with aphasia may make understanding synthetic speech difficult. This study's purpose was to compare the preferences and auditory comprehension accuracy of people with aphasia when listening to sentences generated with digitized natural speech, Alex synthetic speech (i.e., Macintosh platform), or David synthetic speech (i.e., Windows platform). The methodology required each of 20 participants with aphasia to select one of four images corresponding in meaning to each of 60 sentences comprising three stimulus sets. Results revealed significantly better accuracy given digitized natural speech than either synthetic speech option; however, individual participant performance analyses revealed three patterns: (a) comparable accuracy regardless of speech condition for 30% of participants, (b) comparable accuracy between digitized natural speech and one, but not both, synthetic speech option for 45% of participants, and (c) greater accuracy with digitized natural speech than with either synthetic speech option for remaining participants. Ranking and Likert-scale rating data revealed a preference for digitized natural speech and David synthetic speech over Alex synthetic speech. Results suggest many individuals with aphasia can comprehend synthetic speech options available on popular operating systems. Further examination of synthetic speech use to support reading comprehension through text-to-speech technology is thus warranted. Copyright © 2017 Elsevier Inc. All rights reserved.

  13. Speech perception at the interface of neurobiology and linguistics.

    Science.gov (United States)

    Poeppel, David; Idsardi, William J; van Wassenhove, Virginie

    2008-03-12

    Speech perception consists of a set of computations that take continuously varying acoustic waveforms as input and generate discrete representations that make contact with the lexical representations stored in long-term memory as output. Because the perceptual objects that are recognized by the speech perception enter into subsequent linguistic computation, the format that is used for lexical representation and processing fundamentally constrains the speech perceptual processes. Consequently, theories of speech perception must, at some level, be tightly linked to theories of lexical representation. Minimally, speech perception must yield representations that smoothly and rapidly interface with stored lexical items. Adopting the perspective of Marr, we argue and provide neurobiological and psychophysical evidence for the following research programme. First, at the implementational level, speech perception is a multi-time resolution process, with perceptual analyses occurring concurrently on at least two time scales (approx. 20-80 ms, approx. 150-300 ms), commensurate with (sub)segmental and syllabic analyses, respectively. Second, at the algorithmic level, we suggest that perception proceeds on the basis of internal forward models, or uses an 'analysis-by-synthesis' approach. Third, at the computational level (in the sense of Marr), the theory of lexical representation that we adopt is principally informed by phonological research and assumes that words are represented in the mental lexicon in terms of sequences of discrete segments composed of distinctive features. One important goal of the research programme is to develop linking hypotheses between putative neurobiological primitives (e.g. temporal primitives) and those primitives derived from linguistic inquiry, to arrive ultimately at a biologically sensible and theoretically satisfying model of representation and computation in speech.

  14. Dilution and volatilization of groundwater contaminant discharges in streams

    DEFF Research Database (Denmark)

    Aisopou, Angeliki; Bjerg, Poul Løgstrup; Sonne, Anne Thobo

    2015-01-01

    measurement. The solution was successfully applied to published field data obtained in a large and a small Danish stream and provided valuable information on the risk posed by the groundwater contaminant plumes. The results provided by the dilution and volatilization model are very different to those obtained......An analytical solution to describe dilution and volatilization of a continuous groundwater contaminant plume into streams is developed for risk assessment. The location of groundwater plume discharge into the stream (discharge through the side versus bottom of the stream) and different...

  15. Emergence of category-level sensitivities in non-native speech sound learning

    Directory of Open Access Journals (Sweden)

    Emily eMyers

    2014-08-01

    Full Text Available Over the course of development, speech sounds that are contrastive in one’s native language tend to become perceived categorically: that is, listeners are unaware of variation within phonetic categories while showing excellent sensitivity to speech sounds that span linguistically meaningful phonetic category boundaries. The end stage of this developmental process is that the perceptual systems that handle acoustic-phonetic information show special tuning to native language contrasts, and as such, category-level information appears to be present at even fairly low levels of the neural processing stream. Research on adults acquiring non-native speech categories offers an avenue for investigating the interplay of category-level information and perceptual sensitivities to these sounds as speech categories emerge. In particular, one can observe the neural changes that unfold as listeners learn not only to perceive acoustic distinctions that mark non-native speech sound contrasts, but also to map these distinctions onto category-level representations. An emergent literature on the neural basis of novel and non-native speech sound learning offers new insight into this question. In this review, I will examine this literature in order to answer two key questions. First, where in the neural pathway does sensitivity to category-level phonetic information first emerge over the trajectory of speech sound learning? Second, how do frontal and temporal brain areas work in concert over the course of non-native speech sound learning? Finally, in the context of this literature I will describe a model of speech sound learning in which rapidly-adapting access to categorical information in the frontal lobes modulates the sensitivity of stable, slowly-adapting responses in the temporal lobes.

  16. Speech Mannerisms: Games Clients Play

    Science.gov (United States)

    Morgan, Lewis B.

    1978-01-01

    This article focuses on speech mannerisms often employed by clients in a helping relationship. Eight mannerisms are presented and discussed, as well as possible interpretations. Suggestions are given to help counselors respond to them. (Author)

  17. Speech recognition from spectral dynamics

    Indian Academy of Sciences (India)

    Carrier nature of speech; modulation spectrum; spectral dynamics ... the relationships between phonetic values of sounds and their short-term spectral envelopes .... the number of free parameters that need to be estimated from training data.

  18. Streaming Pool: reuse, combine and create reactive streams with pleasure

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    When connecting together heterogeneous and complex systems, it is not easy to exchange data between components. Streams of data are successfully used in industry in order to overcome this problem, especially in the case of "live" data. Streams are a specialization of the Observer design pattern and they provide asynchronous and non-blocking data flow. The ongoing effort of the ReactiveX initiative is one example that demonstrates how demanding this technology is even for big companies. Bridging the discrepancies of different technologies with common interfaces is already done by the Reactive Streams initiative and, in the JVM world, via reactive-streams-jvm interfaces. Streaming Pool is a framework for providing and discovering reactive streams. Through the mechanism of dependency injection provided by the Spring Framework, Streaming Pool provides a so called Discovery Service. This object can discover and chain streams of data that are technologically agnostic, through the use of Stream IDs. The stream to ...

  19. Designing speech for a recipient

    DEFF Research Database (Denmark)

    Fischer, Kerstin

    This study asks how speakers adjust their speech to their addressees, focusing on the potential roles of cognitive representations such as partner models, automatic processes such as interactive alignment, and social processes such as interactional negotiation. The nature of addressee orientation......, psycholinguistics and conversation analysis, and offers both overviews of child-directed, foreigner-directed and robot-directed speech and in-depth analyses of the processes involved in adjusting to a communication partner....

  20. National features of speech etiquette

    OpenAIRE

    Nacafova S.

    2017-01-01

    The article shows the differences between the speech etiquette of different peoples. The most important thing is to find a common language with this or that interlocutor. Knowledge of national etiquette, national character helps to learn the principles of speech of another nation. The article indicates in which cases certain forms of etiquette considered acceptable. At the same time, the rules of etiquette emphasized in the conduct of a dialogue in official meetings and for example, in the ex...

  1. Censored: Whistleblowers and impossible speech

    OpenAIRE

    Kenny, Kate

    2017-01-01

    What happens to a person who speaks out about corruption in their organization, and finds themselves excluded from their profession? In this article, I argue that whistleblowers experience exclusions because they have engaged in ‘impossible speech’, that is, a speech act considered to be unacceptable or illegitimate. Drawing on Butler’s theories of recognition and censorship, I show how norms of acceptable speech working through recruitment practices, alongside the actions of colleagues, can ...

  2. Multi-stream LSTM-HMM decoding and histogram equalization for noise robust keyword spotting.

    Science.gov (United States)

    Wöllmer, Martin; Marchi, Erik; Squartini, Stefano; Schuller, Björn

    2011-09-01

    Highly spontaneous, conversational, and potentially emotional and noisy speech is known to be a challenge for today's automatic speech recognition (ASR) systems, which highlights the need for advanced algorithms that improve speech features and models. Histogram Equalization is an efficient method to reduce the mismatch between clean and noisy conditions by normalizing all moments of the probability distribution of the feature vector components. In this article, we propose to combine histogram equalization and multi-condition training for robust keyword detection in noisy speech. To better cope with conversational speaking styles, we show how contextual information can be effectively exploited in a multi-stream ASR framework that dynamically models context-sensitive phoneme estimates generated by a long short-term memory neural network. The proposed techniques are evaluated on the SEMAINE database-a corpus containing emotionally colored conversations with a cognitive system for "Sensitive Artificial Listening".

  3. Speech Function and Speech Role in Carl Fredricksen's Dialogue on Up Movie

    OpenAIRE

    Rehana, Ridha; Silitonga, Sortha

    2013-01-01

    One aim of this article is to show through a concrete example how speech function and speech role used in movie. The illustrative example is taken from the dialogue of Up movie. Central to the analysis proper form of dialogue on Up movie that contain of speech function and speech role; i.e. statement, offer, question, command, giving, and demanding. 269 dialogue were interpreted by actor, and it was found that the use of speech function and speech role.

  4. Psychoacoustic cues to emotion in speech prosody and music.

    Science.gov (United States)

    Coutinho, Eduardo; Dibben, Nicola

    2013-01-01

    There is strong evidence of shared acoustic profiles common to the expression of emotions in music and speech, yet relatively limited understanding of the specific psychoacoustic features involved. This study combined a controlled experiment and computational modelling to investigate the perceptual codes associated with the expression of emotion in the acoustic domain. The empirical stage of the study provided continuous human ratings of emotions perceived in excerpts of film music and natural speech samples. The computational stage created a computer model that retrieves the relevant information from the acoustic stimuli and makes predictions about the emotional expressiveness of speech and music close to the responses of human subjects. We show that a significant part of the listeners' second-by-second reported emotions to music and speech prosody can be predicted from a set of seven psychoacoustic features: loudness, tempo/speech rate, melody/prosody contour, spectral centroid, spectral flux, sharpness, and roughness. The implications of these results are discussed in the context of cross-modal similarities in the communication of emotion in the acoustic domain.

  5. Multimedia Mapping using Continuous State Space Models

    DEFF Research Database (Denmark)

    Lehn-Schiøler, Tue

    2004-01-01

    In this paper a system that transforms speech waveforms to animated faces are proposed. The system relies on continuous state space models to perform the mapping, this makes it possible to ensure video with no sudden jumps and allows continuous control of the parameters in 'face space'. Simulations...... are performed on recordings of 3-5 sec. video sequences with sentences from the Timit database. The model is able to construct an image sequence from an unknown noisy speech sequence fairly well even though the number of training examples are limited....

  6. Streams and their future inhabitants

    DEFF Research Database (Denmark)

    Sand-Jensen, K.; Friberg, Nikolai

    2006-01-01

    In this fi nal chapter we look ahead and address four questions: How do we improve stream management? What are the likely developments in the biological quality of streams? In which areas is knowledge on stream ecology insuffi cient? What can streams offer children of today and adults of tomorrow?...

  7. Robust audio-visual speech recognition under noisy audio-video conditions.

    Science.gov (United States)

    Stewart, Darryl; Seymour, Rowan; Pass, Adrian; Ming, Ji

    2014-02-01

    This paper presents the maximum weighted stream posterior (MWSP) model as a robust and efficient stream integration method for audio-visual speech recognition in environments, where the audio or video streams may be subjected to unknown and time-varying corruption. A significant advantage of MWSP is that it does not require any specific measurements of the signal in either stream to calculate appropriate stream weights during recognition, and as such it is modality-independent. This also means that MWSP complements and can be used alongside many of the other approaches that have been proposed in the literature for this problem. For evaluation we used the large XM2VTS database for speaker-independent audio-visual speech recognition. The extensive tests include both clean and corrupted utterances with corruption added in either/both the video and audio streams using a variety of types (e.g., MPEG-4 video compression) and levels of noise. The experiments show that this approach gives excellent performance in comparison to another well-known dynamic stream weighting approach and also compared to any fixed-weighted integration approach in both clean conditions or when noise is added to either stream. Furthermore, our experiments show that the MWSP approach dynamically selects suitable integration weights on a frame-by-frame basis according to the level of noise in the streams and also according to the naturally fluctuating relative reliability of the modalities even in clean conditions. The MWSP approach is shown to maintain robust recognition performance in all tested conditions, while requiring no prior knowledge about the type or level of noise.

  8. STRIP: stream learning of influence probabilities

    DEFF Research Database (Denmark)

    Kutzkov, Konstantin

    2013-01-01

    cascades, and developing applications such as viral marketing. Motivated by modern microblogging platforms, such as twitter, in this paper we study the problem of learning influence probabilities in a data-stream scenario, in which the network topology is relatively stable and the challenge of a learning...... algorithm is to keep up with a continuous stream of tweets using a small amount of time and memory. Our contribution is a number of randomized approximation algorithms, categorized according to the available space (superlinear, linear, and sublinear in the number of nodes n) and according to dierent models...

  9. Exploring Australian speech-language pathologists' use and perceptions ofnon-speech oral motor exercises.

    Science.gov (United States)

    Rumbach, Anna F; Rose, Tanya A; Cheah, Mynn

    2018-01-29

    To explore Australian speech-language pathologists' use of non-speech oral motor exercises, and rationales for using/not using non-speech oral motor exercises in clinical practice. A total of 124 speech-language pathologists practising in Australia, working with paediatric and/or adult clients with speech sound difficulties, completed an online survey. The majority of speech-language pathologists reported that they did not use non-speech oral motor exercises when working with paediatric or adult clients with speech sound difficulties. However, more than half of the speech-language pathologists working with adult clients who have dysarthria reported using non-speech oral motor exercises with this population. The most frequently reported rationale for using non-speech oral motor exercises in speech sound difficulty management was to improve awareness/placement of articulators. The majority of speech-language pathologists agreed there is no clear clinical or research evidence base to support non-speech oral motor exercise use with clients who have speech sound difficulties. This study provides an overview of Australian speech-language pathologists' reported use and perceptions of non-speech oral motor exercises' applicability and efficacy in treating paediatric and adult clients who have speech sound difficulties. The research findings provide speech-language pathologists with insight into how and why non-speech oral motor exercises are currently used, and adds to the knowledge base regarding Australian speech-language pathology practice of non-speech oral motor exercises in the treatment of speech sound difficulties. Implications for Rehabilitation Non-speech oral motor exercises refer to oral motor activities which do not involve speech, but involve the manipulation or stimulation of oral structures including the lips, tongue, jaw, and soft palate. Non-speech oral motor exercises are intended to improve the function (e.g., movement, strength) of oral structures. The

  10. Novel Techniques for Dialectal Arabic Speech Recognition

    CERN Document Server

    Elmahdy, Mohamed; Minker, Wolfgang

    2012-01-01

    Novel Techniques for Dialectal Arabic Speech describes approaches to improve automatic speech recognition for dialectal Arabic. Since speech resources for dialectal Arabic speech recognition are very sparse, the authors describe how existing Modern Standard Arabic (MSA) speech data can be applied to dialectal Arabic speech recognition, while assuming that MSA is always a second language for all Arabic speakers. In this book, Egyptian Colloquial Arabic (ECA) has been chosen as a typical Arabic dialect. ECA is the first ranked Arabic dialect in terms of number of speakers, and a high quality ECA speech corpus with accurate phonetic transcription has been collected. MSA acoustic models were trained using news broadcast speech. In order to cross-lingually use MSA in dialectal Arabic speech recognition, the authors have normalized the phoneme sets for MSA and ECA. After this normalization, they have applied state-of-the-art acoustic model adaptation techniques like Maximum Likelihood Linear Regression (MLLR) and M...

  11. The effects of speech controls on performance in advanced helicopters in a double stimulation paradigm

    Science.gov (United States)

    Bortolussi, Michael R.; Vidulich, Michael A.

    1991-01-01

    The potential benefit of speech as a control modality has been investigated with mixed results. Earlier studies suggests that speech controls can reduce the potential of manual control overloads and improve time-sharing performance. However, these benefits were not without costs. Pilots reported higher workload levels associated with the use of speech controls. To further investigate these previous findings, an experiment was conducted in a simulation of an advanced single-pilot, scout/attack helicopter at NASA-Ames' ICAB (interchangeable cab) facility. Objective performance data suggested that speech control modality was effective in reducing interference of discrete, time-shared responses during continuous flight control activity. Subjective ratings, however, indicated that the speech control modality increased workload. Post-flight debriefing indicated that these results were mainly due to the increased effort to speak precisely to a less than perfect voice recognition system.

  12. Experimental comparison between speech transmission index, rapid speech transmission index, and speech intelligibility index.

    Science.gov (United States)

    Larm, Petra; Hongisto, Valtteri

    2006-02-01

    During the acoustical design of, e.g., auditoria or open-plan offices, it is important to know how speech can be perceived in various parts of the room. Different objective methods have been developed to measure and predict speech intelligibility, and these have been extensively used in various spaces. In this study, two such methods were compared, the speech transmission index (STI) and the speech intelligibility index (SII). Also the simplification of the STI, the room acoustics speech transmission index (RASTI), was considered. These quantities are all based on determining an apparent speech-to-noise ratio on selected frequency bands and summing them using a specific weighting. For comparison, some data were needed on the possible differences of these methods resulting from the calculation scheme and also measuring equipment. Their prediction accuracy was also of interest. Measurements were made in a laboratory having adjustable noise level and absorption, and in a real auditorium. It was found that the measurement equipment, especially the selection of the loudspeaker, can greatly affect the accuracy of the results. The prediction accuracy of the RASTI was found acceptable, if the input values for the prediction are accurately known, even though the studied space was not ideally diffuse.

  13. On the Importance of Audiovisual Coherence for the Perceived Quality of Synthesized Visual Speech

    Directory of Open Access Journals (Sweden)

    Wesley Mattheyses

    2009-01-01

    Full Text Available Audiovisual text-to-speech systems convert a written text into an audiovisual speech signal. Typically, the visual mode of the synthetic speech is synthesized separately from the audio, the latter being either natural or synthesized speech. However, the perception of mismatches between these two information streams requires experimental exploration since it could degrade the quality of the output. In order to increase the intermodal coherence in synthetic 2D photorealistic speech, we extended the well-known unit selection audio synthesis technique to work with multimodal segments containing original combinations of audio and video. Subjective experiments confirm that the audiovisual signals created by our multimodal synthesis strategy are indeed perceived as being more synchronous than those of systems in which both modes are not intrinsically coherent. Furthermore, it is shown that the degree of coherence between the auditory mode and the visual mode has an influence on the perceived quality of the synthetic visual speech fragment. In addition, the audio quality was found to have only a minor influence on the perceived visual signal's quality.

  14. Neural pathways for visual speech perception

    Directory of Open Access Journals (Sweden)

    Lynne E Bernstein

    2014-12-01

    Full Text Available This paper examines the questions, what levels of speech can be perceived visually, and how is visual speech represented by the brain? Review of the literature leads to the conclusions that every level of psycholinguistic speech structure (i.e., phonetic features, phonemes, syllables, words, and prosody can be perceived visually, although individuals differ in their abilities to do so; and that there are visual modality-specific representations of speech qua speech in higher-level vision brain areas. That is, the visual system represents the modal patterns of visual speech. The suggestion that the auditory speech pathway receives and represents visual speech is examined in light of neuroimaging evidence on the auditory speech pathways. We outline the generally agreed-upon organization of the visual ventral and dorsal pathways and examine several types of visual processing that might be related to speech through those pathways, specifically, face and body, orthography, and sign language processing. In this context, we examine the visual speech processing literature, which reveals widespread diverse patterns activity in posterior temporal cortices in response to visual speech stimuli. We outline a model of the visual and auditory speech pathways and make several suggestions: (1 The visual perception of speech relies on visual pathway representations of speech qua speech. (2 A proposed site of these representations, the temporal visual speech area (TVSA has been demonstrated in posterior temporal cortex, ventral and posterior to multisensory posterior superior temporal sulcus (pSTS. (3 Given that visual speech has dynamic and configural features, its representations in feedforward visual pathways are expected to integrate these features, possibly in TVSA.

  15. Electrophysiological Correlates of Semantic Dissimilarity Reflect the Comprehension of Natural, Narrative Speech.

    Science.gov (United States)

    Broderick, Michael P; Anderson, Andrew J; Di Liberto, Giovanni M; Crosse, Michael J; Lalor, Edmund C

    2018-03-05

    People routinely hear and understand speech at rates of 120-200 words per minute [1, 2]. Thus, speech comprehension must involve rapid, online neural mechanisms that process words' meanings in an approximately time-locked fashion. However, electrophysiological evidence for such time-locked processing has been lacking for continuous speech. Although valuable insights into semantic processing have been provided by the "N400 component" of the event-related potential [3-6], this literature has been dominated by paradigms using incongruous words within specially constructed sentences, with less emphasis on natural, narrative speech comprehension. Building on the discovery that cortical activity "tracks" the dynamics of running speech [7-9] and psycholinguistic work demonstrating [10-12] and modeling [13-15] how context impacts on word processing, we describe a new approach for deriving an electrophysiological correlate of natural speech comprehension. We used a computational model [16] to quantify the meaning carried by words based on how semantically dissimilar they were to their preceding context and then regressed this measure against electroencephalographic (EEG) data recorded from subjects as they listened to narrative speech. This produced a prominent negativity at a time lag of 200-600 ms on centro-parietal EEG channels, characteristics common to the N400. Applying this approach to EEG datasets involving time-reversed speech, cocktail party attention, and audiovisual speech-in-noise demonstrated that this response was very sensitive to whether or not subjects understood the speech they heard. These findings demonstrate that, when successfully comprehending natural speech, the human brain responds to the contextual semantic content of each word in a relatively time-locked fashion. Copyright © 2018 Elsevier Ltd. All rights reserved.

  16. L2 Learners' Assessments of Accentedness, Fluency, and Comprehensibility of Native and Nonnative German Speech

    Science.gov (United States)

    O'Brien, Mary Grantham

    2014-01-01

    In early stages of classroom language learning, many adult second language (L2) learners communicate primarily with one another, yet we know little about which speech stream characteristics learners tune into or the extent to which they understand this lingua franca communication. In the current study, 25 native English speakers learning German as…

  17. Temporal Segmentation of MPEG Video Streams

    Directory of Open Access Journals (Sweden)

    Janko Calic

    2002-06-01

    Full Text Available Many algorithms for temporal video partitioning rely on the analysis of uncompressed video features. Since the information relevant to the partitioning process can be extracted directly from the MPEG compressed stream, higher efficiency can be achieved utilizing information from the MPEG compressed domain. This paper introduces a real-time algorithm for scene change detection that analyses the statistics of the macroblock features extracted directly from the MPEG stream. A method for extraction of the continuous frame difference that transforms the 3D video stream into a 1D curve is presented. This transform is then further employed to extract temporal units within the analysed video sequence. Results of computer simulations are reported.

  18. Lower Red River Meadow Stream Restoration Project

    International Nuclear Information System (INIS)

    1996-05-01

    As part of a continuing effort to restore anadromous fish populations in the South Fork Clearwater River basin of Idaho, Bonneville Power Administration (BPA) proposes to fund the Lower Red River Meadow Restoration Project (Project). The Project is a cooperative effort with the Idaho Soil and Water Conservation District, Nez Perce National Forest, Idaho Department of Fish and Game (IDFG), and the Nez Perce Tribe of Idaho. The proposed action would allow the sponsors to perform stream bank stabilization, aquatic and riparian habitat improvement activities on IDFG's Red River Management Area and to secure long-term conservation contracts or agreements for conducting streambank and habitat improvement activities with participating private landowners located in the Idaho County, Idaho, study area. This preliminary Environmental Assessment (EA) examines the potential environmental effects of stabilizing the stream channel, restoring juvenile fish rearing habitat and reestablishing a riparian shrub community along the stream

  19. Effects of nutrient enrichment on the decomposition of wood and associated microbial activity in streams

    Science.gov (United States)

    Vladislav Gulis; Amy D. Rosemond; Keller Suberkropp; Holly S. Weyers; Jonathan P. Benstead

    2004-01-01

    We determined the effects of nutrient enrichment on wood decomposition rates and microbial activity during a 3-year study in two headwater streams at Coweeta Hydrologic Laboratory, NC, U.S.A. After a 1-year pretreatment period, one of the streams was continuously enriched with inorganic nutrients (nitrogen and phosphorus) for 2 years while the other stream served as a...

  20. Perception of the multisensory coherence of fluent audiovisual speech in infancy: its emergence and the role of experience.

    Science.gov (United States)

    Lewkowicz, David J; Minar, Nicholas J; Tift, Amy H; Brandon, Melissa

    2015-02-01

    To investigate the developmental emergence of the perception of the multisensory coherence of native and non-native audiovisual fluent speech, we tested 4-, 8- to 10-, and 12- to 14-month-old English-learning infants. Infants first viewed two identical female faces articulating two different monologues in silence and then in the presence of an audible monologue that matched the visible articulations of one of the faces. Neither the 4-month-old nor 8- to 10-month-old infants exhibited audiovisual matching in that they did not look longer at the matching monologue. In contrast, the 12- to 14-month-old infants exhibited matching and, consistent with the emergence of perceptual expertise for the native language, perceived the multisensory coherence of native-language monologues earlier in the test trials than that of non-native language monologues. Moreover, the matching of native audible and visible speech streams observed in the 12- to 14-month-olds did not depend on audiovisual synchrony, whereas the matching of non-native audible and visible speech streams did depend on synchrony. Overall, the current findings indicate that the perception of the multisensory coherence of fluent audiovisual speech emerges late in infancy, that audiovisual synchrony cues are more important in the perception of the multisensory coherence of non-native speech than that of native audiovisual speech, and that the emergence of this skill most likely is affected by perceptual narrowing. Copyright © 2014 Elsevier Inc. All rights reserved.

  1. Teachers, Students, and Ideological Bias in the College Classroom. Wicked Problems Forum: Freedom of Speech at Colleges and Universities

    Science.gov (United States)

    Mazer, Joseph P.

    2018-01-01

    Discussions surrounding ideology and free speech on college and university campuses continually occur in the popular press. In this forum, Herbeck (see EJ1171161) chronicles several heated clashes over free speech that have recently erupted on campuses across the country, fueling news stories reported through traditional and social media. Issues…

  2. Continuous auditing & continuous monitoring : Continuous value?

    NARCIS (Netherlands)

    van Hillo, Rutger; Weigand, Hans; Espana, S; Ralyte, J; Souveyet, C

    2016-01-01

    Advancements in information technology, new laws and regulations and rapidly changing business conditions have led to a need for more timely and ongoing assurance with effectively working controls. Continuous Auditing (CA) and Continuous Monitoring (CM) technologies have made this possible by

  3. Detection of target phonemes in spontaneous and read speech

    NARCIS (Netherlands)

    Mehta, G.; Cutler, A.

    1988-01-01

    Although spontaneous speech occurs more frequently in most listeners' experience than read speech, laboratory studies of human speech recognition typically use carefully controlled materials read from a script. The phonological and prosodic characteristics of spontaneous and read speech differ

  4. Comparison of speech performance in labial and lingual orthodontic patients: A prospective study

    Science.gov (United States)

    Rai, Ambesh Kumar; Rozario, Joe E.; Ganeshkar, Sanjay V.

    2014-01-01

    Background: The intensity and duration of speech difficulty inherently associated with lingual therapy is a significant issue of concern in orthodontics. This study was designed to evaluate and to compare the duration of changes in speech between labial and lingual orthodontics. Materials and Methods: A prospective longitudinal clinical study was designed to assess speech of 24 patients undergoing labial or lingual orthodontic treatment. An objective spectrographic evaluation of/s/sound was done using software PRAAT version 5.0.47, a semiobjective auditive evaluation of articulation was done by four speech pathologists and a subjective assessment of speech was done by four laypersons. The tests were performed before (T1), within 24 h (T2), after 1 week (T3) and after 1 month (T4) of the start of therapy. The Mann-Whitney U-test for independent samples was used to assess the significance difference between the labial and lingual appliances. A speech alteration with P appliance systems caused a comparable speech difficulty immediately after bonding (T2). Although the speech recovered within a week in the labial group (T3), the lingual group continued to experience discomfort even after a month (T4). PMID:25540661

  5. Speech recognition technology: an outlook for human-to-machine interaction.

    Science.gov (United States)

    Erdel, T; Crooks, S

    2000-01-01

    Speech recognition, as an enabling technology in healthcare-systems computing, is a topic that has been discussed for quite some time, but is just now coming to fruition. Traditionally, speech-recognition software has been constrained by hardware, but improved processors and increased memory capacities are starting to remove some of these limitations. With these barriers removed, companies that create software for the healthcare setting have the opportunity to write more successful applications. Among the criticisms of speech-recognition applications are the high rates of error and steep training curves. However, even in the face of such negative perceptions, there remains significant opportunities for speech recognition to allow healthcare providers and, more specifically, physicians, to work more efficiently and ultimately spend more time with their patients and less time completing necessary documentation. This article will identify opportunities for inclusion of speech-recognition technology in the healthcare setting and examine major categories of speech-recognition software--continuous speech recognition, command and control, and text-to-speech. We will discuss the advantages and disadvantages of each area, the limitations of the software today, and how future trends might affect them.

  6. Speech activity detection for the automated speaker recognition system of critical use

    Directory of Open Access Journals (Sweden)

    M. M. Bykov

    2017-06-01

    Full Text Available In the article, the authors developed a method for detecting speech activity for an automated system for recognizing critical use of speeches with wavelet parameterization of speech signal and classification at intervals of “language”/“pause” using a curvilinear neural network. The method of wavelet-parametrization proposed by the authors allows choosing the optimal parameters of wavelet transformation in accordance with the user-specified error of presentation of speech signal. Also, the method allows estimating the loss of information depending on the selected parameters of continuous wavelet transformation (NPP, which allowed to reduce the number of scalable coefficients of the LVP of the speech signal in order of magnitude with the allowable degree of distortion of the local spectrum of the LVP. An algorithm for detecting speech activity with a curvilinear neural network classifier is also proposed, which shows the high quality of segmentation of speech signals at intervals "language" / "pause" and is resistant to the presence in the speech signal of narrowband noise and technogenic noise due to the inherent properties of the curvilinear neural network.

  7. Towards Contactless Silent Speech Recognition Based on Detection of Active and Visible Articulators Using IR-UWB Radar.

    Science.gov (United States)

    Shin, Young Hoon; Seo, Jiwon

    2016-10-29

    People with hearing or speaking disabilities are deprived of the benefits of conventional speech recognition technology because it is based on acoustic signals. Recent research has focused on silent speech recognition systems that are based on the motions of a speaker's vocal tract and articulators. Because most silent speech recognition systems use contact sensors that are very inconvenient to users or optical systems that are susceptible to environmental interference, a contactless and robust solution is hence required. Toward this objective, this paper presents a series of signal processing algorithms for a contactless silent speech recognition system using an impulse radio ultra-wide band (IR-UWB) radar. The IR-UWB radar is used to remotely and wirelessly detect motions of the lips and jaw. In order to extract the necessary features of lip and jaw motions from the received radar signals, we propose a feature extraction algorithm. The proposed algorithm noticeably improved speech recognition performance compared to the existing algorithm during our word recognition test with five speakers. We also propose a speech activity detection algorithm to automatically select speech segments from continuous input signals. Thus, speech recognition processing is performed only when speech segments are detected. Our testbed consists of commercial off-the-shelf radar products, and the proposed algorithms are readily applicable without designing specialized radar hardware for silent speech processing.

  8. Noise-robust speech triage.

    Science.gov (United States)

    Bartos, Anthony L; Cipr, Tomas; Nelson, Douglas J; Schwarz, Petr; Banowetz, John; Jerabek, Ladislav

    2018-04-01

    A method is presented in which conventional speech algorithms are applied, with no modifications, to improve their performance in extremely noisy environments. It has been demonstrated that, for eigen-channel algorithms, pre-training multiple speaker identification (SID) models at a lattice of signal-to-noise-ratio (SNR) levels and then performing SID using the appropriate SNR dependent model was successful in mitigating noise at all SNR levels. In those tests, it was found that SID performance was optimized when the SNR of the testing and training data were close or identical. In this current effort multiple i-vector algorithms were used, greatly improving both processing throughput and equal error rate classification accuracy. Using identical approaches in the same noisy environment, performance of SID, language identification, gender identification, and diarization were significantly improved. A critical factor in this improvement is speech activity detection (SAD) that performs reliably in extremely noisy environments, where the speech itself is barely audible. To optimize SAD operation at all SNR levels, two algorithms were employed. The first maximized detection probability at low levels (-10 dB ≤ SNR < +10 dB) using just the voiced speech envelope, and the second exploited features extracted from the original speech to improve overall accuracy at higher quality levels (SNR ≥ +10 dB).

  9. Language modeling for automatic speech recognition of inflective languages an applications-oriented approach using lexical data

    CERN Document Server

    Donaj, Gregor

    2017-01-01

    This book covers language modeling and automatic speech recognition for inflective languages (e.g. Slavic languages), which represent roughly half of the languages spoken in Europe. These languages do not perform as well as English in speech recognition systems and it is therefore harder to develop an application with sufficient quality for the end user. The authors describe the most important language features for the development of a speech recognition system. This is then presented through the analysis of errors in the system and the development of language models and their inclusion in speech recognition systems, which specifically address the errors that are relevant for targeted applications. The error analysis is done with regard to morphological characteristics of the word in the recognized sentences. The book is oriented towards speech recognition with large vocabularies and continuous and even spontaneous speech. Today such applications work with a rather small number of languages compared to the nu...

  10. Voice Activity Detection. Fundamentals and Speech Recognition System Robustness

    OpenAIRE

    Ramirez, J.; Gorriz, J. M.; Segura, J. C.

    2007-01-01

    This chapter has shown an overview of the main challenges in robust speech detection and a review of the state of the art and applications. VADs are frequently used in a number of applications including speech coding, speech enhancement and speech recognition. A precise VAD extracts a set of discriminative speech features from the noisy speech and formulates the decision in terms of well defined rule. The chapter has summarized three robust VAD methods that yield high speech/non-speech discri...

  11. Subcortical processing of speech regularities underlies reading and music aptitude in children

    Science.gov (United States)

    2011-01-01

    Background Neural sensitivity to acoustic regularities supports fundamental human behaviors such as hearing in noise and reading. Although the failure to encode acoustic regularities in ongoing speech has been associated with language and literacy deficits, how auditory expertise, such as the expertise that is associated with musical skill, relates to the brainstem processing of speech regularities is unknown. An association between musical skill and neural sensitivity to acoustic regularities would not be surprising given the importance of repetition and regularity in music. Here, we aimed to define relationships between the subcortical processing of speech regularities, music aptitude, and reading abilities in children with and without reading impairment. We hypothesized that, in combination with auditory cognitive abilities, neural sensitivity to regularities in ongoing speech provides a common biological mechanism underlying the development of music and reading abilities. Methods We assessed auditory working memory and attention, music aptitude, reading ability, and neural sensitivity to acoustic regularities in 42 school-aged children with a wide range of reading ability. Neural sensitivity to acoustic regularities was assessed by recording brainstem responses to the same speech sound presented in predictable and variable speech streams. Results Through correlation analyses and structural equation modeling, we reveal that music aptitude and literacy both relate to the extent of subcortical adaptation to regularities in ongoing speech as well as with auditory working memory and attention. Relationships between music and speech processing are specifically driven by performance on a musical rhythm task, underscoring the importance of rhythmic regularity for both language and music. Conclusions These data indicate common brain mechanisms underlying reading and music abilities that relate to how the nervous system responds to regularities in auditory input

  12. Subcortical processing of speech regularities underlies reading and music aptitude in children

    Directory of Open Access Journals (Sweden)

    Strait Dana L

    2011-10-01

    Full Text Available Abstract Background Neural sensitivity to acoustic regularities supports fundamental human behaviors such as hearing in noise and reading. Although the failure to encode acoustic regularities in ongoing speech has been associated with language and literacy deficits, how auditory expertise, such as the expertise that is associated with musical skill, relates to the brainstem processing of speech regularities is unknown. An association between musical skill and neural sensitivity to acoustic regularities would not be surprising given the importance of repetition and regularity in music. Here, we aimed to define relationships between the subcortical processing of speech regularities, music aptitude, and reading abilities in children with and without reading impairment. We hypothesized that, in combination with auditory cognitive abilities, neural sensitivity to regularities in ongoing speech provides a common biological mechanism underlying the development of music and reading abilities. Methods We assessed auditory working memory and attention, music aptitude, reading ability, and neural sensitivity to acoustic regularities in 42 school-aged children with a wide range of reading ability. Neural sensitivity to acoustic regularities was assessed by recording brainstem responses to the same speech sound presented in predictable and variable speech streams. Results Through correlation analyses and structural equation modeling, we reveal that music aptitude and literacy both relate to the extent of subcortical adaptation to regularities in ongoing speech as well as with auditory working memory and attention. Relationships between music and speech processing are specifically driven by performance on a musical rhythm task, underscoring the importance of rhythmic regularity for both language and music. Conclusions These data indicate common brain mechanisms underlying reading and music abilities that relate to how the nervous system responds to

  13. Subcortical processing of speech regularities underlies reading and music aptitude in children.

    Science.gov (United States)

    Strait, Dana L; Hornickel, Jane; Kraus, Nina

    2011-10-17

    Neural sensitivity to acoustic regularities supports fundamental human behaviors such as hearing in noise and reading. Although the failure to encode acoustic regularities in ongoing speech has been associated with language and literacy deficits, how auditory expertise, such as the expertise that is associated with musical skill, relates to the brainstem processing of speech regularities is unknown. An association between musical skill and neural sensitivity to acoustic regularities would not be surprising given the importance of repetition and regularity in music. Here, we aimed to define relationships between the subcortical processing of speech regularities, music aptitude, and reading abilities in children with and without reading impairment. We hypothesized that, in combination with auditory cognitive abilities, neural sensitivity to regularities in ongoing speech provides a common biological mechanism underlying the development of music and reading abilities. We assessed auditory working memory and attention, music aptitude, reading ability, and neural sensitivity to acoustic regularities in 42 school-aged children with a wide range of reading ability. Neural sensitivity to acoustic regularities was assessed by recording brainstem responses to the same speech sound presented in predictable and variable speech streams. Through correlation analyses and structural equation modeling, we reveal that music aptitude and literacy both relate to the extent of subcortical adaptation to regularities in ongoing speech as well as with auditory working memory and attention. Relationships between music and speech processing are specifically driven by performance on a musical rhythm task, underscoring the importance of rhythmic regularity for both language and music. These data indicate common brain mechanisms underlying reading and music abilities that relate to how the nervous system responds to regularities in auditory input. Definition of common biological underpinnings

  14. Influence of musical training on understanding voiced and whispered speech in noise.

    Science.gov (United States)

    Ruggles, Dorea R; Freyman, Richard L; Oxenham, Andrew J

    2014-01-01

    This study tested the hypothesis that the previously reported advantage of musicians over non-musicians in understanding speech in noise arises from more efficient or robust coding of periodic voiced speech, particularly in fluctuating backgrounds. Speech intelligibility was measured in listeners with extensive musical training, and in those with very little musical training or experience, using normal (voiced) or whispered (unvoiced) grammatically correct nonsense sentences in noise that was spectrally shaped to match the long-term spectrum of the speech, and was either continuous or gated with a 16-Hz square wave. Performance was also measured in clinical speech-in-noise tests and in pitch discrimination. Musicians exhibited enhanced pitch discrimination, as expected. However, no systematic or statistically significant advantage for musicians over non-musicians was found in understanding either voiced or whispered sentences in either continuous or gated noise. Musicians also showed no statistically significant advantage in the clinical speech-in-noise tests. Overall, the results provide no evidence for a significant difference between young adult musicians and non-musicians in their ability to understand speech in noise.

  15. Speech Inconsistency in Children with Childhood Apraxia of Speech, Language Impairment, and Speech Delay: Depends on the Stimuli

    Science.gov (United States)

    Iuzzini-Seigel, Jenya; Hogan, Tiffany P.; Green, Jordan R.

    2017-01-01

    Purpose: The current research sought to determine (a) if speech inconsistency is a core feature of childhood apraxia of speech (CAS) or if it is driven by comorbid language impairment that affects a large subset of children with CAS and (b) if speech inconsistency is a sensitive and specific diagnostic marker that can differentiate between CAS and…

  16. Variable Span Filters for Speech Enhancement

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Benesty, Jacob; Christensen, Mads Græsbøll

    2016-01-01

    In this work, we consider enhancement of multichannel speech recordings. Linear filtering and subspace approaches have been considered previously for solving the problem. The current linear filtering methods, although many variants exist, have limited control of noise reduction and speech...

  17. Represented Speech in Qualitative Health Research

    DEFF Research Database (Denmark)

    Musaeus, Peter

    2017-01-01

    Represented speech refers to speech where we reference somebody. Represented speech is an important phenomenon in everyday conversation, health care communication, and qualitative research. This case will draw first from a case study on physicians’ workplace learning and second from a case study...... on nurses’ apprenticeship learning. The aim of the case is to guide the qualitative researcher to use own and others’ voices in the interview and to be sensitive to represented speech in everyday conversation. Moreover, reported speech matters to health professionals who aim to represent the voice...... of their patients. Qualitative researchers and students might learn to encourage interviewees to elaborate different voices or perspectives. Qualitative researchers working with natural speech might pay attention to how people talk and use represented speech. Finally, represented speech might be relevant...

  18. Quick Statistics about Voice, Speech, and Language

    Science.gov (United States)

    ... here Home » Health Info » Statistics and Epidemiology Quick Statistics About Voice, Speech, Language Voice, Speech, Language, and ... no 205. Hyattsville, MD: National Center for Health Statistics. 2015. Hoffman HJ, Li C-M, Losonczy K, ...

  19. A NOVEL APPROACH TO STUTTERED SPEECH CORRECTION

    Directory of Open Access Journals (Sweden)

    Alim Sabur Ajibola

    2016-06-01

    Full Text Available Stuttered speech is a dysfluency rich speech, more prevalent in males than females. It has been associated with insufficient air pressure or poor articulation, even though the root causes are more complex. The primary features include prolonged speech and repetitive speech, while some of its secondary features include, anxiety, fear, and shame. This study used LPC analysis and synthesis algorithms to reconstruct the stuttered speech. The results were evaluated using cepstral distance, Itakura-Saito distance, mean square error, and likelihood ratio. These measures implied perfect speech reconstruction quality. ASR was used for further testing, and the results showed that all the reconstructed speech samples were perfectly recognized while only three samples of the original speech were perfectly recognized.

  20. Developmental language and speech disability.

    Science.gov (United States)

    Spiel, G; Brunner, E; Allmayer, B; Pletz, A

    2001-09-01

    Speech disabilities (articulation deficits) and language disorders--expressive (vocabulary) receptive (language comprehension) are not uncommon in children. An overview of these along with a global description of the impairment of communication as well as clinical characteristics of language developmental disorders are presented in this article. The diagnostic tables, which are applied in the European and Anglo-American speech areas, ICD-10 and DSM-IV, have been explained and compared. Because of their strengths and weaknesses an alternative classification of language and speech developmental disorders is proposed, which allows a differentiation between expressive and receptive language capabilities with regard to the semantic and the morphological/syntax domains. Prevalence and comorbidity rates, psychosocial influences, biological factors and the biological social interaction have been discussed. The necessity of the use of standardized examinations is emphasised. General logopaedic treatment paradigms, specific therapy concepts and an overview of prognosis have been described.

  1. The Rabbit Stream Cipher

    DEFF Research Database (Denmark)

    Boesgaard, Martin; Vesterager, Mette; Zenner, Erik

    2008-01-01

    The stream cipher Rabbit was first presented at FSE 2003, and no attacks against it have been published until now. With a measured encryption/decryption speed of 3.7 clock cycles per byte on a Pentium III processor, Rabbit does also provide very high performance. This paper gives a concise...... description of the Rabbit design and some of the cryptanalytic results available....

  2. Music Streaming in Denmark

    DEFF Research Database (Denmark)

    Pedersen, Rasmus Rex

    This report analyses how a ’per user’ settlement model differs from the ‘pro rata’ model currently used. The analysis is based on data for all streams by WiMP users in Denmark during August 2013. The analysis has been conducted in collaboration with Christian Schlelein from Koda on the basis of d...

  3. Academic streaming in Europe

    DEFF Research Database (Denmark)

    Falaschi, Alessandro; Mønster, Dan; Doležal, Ivan

    2004-01-01

    The TF-NETCAST task force was active from March 2003 to March 2004, and during this time the mem- bers worked on various aspects of streaming media related to the ultimate goal of setting up common services and infrastructures to enable netcasting of high quality content to the academic community...

  4. The pupil response is sensitive to divided attention during speech processing.

    Science.gov (United States)

    Koelewijn, Thomas; Shinn-Cunningham, Barbara G; Zekveld, Adriana A; Kramer, Sophia E

    2014-06-01

    Dividing attention over two streams of speech strongly decreases performance compared to focusing on only one. How divided attention affects cognitive processing load as indexed with pupillometry during speech recognition has so far not been investigated. In 12 young adults the pupil response was recorded while they focused on either one or both of two sentences that were presented dichotically and masked by fluctuating noise across a range of signal-to-noise ratios. In line with previous studies, the performance decreases when processing two target sentences instead of one. Additionally, dividing attention to process two sentences caused larger pupil dilation and later peak pupil latency than processing only one. This suggests an effect of attention on cognitive processing load (pupil dilation) during speech processing in noise. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.

  5. Motor Speech Phenotypes of Frontotemporal Dementia, Primary Progressive Aphasia, and Progressive Apraxia of Speech

    Science.gov (United States)

    Poole, Matthew L.; Brodtmann, Amy; Darby, David; Vogel, Adam P.

    2017-01-01

    Purpose: Our purpose was to create a comprehensive review of speech impairment in frontotemporal dementia (FTD), primary progressive aphasia (PPA), and progressive apraxia of speech in order to identify the most effective measures for diagnosis and monitoring, and to elucidate associations between speech and neuroimaging. Method: Speech and…

  6. Visual context enhanced. The joint contribution of iconic gestures and visible speech to degraded speech comprehension.

    NARCIS (Netherlands)

    Drijvers, L.; Özyürek, A.

    2017-01-01

    Purpose: This study investigated whether and to what extent iconic co-speech gestures contribute to information from visible speech to enhance degraded speech comprehension at different levels of noise-vocoding. Previous studies of the contributions of these 2 visual articulators to speech

  7. Listeners Experience Linguistic Masking Release in Noise-Vocoded Speech-in-Speech Recognition

    Science.gov (United States)

    Viswanathan, Navin; Kokkinakis, Kostas; Williams, Brittany T.

    2018-01-01

    Purpose: The purpose of this study was to evaluate whether listeners with normal hearing perceiving noise-vocoded speech-in-speech demonstrate better intelligibility of target speech when the background speech was mismatched in language (linguistic release from masking [LRM]) and/or location (spatial release from masking [SRM]) relative to the…

  8. Predicting Speech Intelligibility with a Multiple Speech Subsystems Approach in Children with Cerebral Palsy

    Science.gov (United States)

    Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

    2014-01-01

    Purpose: Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method: Nine acoustic variables reflecting different subsystems, and…

  9. Visual Context Enhanced: The Joint Contribution of Iconic Gestures and Visible Speech to Degraded Speech Comprehension

    Science.gov (United States)

    Drijvers, Linda; Ozyurek, Asli

    2017-01-01

    Purpose: This study investigated whether and to what extent iconic co-speech gestures contribute to information from visible speech to enhance degraded speech comprehension at different levels of noise-vocoding. Previous studies of the contributions of these 2 visual articulators to speech comprehension have only been performed separately. Method:…

  10. An experimental Dutch keyboard-to-speech system for the speech impaired

    NARCIS (Netherlands)

    Deliege, R.J.H.

    1989-01-01

    An experimental Dutch keyboard-to-speech system has been developed to explor the possibilities and limitations of Dutch speech synthesis in a communication aid for the speech impaired. The system uses diphones and a formant synthesizer chip for speech synthesis. Input to the system is in

  11. Perceived Liveliness and Speech Comprehensibility in Aphasia: The Effects of Direct Speech in Auditory Narratives

    Science.gov (United States)

    Groenewold, Rimke; Bastiaanse, Roelien; Nickels, Lyndsey; Huiskes, Mike

    2014-01-01

    Background: Previous studies have shown that in semi-spontaneous speech, individuals with Broca's and anomic aphasia produce relatively many direct speech constructions. It has been claimed that in "healthy" communication direct speech constructions contribute to the liveliness, and indirectly to the comprehensibility, of speech.…

  12. Poor Speech Perception Is Not a Core Deficit of Childhood Apraxia of Speech: Preliminary Findings

    Science.gov (United States)

    Zuk, Jennifer; Iuzzini-Seigel, Jenya; Cabbage, Kathryn; Green, Jordan R.; Hogan, Tiffany P.

    2018-01-01

    Purpose: Childhood apraxia of speech (CAS) is hypothesized to arise from deficits in speech motor planning and programming, but the influence of abnormal speech perception in CAS on these processes is debated. This study examined speech perception abilities among children with CAS with and without language impairment compared to those with…

  13. The treatment of apraxia of speech : Speech and music therapy, an innovative joint effort

    NARCIS (Netherlands)

    Hurkmans, Josephus Johannes Stephanus

    2016-01-01

    Apraxia of Speech (AoS) is a neurogenic speech disorder. A wide variety of behavioural methods have been developed to treat AoS. Various therapy programmes use musical elements to improve speech production. A unique therapy programme combining elements of speech therapy and music therapy is called

  14. Speech Perception and Short-Term Memory Deficits in Persistent Developmental Speech Disorder

    Science.gov (United States)

    Kenney, Mary Kay; Barac-Cikoja, Dragana; Finnegan, Kimberly; Jeffries, Neal; Ludlow, Christy L.

    2006-01-01

    Children with developmental speech disorders may have additional deficits in speech perception and/or short-term memory. To determine whether these are only transient developmental delays that can accompany the disorder in childhood or persist as part of the speech disorder, adults with a persistent familial speech disorder were tested on speech…

  15. Switch of flow direction in an Antarctic ice stream.

    Science.gov (United States)

    Conway, H; Catania, G; Raymond, C F; Gades, A M; Scambos, T A; Engelhardt, H

    2002-10-03

    Fast-flowing ice streams transport ice from the interior of West Antarctica to the ocean, and fluctuations in their activity control the mass balance of the ice sheet. The mass balance of the Ross Sea sector of the West Antarctic ice sheet is now positive--that is, it is growing--mainly because one of the ice streams (ice stream C) slowed down about 150 years ago. Here we present evidence from both surface measurements and remote sensing that demonstrates the highly dynamic nature of the Ross drainage system. We show that the flow in an area that once discharged into ice stream C has changed direction, now draining into the Whillans ice stream (formerly ice stream B). This switch in flow direction is a result of continuing thinning of the Whillans ice stream and recent thickening of ice stream C. Further abrupt reorganization of the activity and configuration of the ice streams over short timescales is to be expected in the future as the surface topography of the ice sheet responds to the combined effects of internal dynamics and long-term climate change. We suggest that caution is needed when using observations of short-term mass changes to draw conclusions about the large-scale mass balance of the ice sheet.

  16. Sound stream segregation: a neuromorphic approach to solve the "cocktail party problem" in real-time.

    Science.gov (United States)

    Thakur, Chetan Singh; Wang, Runchun M; Afshar, Saeed; Hamilton, Tara J; Tapson, Jonathan C; Shamma, Shihab A; van Schaik, André

    2015-01-01

    The human auditory system has the ability to segregate complex auditory scenes into a foreground component and a background, allowing us to listen to specific speech sounds from a mixture of sounds. Selective attention plays a crucial role in this process, colloquially known as the "cocktail party effect." It has not been possible to build a machine that can emulate this human ability in real-time. Here, we have developed a framework for the implementation of a neuromorphic sound segregation algorithm in a Field Programmable Gate Array (FPGA). This algorithm is based on the principles of temporal coherence and uses an attention signal to separate a target sound stream from background noise. Temporal coherence implies that auditory features belonging to the same sound source are coherently modulated and evoke highly correlated neural response patterns. The basis for this form of sound segregation is that responses from pairs of channels that are strongly positively correlated belong to the same stream, while channels that are uncorrelated or anti-correlated belong to different streams. In our framework, we have used a neuromorphic cochlea as a frontend sound analyser to extract spatial information of the sound input, which then passes through band pass filters that extract the sound envelope at various modulation rates. Further stages include feature extraction and mask generation, which is finally used to reconstruct the targeted sound. Using sample tonal and speech mixtures, we show that our FPGA architecture is able to segregate sound sources in real-time. The accuracy of segregation is indicated by the high signal-to-noise ratio (SNR) of the segregated stream (90, 77, and 55 dB for simple tone, complex tone, and speech, respectively) as compared to the SNR of the mixture waveform (0 dB). This system may be easily extended for the segregation of complex speech signals, and may thus find various applications in electronic devices such as for sound segregation and

  17. Common neural substrates support speech and non-speech vocal tract gestures

    OpenAIRE

    Chang, Soo-Eun; Kenney, Mary Kay; Loucks, Torrey M.J.; Poletto, Christopher J.; Ludlow, Christy L.

    2009-01-01

    The issue of whether speech is supported by the same neural substrates as non-speech vocal-tract gestures has been contentious. In this fMRI study we tested whether producing non-speech vocal tract gestures in humans shares the same functional neuroanatomy as non-sense speech syllables. Production of non-speech vocal tract gestures, devoid of phonological content but similar to speech in that they had familiar acoustic and somatosensory targets, were compared to the production of speech sylla...

  18. Data Stream Classification Based on the Gamma Classifier

    Directory of Open Access Journals (Sweden)

    Abril Valeria Uriarte-Arcia

    2015-01-01

    Full Text Available The ever increasing data generation confronts us with the problem of handling online massive amounts of information. One of the biggest challenges is how to extract valuable information from these massive continuous data streams during single scanning. In a data stream context, data arrive continuously at high speed; therefore the algorithms developed to address this context must be efficient regarding memory and time management and capable of detecting changes over time in the underlying distribution that generated the data. This work describes a novel method for the task of pattern classification over a continuous data stream based on an associative model. The proposed method is based on the Gamma classifier, which is inspired by the Alpha-Beta associative memories, which are both supervised pattern recognition models. The proposed method is capable of handling the space and time constrain inherent to data stream scenarios. The Data Streaming Gamma classifier (DS-Gamma classifier implements a sliding window approach to provide concept drift detection and a forgetting mechanism. In order to test the classifier, several experiments were performed using different data stream scenarios with real and synthetic data streams. The experimental results show that the method exhibits competitive performance when compared to other state-of-the-art algorithms.

  19. Multimicrophone Speech Dereverberation: Experimental Validation

    Directory of Open Access Journals (Sweden)

    Marc Moonen

    2007-05-01

    Full Text Available Dereverberation is required in various speech processing applications such as handsfree telephony and voice-controlled systems, especially when signals are applied that are recorded in a moderately or highly reverberant environment. In this paper, we compare a number of classical and more recently developed multimicrophone dereverberation algorithms, and validate the different algorithmic settings by means of two performance indices and a speech recognition system. It is found that some of the classical solutions obtain a moderate signal enhancement. More advanced subspace-based dereverberation techniques, on the other hand, fail to enhance the signals despite their high-computational load.

  20. Discriminative learning for speech recognition

    CERN Document Server

    He, Xiadong

    2008-01-01

    In this book, we introduce the background and mainstream methods of probabilistic modeling and discriminative parameter optimization for speech recognition. The specific models treated in depth include the widely used exponential-family distributions and the hidden Markov model. A detailed study is presented on unifying the common objective functions for discriminative learning in speech recognition, namely maximum mutual information (MMI), minimum classification error, and minimum phone/word error. The unification is presented, with rigorous mathematical analysis, in a common rational-functio

  1. Dynamic encoding of speech sequence probability in human temporal cortex.

    Science.gov (United States)

    Leonard, Matthew K; Bouchard, Kristofer E; Tang, Claire; Chang, Edward F

    2015-05-06

    Sensory processing involves identification of stimulus features, but also integration with the surrounding sensory and cognitive context. Previous work in animals and humans has shown fine-scale sensitivity to context in the form of learned knowledge about the statistics of the sensory environment, including relative probabilities of discrete units in a stream of sequential auditory input. These statistics are a defining characteristic of one of the most important sequential signals humans encounter: speech. For speech, extensive exposure to a language tunes listeners to the statistics of sound sequences. To address how speech sequence statistics are neurally encoded, we used high-resolution direct cortical recordings from human lateral superior temporal cortex as subjects listened to words and nonwords with varying transition probabilities between sound segments. In addition to their sensitivity to acoustic features (including contextual features, such as coarticulation), we found that neural responses dynamically encoded the language-level probability of both preceding and upcoming speech sounds. Transition probability first negatively modulated neural responses, followed by positive modulation of neural responses, consistent with coordinated predictive and retrospective recognition processes, respectively. Furthermore, transition probability encoding was different for real English words compared with nonwords, providing evidence for online interactions with high-order linguistic knowledge. These results demonstrate that sensory processing of deeply learned stimuli involves integrating physical stimulus features with their contextual sequential structure. Despite not being consciously aware of phoneme sequence statistics, listeners use this information to process spoken input and to link low-level acoustic representations with linguistic information about word identity and meaning. Copyright © 2015 the authors 0270-6474/15/357203-12$15.00/0.

  2. Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

    International Nuclear Information System (INIS)

    Holzrichter, J.F.; Ng, L.C.

    1998-01-01

    The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs

  3. Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

    Science.gov (United States)

    Holzrichter, John F.; Ng, Lawrence C.

    1998-01-01

    The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching.

  4. Relationship between Speech Intelligibility and Speech Comprehension in Babble Noise

    Science.gov (United States)

    Fontan, Lionel; Tardieu, Julien; Gaillard, Pascal; Woisard, Virginie; Ruiz, Robert

    2015-01-01

    Purpose: The authors investigated the relationship between the intelligibility and comprehension of speech presented in babble noise. Method: Forty participants listened to French imperative sentences (commands for moving objects) in a multitalker babble background for which intensity was experimentally controlled. Participants were instructed to…

  5. Spectral integration in speech and non-speech sounds

    Science.gov (United States)

    Jacewicz, Ewa

    2005-04-01

    Spectral integration (or formant averaging) was proposed in vowel perception research to account for the observation that a reduction of the intensity of one of two closely spaced formants (as in /u/) produced a predictable shift in vowel quality [Delattre et al., Word 8, 195-210 (1952)]. A related observation was reported in psychoacoustics, indicating that when the components of a two-tone periodic complex differ in amplitude and frequency, its perceived pitch is shifted toward that of the more intense tone [Helmholtz, App. XIV (1875/1948)]. Subsequent research in both fields focused on the frequency interval that separates these two spectral components, in an attempt to determine the size of the bandwidth for spectral integration to occur. This talk will review the accumulated evidence for and against spectral integration within the hypothesized limit of 3.5 Bark for static and dynamic signals in speech perception and psychoacoustics. Based on similarities in the processing of speech and non-speech sounds, it is suggested that spectral integration may reflect a general property of the auditory system. A larger frequency bandwidth, possibly close to 3.5 Bark, may be utilized in integrating acoustic information, including speech, complex signals, or sound quality of a violin.

  6. Intelligibility of synthetic speech in the presence of interfering speech

    NARCIS (Netherlands)

    Eggen, J.H.

    1989-01-01

    Standard articulation tests are not always sensitive enough to discriminate between speech samples which are of high intelligibility. One can increase the sensitivity of such tests by presenting the test materials in noise. In this way, small differences in intelligibility can be magnified into

  7. Multimedia with a speech track: searching spontaneous conversational speech

    NARCIS (Netherlands)

    Larson, Martha; Ordelman, Roeland J.F.; de Jong, Franciska M.G.; Kohler, Joachim; Kraaij, Wessel

    After two successful years at SIGIR in 2007 and 2008, the third workshop on Searching Spontaneous Conversational Speech (SSCS 2009) was held conjunction with the ACM Multimedia 2009. The goal of the SSCS series is to serve as a forum that brings together the disciplines that collaborate on spoken

  8. SPEECH ACT ANALYSIS: HOSNI MUBARAK'S SPEECHES IN PRE ...

    African Journals Online (AJOL)

    enerco

    from movements of certain organs with his (man‟s) throat and mouth…. By means ... In other words, government engages language; and how this affects the ... address the audience in a social gathering in order to have a new dawn. ..... Agbedo, C. U. Speech Act Analysis of Political discourse in the Nigerian Print Media in.

  9. Cognitive Functions in Childhood Apraxia of Speech

    Science.gov (United States)

    Nijland, Lian; Terband, Hayo; Maassen, Ben

    2015-01-01

    Purpose: Childhood apraxia of speech (CAS) is diagnosed on the basis of specific speech characteristics, in the absence of problems in hearing, intelligence, and language comprehension. This does not preclude the possibility that children with this speech disorder might demonstrate additional problems. Method: Cognitive functions were investigated…

  10. Phonetic recalibration of speech by text

    NARCIS (Netherlands)

    Keetels, M.N.; Schakel, L.; de Bonte, M.; Vroomen, J.

    2016-01-01

    Listeners adjust their phonetic categories to cope with variations in the speech signal (phonetic recalibration). Previous studies have shown that lipread speech (and word knowledge) can adjust the perception of ambiguous speech and can induce phonetic adjustments (Bertelson, Vroomen, & de Gelder in

  11. Speech and Debate as Civic Education

    Science.gov (United States)

    Hogan, J. Michael; Kurr, Jeffrey A.; Johnson, Jeremy D.; Bergmaier, Michael J.

    2016-01-01

    In light of the U.S. Senate's designation of March 15, 2016 as "National Speech and Debate Education Day" (S. Res. 398, 2016), it only seems fitting that "Communication Education" devote a special section to the role of speech and debate in civic education. Speech and debate have been at the heart of the communication…

  12. Speech Synthesis Applied to Language Teaching.

    Science.gov (United States)

    Sherwood, Bruce

    1981-01-01

    The experimental addition of speech output to computer-based Esperanto lessons using speech synthesized from text is described. Because of Esperanto's phonetic spelling and simple rhythm, it is particularly easy to describe the mechanisms of Esperanto synthesis. Attention is directed to how the text-to-speech conversion is performed and the ways…

  13. Epoch-based analysis of speech signals

    Indian Academy of Sciences (India)

    on speech production characteristics, but also helps in accurate analysis of speech. .... include time delay estimation, speech enhancement from single and multi- ...... log. (. E[k]. ∑K−1 l=0. E[l]. ) ,. (7) where K is the number of samples in the ...

  14. Normal Aspects of Speech, Hearing, and Language.

    Science.gov (United States)

    Minifie, Fred. D., Ed.; And Others

    This book is written as a guide to the understanding of the processes involved in human speech communication. Ten authorities contributed material to provide an introduction to the physiological aspects of speech production and reception, the acoustical aspects of speech production and transmission, the psychophysics of sound reception, the nature…

  15. Audiovisual Asynchrony Detection in Human Speech

    Science.gov (United States)

    Maier, Joost X.; Di Luca, Massimiliano; Noppeney, Uta

    2011-01-01

    Combining information from the visual and auditory senses can greatly enhance intelligibility of natural speech. Integration of audiovisual speech signals is robust even when temporal offsets are present between the component signals. In the present study, we characterized the temporal integration window for speech and nonspeech stimuli with…

  16. The interpersonal level in English: reported speech

    NARCIS (Netherlands)

    Keizer, E.

    2009-01-01

    The aim of this article is to describe and classify a number of different forms of English reported speech (or thought), and subsequently to analyze and represent them within the theory of FDG. First, the most prototypical forms of reported speech are discussed (direct and indirect speech);

  17. Cognitive functions in Childhood Apraxia of Speech

    NARCIS (Netherlands)

    Nijland, L.; Terband, H.; Maassen, B.

    2015-01-01

    Purpose: Childhood Apraxia of Speech (CAS) is diagnosed on the basis of specific speech characteristics, in the absence of problems in hearing, intelligence, and language comprehension. This does not preclude the possibility that children with this speech disorder might demonstrate additional

  18. Regulation of speech in multicultural societies

    NARCIS (Netherlands)

    Maussen, M.; Grillo, R.

    2015-01-01

    This book focuses on the way in which public debate and legal practice intersect when it comes to the value of free speech and the need to regulate "offensive", "blasphemous" or "hate" speech, especially, though not exclusively where such speech is thought to be offensive to members of ethnic and

  19. Theoretical Value in Teaching Freedom of Speech.

    Science.gov (United States)

    Carney, John J., Jr.

    The exercise of freedom of speech within our nation has deteriorated. A practical value in teaching free speech is the possibility of restoring a commitment to its principles by educators. What must be taught is why freedom of speech is important, why it has been compromised, and the extent to which it has been compromised. Every technological…

  20. Interventions for Speech Sound Disorders in Children

    Science.gov (United States)

    Williams, A. Lynn, Ed.; McLeod, Sharynne, Ed.; McCauley, Rebecca J., Ed.

    2010-01-01

    With detailed discussion and invaluable video footage of 23 treatment interventions for speech sound disorders (SSDs) in children, this textbook and DVD set should be part of every speech-language pathologist's professional preparation. Focusing on children with functional or motor-based speech disorders from early childhood through the early…

  1. Application of wavelets in speech processing

    CERN Document Server

    Farouk, Mohamed Hesham

    2014-01-01

    This book provides a survey on wide-spread of employing wavelets analysis  in different applications of speech processing. The author examines development and research in different application of speech processing. The book also summarizes the state of the art research on wavelet in speech processing.

  2. DEVELOPMENT AND DISORDERS OF SPEECH IN CHILDHOOD.

    Science.gov (United States)

    KARLIN, ISAAC W.; AND OTHERS

    THE GROWTH, DEVELOPMENT, AND ABNORMALITIES OF SPEECH IN CHILDHOOD ARE DESCRIBED IN THIS TEXT DESIGNED FOR PEDIATRICIANS, PSYCHOLOGISTS, EDUCATORS, MEDICAL STUDENTS, THERAPISTS, PATHOLOGISTS, AND PARENTS. THE NORMAL DEVELOPMENT OF SPEECH AND LANGUAGE IS DISCUSSED, INCLUDING THEORIES ON THE ORIGIN OF SPEECH IN MAN AND FACTORS INFLUENCING THE NORMAL…

  3. Boundary-making in the public sphere: Contestations of free speech

    OpenAIRE

    Midtbøen, Arnfinn Haagensen; Steen-Johnsen, Kari; Thorbjørnsrud, Kjersti

    2017-01-01

    Freedom of speech is a fundamental human right and considered a core value in liberal democracies. However, it is also one of our time’s most contested issues, constantly claimed either to be too wide-ranging, allowing continuous repression of minority groups, or too limited – restricting dissent and democratic deliberation. In this book we depart from conventional approaches of free speech, which tend to focus on whether specific types of public talk should be considered legally allowed or n...

  4. Re-meandering of lowland streams: will disobeying the laws of geomorphology have ecological consequences?

    Science.gov (United States)

    Pedersen, Morten Lauge; Kristensen, Klaus Kevin; Friberg, Nikolai

    2014-01-01

    We evaluated the restoration of physical habitats and its influence on macroinvertebrate community structure in 18 Danish lowland streams comprising six restored streams, six streams with little physical alteration and six channelized streams. We hypothesized that physical habitats and macroinvertebrate communities of restored streams would resemble those of natural streams, while those of the channelized streams would differ from both restored and near-natural streams. Physical habitats were surveyed for substrate composition, depth, width and current velocity. Macroinvertebrates were sampled along 100 m reaches in each stream, in edge habitats and in riffle/run habitats located in the center of the stream. Restoration significantly altered the physical conditions and affected the interactions between stream habitat heterogeneity and macroinvertebrate diversity. The substrate in the restored streams was dominated by pebble, whereas the substrate in the channelized and natural streams was dominated by sand. In the natural streams a relationship was identified between slope and pebble/gravel coverage, indicating a coupling of energy and substrate characteristics. Such a relationship did not occur in the channelized or in the restored streams where placement of large amounts of pebble/gravel distorted the natural relationship. The analyses revealed, a direct link between substrate heterogeneity and macroinvertebrate diversity in the natural streams. A similar relationship was not found in either the channelized or the restored streams, which we attribute to a de-coupling of the natural relationship between benthic community diversity and physical habitat diversity. Our study results suggest that restoration schemes should aim at restoring the natural physical structural complexity in the streams and at the same time enhance the possibility of re-generating the natural geomorphological processes sustaining the habitats in streams and rivers. Documentation of

  5. Analysis of hydraulic characteristics for stream diversion in small stream

    Energy Technology Data Exchange (ETDEWEB)

    Ahn, Sang-Jin; Jun, Kye-Won [Chungbuk National University, Cheongju(Korea)

    2001-10-31

    This study is the analysis of hydraulic characteristics for stream diversion reach by numerical model test. Through it we can provide the basis data in flood, and in grasping stream flow characteristics. Analysis of hydraulic characteristics in Seoknam stream were implemented by using computer model HEC-RAS(one-dimensional model) and RMA2(two-dimensional finite element model). As a result we became to know that RMA2 to simulate left, main channel, right in stream is more effective method in analysing flow in channel bends, steep slope, complex bed form effect stream flow characteristics, than HEC-RAS. (author). 13 refs., 3 tabs., 5 figs.

  6. A wireless brain-machine interface for real-time speech synthesis.

    Directory of Open Access Journals (Sweden)

    Frank H Guenther

    2009-12-01

    Full Text Available Brain-machine interfaces (BMIs involving electrodes implanted into the human cerebral cortex have recently been developed in an attempt to restore function to profoundly paralyzed individuals. Current BMIs for restoring communication can provide important capabilities via a typing process, but unfortunately they are only capable of slow communication rates. In the current study we use a novel approach to speech restoration in which we decode continuous auditory parameters for a real-time speech synthesizer from neuronal activity in motor cortex during attempted speech.Neural signals recorded by a Neurotrophic Electrode implanted in a speech-related region of the left precentral gyrus of a human volunteer suffering from locked-in syndrome, characterized by near-total paralysis with spared cognition, were transmitted wirelessly across the scalp and used to drive a speech synthesizer. A Kalman filter-based decoder translated the neural signals generated during attempted speech into continuous parameters for controlling a synthesizer that provided immediate (within 50 ms auditory feedback of the decoded sound. Accuracy of the volunteer's vowel productions with the synthesizer improved quickly with practice, with a 25% improvement in average hit rate (from 45% to 70% and 46% decrease in average endpoint error from the first to the last block of a three-vowel task.Our results support the feasibility of neural prostheses that may have the potential to provide near-conversational synthetic speech output for individuals with severely impaired speech motor control. They also provide an initial glimpse into the functional properties of neurons in speech motor cortical areas.

  7. Lifelong Augmentation of Multimodal Streaming Autobiographical Memories

    OpenAIRE

    Petit, Maxime; Fischer, Tobias; Demiris, Yiannis

    2016-01-01

    Robot systems that interact with humans over extended periods of time will benefit from storing and recalling large amounts of accumulated sensorimotor and interaction data. We provide a principled framework for the cumulative organisation of streaming autobiographical data so that data can be continuously processed and augmented as the processing and reasoning abilities of the agent develop and further interactions with humans take place. As an example, we show how a kinematic structure lear...

  8. Real time speech formant analyzer and display

    Science.gov (United States)

    Holland, George E.; Struve, Walter S.; Homer, John F.

    1987-01-01

    A speech analyzer for interpretation of sound includes a sound input which converts the sound into a signal representing the sound. The signal is passed through a plurality of frequency pass filters to derive a plurality of frequency formants. These formants are converted to voltage signals by frequency-to-voltage converters and then are prepared for visual display in continuous real time. Parameters from the inputted sound are also derived and displayed. The display may then be interpreted by the user. The preferred embodiment includes a microprocessor which is interfaced with a television set for displaying of the sound formants. The microprocessor software enables the sound analyzer to present a variety of display modes for interpretive and therapeutic used by the user.

  9. Neural Tuning to Low-Level Features of Speech throughout the Perisylvian Cortex.

    Science.gov (United States)

    Berezutskaya, Julia; Freudenburg, Zachary V; Güçlü, Umut; van Gerven, Marcel A J; Ramsey, Nick F

    2017-08-16

    Despite a large body of research, we continue to lack a detailed account of how auditory processing of continuous speech unfolds in the human brain. Previous research showed the propagation of low-level acoustic features of speech from posterior superior temporal gyrus toward anterior superior temporal gyrus in the human brain (Hullett et al., 2016). In this study, we investigate what happens to these neural representations past the superior temporal gyrus and how they engage higher-level language processing areas such as inferior frontal gyrus. We used low-level sound features to model neural responses to speech outside of the primary auditory cortex. Two complementary imaging techniques were used with human participants (both males and females): electrocorticography (ECoG) and fMRI. Both imaging techniques showed tuning of the perisylvian cortex to low-level speech features. With ECoG, we found evidence of propagation of the temporal features of speech sounds along the ventral pathway of language processing in the brain toward inferior frontal gyrus. Increasingly coarse temporal features of speech spreading from posterior superior temporal cortex toward inferior frontal gyrus were associated with linguistic features such as voice onset time, duration of the formant transitions, and phoneme, syllable, and word boundaries. The present findings provide the groundwork for a comprehensive bottom-up account of speech comprehension in the human brain. SIGNIFICANCE STATEMENT We know that, during natural speech comprehension, a broad network of perisylvian cortical regions is involved in sound and language processing. Here, we investigated the tuning to low-level sound features within these regions using neural responses to a short feature film. We also looked at whether the tuning organization along these brain regions showed any parallel to the hierarchy of language structures in continuous speech. Our results show that low-level speech features propagate throughout the

  10. The Students Experiences With Live Video-Streamed Teaching Classes

    DEFF Research Database (Denmark)

    Jelsbak, Vibe Alopaeus; Ørngreen, Rikke; Buus, Lillian

    2017-01-01

    The Bachelor's Degree Programme of Biomedical Laboratory Science at VIA Faculty of Health Sciences offers a combination of live video-streamed and traditional teaching. It is the student’s individual choice whether to attend classes on-site or to attend classes from home via live video-stream. Our...... previous studies revealed that the live-streamed sessions compared to on-site teaching reduced interaction and dialogue between attendants, and that the main reasons were technological issues and the teacher’s choice of teaching methods. One of our goals therefore became to develop methods and implement...... transparency in the live video-streamed teaching sessions during a 5-year period of continuous development of technological and pedagogical solutions for live-streamed teaching. Data describing student’s experiences were gathered in a longitudinal study of four sessions from 2012 to 2017 using a qualitative...

  11. Fast Monaural Separation of Speech

    DEFF Research Database (Denmark)

    Pontoppidan, Niels Henrik; Dyrholm, Mads

    2003-01-01

    a Factorial Hidden Markov Model, with non-stationary assumptions on the source autocorrelations modelled through the Factorial Hidden Markov Model, leads to separation in the monaural case. By extending Hansens work we find that Roweis' assumptions are necessary for monaural speech separation. Furthermore we...

  12. Why Go to Speech Therapy?

    Science.gov (United States)

    ... for stuttering to change over time or for emotions and attitudes about your speech to change as you have new experiences. It is important for you to have a clear idea about your motivation for going to therapy because your reasons for ...

  13. Paraconsistent semantics of speech acts

    NARCIS (Netherlands)

    Dunin-Kȩplicz, Barbara; Strachocka, Alina; Szałas, Andrzej; Verbrugge, Rineke

    2015-01-01

    This paper discusses an implementation of four speech acts: assert, concede, request and challenge in a paraconsistent framework. A natural four-valued model of interaction yields multiple new cognitive situations. They are analyzed in the context of communicative relations, which partially replace

  14. The DNA of prophetic speech

    African Journals Online (AJOL)

    2014-03-04

    Mar 4, 2014 ... In reflecting on possible responses to this ... Through the actions of a prophet, as Philip Wogamen (1998:4) reasons, people are supposed to have a ... The main argument in this article is that the person called to prophetic speech needs to become ..... were like dumb bricks and blocks to be forcefully moved.

  15. Prosodic Contrasts in Ironic Speech

    Science.gov (United States)

    Bryant, Gregory A.

    2010-01-01

    Prosodic features in spontaneous speech help disambiguate implied meaning not explicit in linguistic surface structure, but little research has examined how these signals manifest themselves in real conversations. Spontaneously produced verbal irony utterances generated between familiar speakers in conversational dyads were acoustically analyzed…

  16. The DNA of prophetic speech

    African Journals Online (AJOL)

    2014-03-04

    Mar 4, 2014 ... It is expected that people will be drawn into the reality of God by authentic prophetic speech, .... strands of the DNA molecule show themselves to be arranged ... explains, chemical patterns act like the letters of a code, .... viewing the self-reflection regarding the ministry of renewal from the .... Irresistible force.

  17. Continuous Problem of Function Continuity

    Science.gov (United States)

    Jayakody, Gaya; Zazkis, Rina

    2015-01-01

    We examine different definitions presented in textbooks and other mathematical sources for "continuity of a function at a point" and "continuous function" in the context of introductory level Calculus. We then identify problematic issues related to definitions of continuity and discontinuity: inconsistency and absence of…

  18. Audiovisual Speech Synchrony Measure: Application to Biometrics

    Directory of Open Access Journals (Sweden)

    Gérard Chollet

    2007-01-01

    Full Text Available Speech is a means of communication which is intrinsically bimodal: the audio signal originates from the dynamics of the articulators. This paper reviews recent works in the field of audiovisual speech, and more specifically techniques developed to measure the level of correspondence between audio and visual speech. It overviews the most common audio and visual speech front-end processing, transformations performed on audio, visual, or joint audiovisual feature spaces, and the actual measure of correspondence between audio and visual speech. Finally, the use of synchrony measure for biometric identity verification based on talking faces is experimented on the BANCA database.

  19. The motor theory of speech perception revisited.

    Science.gov (United States)

    Massaro, Dominic W; Chen, Trevor H

    2008-04-01

    Galantucci, Fowler, and Turvey (2006) have claimed that perceiving speech is perceiving gestures and that the motor system is recruited for perceiving speech. We make the counter argument that perceiving speech is not perceiving gestures, that the motor system is not recruitedfor perceiving speech, and that speech perception can be adequately described by a prototypical pattern recognition model, the fuzzy logical model of perception (FLMP). Empirical evidence taken as support for gesture and motor theory is reconsidered in more detail and in the framework of the FLMR Additional theoretical and logical arguments are made to challenge gesture and motor theory.

  20. Perceived Speech Quality Estimation Using DTW Algorithm

    Directory of Open Access Journals (Sweden)

    S. Arsenovski

    2009-06-01

    Full Text Available In this paper a method for speech quality estimation is evaluated by simulating the transfer of speech over packet switched and mobile networks. The proposed system uses Dynamic Time Warping algorithm for test and received speech comparison. Several tests have been made on a test speech sample of a single speaker with simulated packet (frame loss effects on the perceived speech. The achieved results have been compared with measured PESQ values on the used transmission channel and their correlation has been observed.

  1. Music training for the development of speech segmentation.

    Science.gov (United States)

    François, Clément; Chobert, Julie; Besson, Mireille; Schön, Daniele

    2013-09-01

    The role of music training in fostering brain plasticity and developing high cognitive skills, notably linguistic abilities, is of great interest from both a scientific and a societal perspective. Here, we report results of a longitudinal study over 2 years using both behavioral and electrophysiological measures and a test-training-retest procedure to examine the influence of music training on speech segmentation in 8-year-old children. Children were pseudo-randomly assigned to either music or painting training and were tested on their ability to extract meaningless words from a continuous flow of nonsense syllables. While no between-group differences were found before training, both behavioral and electrophysiological measures showed improved speech segmentation skills across testing sessions for the music group only. These results show that music training directly causes facilitation in speech segmentation, thereby pointing to the importance of music for speech perception and more generally for children's language development. Finally these results have strong implications for promoting the development of music-based remediation strategies for children with language-based learning impairments.

  2. High-performance speech recognition using consistency modeling

    Science.gov (United States)

    Digalakis, Vassilios; Murveit, Hy; Monaco, Peter; Neumeyer, Leo; Sankar, Ananth

    1994-12-01

    The goal of SRI's consistency modeling project is to improve the raw acoustic modeling component of SRI's DECIPHER speech recognition system and develop consistency modeling technology. Consistency modeling aims to reduce the number of improper independence assumptions used in traditional speech recognition algorithms so that the resulting speech recognition hypotheses are more self-consistent and, therefore, more accurate. At the initial stages of this effort, SRI focused on developing the appropriate base technologies for consistency modeling. We first developed the Progressive Search technology that allowed us to perform large-vocabulary continuous speech recognition (LVCSR) experiments. Since its conception and development at SRI, this technique has been adopted by most laboratories, including other ARPA contracting sites, doing research on LVSR. Another goal of the consistency modeling project is to attack difficult modeling problems, when there is a mismatch between the training and testing phases. Such mismatches may include outlier speakers, different microphones and additive noise. We were able to either develop new, or transfer and evaluate existing, technologies that adapted our baseline genonic HMM recognizer to such difficult conditions.

  3. Streaming gravity mode instability

    International Nuclear Information System (INIS)

    Wang Shui.

    1989-05-01

    In this paper, we study the stability of a current sheet with a sheared flow in a gravitational field which is perpendicular to the magnetic field and plasma flow. This mixing mode caused by a combined role of the sheared flow and gravity is named the streaming gravity mode instability. The conditions of this mode instability are discussed for an ideal four-layer model in the incompressible limit. (author). 5 refs

  4. Autonomous Byte Stream Randomizer

    Science.gov (United States)

    Paloulian, George K.; Woo, Simon S.; Chow, Edward T.

    2013-01-01

    Net-centric networking environments are often faced with limited resources and must utilize bandwidth as efficiently as possible. In networking environments that span wide areas, the data transmission has to be efficient without any redundant or exuberant metadata. The Autonomous Byte Stream Randomizer software provides an extra level of security on top of existing data encryption methods. Randomizing the data s byte stream adds an extra layer to existing data protection methods, thus making it harder for an attacker to decrypt protected data. Based on a generated crypto-graphically secure random seed, a random sequence of numbers is used to intelligently and efficiently swap the organization of bytes in data using the unbiased and memory-efficient in-place Fisher-Yates shuffle method. Swapping bytes and reorganizing the crucial structure of the byte data renders the data file unreadable and leaves the data in a deconstructed state. This deconstruction adds an extra level of security requiring the byte stream to be reconstructed with the random seed in order to be readable. Once the data byte stream has been randomized, the software enables the data to be distributed to N nodes in an environment. Each piece of the data in randomized and distributed form is a separate entity unreadable on its own right, but when combined with all N pieces, is able to be reconstructed back to one. Reconstruction requires possession of the key used for randomizing the bytes, leading to the generation of the same cryptographically secure random sequence of numbers used to randomize the data. This software is a cornerstone capability possessing the ability to generate the same cryptographically secure sequence on different machines and time intervals, thus allowing this software to be used more heavily in net-centric environments where data transfer bandwidth is limited.

  5. The LHCb Turbo stream

    Energy Technology Data Exchange (ETDEWEB)

    Puig, A., E-mail: albert.puig@cern.ch

    2016-07-11

    The LHCb experiment will record an unprecedented dataset of beauty and charm hadron decays during Run II of the LHC, set to take place between 2015 and 2018. A key computing challenge is to store and process this data, which limits the maximum output rate of the LHCb trigger. So far, LHCb has written out a few kHz of events containing the full raw sub-detector data, which are passed through a full offline event reconstruction before being considered for physics analysis. Charm physics in particular is limited by trigger output rate constraints. A new streaming strategy includes the possibility to perform the physics analysis with candidates reconstructed in the trigger, thus bypassing the offline reconstruction. In the Turbo stream the trigger will write out a compact summary of physics objects containing all information necessary for analyses. This will allow an increased output rate and thus higher average efficiencies and smaller selection biases. This idea will be commissioned and developed during 2015 with a selection of physics analyses. It is anticipated that the turbo stream will be adopted by an increasing number of analyses during the remainder of LHC Run II (2015–2018) and ultimately in Run III (starting in 2020) with the upgraded LHCb detector.

  6. Re-Meandering of Lowland Streams

    DEFF Research Database (Denmark)

    Pedersen, Morten Lauge; Kristensen, Klaus Kevin; Friberg, Nikolai

    2014-01-01

    We evaluated the restoration of physical habitats and its influence on macroinvertebrate community structure in 18 Danish lowland streams comprising six restored streams, six streams with little physical alteration and six channelized streams. We hypothesized that physical habitats and macroinver...

  7. Stream processing health card application.

    Science.gov (United States)

    Polat, Seda; Gündem, Taflan Imre

    2012-10-01

    In this paper, we propose a data stream management system embedded to a smart card for handling and storing user specific summaries of streaming data coming from medical sensor measurements and/or other medical measurements. The data stream management system that we propose for a health card can handle the stream data rates of commonly known medical devices and sensors. It incorporates a type of context awareness feature that acts according to user specific information. The proposed system is cheap and provides security for private data by enhancing the capabilities of smart health cards. The stream data management system is tested on a real smart card using both synthetic and real data.

  8. Masking effects of speech and music: does the masker's hierarchical structure matter?

    Science.gov (United States)

    Shi, Lu-Feng; Law, Yvonne

    2010-04-01

    Speech and music are time-varying signals organized by parallel hierarchical rules. Through a series of four experiments, this study compared the masking effects of single-talker speech and instrumental music on speech perception while manipulating the complexity of hierarchical and temporal structures of the maskers. Listeners' word recognition was found to be similar between hierarchically intact and disrupted speech or classical music maskers (Experiment 1). When sentences served as the signal, significantly greater masking effects were observed with disrupted than intact speech or classical music maskers (Experiment 2), although not with jazz or serial music maskers, which differed from the classical music masker in their hierarchical structures (Experiment 3). Removing the classical music masker's temporal dynamics or partially restoring it affected listeners' sentence recognition; yet, differences in performance between intact and disrupted maskers remained robust (Experiment 4). Hence, the effect of structural expectancy was largely present across maskers when comparing them before and after their hierarchical structure was purposefully disrupted. This effect seemed to lend support to the auditory stream segregation theory.

  9. Effect of attentional load on audiovisual speech perception: Evidence from ERPs

    Directory of Open Access Journals (Sweden)

    Agnès eAlsius

    2014-07-01

    Full Text Available Seeing articulatory movements influences perception of auditory speech. This is often reflected in a shortened latency of auditory event-related potentials (ERPs generated in the auditory cortex. The present study addressed whether this early neural correlate of audiovisual interaction is modulated by attention. We recorded ERPs in 15 subjects while they were presented with auditory, visual and audiovisual spoken syllables. Audiovisual stimuli consisted of incongruent auditory and visual components known to elicit a McGurk effect, i.e. a visually driven alteration in the auditory speech percept. In a Dual task condition, participants were asked to identify spoken syllables whilst monitoring a rapid visual stream of pictures for targets, i.e., they had to divide their attention. In a Single task condition, participants identified the syllables without any other tasks, i.e., they were asked to ignore the pictures and focus their attention fully on the spoken syllables. The McGurk effect was weaker in the Dual task than in the Single task condition, indicating an effect of attentional load on audiovisual speech perception. Early auditory ERP components, N1 and P2, peaked earlier to audiovisual stimuli than to auditory stimuli when attention was fully focused on syllables, indicating neurophysiological audiovisual interaction. This latency decrement was reduced when attention was loaded, suggesting that attention influences early neural processing of audiovisual speech. We conclude that reduced attention weakens the interaction between vision and audition in speech.

  10. Effect of attentional load on audiovisual speech perception: evidence from ERPs.

    Science.gov (United States)

    Alsius, Agnès; Möttönen, Riikka; Sams, Mikko E; Soto-Faraco, Salvador; Tiippana, Kaisa

    2014-01-01

    Seeing articulatory movements influences perception of auditory speech. This is often reflected in a shortened latency of auditory event-related potentials (ERPs) generated in the auditory cortex. The present study addressed whether this early neural correlate of audiovisual interaction is modulated by attention. We recorded ERPs in 15 subjects while they were presented with auditory, visual, and audiovisual spoken syllables. Audiovisual stimuli consisted of incongruent auditory and visual components known to elicit a McGurk effect, i.e., a visually driven alteration in the auditory speech percept. In a Dual task condition, participants were asked to identify spoken syllables whilst monitoring a rapid visual stream of pictures for targets, i.e., they had to divide their attention. In a Single task condition, participants identified the syllables without any other tasks, i.e., they were asked to ignore the pictures and focus their attention fully on the spoken syllables. The McGurk effect was weaker in the Dual task than in the Single task condition, indicating an effect of attentional load on audiovisual speech perception. Early auditory ERP components, N1 and P2, peaked earlier to audiovisual stimuli than to auditory stimuli when attention was fully focused on syllables, indicating neurophysiological audiovisual interaction. This latency decrement was reduced when attention was loaded, suggesting that attention influences early neural processing of audiovisual speech. We conclude that reduced attention weakens the interaction between vision and audition in speech.

  11. Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems.

    Science.gov (United States)

    Greene, Beth G; Logan, John S; Pisoni, David B

    1986-03-01

    We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered.

  12. Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems

    Science.gov (United States)

    GREENE, BETH G.; LOGAN, JOHN S.; PISONI, DAVID B.

    2012-01-01

    We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered. PMID:23225916

  13. Speech Entrainment Compensates for Broca's Area Damage

    Science.gov (United States)

    Fridriksson, Julius; Basilakos, Alexandra; Hickok, Gregory; Bonilha, Leonardo; Rorden, Chris

    2015-01-01

    Speech entrainment (SE), the online mimicking of an audiovisual speech model, has been shown to increase speech fluency in patients with Broca's aphasia. However, not all individuals with aphasia benefit from SE. The purpose of this study was to identify patterns of cortical damage that predict a positive response SE's fluency-inducing effects. Forty-four chronic patients with left hemisphere stroke (15 female) were included in this study. Participants completed two tasks: 1) spontaneous speech production, and 2) audiovisual SE. Number of different words per minute was calculated as a speech output measure for each task, with the difference between SE and spontaneous speech conditions yielding a measure of fluency improvement. Voxel-wise lesion-symptom mapping (VLSM) was used to relate the number of different words per minute for spontaneous speech, SE, and SE-related improvement to patterns of brain damage in order to predict lesion locations associated with the fluency-inducing response to speech entrainment. Individuals with Broca's aphasia demonstrated a significant increase in different words per minute during speech entrainment versus spontaneous speech. A similar pattern of improvement was not seen in patients with other types of aphasia. VLSM analysis revealed damage to the inferior frontal gyrus predicted this response. Results suggest that SE exerts its fluency-inducing effects by providing a surrogate target for speech production via internal monitoring processes. Clinically, these results add further support for the use of speech entrainment to improve speech production and may help select patients for speech entrainment treatment. PMID:25989443

  14. Commencement Speech as a Hybrid Polydiscursive Practice

    Directory of Open Access Journals (Sweden)

    Светлана Викторовна Иванова

    2017-12-01

    Full Text Available Discourse and media communication researchers pay attention to the fact that popular discursive and communicative practices have a tendency to hybridization and convergence. Discourse which is understood as language in use is flexible. Consequently, it turns out that one and the same text can represent several types of discourses. A vivid example of this tendency is revealed in American commencement speech / commencement address / graduation speech. A commencement speech is a speech university graduates are addressed with which in compliance with the modern trend is delivered by outstanding media personalities (politicians, athletes, actors, etc.. The objective of this study is to define the specificity of the realization of polydiscursive practices within commencement speech. The research involves discursive, contextual, stylistic and definitive analyses. Methodologically the study is based on the discourse analysis theory, in particular the notion of a discursive practice as a verbalized social practice makes up the conceptual basis of the research. This research draws upon a hundred commencement speeches delivered by prominent representatives of American society since 1980s till now. In brief, commencement speech belongs to institutional discourse public speech embodies. Commencement speech institutional parameters are well represented in speeches delivered by people in power like American and university presidents. Nevertheless, as the results of the research indicate commencement speech institutional character is not its only feature. Conceptual information analysis enables to refer commencement speech to didactic discourse as it is aimed at teaching university graduates how to deal with challenges life is rich in. Discursive practices of personal discourse are also actively integrated into the commencement speech discourse. More than that, existential discursive practices also find their way into the discourse under study. Commencement

  15. Low-Frequency Cortical Entrainment to Speech Reflects Phoneme-Level Processing.

    Science.gov (United States)

    Di Liberto, Giovanni M; O'Sullivan, James A; Lalor, Edmund C

    2015-10-05

    The human ability to understand speech is underpinned by a hierarchical auditory system whose successive stages process increasingly complex attributes of the acoustic input. It has been suggested that to produce categorical speech perception, this system must elicit consistent neural responses to speech tokens (e.g., phonemes) despite variations in their acoustics. Here, using electroencephalography (EEG), we provide evidence for this categorical phoneme-level speech processing by showing that the relationship between continuous speech and neural activity is best described when that speech is represented using both low-level spectrotemporal information and categorical labeling of phonetic features. Furthermore, the mapping between phonemes and EEG becomes more discriminative for phonetic features at longer latencies, in line with what one might expect from a hierarchical system. Importantly, these effects are not seen for time-reversed speech. These findings may form the basis for future research on natural language processing in specific cohorts of interest and for broader insights into how brains transform acoustic input into meaning. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. Role of working memory and lexical knowledge in perceptual restoration of interrupted speech.

    Science.gov (United States)

    Nagaraj, Naveen K; Magimairaj, Beula M

    2017-12-01

    The role of working memory (WM) capacity and lexical knowledge in perceptual restoration (PR) of missing speech was investigated using the interrupted speech perception paradigm. Speech identification ability, which indexed PR, was measured using low-context sentences periodically interrupted at 1.5 Hz. PR was measured for silent gated, low-frequency speech noise filled, and low-frequency fine-structure and envelope filled interrupted conditions. WM capacity was measured using verbal and visuospatial span tasks. Lexical knowledge was assessed using both receptive vocabulary and meaning from context tests. Results showed that PR was better for speech noise filled condition than other conditions tested. Both receptive vocabulary and verbal WM capacity explained unique variance in PR for the speech noise filled condition, but were unrelated to performance in the silent gated condition. It was only receptive vocabulary that uniquely predicted PR for fine-structure and envelope filled conditions. These findings suggest that the contribution of lexical knowledge and verbal WM during PR depends crucially on the information content that replaced the silent intervals. When perceptual continuity was partially restored by filler speech noise, both lexical knowledge and verbal WM capacity facilitated PR. Importantly, for fine-structure and envelope filled interrupted conditions, lexical knowledge was crucial for PR.

  17. SPEECH STRATEGIES AND TACTICS IN THE ANNUAL ADDRESSES OF ANGELA MERKEL

    Directory of Open Access Journals (Sweden)

    Eysfeld Evgeniy Aleksandrovich

    2015-06-01

    Full Text Available The article is devoted to studying speech strategies and tactics on the material of annual addresses of Angela Merkel. The author carries out the systematic linguistic analysis of her speeches in the period from 2005 to 2015 and reveals the dynamics of the speaker's stratagem and tactic complex development. The peculiarities of speech strategies and tactics used byAngela Merkel are investigated by the methods of continuous sampling and contextual analysis. As the research shows, the main speech strategies used by Angela Merkel in the analyzed texts are the following: the self-presentation strategy, the interpretation strategy, the argumentation strategy, the strategy of forming the addressee's emotional state, and the agitation strategy. Consequently, the implementation of these strategies through the set of speech tactics lets the speaker fulfil certain communicative objectives. In one of her annual speeches Angela Merkel aims to inform the audience, to interpret some facts or data, to assume confidence-building measures, to consolidate the people, to determine common tasks, to make audience believe in correctness of their political choice, to discredit political competitors, to stimulate recipients to take some actions etc. Moreover, the process of combining strategies and tactics promotes optimal achievement of communicative targets. The conclusions of this article may result in further academic research. Therefore the comparative analysis of speech strategies and tactics in Russian and German political discourse can be perspective of this study.

  18. Foundations for Streaming Model Transformations by Complex Event Processing.

    Science.gov (United States)

    Dávid, István; Ráth, István; Varró, Dániel

    2018-01-01

    Streaming model transformations represent a novel class of transformations to manipulate models whose elements are continuously produced or modified in high volume and with rapid rate of change. Executing streaming transformations requires efficient techniques to recognize activated transformation rules over a live model and a potentially infinite stream of events. In this paper, we propose foundations of streaming model transformations by innovatively integrating incremental model query, complex event processing (CEP) and reactive (event-driven) transformation techniques. Complex event processing allows to identify relevant patterns and sequences of events over an event stream. Our approach enables event streams to include model change events which are automatically and continuously populated by incremental model queries. Furthermore, a reactive rule engine carries out transformations on identified complex event patterns. We provide an integrated domain-specific language with precise semantics for capturing complex event patterns and streaming transformations together with an execution engine, all of which is now part of the Viatra reactive transformation framework. We demonstrate the feasibility of our approach with two case studies: one in an advanced model engineering workflow; and one in the context of on-the-fly gesture recognition.

  19. Urbanization and stream ecology: Diverse mechanisms of change

    Science.gov (United States)

    Roy, Allison; Capps, Krista A.; El-Sabaawi, Rana W.; Jones, Krista L.; Parr, Thomas B.; Ramirez, Alonso; Smith, Robert F.; Walsh, Christopher J.; Wenger, Seth J.

    2016-01-01

    The field of urban stream ecology has evolved rapidly in the last 3 decades, and it now includes natural scientists from numerous disciplines working with social scientists, landscape planners and designers, and land and water managers to address complex, socioecological problems that have manifested in urban landscapes. Over the last decade, stream ecologists have met 3 times at the Symposium on Urbanization and Stream Ecology (SUSE) to discuss current research, identify knowledge gaps, and promote future research collaborations. The papers in this special series on urbanization and stream ecology include both primary research studies and conceptual synthesis papers spurred from discussions at SUSE in May 2014. The themes of the meeting are reflected in the papers in this series emphasizing global differences in mechanisms and responses of stream ecosystems to urbanization and management solutions in diverse urban streams. Our hope is that this series will encourage continued interdisciplinary and collaborative research to increase the global understanding of urban stream ecology toward stream protection and restoration in urban landscapes.

  20. Event Streams Clustering Using Machine Learning Techniques

    Directory of Open Access Journals (Sweden)

    Hanen Bouali

    2015-10-01

    Full Text Available Data streams are usually of unbounded lengths which push users to consider only recent observations by focusing on a time window, and ignore past data. However, in many real world applications, past data must be taken in consideration to guarantee the efficiency, the performance of decision making and to handle data streams evolution over time. In order to build a selectively history to track the underlying event streams changes, we opt for the continuously data of the sliding window which increases the time window based on changes over historical data. In this paper, to have the ability to access to historical data without requiring any significant storage or multiple passes over the data. In this paper, we propose a new algorithm for clustering multiple data streams using incremental support vector machine and data representative points’ technique. The algorithm uses a sliding window model for the most recent clustering results and data representative points to model the old data clustering results. Our experimental results on electromyography signal show a better clustering than other present in the literature

  1. Perception of the Multisensory Coherence of Fluent Audiovisual Speech in Infancy: Its Emergence & the Role of Experience

    Science.gov (United States)

    Lewkowicz, David J.; Minar, Nicholas J.; Tift, Amy H.; Brandon, Melissa

    2014-01-01

    To investigate the developmental emergence of the ability to perceive the multisensory coherence of native and non-native audiovisual fluent speech, we tested 4-, 8–10, and 12–14 month-old English-learning infants. Infants first viewed two identical female faces articulating two different monologues in silence and then in the presence of an audible monologue that matched the visible articulations of one of the faces. Neither the 4-month-old nor the 8–10 month-old infants exhibited audio-visual matching in that neither group exhibited greater looking at the matching monologue. In contrast, the 12–14 month-old infants exhibited matching and, consistent with the emergence of perceptual expertise for the native language, they perceived the multisensory coherence of native-language monologues earlier in the test trials than of non-native language monologues. Moreover, the matching of native audible and visible speech streams observed in the 12–14 month olds did not depend on audio-visual synchrony whereas the matching of non-native audible and visible speech streams did depend on synchrony. Overall, the current findings indicate that the perception of the multisensory coherence of fluent audiovisual speech emerges late in infancy, that audio-visual synchrony cues are more important in the perception of the multisensory coherence of non-native than native audiovisual speech, and that the emergence of this skill most likely is affected by perceptual narrowing. PMID:25462038

  2. Enhancement of speech signals - with a focus on voiced speech models

    DEFF Research Database (Denmark)

    Nørholm, Sidsel Marie

    This thesis deals with speech enhancement, i.e., noise reduction in speech signals. This has applications in, e.g., hearing aids and teleconference systems. We consider a signal-driven approach to speech enhancement where a model of the speech is assumed and filters are generated based...... on this model. The basic model used in this thesis is the harmonic model which is a commonly used model for describing the voiced part of the speech signal. We show that it can be beneficial to extend the model to take inharmonicities or the non-stationarity of speech into account. Extending the model...

  3. An analysis of the masking of speech by competing speech using self-report data.

    Science.gov (United States)

    Agus, Trevor R; Akeroyd, Michael A; Noble, William; Bhullar, Navjot

    2009-01-01

    Many of the items in the "Speech, Spatial, and Qualities of Hearing" scale questionnaire [S. Gatehouse and W. Noble, Int. J. Audiol. 43, 85-99 (2004)] are concerned with speech understanding in a variety of backgrounds, both speech and nonspeech. To study if this self-report data reflected informational masking, previously collected data on 414 people were analyzed. The lowest scores (greatest difficulties) were found for the two items in which there were two speech targets, with successively higher scores for competing speech (six items), energetic masking (one item), and no masking (three items). The results suggest significant masking by competing speech in everyday listening situations.

  4. Speech-in-speech perception and executive function involvement.

    Directory of Open Access Journals (Sweden)

    Marcela Perrone-Bertolotti

    Full Text Available This present study investigated the link between speech-in-speech perception capacities and four executive function components: response suppression, inhibitory control, switching and working memory. We constructed a cross-modal semantic priming paradigm using a written target word and a spoken prime word, implemented in one of two concurrent auditory sentences (cocktail party situation. The prime and target were semantically related or unrelated. Participants had to perform a lexical decision task on visual target words and simultaneously listen to only one of two pronounced sentences. The attention of the participant was manipulated: The prime was in the pronounced sentence listened to by the participant or in the ignored one. In addition, we evaluate the executive function abilities of participants (switching cost, inhibitory-control cost and response-suppression cost and their working memory span. Correlation analyses were performed between the executive and priming measurements. Our results showed a significant interaction effect between attention and semantic priming. We observed a significant priming effect in the attended but not in the ignored condition. Only priming effects obtained in the ignored condition were significantly correlated with some of the executive measurements. However, no correlation between priming effects and working memory capacity was found. Overall, these results confirm, first, the role of attention for semantic priming effect and, second, the implication of executive functions in speech-in-noise understanding capacities.

  5. StreamSqueeze: a dynamic stream visualization for monitoring of event data

    Science.gov (United States)

    Mansmann, Florian; Krstajic, Milos; Fischer, Fabian; Bertini, Enrico

    2012-01-01

    While in clear-cut situations automated analytical solution for data streams are already in place, only few visual approaches have been proposed in the literature for exploratory analysis tasks on dynamic information. However, due to the competitive or security-related advantages that real-time information gives in domains such as finance, business or networking, we are convinced that there is a need for exploratory visualization tools for data streams. Under the conditions that new events have higher relevance and that smooth transitions enable traceability of items, we propose a novel dynamic stream visualization called StreamSqueeze. In this technique the degree of interest of recent items is expressed through an increase in size and thus recent events can be shown with more details. The technique has two main benefits: First, the layout algorithm arranges items in several lists of various sizes and optimizes the positions within each list so that the transition of an item from one list to the other triggers least visual changes. Second, the animation scheme ensures that for 50 percent of the time an item has a static screen position where reading is most effective and then continuously shrinks and moves to the its next static position in the subsequent list. To demonstrate the capability of our technique, we apply it to large and high-frequency news and syslog streams and show how it maintains optimal stability of the layout under the conditions given above.

  6. Stream Tracker: Crowd sourcing and remote sensing to monitor stream flow intermittence

    Science.gov (United States)

    Puntenney, K.; Kampf, S. K.; Newman, G.; Lefsky, M. A.; Weber, R.; Gerlich, J.

    2017-12-01

    Streams that do not flow continuously in time and space support diverse aquatic life and can be critical contributors to downstream water supply. However, these intermittent streams are rarely monitored and poorly mapped. Stream Tracker is a community powered stream monitoring project that pairs citizen contributed observations of streamflow presence or absence with a network of streamflow sensors and remotely sensed data from satellites to track when and where water is flowing in intermittent stream channels. Citizens can visit sites on roads and trails to track flow and contribute their observations to the project site hosted by CitSci.org. Data can be entered using either a mobile application with offline capabilities or an online data entry portal. The sensor network provides a consistent record of streamflow and flow presence/absence across a range of elevations and drainage areas. Capacitance, resistance, and laser sensors have been deployed to determine the most reliable, low cost sensor that could be mass distributed to track streamflow intermittence over a larger number of sites. Streamflow presence or absence observations from the citizen and sensor networks are then compared to satellite imagery to improve flow detection algorithms using remotely sensed data from Landsat. In the first two months of this project, 1,287 observations have been made at 241 sites by 24 project members across northern and western Colorado.

  7. Audiomotor Perceptual Training Enhances Speech Intelligibility in Background Noise.

    Science.gov (United States)

    Whitton, Jonathon P; Hancock, Kenneth E; Shannon, Jeffrey M; Polley, Daniel B

    2017-11-06

    Sensory and motor skills can be improved with training, but learning is often restricted to practice stimuli. As an exception, training on closed-loop (CL) sensorimotor interfaces, such as action video games and musical instruments, can impart a broad spectrum of perceptual benefits. Here we ask whether computerized CL auditory training can enhance speech understanding in levels of background noise that approximate a crowded restaurant. Elderly hearing-impaired subjects trained for 8 weeks on a CL game that, like a musical instrument, challenged them to monitor subtle deviations between predicted and actual auditory feedback as they moved their fingertip through a virtual soundscape. We performed our study as a randomized, double-blind, placebo-controlled trial by training other subjects in an auditory working-memory (WM) task. Subjects in both groups improved at their respective auditory tasks and reported comparable expectations for improved speech processing, thereby controlling for placebo effects. Whereas speech intelligibility was unchanged after WM training, subjects in the CL training group could correctly identify 25% more words in spoken sentences or digit sequences presented in high levels of background noise. Numerically, CL audiomotor training provided more than three times the benefit of our subjects' hearing aids for speech processing in noisy listening conditions. Gains in speech intelligibility could be predicted from gameplay accuracy and baseline inhibitory control. However, benefits did not persist in the absence of continuing practice. These studies employ stringent clinical standards to demonstrate that perceptual learning on a computerized audio game can transfer to "real-world" communication challenges. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Individual differneces in degraded speech perception

    Science.gov (United States)

    Carbonell, Kathy M.

    One of the lasting concerns in audiology is the unexplained individual differences in speech perception performance even for individuals with similar audiograms. One proposal is that there are cognitive/perceptual individual differences underlying this vulnerability and that these differences are present in normal hearing (NH) individuals but do not reveal themselves in studies that use clear speech produced in quiet (because of a ceiling effect). However, previous studies have failed to uncover cognitive/perceptual variables that explain much of the variance in NH performance on more challenging degraded speech tasks. This lack of strong correlations may be due to either examining the wrong measures (e.g., working memory capacity) or to there being no reliable differences in degraded speech performance in NH listeners (i.e., variability in performance is due to measurement noise). The proposed project has 3 aims; the first, is to establish whether there are reliable individual differences in degraded speech performance for NH listeners that are sustained both across degradation types (speech in noise, compressed speech, noise-vocoded speech) and across multiple testing sessions. The second aim is to establish whether there are reliable differences in NH listeners' ability to adapt their phonetic categories based on short-term statistics both across tasks and across sessions; and finally, to determine whether performance on degraded speech perception tasks are correlated with performance on phonetic adaptability tasks, thus establishing a possible explanatory variable for individual differences in speech perception for NH and hearing impaired listeners.

  9. Sensorimotor influences on speech perception in infancy.

    Science.gov (United States)

    Bruderer, Alison G; Danielson, D Kyle; Kandhadai, Padmapriya; Werker, Janet F

    2015-11-03

    The influence of speech production on speech perception is well established in adults. However, because adults have a long history of both perceiving and producing speech, the extent to which the perception-production linkage is due to experience is unknown. We addressed this issue by asking whether articulatory configurations can influence infants' speech perception performance. To eliminate influences from specific linguistic experience, we studied preverbal, 6-mo-old infants and tested the discrimination of a nonnative, and hence never-before-experienced, speech sound distinction. In three experimental studies, we used teething toys to control the position and movement of the tongue tip while the infants listened to the speech sounds. Using ultrasound imaging technology, we verified that the teething toys consistently and effectively constrained the movement and positioning of infants' tongues. With a looking-time procedure, we found that temporarily restraining infants' articulators impeded their discrimination of a nonnative consonant contrast but only when the relevant articulator was selectively restrained to prevent the movements associated with producing those sounds. Our results provide striking evidence that even before infants speak their first words and without specific listening experience, sensorimotor information from the articulators influences speech perception. These results transform theories of speech perception by suggesting that even at the initial stages of development, oral-motor movements influence speech sound discrimination. Moreover, an experimentally induced "impairment" in articulator movement can compromise speech perception performance, raising the question of whether long-term oral-motor impairments may impact perceptual development.

  10. Nitrogen saturation in stream ecosystems.

    Science.gov (United States)

    Earl, Stevan R; Valett, H Maurice; Webster, Jackson R

    2006-12-01

    The concept of nitrogen (N) saturation has organized the assessment of N loading in terrestrial ecosystems. Here we extend the concept to lotic ecosystems by coupling Michaelis-Menten kinetics and nutrient spiraling. We propose a series of saturation response types, which may be used to characterize the proximity of streams to N saturation. We conducted a series of short-term N releases using a tracer (15NO3-N) to measure uptake. Experiments were conducted in streams spanning a gradient of background N concentration. Uptake increased in four of six streams as NO3-N was incrementally elevated, indicating that these streams were not saturated. Uptake generally corresponded to Michaelis-Menten kinetics but deviated from the model in two streams where some other growth-critical factor may have been limiting. Proximity to saturation was correlated to background N concentration but was better predicted by the ratio of dissolved inorganic N (DIN) to soluble reactive phosphorus (SRP), suggesting phosphorus limitation in several high-N streams. Uptake velocity, a reflection of uptake efficiency, declined nonlinearly with increasing N amendment in all streams. At the same time, uptake velocity was highest in the low-N streams. Our conceptual model of N transport, uptake, and uptake efficiency suggests that, while streams may be active sites of N uptake on the landscape, N saturation contributes to nonlinear changes in stream N dynamics that correspond to decreased uptake efficiency.

  11. Statistical analysis of acoustic characteristics of Tibetan Lhasa dialect speech emotion

    Directory of Open Access Journals (Sweden)

    Guo Dandan

    2016-01-01

    Full Text Available The paper makes a quantitative analysis and comparison on the continuous speech emotion of Lhasa Tibetan in the four basic emotional patterns (happy, surprise, sad, neutral pitch, energy and time length by experimental phonetics and the linear statistical research methods, found that there is a positive correlation between the Lhasa Tibetan emotional speech and pitch, energy and duration, etc. And the pitch, energy and duration of negative emotion acoustic parameters are bigger than positive emotion, on this basis, drawing the Lhasa Tibetan speech emotion acoustic feature patterns. Compared with the Chinese language and the Tibetan, even though both have the tone prosodic features, they also have significant differences in the acoustic characteristics of the speech emotion.

  12. Biological impact of preschool music classes on processing speech in noise.

    Science.gov (United States)

    Strait, Dana L; Parbery-Clark, Alexandra; O'Connell, Samantha; Kraus, Nina

    2013-10-01

    Musicians have increased resilience to the effects of noise on speech perception and its neural underpinnings. We do not know, however, how early in life these enhancements arise. We compared auditory brainstem responses to speech in noise in 32 preschool children, half of whom were engaged in music training. Thirteen children returned for testing one year later, permitting the first longitudinal assessment of subcortical auditory function with music training. Results indicate emerging neural enhancements in musically trained preschoolers for processing speech in noise. Longitudinal outcomes reveal that children enrolled in music classes experience further increased neural resilience to background noise following one year of continued training compared to nonmusician peers. Together, these data reveal enhanced development of neural mechanisms undergirding speech-in-noise perception in preschoolers undergoing music training and may indicate a biological impact of music training on auditory function during early childhood. Copyright © 2013 Elsevier Ltd. All rights reserved.

  13. Phonological analysis of substitution errors of patients with apraxia of speech

    Directory of Open Access Journals (Sweden)

    Maysa Luchesi Cera

    Full Text Available Abstract The literature on apraxia of speech describes the types and characteristics of phonological errors in this disorder. In general, phonemes affected by errors are described, but the distinctive features involved have not yet been investigated. Objective: To analyze the features involved in substitution errors produced by Brazilian-Portuguese speakers with apraxia of speech. Methods: 20 adults with apraxia of speech were assessed. Phonological analysis of the distinctive features involved in substitution type errors was carried out using the protocol for the evaluation of verbal and non-verbal apraxia. Results: The most affected features were: voiced, continuant, high, anterior, coronal, posterior. Moreover, the mean of the substitutions of marked to markedness features was statistically greater than the markedness to marked features. Conclusions: This study contributes toward a better characterization of the phonological errors found in apraxia of speech, thereby helping to diagnose communication disorders and the selection criteria of phonemes for rehabilitation in these patients.

  14. STREAM2016: Streaming Requirements, Experience, Applications and Middleware Workshop

    Energy Technology Data Exchange (ETDEWEB)

    Fox, Geoffrey [Indiana Univ., Bloomington, IN (United States); Jha, Shantenu [Rutgers Univ., New Brunswick, NJ (United States); Ramakrishnan, Lavanya [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2016-10-01

    The Department of Energy (DOE) Office of Science (SC) facilities including accelerators, light sources and neutron sources and sensors that study, the environment, and the atmosphere, are producing streaming data that needs to be analyzed for next-generation scientific discoveries. There has been an explosion of new research and technologies for stream analytics arising from the academic and private sectors. However, there has been no corresponding effort in either documenting the critical research opportunities or building a community that can create and foster productive collaborations. The two-part workshop series, STREAM: Streaming Requirements, Experience, Applications and Middleware Workshop (STREAM2015 and STREAM2016), were conducted to bring the community together and identify gaps and future efforts needed by both NSF and DOE. This report describes the discussions, outcomes and conclusions from STREAM2016: Streaming Requirements, Experience, Applications and Middleware Workshop, the second of these workshops held on March 22-23, 2016 in Tysons, VA. STREAM2016 focused on the Department of Energy (DOE) applications, computational and experimental facilities, as well software systems. Thus, the role of “streaming and steering” as a critical mode of connecting the experimental and computing facilities was pervasive through the workshop. Given the overlap in interests and challenges with industry, the workshop had significant presence from several innovative companies and major contributors. The requirements that drive the proposed research directions, identified in this report, show an important opportunity for building competitive research and development program around streaming data. These findings and recommendations are consistent with vision outlined in NRC Frontiers of Data and National Strategic Computing Initiative (NCSI) [1, 2]. The discussions from the workshop are captured as topic areas covered in this report's sections. The report

  15. A causal test of the motor theory of speech perception: a case of impaired speech production and spared speech perception.

    Science.gov (United States)

    Stasenko, Alena; Bonn, Cory; Teghipco, Alex; Garcea, Frank E; Sweet, Catherine; Dombovy, Mary; McDonough, Joyce; Mahon, Bradford Z

    2015-01-01

    The debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. Here, we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. We found that the patient showed a normal phonemic categorical boundary when discriminating two non-words that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the non-word stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labelling impairment. These data suggest that while the motor system is not causally involved in perception of the speech signal, it may be used when other cues (e.g., meaning, context) are not available.

  16. Galaxies with jet streams

    International Nuclear Information System (INIS)

    Breuer, R.

    1981-01-01

    Describes recent research work on supersonic gas flow. Notable examples have been observed in cosmic radio sources, where jet streams of galactic dimensions sometimes occur, apparently as the result of interaction between neighbouring galaxies. The current theory of jet behaviour has been convincingly demonstrated using computer simulation. The surprisingly long-term stability is related to the supersonic velocity, and is analagous to the way in which an Appollo spacecraft re-entering the atmosphere supersonically is protected by the gas from the burning shield. (G.F.F.)

  17. Oscillating acoustic streaming jet

    International Nuclear Information System (INIS)

    Moudjed, Brahim; Botton, Valery; Henry, Daniel; Millet, Severine; Ben Hadid, Hamda; Garandet, Jean-Paul

    2014-01-01

    The present paper provides the first experimental investigation of an oscillating acoustic streaming jet. The observations are performed in the far field of a 2 MHz circular plane ultrasound transducer introduced in a rectangular cavity filled with water. Measurements are made by Particle Image Velocimetry (PIV) in horizontal and vertical planes near the end of the cavity. Oscillations of the jet appear in this zone, for a sufficiently high Reynolds number, as an intermittent phenomenon on an otherwise straight jet fluctuating in intensity. The observed perturbation pattern is similar to that of former theoretical studies. This intermittently oscillatory behavior is the first step to the transition to turbulence. (authors)

  18. THE BASIS FOR SPEECH PREVENTION

    Directory of Open Access Journals (Sweden)

    Jordan JORDANOVSKI

    1997-06-01

    Full Text Available The speech is a tool for accurate communication of ideas. When we talk about speech prevention as a practical realization of the language, we are referring to the fact that it should be comprised of the elements of the criteria as viewed from the perspective of the standards. This criteria, in the broad sense of the word, presupposes an exact realization of the thought expressed between the speaker and the recipient.The absence of this criterion catches the eye through the practical realization of the language and brings forth consequences, often hidden very deeply in the human psyche. Their outer manifestation already represents a delayed reaction of the social environment. The foundation for overcoming and standardization of this phenomenon must be the anatomy-physiological patterns of the body, accomplished through methods in concordance with the nature of the body.

  19. Aerosol emission during human speech

    Science.gov (United States)

    Asadi, Sima; Wexler, Anthony S.; Cappa, Christopher D.; Bouvier, Nicole M.; Barreda-Castanon, Santiago; Ristenpart, William D.

    2017-11-01

    We show that the rate of aerosol particle emission during healthy human speech is strongly correlated with the loudness (amplitude) of vocalization. Emission rates range from approximately 1 to 50 particles per second for quiet to loud amplitudes, regardless of language spoken (English, Spanish, Mandarin, or Arabic). Intriguingly, a small fraction of individuals behave as ``super emitters,'' consistently emitting an order of magnitude more aerosol particles than their peers. We interpret the results in terms of the eggressive flowrate during vocalization, which is known to vary significantly for different types of vocalization and for different individuals. The results suggest that individual speech patterns could affect the probability of airborne disease transmission. The results also provide a possible explanation for the existence of ``super spreaders'' who transmit pathogens much more readily than average and who play a key role in the spread of epidemics.

  20. Pragmatic Study of Directive Speech Acts in Stories in Alquran

    Directory of Open Access Journals (Sweden)

    Rochmat Budi Santosa

    2016-10-01

    principles of the movement of human history. Those principles later we call the laws of God. He continues to invite people to reflect His guidance in life. Keywords: directive speech act, verses of stories, quran

  1. Business continuity

    International Nuclear Information System (INIS)

    Breunhoelder, Gert

    2002-01-01

    This presentation deals with the following keypoints: Information Technology (IT) Business Continuity and Recovery essential for any business; lessons learned after Sept. 11 event; Detailed planning, redundancy and testing being the key elements for probability estimation of disasters

  2. Stream periphyton responses to mesocosm treatments of ...

    Science.gov (United States)

    A stream mesocosm experiment was designed to compare biotic responses among streams exposed to an equal excess specific conductivity target of 850 µS/cm relative to a control that was set for 200 µS/cm and three treatments comprised of different major ion contents. Each treatment and the control was replicated 4 times at the mesocosm scale (16 mesocosms total). The treatments were based on dosing the background mesocosm water, a continuous flow-through mixture of natural river water and reverse osmosis treated water, with stock salt solutions prepared from 1) a mixture of sodium chloride and calcium chloride (Na/Cl chloride), 2) sodium bicarbonate, and 3) magnesium sulfate. The realized average specific conductance over the first 28d of continuous dosing was 827, 829, and 847 µS/cm, for the chloride, bicarbonate, and sulfate based treatments, respectively, and did not differ significantly. The controls averaged 183 µS/cm. Here we focus on comparing stream periphyton communities across treatments based on measurements obtained from a Pulse-Amplitude Modulated (PAM) fluorometer. The fluorometer is used in situ and with built in algorithms distributes the total aerial algal biomass (µg/cm2) of the periphyton among cyanobacteria, diatoms, and green algae. A measurement is recorded in a matter of seconds and, therefore, many different locations can be measured with in each mesocosm at a high return frequency. Eight locations within each of the 1 m2 (0.3 m W x 3

  3. The metaphors we stream by: Making sense of music streaming

    OpenAIRE

    Hagen, Anja Nylund

    2016-01-01

    In Norway music-streaming services have become mainstream in everyday music listening. This paper examines how 12 heavy streaming users make sense of their experiences with Spotify and WiMP Music (now Tidal). The analysis relies on a mixed-method qualitative study, combining music-diary self-reports, online observation of streaming accounts, Facebook and last.fm scrobble-logs, and in-depth interviews. By drawing on existing metaphors of Internet experiences we demonstrate that music-streaming...

  4. Prediction and imitation in speech

    Directory of Open Access Journals (Sweden)

    Chiara eGambi

    2013-06-01

    Full Text Available It has been suggested that intra- and inter-speaker variability in speech are correlated. Interlocutors have been shown to converge on various phonetic dimensions. In addition, speakers imitate the phonetic properties of voices they are exposed to in shadowing, repetition, and even passive listening tasks. We review three theoretical accounts of speech imitation and convergence phenomena: (i the Episodic Theory (ET of speech perception and production (Goldinger, 1998; (ii the Motor Theory (MT of speech perception (Liberman and Whalen, 2000;Galantucci et al., 2006 ; (iii Communication Accommodation Theory (CAT; Giles et al., 1991;Giles and Coupland, 1991. We argue that no account is able to explain all the available evidence. In particular, there is a need to integrate low-level, mechanistic accounts (like ET and MT and higher-level accounts (like CAT. We propose that this is possible within the framework of an integrated theory of production and comprehension (Pickering & Garrod, in press. Similarly to both ET and MT, this theory assumes parity between production and perception. Uniquely, however, it posits that listeners simulate speakers’ utterances by computing forward-model predictions at many different levels, which are then compared to the incoming phonetic input. In our account phonetic imitation can be achieved via the same mechanism that is responsible for sensorimotor adaptation; i.e. the correction of prediction errors. In addition, the model assumes that the degree to which sensory prediction errors lead to motor adjustments is context-dependent. The notion of context subsumes both the preceding linguistic input and non-linguistic attributes of the situation (e.g., the speaker’s and listener’s social identities, their conversational roles, the listener’s intention to imitate.

  5. Identifying Deceptive Speech Across Cultures

    Science.gov (United States)

    2016-06-25

    enough from the truth. Subjects were then interviewed individually in a sound booth to obtain “norming” speech data, pre- interview. We also...e.g. pitch, intensity, speaking rate, voice quality), gender, ethnicity and personality information, our machine learning experiments can classify...Have you ever been in trouble with the police?” vs. open-ended (e.g. “What is the last movie you saw that you really hated ?”) DISTRIBUTION A

  6. Continuous tokamaks

    International Nuclear Information System (INIS)

    Peng, Y.K.M.

    1978-04-01

    A tokamak configuration is proposed that permits the rapid replacement of a plasma discharge in a ''burn'' chamber by another one in a time scale much shorter than the elementary thermal time constant of the chamber first wall. With respect to the chamber, the effective duty cycle factor can thus be made arbitrarily close to unity minimizing the cyclic thermal stress in the first wall. At least one plasma discharge always exists in the new tokamak configuration, hence, a continuous tokamak. By incorporating adiabatic toroidal compression, configurations of continuous tokamak compressors are introduced. To operate continuous tokamaks, it is necessary to introduce the concept of mixed poloidal field coils, which spatially groups all the poloidal field coils into three sets, all contributing simultaneously to inducing the plasma current and maintaining the proper plasma shape and position. Preliminary numerical calculations of axisymmetric MHD equilibria in continuous tokamaks indicate the feasibility of their continued plasma operation. Advanced concepts of continuous tokamaks to reduce the topological complexity and to allow the burn plasma aspect ratio to decrease for increased beta are then suggested

  7. Design and realisation of an audiovisual speech activity detector

    NARCIS (Netherlands)

    Van Bree, K.C.

    2006-01-01

    For many speech telecommunication technologies a robust speech activity detector is important. An audio-only speech detector will givefalse positives when the interfering signal is speech or has speech characteristics. The modality video is suitable to solve this problem. In this report the approach

  8. Extensions to the Speech Disorders Classification System (SDCS)

    Science.gov (United States)

    Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.

    2010-01-01

    This report describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). Part I describes a classification extension to the SDCS to differentiate motor speech disorders from speech delay and to differentiate among three sub-types of motor speech disorders.…

  9. Speech parts as Poisson processes.

    Science.gov (United States)

    Badalamenti, A F

    2001-09-01

    This paper presents evidence that six of the seven parts of speech occur in written text as Poisson processes, simple or recurring. The six major parts are nouns, verbs, adjectives, adverbs, prepositions, and conjunctions, with the interjection occurring too infrequently to support a model. The data consist of more than the first 5000 words of works by four major authors coded to label the parts of speech, as well as periods (sentence terminators). Sentence length is measured via the period and found to be normally distributed with no stochastic model identified for its occurrence. The models for all six speech parts but the noun significantly distinguish some pairs of authors and likewise for the joint use of all words types. Any one author is significantly distinguished from any other by at least one word type and sentence length very significantly distinguishes each from all others. The variety of word type use, measured by Shannon entropy, builds to about 90% of its maximum possible value. The rate constants for nouns are close to the fractions of maximum entropy achieved. This finding together with the stochastic models and the relations among them suggest that the noun may be a primitive organizer of written text.

  10. Quadcopter Control Using Speech Recognition

    Science.gov (United States)

    Malik, H.; Darma, S.; Soekirno, S.

    2018-04-01

    This research reported a comparison from a success rate of speech recognition systems that used two types of databases they were existing databases and new databases, that were implemented into quadcopter as motion control. Speech recognition system was using Mel frequency cepstral coefficient method (MFCC) as feature extraction that was trained using recursive neural network method (RNN). MFCC method was one of the feature extraction methods that most used for speech recognition. This method has a success rate of 80% - 95%. Existing database was used to measure the success rate of RNN method. The new database was created using Indonesian language and then the success rate was compared with results from an existing database. Sound input from the microphone was processed on a DSP module with MFCC method to get the characteristic values. Then, the characteristic values were trained using the RNN which result was a command. The command became a control input to the single board computer (SBC) which result was the movement of the quadcopter. On SBC, we used robot operating system (ROS) as the kernel (Operating System).

  11. Tracking Gendered Streams

    Directory of Open Access Journals (Sweden)

    Maria Eriksson

    2017-10-01

    Full Text Available One of the most prominent features of digital music services is the provision of personalized music recommendations that come about through the profiling of users and audiences. Based on a range of "bot experiments," this article investigates if, and how, gendered patterns in music recommendations are provided by the streaming service Spotify. While our experiments did not give any strong indications that Spotify assigns different taste profiles to male and female users, the study showed that male artists were highly overrepresented in Spotify's music recommendations; an issue which we argue prompts users to cite hegemonic masculine norms within the music industries. Although the results should be approached as historically and contextually contingent, we argue that they point to how gender and gendered tastes may be constituted through the interplay between users and algorithmic knowledge-making processes, and how digital content delivery may maintain and challenge gender relations and gendered power differentials within the music industries. Seen through the lens of critical research on software, music and gender performativity, the experiments thus provide insights into how gender is shaped and attributed meaning as it materializes in contemporary music streams.

  12. The LHCb Turbo stream

    CERN Document Server

    AUTHOR|(CDS)2070171

    2016-01-01

    The LHCb experiment will record an unprecedented dataset of beauty and charm hadron decays during Run II of the LHC, set to take place between 2015 and 2018. A key computing challenge is to store and process this data, which limits the maximum output rate of the LHCb trigger. So far, LHCb has written out a few kHz of events containing the full raw sub-detector data, which are passed through a full offline event reconstruction before being considered for physics analysis. Charm physics in particular is limited by trigger output rate constraints. A new streaming strategy includes the possibility to perform the physics analysis with candidates reconstructed in the trigger, thus bypassing the offline reconstruction. In the Turbo stream the trigger will write out a compact summary of physics objects containing all information necessary for analyses. This will allow an increased output rate and thus higher average efficiencies and smaller selection biases. This idea will be commissioned and developed during 2015 wi...

  13. A Diagnostic Marker to Discriminate Childhood Apraxia of Speech from Speech Delay: Introduction

    Science.gov (United States)

    Shriberg, Lawrence D.; Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.

    2017-01-01

    Purpose: The goal of this article is to introduce the pause marker (PM), a single-sign diagnostic marker proposed to discriminate early or persistent childhood apraxia of speech (CAS) from speech delay.

  14. Effect of speech rate variation on acoustic phone stability in Afrikaans speech recognition

    CSIR Research Space (South Africa)

    Badenhorst, JAC

    2007-11-01

    Full Text Available The authors analyse the effect of speech rate variation on Afrikaans phone stability from an acoustic perspective. Specifically they introduce two techniques for the acoustic analysis of speech rate variation, apply these techniques to an Afrikaans...

  15. VALUE STREAM MAPPINGIN THE ROMANIAN FOOTWEAR INDUSTRY

    Directory of Open Access Journals (Sweden)

    Sorin BRICIU

    2015-04-01

    Full Text Available Cost reduction, productivity increase and creating value for the client are just a few of the arguments that managers use when they adopt Lean philosophy. Businesses’ concern is to create products that have value in the eyes of the client, continuously analyzing the existing value stream in order to improve it. Value stream mapping (VSM is a technique used to visually present the chain of processes, within the company, necessary to obtain the product. Due to the many advantages and to the ease of use experienced by Toyota since the ’80, VSM use has constantly increased as this activity improvement technique was discovered by managers. The article presents a case study of the application of VSM in footwear industry.

  16. SST: Single-Stream Temporal Action Proposals

    KAUST Repository

    Buch, Shyamal; Escorcia, Victor; Shen, Chuanqi; Ghanem, Bernard; Niebles, Juan Carlos

    2017-01-01

    Our paper presents a new approach for temporal detection of human actions in long, untrimmed video sequences. We introduce Single-Stream Temporal Action Proposals (SST), a new effective and efficient deep architecture for the generation of temporal action proposals. Our network can run continuously in a single stream over very long input video sequences, without the need to divide input into short overlapping clips or temporal windows for batch processing. We demonstrate empirically that our model outperforms the state-of-the-art on the task of temporal action proposal generation, while achieving some of the fastest processing speeds in the literature. Finally, we demonstrate that using SST proposals in conjunction with existing action classifiers results in improved state-of-the-art temporal action detection performance.

  17. SST: Single-Stream Temporal Action Proposals

    KAUST Repository

    Buch, Shyamal

    2017-11-09

    Our paper presents a new approach for temporal detection of human actions in long, untrimmed video sequences. We introduce Single-Stream Temporal Action Proposals (SST), a new effective and efficient deep architecture for the generation of temporal action proposals. Our network can run continuously in a single stream over very long input video sequences, without the need to divide input into short overlapping clips or temporal windows for batch processing. We demonstrate empirically that our model outperforms the state-of-the-art on the task of temporal action proposal generation, while achieving some of the fastest processing speeds in the literature. Finally, we demonstrate that using SST proposals in conjunction with existing action classifiers results in improved state-of-the-art temporal action detection performance.

  18. Speech Intelligibility Evaluation for Mobile Phones

    DEFF Research Database (Denmark)

    Jørgensen, Søren; Cubick, Jens; Dau, Torsten

    2015-01-01

    In the development process of modern telecommunication systems, such as mobile phones, it is common practice to use computer models to objectively evaluate the transmission quality of the system, instead of time-consuming perceptual listening tests. Such models have typically focused on the quality...... of the transmitted speech, while little or no attention has been provided to speech intelligibility. The present study investigated to what extent three state-of-the art speech intelligibility models could predict the intelligibility of noisy speech transmitted through mobile phones. Sentences from the Danish...... Dantale II speech material were mixed with three different kinds of background noise, transmitted through three different mobile phones, and recorded at the receiver via a local network simulator. The speech intelligibility of the transmitted sentences was assessed by six normal-hearing listeners...

  19. Primary progressive aphasia and apraxia of speech.

    Science.gov (United States)

    Jung, Youngsin; Duffy, Joseph R; Josephs, Keith A

    2013-09-01

    Primary progressive aphasia is a neurodegenerative syndrome characterized by progressive language dysfunction. The majority of primary progressive aphasia cases can be classified into three subtypes: nonfluent/agrammatic, semantic, and logopenic variants. Each variant presents with unique clinical features, and is associated with distinctive underlying pathology and neuroimaging findings. Unlike primary progressive aphasia, apraxia of speech is a disorder that involves inaccurate production of sounds secondary to impaired planning or programming of speech movements. Primary progressive apraxia of speech is a neurodegenerative form of apraxia of speech, and it should be distinguished from primary progressive aphasia given its discrete clinicopathological presentation. Recently, there have been substantial advances in our understanding of these speech and language disorders. The clinical, neuroimaging, and histopathological features of primary progressive aphasia and apraxia of speech are reviewed in this article. The distinctions among these disorders for accurate diagnosis are increasingly important from a prognostic and therapeutic standpoint. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.

  20. Recent advances in nonlinear speech processing

    CERN Document Server

    Faundez-Zanuy, Marcos; Esposito, Antonietta; Cordasco, Gennaro; Drugman, Thomas; Solé-Casals, Jordi; Morabito, Francesco

    2016-01-01

    This book presents recent advances in nonlinear speech processing beyond nonlinear techniques. It shows that it exploits heuristic and psychological models of human interaction in order to succeed in the implementations of socially believable VUIs and applications for human health and psychological support. The book takes into account the multifunctional role of speech and what is “outside of the box” (see Björn Schuller’s foreword). To this aim, the book is organized in 6 sections, each collecting a small number of short chapters reporting advances “inside” and “outside” themes related to nonlinear speech research. The themes emphasize theoretical and practical issues for modelling socially believable speech interfaces, ranging from efforts to capture the nature of sound changes in linguistic contexts and the timing nature of speech; labors to identify and detect speech features that help in the diagnosis of psychological and neuronal disease, attempts to improve the effectiveness and performa...