WorldWideScience

Sample records for machine coded speech

  1. Speech coding

    Energy Technology Data Exchange (ETDEWEB)

    Ravishankar, C., Hughes Network Systems, Germantown, MD

    1998-05-08

    Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the

  2. Principles of speech coding

    CERN Document Server

    Ogunfunmi, Tokunbo

    2010-01-01

    It is becoming increasingly apparent that all forms of communication-including voice-will be transmitted through packet-switched networks based on the Internet Protocol (IP). Therefore, the design of modern devices that rely on speech interfaces, such as cell phones and PDAs, requires a complete and up-to-date understanding of the basics of speech coding. Outlines key signal processing algorithms used to mitigate impairments to speech quality in VoIP networksOffering a detailed yet easily accessible introduction to the field, Principles of Speech Coding provides an in-depth examination of the

  3. Speech and audio processing for coding, enhancement and recognition

    CERN Document Server

    Togneri, Roberto; Narasimha, Madihally

    2015-01-01

    This book describes the basic principles underlying the generation, coding, transmission and enhancement of speech and audio signals, including advanced statistical and machine learning techniques for speech and speaker recognition with an overview of the key innovations in these areas. Key research undertaken in speech coding, speech enhancement, speech recognition, emotion recognition and speaker diarization are also presented, along with recent advances and new paradigms in these areas. ·         Offers readers a single-source reference on the significant applications of speech and audio processing to speech coding, speech enhancement and speech/speaker recognition. Enables readers involved in algorithm development and implementation issues for speech coding to understand the historical development and future challenges in speech coding research; ·         Discusses speech coding methods yielding bit-streams that are multi-rate and scalable for Voice-over-IP (VoIP) Networks; ·     �...

  4. Speech coding code- excited linear prediction

    CERN Document Server

    Bäckström, Tom

    2017-01-01

    This book provides scientific understanding of the most central techniques used in speech coding both for advanced students as well as professionals with a background in speech audio and or digital signal processing. It provides a clear connection between the whys hows and whats thus enabling a clear view of the necessity purpose and solutions provided by various tools as well as their strengths and weaknesses in each respect Equivalently this book sheds light on the following perspectives for each technology presented Objective What do we want to achieve and especially why is this goal important Resource Information What information is available and how can it be useful and Resource Platform What kind of platforms are we working with and what are their capabilities restrictions This includes computational memory and acoustic properties and the transmission capacity of devices used. The book goes on to address Solutions Which solutions have been proposed and how can they be used to reach the stated goals and ...

  5. Reusable State Machine Code Generator

    Science.gov (United States)

    Hoffstadt, A. A.; Reyes, C.; Sommer, H.; Andolfato, L.

    2010-12-01

    The State Machine model is frequently used to represent the behaviour of a system, allowing one to express and execute this behaviour in a deterministic way. A graphical representation such as a UML State Chart diagram tames the complexity of the system, thus facilitating changes to the model and communication between developers and domain experts. We present a reusable state machine code generator, developed by the Universidad Técnica Federico Santa María and the European Southern Observatory. The generator itself is based on the open source project architecture, and uses UML State Chart models as input. This allows for a modular design and a clean separation between generator and generated code. The generated state machine code has well-defined interfaces that are independent of the implementation artefacts such as the middle-ware. This allows using the generator in the substantially different observatory software of the Atacama Large Millimeter Array and the ESO Very Large Telescope. A project-specific mapping layer for event and transition notification connects the state machine code to its environment, which can be the Common Software of these projects, or any other project. This approach even allows to automatically create tests for a generated state machine, using techniques from software testing, such as path-coverage.

  6. Liberalism, Speech Codes, and Related Problems.

    Science.gov (United States)

    Sunstein, Cass R.

    1993-01-01

    It is argued that universities are pervasively and necessarily engaged in regulation of speech, which complicates many existing claims about hate speech codes on campus. The ultimate test is whether the restriction on speech is a legitimate part of the institution's mission, commitment to liberal education. (MSE)

  7. Liberalism, Speech Codes, and Related Problems.

    Science.gov (United States)

    Sunstein, Cass R.

    1993-01-01

    It is argued that universities are pervasively and necessarily engaged in regulation of speech, which complicates many existing claims about hate speech codes on campus. The ultimate test is whether the restriction on speech is a legitimate part of the institution's mission, commitment to liberal education. (MSE)

  8. Machine speech and speaking about machines

    Energy Technology Data Exchange (ETDEWEB)

    Nye, A. [Univ. of Wisconsin, Whitewater, WI (United States)

    1996-12-31

    Current philosophy of language prides itself on scientific status. It boasts of being no longer contaminated with queer mental entities or idealist essences. It theorizes language as programmable variants of formal semantic systems, reimaginable either as the properly epiphenomenal machine functions of computer science or the properly material neural networks of physiology. Whether or not such models properly capture the physical workings of a living human brain is a question that scientists will have to answer. I, as a philosopher, come at the problem from another direction. Does contemporary philosophical semantics, in its dominant truth-theoretic and related versions, capture actual living human thought as it is experienced, or does it instead reflect, regardless of (perhaps dubious) scientific credentials, pathology of thought, a pathology with a disturbing social history.

  9. Ultra low bit-rate speech coding

    CERN Document Server

    Ramasubramanian, V

    2015-01-01

    "Ultra Low Bit-Rate Speech Coding" focuses on the specialized topic of speech coding at very low bit-rates of 1 Kbits/sec and less, particularly at the lower ends of this range, down to 100 bps. The authors set forth the fundamental results and trends that form the basis for such ultra low bit-rates to be viable and provide a comprehensive overview of various techniques and systems in literature to date, with particular attention to their work in the paradigm of unit-selection based segment quantization. The book is for research students, academic faculty and researchers, and industry practitioners in the areas of speech processing and speech coding.

  10. INTEGRATING MACHINE TRANSLATION AND SPEECH SYNTHESIS COMPONENT FOR ENGLISH TO DRAVIDIAN LANGUAGE SPEECH TO SPEECH TRANSLATION SYSTEM

    Directory of Open Access Journals (Sweden)

    J. SANGEETHA

    2015-02-01

    Full Text Available This paper provides an interface between the machine translation and speech synthesis system for converting English speech to Tamil text in English to Tamil speech to speech translation system. The speech translation system consists of three modules: automatic speech recognition, machine translation and text to speech synthesis. Many procedures for incorporation of speech recognition and machine translation have been projected. Still speech synthesis system has not yet been measured. In this paper, we focus on integration of machine translation and speech synthesis, and report a subjective evaluation to investigate the impact of speech synthesis, machine translation and the integration of machine translation and speech synthesis components. Here we implement a hybrid machine translation (combination of rule based and statistical machine translation and concatenative syllable based speech synthesis technique. In order to retain the naturalness and intelligibility of synthesized speech Auto Associative Neural Network (AANN prosody prediction is used in this work. The results of this system investigation demonstrate that the naturalness and intelligibility of the synthesized speech are strongly influenced by the fluency and correctness of the translated text.

  11. Sparsity in Linear Predictive Coding of Speech

    DEFF Research Database (Denmark)

    Giacobello, Daniele

    modern speech coders. In the first part of the thesis, we provide an overview of Sparse Linear Predic- tion, a set of speech processing tools created by introducing sparsity constraints into the LP framework. This approach defines predictors that look for a sparse residual rather than a minimum variance...... of high-order sparse predictors. These predictors, by modeling efficiently the spectral envelope and the harmonics components with very few coefficients, have direct applications in speech processing, engendering a joint estimation of short-term and long-term predictors. We also give preliminary results...... sensing formulation. Furthermore, we define a novel re-estimation procedure to adapt the predictor coefficients to the given sparse excitation, balancing the two representations in the context of speech coding. Finally, the advantages of the compact parametric representation of a segment of speech, given...

  12. Only Speech Codes Should Be Censored

    Science.gov (United States)

    Pavela, Gary

    2006-01-01

    In this article, the author discusses the enforcement of "hate speech" codes and confirms research that considers why U.S. colleges and universities continue to promulgate student disciplinary rules prohibiting expression that "subordinates" others or is "demeaning, offensive, or hateful." Such continued adherence to…

  13. Machine structure oriented control code logic

    NARCIS (Netherlands)

    Bergstra, J.A.; Middelburg, C.A.

    2009-01-01

    Control code is a concept that is closely related to a frequently occurring practitioner’s view on what is a program: code that is capable of controlling the behaviour of some machine. We present a logical approach to explain issues concerning control codes that are independent of the details of the

  14. Machines are benchmarked by code, not algorithms

    NARCIS (Netherlands)

    Poss, R.

    2013-01-01

    This article highlights how small modifications to either the source code of a benchmark program or the compilation options may impact its behavior on a specific machine. It argues that for evaluating machines, benchmark providers and users be careful to ensure reproducibility of results based on th

  15. Auditory inspired machine learning techniques can improve speech intelligibility and quality for hearing-impaired listeners.

    Science.gov (United States)

    Monaghan, Jessica J M; Goehring, Tobias; Yang, Xin; Bolner, Federico; Wang, Shangqiguo; Wright, Matthew C M; Bleeck, Stefan

    2017-03-01

    Machine-learning based approaches to speech enhancement have recently shown great promise for improving speech intelligibility for hearing-impaired listeners. Here, the performance of three machine-learning algorithms and one classical algorithm, Wiener filtering, was compared. Two algorithms based on neural networks were examined, one using a previously reported feature set and one using a feature set derived from an auditory model. The third machine-learning approach was a dictionary-based sparse-coding algorithm. Speech intelligibility and quality scores were obtained for participants with mild-to-moderate hearing impairments listening to sentences in speech-shaped noise and multi-talker babble following processing with the algorithms. Intelligibility and quality scores were significantly improved by each of the three machine-learning approaches, but not by the classical approach. The largest improvements for both speech intelligibility and quality were found by implementing a neural network using the feature set based on auditory modeling. Furthermore, neural network based techniques appeared more promising than dictionary-based, sparse coding in terms of performance and ease of implementation.

  16. University Hate Speech Codes: Toward an Approach Restricting Verbal Attack.

    Science.gov (United States)

    Hanson, Jim

    This paper reviews events leading to the University of Michigan speech codes, identifies the state of the law following the Doe v. the University of Michigan decision, points out problems in suggested alternatives to the code, and outlines an approach that protects students from hate speech while maintaining first amendment rights. The paper first…

  17. Efficient Derivation and Approximations of Cepstral Coefficients for Speech Coding

    Science.gov (United States)

    1992-12-01

    A new formulation is presented for the calculation of cepstral coefficients directly from measured sine wave amplitudes and frequencies of speech waveforms. Approximations to these cepstral coefficients are shown to be suitable for operation in a real-time speech coding environment. These results were encoded in the C programming language and then evaluated through experiments that were conducted on the McAulay-Quatieri Sinusoidal Transform Coder (STC).... Speech coding, Cepstral processing.

  18. Spotlight on Speech Codes 2012: The State of Free Speech on Our Nation's Campuses

    Science.gov (United States)

    Foundation for Individual Rights in Education (NJ1), 2012

    2012-01-01

    The U.S. Supreme Court has called America's colleges and universities "vital centers for the Nation's intellectual life," but the reality today is that many of these institutions severely restrict free speech and open debate. Speech codes--policies prohibiting student and faculty speech that would, outside the bounds of campus, be…

  19. Spotlight on Speech Codes 2007: The State of Free Speech on Our Nation's Campuses

    Science.gov (United States)

    Foundation for Individual Rights in Education (NJ1), 2007

    2007-01-01

    Last year, the Foundation for Individual Rights in Education (FIRE) conducted its first-ever comprehensive study of restrictions on speech at America's colleges and universities, "Spotlight on Speech Codes 2006: The State of Free Speech on our Nation's Campuses." In light of the essentiality of free expression to a truly liberal…

  20. Reversible machine code and its abstract processor architecture

    DEFF Research Database (Denmark)

    Axelsen, Holger Bock; Glück, Robert; Yokoyama, Tetsuo

    2007-01-01

    A reversible abstract machine architecture and its reversible machine code are presented and formalized. For machine code to be reversible, both the underlying control logic and each instruction must be reversible. A general class of machine instruction sets was proven to be reversible, building...

  1. Optimal interference code based on machine learning

    Science.gov (United States)

    Qian, Ye; Chen, Qian; Hu, Xiaobo; Cao, Ercong; Qian, Weixian; Gu, Guohua

    2016-10-01

    In this paper, we analyze the characteristics of pseudo-random code, by the case of m sequence. Depending on the description of coding theory, we introduce the jamming methods. We simulate the interference effect or probability model by the means of MATLAB to consolidate. In accordance with the length of decoding time the adversary spends, we find out the optimal formula and optimal coefficients based on machine learning, then we get the new optimal interference code. First, when it comes to the phase of recognition, this study judges the effect of interference by the way of simulating the length of time over the decoding period of laser seeker. Then, we use laser active deception jamming simulate interference process in the tracking phase in the next block. In this study we choose the method of laser active deception jamming. In order to improve the performance of the interference, this paper simulates the model by MATLAB software. We find out the least number of pulse intervals which must be received, then we can make the conclusion that the precise interval number of the laser pointer for m sequence encoding. In order to find the shortest space, we make the choice of the greatest common divisor method. Then, combining with the coding regularity that has been found before, we restore pulse interval of pseudo-random code, which has been already received. Finally, we can control the time period of laser interference, get the optimal interference code, and also increase the probability of interference as well.

  2. Lattice Vector Quantization Applied to Speech and Audio Coding

    Institute of Scientific and Technical Information of China (English)

    Minjie Xie

    2012-01-01

    Lattice vector quantization (LVQ) has been used for real-time speech and audio coding systems. Compared with conventional vector quantization, LVQ has two main advantages: It has a simple and fast encoding process, and it significantly reduces the amount of memory required. Therefore, LVQ is suitable for use in low-complexity speech and audio coding. In this paper, we describe the basic concepts of LVQ and its advantages over conventional vector quantization. We also describe some LVQ techniques that have been used in speech and audio coding standards of international standards developing organizations (SDOs).

  3. Mistaking minds and machines: How speech affects dehumanization and anthropomorphism.

    Science.gov (United States)

    Schroeder, Juliana; Epley, Nicholas

    2016-11-01

    Treating a human mind like a machine is an essential component of dehumanization, whereas attributing a humanlike mind to a machine is an essential component of anthropomorphism. Here we tested how a cue closely connected to a person's actual mental experience-a humanlike voice-affects the likelihood of mistaking a person for a machine, or a machine for a person. We predicted that paralinguistic cues in speech are particularly likely to convey the presence of a humanlike mind, such that removing voice from communication (leaving only text) would increase the likelihood of mistaking the text's creator for a machine. Conversely, adding voice to a computer-generated script (resulting in speech) would increase the likelihood of mistaking the text's creator for a human. Four experiments confirmed these hypotheses, demonstrating that people are more likely to infer a human (vs. computer) creator when they hear a voice expressing thoughts than when they read the same thoughts in text. Adding human visual cues to text (i.e., seeing a person perform a script in a subtitled video clip), did not increase the likelihood of inferring a human creator compared with only reading text, suggesting that defining features of personhood may be conveyed more clearly in speech (Experiments 1 and 2). Removing the naturalistic paralinguistic cues that convey humanlike capacity for thinking and feeling, such as varied pace and intonation, eliminates the humanizing effect of speech (Experiment 4). We discuss implications for dehumanizing others through text-based media, and for anthropomorphizing machines through speech-based media. (PsycINFO Database Record

  4. Reversible machine code and its abstract processor architecture

    DEFF Research Database (Denmark)

    Axelsen, Holger Bock; Glück, Robert; Yokoyama, Tetsuo

    2007-01-01

    A reversible abstract machine architecture and its reversible machine code are presented and formalized. For machine code to be reversible, both the underlying control logic and each instruction must be reversible. A general class of machine instruction sets was proven to be reversible, building ...... on our concept of reversible updates. The presentation is abstract and can serve as a guideline for a family of reversible processor designs. By example, we illustrate programming principles for the abstract machine architecture formalized in this paper....

  5. The Efficient Coding of Speech: Cross-Linguistic Differences.

    Science.gov (United States)

    Guevara Erra, Ramon; Gervain, Judit

    2016-01-01

    Neural coding in the auditory system has been shown to obey the principle of efficient neural coding. The statistical properties of speech appear to be particularly well matched to the auditory neural code. However, only English has so far been analyzed from an efficient coding perspective. It thus remains unknown whether such an approach is able to capture differences between the sound patterns of different languages. Here, we use independent component analysis to derive information theoretically optimal, non-redundant codes (filter populations) for seven typologically distinct languages (Dutch, English, Japanese, Marathi, Polish, Spanish and Turkish) and relate the statistical properties of these filter populations to documented differences in the speech rhythms (Analysis 1) and consonant inventories (Analysis 2) of these languages. We show that consonant class membership plays a particularly important role in shaping the statistical structure of speech in different languages, suggesting that acoustic transience, a property that discriminates consonant classes from one another, is highly relevant for efficient coding.

  6. Compressed Domain Packet Loss Concealment of Sinusoidally Coded Speech

    DEFF Research Database (Denmark)

    Rødbro, Christoffer A.; Christensen, Mads Græsbøll; Andersen, Søren Vang

    2003-01-01

    We consider the problem of packet loss concealment for voice over IP (VoIP). The speech signal is compressed at the transmitter using a sinusoidal coding scheme working at 8 kbit/s. At the receiver, packet loss concealment is carried out working directly on the quantized sinusoidal parameters......, based on time-scaling of the packets surrounding the missing ones. Subjective listening tests show promising results indicating the potential of sinusoidal speech coding for VoIP....

  7. Emotional State Categorization from Speech: Machine vs. Human

    CERN Document Server

    Shaukat, Arslan

    2010-01-01

    This paper presents our investigations on emotional state categorization from speech signals with a psychologically inspired computational model against human performance under the same experimental setup. Based on psychological studies, we propose a multistage categorization strategy which allows establishing an automatic categorization model flexibly for a given emotional speech categorization task. We apply the strategy to the Serbian Emotional Speech Corpus (GEES) and the Danish Emotional Speech Corpus (DES), where human performance was reported in previous psychological studies. Our work is the first attempt to apply machine learning to the GEES corpus where the human recognition rates were only available prior to our study. Unlike the previous work on the DES corpus, our work focuses on a comparison to human performance under the same experimental settings. Our studies suggest that psychology-inspired systems yield behaviours that, to a great extent, resemble what humans perceived and their performance ...

  8. Mandarin Digits Speech Recognition Using Support Vector Machines

    Institute of Scientific and Technical Information of China (English)

    XIE Xiang; KUANG Jing-ming

    2005-01-01

    A method of applying support vector machine (SVM) in speech recognition was proposed, and a speech recognition system for mandarin digits was built up by SVMs. In the system, vectors were linearly extracted from speech feature sequence to make up time-aligned input patterns for SVM, and the decisions of several 2-class SVM classifiers were employed for constructing an N-class classifier. Four kinds of SVM kernel functions were compared in the experiments of speaker-independent speech recognition of mandarin digits. And the kernel of radial basis function has the highest accurate rate of 99.33%, which is better than that of the baseline system based on hidden Markov models (HMM) (97.08%). And the experiments also show that SVM can outperform HMM especially when the samples for learning were very limited.

  9. Brain-machine interfaces for real-time speech synthesis.

    Science.gov (United States)

    Guenther, Frank H; Brumberg, Jonathan S

    2011-01-01

    This paper reports on studies involving brain-machine interfaces (BMIs) that provide near-instantaneous audio feedback from a speech synthesizer to the BMI user. In one study, neural signals recorded by an intracranial electrode implanted in a speech-related region of the left precentral gyrus of a human volunteer suffering from locked-in syndrome were transmitted wirelessly across the scalp and used to drive a formant synthesizer, allowing the user to produce vowels. In a second, pilot study, a neurologically normal user was able to drive the formant synthesizer with imagined movements detected using electroencephalography. Our results support the feasibility of neural prostheses that have the potential to provide near-conversational synthetic speech for individuals with severely impaired speech output.

  10. Understanding and Writing G & M Code for CNC Machines

    Science.gov (United States)

    Loveland, Thomas

    2012-01-01

    In modern CAD and CAM manufacturing companies, engineers design parts for machines and consumable goods. Many of these parts are cut on CNC machines. Whether using a CNC lathe, milling machine, or router, the ideas and designs of engineers must be translated into a machine-readable form called G & M Code that can be used to cut parts to precise…

  11. Understanding and Writing G & M Code for CNC Machines

    Science.gov (United States)

    Loveland, Thomas

    2012-01-01

    In modern CAD and CAM manufacturing companies, engineers design parts for machines and consumable goods. Many of these parts are cut on CNC machines. Whether using a CNC lathe, milling machine, or router, the ideas and designs of engineers must be translated into a machine-readable form called G & M Code that can be used to cut parts to precise…

  12. Quantitative information measurement and application for machine component classification codes

    Institute of Scientific and Technical Information of China (English)

    LI Ling-Feng; TAN Jian-rong; LIU Bo

    2005-01-01

    Information embodied in machine component classification codes has internal relation with the probability distribution of the code symbol. This paper presents a model considering codes as information source based on Shannon's information theory. Using information entropy, it preserves the mathematical form and quantitatively measures the information amount of a symbol and a bit in the machine component classification coding system. It also gets the maximum value of information amount and the corresponding coding scheme when the category of symbols is fixed. Samples are given to show how to evaluate the information amount of component codes and how to optimize a coding system.

  13. Machine function based control code algebras

    NARCIS (Netherlands)

    Bergstra, J.A.

    2008-01-01

    Machine functions have been introduced by Earley and Sturgis in [6] in order to provide a mathematical foundation of the use of the T-diagrams proposed by Bratman in [5]. Machine functions describe the operation of a machine at a very abstract level. A theory of hardware and software based on machin

  14. Techniques of Very Low Bit-Rate Speech Coding1

    Institute of Scientific and Technical Information of China (English)

    CUIHuijuan; TANGKun; ZHAOMing; ZHANGXin

    2004-01-01

    Techniques of very low bit-rate speech coding,such as lower than 800 bps are presented in this paper. The techniques of multi-frame, multi-sub-band, multimodel, and vector quantization etc. are effective to decrease the bit-rate of vocoders based on a linear prediction model. These techniques bring the vocoder not only high quality of the reconstructed speech, but also robustness.Vocoders which apply those techniques can synthesize clear and intelligent speech with some naturalness. The mean DRT (Diagnostic rhyme test) score of an 800 bps vocoder is 89.2% and 86.3% for a 600 bps vocoder.

  15. Pants and Hats: Dress Codes and Expressive Conduct as Speech.

    Science.gov (United States)

    DeMitchell, Todd A.

    1999-01-01

    It has been 30 years since the U.S. Supreme Court, in the "Tinker" case, upheld three students' right to wear black armbands protesting the Vietnam War to school. Recent cases involving sagging pants and an African headwrap (dress code violations) did not meet allowable "free-speech" requirements. (MLH)

  16. The Efficient Coding of Speech: Cross-Linguistic Differences.

    Directory of Open Access Journals (Sweden)

    Ramon Guevara Erra

    Full Text Available Neural coding in the auditory system has been shown to obey the principle of efficient neural coding. The statistical properties of speech appear to be particularly well matched to the auditory neural code. However, only English has so far been analyzed from an efficient coding perspective. It thus remains unknown whether such an approach is able to capture differences between the sound patterns of different languages. Here, we use independent component analysis to derive information theoretically optimal, non-redundant codes (filter populations for seven typologically distinct languages (Dutch, English, Japanese, Marathi, Polish, Spanish and Turkish and relate the statistical properties of these filter populations to documented differences in the speech rhythms (Analysis 1 and consonant inventories (Analysis 2 of these languages. We show that consonant class membership plays a particularly important role in shaping the statistical structure of speech in different languages, suggesting that acoustic transience, a property that discriminates consonant classes from one another, is highly relevant for efficient coding.

  17. Continuous speech recognition with sparse coding

    CSIR Research Space (South Africa)

    Smit, WJ

    2009-04-01

    Full Text Available . This algorithm finds a sparse code in reasonable time if the input is limited to a fairly coarse spectral resolution. At this resolution, our system achieves a word error rate of 19%, whereas a system based on Hidden Markov Models achieves a word error rate of 15...

  18. Shared acoustic codes underlie emotional communication in music and speech-Evidence from deep transfer learning.

    Science.gov (United States)

    Coutinho, Eduardo; Schuller, Björn

    2017-01-01

    Music and speech exhibit striking similarities in the communication of emotions in the acoustic domain, in such a way that the communication of specific emotions is achieved, at least to a certain extent, by means of shared acoustic patterns. From an Affective Sciences points of view, determining the degree of overlap between both domains is fundamental to understand the shared mechanisms underlying such phenomenon. From a Machine learning perspective, the overlap between acoustic codes for emotional expression in music and speech opens new possibilities to enlarge the amount of data available to develop music and speech emotion recognition systems. In this article, we investigate time-continuous predictions of emotion (Arousal and Valence) in music and speech, and the Transfer Learning between these domains. We establish a comparative framework including intra- (i.e., models trained and tested on the same modality, either music or speech) and cross-domain experiments (i.e., models trained in one modality and tested on the other). In the cross-domain context, we evaluated two strategies-the direct transfer between domains, and the contribution of Transfer Learning techniques (feature-representation-transfer based on Denoising Auto Encoders) for reducing the gap in the feature space distributions. Our results demonstrate an excellent cross-domain generalisation performance with and without feature representation transfer in both directions. In the case of music, cross-domain approaches outperformed intra-domain models for Valence estimation, whereas for Speech intra-domain models achieve the best performance. This is the first demonstration of shared acoustic codes for emotional expression in music and speech in the time-continuous domain.

  19. Listening for the norm: adaptive coding in speech categorization

    Directory of Open Access Journals (Sweden)

    Jingyuan eHuang

    2012-02-01

    Full Text Available Perceptual aftereffects have been referred to as the psychologist’s microelectrode because they can expose dimensions of representation through the residual effect of a context stimulus upon perception of a subsequent target. The present study uses such context dependence to examine the dimensions of representation involved in a classic demonstration of talker normalization in speech perception. Whereas most accounts of talker normalization have emphasized talker-, speech- or articulatory-specific dimensions’ significance, the present work tests an alternative hypothesis: that the long-term average spectrum of speech context is responsible for patterns of context-dependent perception considered to be evidence for talker normalization. In support of this hypothesis, listeners’ vowel categorization was equivalently influenced by speech contexts manipulated to sound as though they were spoken by different talkers and nonspeech analogs matched in LTAS to the speech contexts. Since the nonspeech contexts did not possess talker, speech or articulatory information, general perceptual mechanisms are implicated. Results are described in terms of adaptive perceptual coding.

  20. Tools for signal compression applications to speech and audio coding

    CERN Document Server

    Moreau, Nicolas

    2013-01-01

    This book presents tools and algorithms required to compress/uncompress signals such as speech and music. These algorithms are largely used in mobile phones, DVD players, HDTV sets, etc. In a first rather theoretical part, this book presents the standard tools used in compression systems: scalar and vector quantization, predictive quantization, transform quantization, entropy coding. In particular we show the consistency between these different tools. The second part explains how these tools are used in the latest speech and audio coders. The third part gives Matlab programs simulating t

  1. A portable virtual machine target for proof-carrying code

    DEFF Research Database (Denmark)

    Franz, Michael; Chandra, Deepak; Gal, Andreas

    2005-01-01

    Virtual Machines (VMs) and Proof-Carrying Code (PCC) are two techniques that have been used independently to provide safety for (mobile) code. Existing virtual machines, such as the Java VM, have several drawbacks: First, the effort required for safety verification is considerable. Second and mor...... simultaneously providing efficient justin-time compilation and target-machine independence. In particular, our approach reduces the complexity of the required proofs, resulting in fewer proof obligations that need to be discharged at the target machine.......Virtual Machines (VMs) and Proof-Carrying Code (PCC) are two techniques that have been used independently to provide safety for (mobile) code. Existing virtual machines, such as the Java VM, have several drawbacks: First, the effort required for safety verification is considerable. Second and more...... subtly, the need to provide such verification by the code consumer inhibits the amount of optimization that can be performed by the code producer. This in turn makes justin-time compilation surprisingly expensive. Proof-Carrying Code, on the other hand, has its own set of limitations, among which...

  2. Medium-rate speech coding simulator for mobile satellite systems

    Science.gov (United States)

    Copperi, Maurizio; Perosino, F.; Rusina, F.; Albertengo, G.; Biglieri, E.

    1986-01-01

    Channel modeling and error protection schemes for speech coding are described. A residual excited linear predictive (RELP) coder for bit rates 4.8, 7.2, and 9.6 kbit/sec is outlined. The coder at 9.6 kbit/sec incorporates a number of channel error protection techniques, such as bit interleaving, error correction codes, and parameter repetition. Results of formal subjective experiments (DRT and DAM tests) under various channel conditions, reveal that the proposed coder outperforms conventional LPC-10 vocoders by 2 subjective categories, thus confirming the suitability of the RELP coder at 9.6 kbit/sec for good quality speech transmission in mobile satellite systems.

  3. "Perception of the speech code" revisited: Speech is alphabetic after all.

    Science.gov (United States)

    Fowler, Carol A; Shankweiler, Donald; Studdert-Kennedy, Michael

    2016-03-01

    We revisit an article, "Perception of the Speech Code" (PSC), published in this journal 50 years ago (Liberman, Cooper, Shankweiler, & Studdert-Kennedy, 1967) and address one of its legacies concerning the status of phonetic segments, which persists in theories of speech today. In the perspective of PSC, segments both exist (in language as known) and do not exist (in articulation or the acoustic speech signal). Findings interpreted as showing that speech is not a sound alphabet, but, rather, phonemes are encoded in the signal, coupled with findings that listeners perceive articulation, led to the motor theory of speech perception, a highly controversial legacy of PSC. However, a second legacy, the paradoxical perspective on segments has been mostly unquestioned. We remove the paradox by offering an alternative supported by converging evidence that segments exist in language both as known and as used. We support the existence of segments in both language knowledge and in production by showing that phonetic segments are articulatory and dynamic and that coarticulation does not eliminate them. We show that segments leave an acoustic signature that listeners can track. This suggests that speech is well-adapted to public communication in facilitating, not creating a barrier to, exchange of language forms. (c) 2016 APA, all rights reserved).

  4. Relating dynamic brain states to dynamic machine states: Human and machine solutions to the speech recognition problem.

    Science.gov (United States)

    Wingfield, Cai; Su, Li; Liu, Xunying; Zhang, Chao; Woodland, Phil; Thwaites, Andrew; Fonteneau, Elisabeth; Marslen-Wilson, William D

    2017-09-01

    There is widespread interest in the relationship between the neurobiological systems supporting human cognition and emerging computational systems capable of emulating these capacities. Human speech comprehension, poorly understood as a neurobiological process, is an important case in point. Automatic Speech Recognition (ASR) systems with near-human levels of performance are now available, which provide a computationally explicit solution for the recognition of words in continuous speech. This research aims to bridge the gap between speech recognition processes in humans and machines, using novel multivariate techniques to compare incremental 'machine states', generated as the ASR analysis progresses over time, to the incremental 'brain states', measured using combined electro- and magneto-encephalography (EMEG), generated as the same inputs are heard by human listeners. This direct comparison of dynamic human and machine internal states, as they respond to the same incrementally delivered sensory input, revealed a significant correspondence between neural response patterns in human superior temporal cortex and the structural properties of ASR-derived phonetic models. Spatially coherent patches in human temporal cortex responded selectively to individual phonetic features defined on the basis of machine-extracted regularities in the speech to lexicon mapping process. These results demonstrate the feasibility of relating human and ASR solutions to the problem of speech recognition, and suggest the potential for further studies relating complex neural computations in human speech comprehension to the rapidly evolving ASR systems that address the same problem domain.

  5. Anthropomorphic Coding of Speech and Audio: A Model Inversion Approach

    Directory of Open Access Journals (Sweden)

    W. Bastiaan Kleijn

    2005-06-01

    Full Text Available Auditory modeling is a well-established methodology that provides insight into human perception and that facilitates the extraction of signal features that are most relevant to the listener. The aim of this paper is to provide a tutorial on perceptual speech and audio coding using an invertible auditory model. In this approach, the audio signal is converted into an auditory representation using an invertible auditory model. The auditory representation is quantized and coded. Upon decoding, it is then transformed back into the acoustic domain. This transformation converts a complex distortion criterion into a simple one, thus facilitating quantization with low complexity. We briefly review past work on auditory models and describe in more detail the components of our invertible model and its inversion procedure, that is, the method to reconstruct the signal from the output of the auditory model. We summarize attempts to use the auditory representation for low-bit-rate coding. Our approach also allows the exploitation of the inherent redundancy of the human auditory system for the purpose of multiple description (joint source-channel coding.

  6. Neural Coding of Formant-Exaggerated Speech in the Infant Brain

    Science.gov (United States)

    Zhang, Yang; Koerner, Tess; Miller, Sharon; Grice-Patil, Zach; Svec, Adam; Akbari, David; Tusler, Liz; Carney, Edward

    2011-01-01

    Speech scientists have long proposed that formant exaggeration in infant-directed speech plays an important role in language acquisition. This event-related potential (ERP) study investigated neural coding of formant-exaggerated speech in 6-12-month-old infants. Two synthetic /i/ vowels were presented in alternating blocks to test the effects of…

  7. Neural Coding of Formant-Exaggerated Speech in the Infant Brain

    Science.gov (United States)

    Zhang, Yang; Koerner, Tess; Miller, Sharon; Grice-Patil, Zach; Svec, Adam; Akbari, David; Tusler, Liz; Carney, Edward

    2011-01-01

    Speech scientists have long proposed that formant exaggeration in infant-directed speech plays an important role in language acquisition. This event-related potential (ERP) study investigated neural coding of formant-exaggerated speech in 6-12-month-old infants. Two synthetic /i/ vowels were presented in alternating blocks to test the effects of…

  8. On Cascade Source Coding with A Side Information "Vending Machine"

    CERN Document Server

    Ahmadi, Behzad; Choudhuri, Chiranjib; Mitra, Urbashi

    2012-01-01

    The model of a side information "vending machine" accounts for scenarios in which acquiring side information is costly and thus should be done efficiently. In this paper, the three-node cascade source coding problem is studied under the assumption that a side information vending machine is available either at the intermediate or at the end node. In both cases, a single-letter characterization of the available trade-offs among the rate, the distortions in the reconstructions at the intermediate and at the end node, and the cost in acquiring the side information are derived under given conditions.

  9. A Support Vector Machine-Based Dynamic Network for Visual Speech Recognition Applications

    Directory of Open Access Journals (Sweden)

    Mihaela Gordan

    2002-11-01

    Full Text Available Visual speech recognition is an emerging research field. In this paper, we examine the suitability of support vector machines for visual speech recognition. Each word is modeled as a temporal sequence of visemes corresponding to the different phones realized. One support vector machine is trained to recognize each viseme and its output is converted to a posterior probability through a sigmoidal mapping. To model the temporal character of speech, the support vector machines are integrated as nodes into a Viterbi lattice. We test the performance of the proposed approach on a small visual speech recognition task, namely the recognition of the first four digits in English. The word recognition rate obtained is at the level of the previous best reported rates.

  10. Speech rhythms and multiplexed oscillatory sensory coding in the human brain.

    Directory of Open Access Journals (Sweden)

    Joachim Gross

    2013-12-01

    Full Text Available Cortical oscillations are likely candidates for segmentation and coding of continuous speech. Here, we monitored continuous speech processing with magnetoencephalography (MEG to unravel the principles of speech segmentation and coding. We demonstrate that speech entrains the phase of low-frequency (delta, theta and the amplitude of high-frequency (gamma oscillations in the auditory cortex. Phase entrainment is stronger in the right and amplitude entrainment is stronger in the left auditory cortex. Furthermore, edges in the speech envelope phase reset auditory cortex oscillations thereby enhancing their entrainment to speech. This mechanism adapts to the changing physical features of the speech envelope and enables efficient, stimulus-specific speech sampling. Finally, we show that within the auditory cortex, coupling between delta, theta, and gamma oscillations increases following speech edges. Importantly, all couplings (i.e., brain-speech and also within the cortex attenuate for backward-presented speech, suggesting top-down control. We conclude that segmentation and coding of speech relies on a nested hierarchy of entrained cortical oscillations.

  11. Machine learning based sample extraction for automatic speech recognition using dialectal Assamese speech.

    Science.gov (United States)

    Agarwalla, Swapna; Sarma, Kandarpa Kumar

    2016-06-01

    Automatic Speaker Recognition (ASR) and related issues are continuously evolving as inseparable elements of Human Computer Interaction (HCI). With assimilation of emerging concepts like big data and Internet of Things (IoT) as extended elements of HCI, ASR techniques are found to be passing through a paradigm shift. Oflate, learning based techniques have started to receive greater attention from research communities related to ASR owing to the fact that former possess natural ability to mimic biological behavior and that way aids ASR modeling and processing. The current learning based ASR techniques are found to be evolving further with incorporation of big data, IoT like concepts. Here, in this paper, we report certain approaches based on machine learning (ML) used for extraction of relevant samples from big data space and apply them for ASR using certain soft computing techniques for Assamese speech with dialectal variations. A class of ML techniques comprising of the basic Artificial Neural Network (ANN) in feedforward (FF) and Deep Neural Network (DNN) forms using raw speech, extracted features and frequency domain forms are considered. The Multi Layer Perceptron (MLP) is configured with inputs in several forms to learn class information obtained using clustering and manual labeling. DNNs are also used to extract specific sentence types. Initially, from a large storage, relevant samples are selected and assimilated. Next, a few conventional methods are used for feature extraction of a few selected types. The features comprise of both spectral and prosodic types. These are applied to Recurrent Neural Network (RNN) and Fully Focused Time Delay Neural Network (FFTDNN) structures to evaluate their performance in recognizing mood, dialect, speaker and gender variations in dialectal Assamese speech. The system is tested under several background noise conditions by considering the recognition rates (obtained using confusion matrices and manually) and computation time

  12. Editing of EIA coded, numerically controlled, machine tool tapes

    Science.gov (United States)

    Weiner, J. M.

    1975-01-01

    Editing of numerically controlled (N/C) machine tool tapes (8-level paper tape) using an interactive graphic display processor is described. A rapid technique required for correcting production errors in N/C tapes was developed using the interactive text editor on the IMLAC PDS-ID graphic display system and two special programs resident on disk. The correction technique and special programs for processing N/C tapes coded to EIA specifications are discussed.

  13. Analytical Study of High Pitch Delay Resolution Technique for Tonal Speech Coding

    Directory of Open Access Journals (Sweden)

    Suphattharachai Chomphan

    2012-01-01

    Full Text Available Problem statement: In tonal-language speech, since tone plays important role not only on the naturalness and also the intelligibility of the speech, it must be treated appropriately in a speech coder algorithm. Approach: This study proposes an analytical study of the technique of High Pitch Delay Resolutions (HPDR applied to the adaptive codebook of core coder of Multi-Pulse based Code Excited Linear Predictive (MP-CELP coder. Results: The experimental results show that the speech quality of the MP-CELP speech coder with HPDR technique is improved above the speech quality of the conventional coder. An optimum resolution of pitch delay is also presented. Conclusion: From the analytical study, it has been found that the proposed technique can improve the speech coding quality.

  14. Speech/Music Classification Enhancement for 3GPP2 SMV Codec Based on Support Vector Machine

    Science.gov (United States)

    Kim, Sang-Kyun; Chang, Joon-Hyuk

    In this letter, we propose a novel approach to speech/music classification based on the support vector machine (SVM) to improve the performance of the 3GPP2 selectable mode vocoder (SMV) codec. We first analyze the features and the classification method used in real time speech/music classification algorithm in SMV, and then apply the SVM for enhanced speech/music classification. For evaluation of performance, we compare the proposed algorithm and the traditional algorithm of the SMV. The performance of the proposed system is evaluated under the various environments and shows better performance compared to the original method in the SMV.

  15. Compiler design handbook optimizations and machine code generation

    CERN Document Server

    Srikant, YN

    2003-01-01

    The widespread use of object-oriented languages and Internet security concerns are just the beginning. Add embedded systems, multiple memory banks, highly pipelined units operating in parallel, and a host of other advances and it becomes clear that current and future computer architectures pose immense challenges to compiler designers-challenges that already exceed the capabilities of traditional compilation techniques. The Compiler Design Handbook: Optimizations and Machine Code Generation is designed to help you meet those challenges. Written by top researchers and designers from around the

  16. Insights and Implications of Campus Hate Speech Codes.

    Science.gov (United States)

    Downey, John P.; Jennings, Peggy

    Student affairs personnel must both ensure the safety and basic civil rights of students and also find ways to expose students to the consequences of their actions and speech. These obligations involve tensions between students' rights to free speech and their rights to equal protection under the Constitution, thus to education free from…

  17. BLIND SPEECH SEPARATION FOR ROBOTS WITH INTELLIGENT HUMAN-MACHINE INTERACTION

    Institute of Scientific and Technical Information of China (English)

    Huang Yulei; Ding Zhizhong; Dai Lirong; Chen Xiaoping

    2012-01-01

    Speech recognition rate will deteriorate greatly in human-machine interaction when the speaker's speech mixes with a bystander's voice.This paper proposes a time-frequency approach for Blind Source Seperation (BSS) for intelligent Human-Machine Interaction(HMI).Main idea of the algorithm is to simultaneously diagonalize the correlation matrix of the pre-whitened signals at different time delays for every frequency bins in time-frequency domain.The prososed method has two merits:(1) fast convergence speed; (2) high signal to interference ratio of the separated signals.Numerical evaluations are used to compare the performance of the proposed algorithm with two other deconvolution algorithms.An efficient algorithm to resolve permutation ambiguity is also proposed in this paper.The algorithm proposed saves more than 10% of computational time with properly selected parameters and achieves good performances for both simulated convolutive mixtures and real room recorded speeches.

  18. High Pitch Delay Resolution Technique for Tonal Language Speech Coding Based on Multi-Pulse Based Code Excited Linear Prediction Algorithm

    Directory of Open Access Journals (Sweden)

    Suphattharachai Chomphan

    2011-01-01

    Full Text Available Problem statement: In spontaneous speech communication, speech coding is an important process that should be taken into account, since the quality of coded speech depends on the efficiency of the speech coding algorithm. As for tonal language which tone plays important role not only on the naturalness and also the intelligibility of the speech, tone must be treated appropriately. Approach: This study proposes a modification of flexible Multi-Pulse based Code Excited Linear Predictive (MP-CELP coder with multiple bitrates and bitrate scalabilities for tonal language speech in the multimedia applications. The coder consists of a core coder and bitrate scalable tools. The High Pitch Delay Resolutions (HPDR are applied to the adaptive codebook of core coder for tonal language speech quality improvement. The bitrate scalable tool employs multi-stage excitation coding based on an embedded-coding approach. The multi-pulse excitation codebook at each stage is adaptively produced depending on the selected excitation signal at the previous stage. Results: The experimental results show that the speech quality of the proposed coder is improved above the speech quality of the conventional coder without pitch-resolution adaptation. Conclusion: From the study, it is a strong evidence to further apply the proposed technique in the speech coding systems or other speech processing technologies.

  19. An articulatorily constrained, maximum entropy approach to speech recognition and speech coding

    Energy Technology Data Exchange (ETDEWEB)

    Hogden, J.

    1996-12-31

    Hidden Markov models (HMM`s) are among the most popular tools for performing computer speech recognition. One of the primary reasons that HMM`s typically outperform other speech recognition techniques is that the parameters used for recognition are determined by the data, not by preconceived notions of what the parameters should be. This makes HMM`s better able to deal with intra- and inter-speaker variability despite the limited knowledge of how speech signals vary and despite the often limited ability to correctly formulate rules describing variability and invariance in speech. In fact, it is often the case that when HMM parameter values are constrained using the limited knowledge of speech, recognition performance decreases. However, the structure of an HMM has little in common with the mechanisms underlying speech production. Here, the author argues that by using probabilistic models that more accurately embody the process of speech production, he can create models that have all the advantages of HMM`s, but that should more accurately capture the statistical properties of real speech samples--presumably leading to more accurate speech recognition. The model he will discuss uses the fact that speech articulators move smoothly and continuously. Before discussing how to use articulatory constraints, he will give a brief description of HMM`s. This will allow him to highlight the similarities and differences between HMM`s and the proposed technique.

  20. Distinguishing protein-coding from non-coding RNAs through support vector machines.

    Directory of Open Access Journals (Sweden)

    Jinfeng Liu

    2006-04-01

    Full Text Available RIKEN's FANTOM project has revealed many previously unknown coding sequences, as well as an unexpected degree of variation in transcripts resulting from alternative promoter usage and splicing. Ever more transcripts that do not code for proteins have been identified by transcriptome studies, in general. Increasing evidence points to the important cellular roles of such non-coding RNAs (ncRNAs. The distinction of protein-coding RNA transcripts from ncRNA transcripts is therefore an important problem in understanding the transcriptome and carrying out its annotation. Very few in silico methods have specifically addressed this problem. Here, we introduce CONC (for "coding or non-coding", a novel method based on support vector machines that classifies transcripts according to features they would have if they were coding for proteins. These features include peptide length, amino acid composition, predicted secondary structure content, predicted percentage of exposed residues, compositional entropy, number of homologs from database searches, and alignment entropy. Nucleotide frequencies are also incorporated into the method. Confirmed coding cDNAs for eukaryotic proteins from the Swiss-Prot database constituted the set of true positives, ncRNAs from RNAdb and NONCODE the true negatives. Ten-fold cross-validation suggested that CONC distinguished coding RNAs from ncRNAs at about 97% specificity and 98% sensitivity. Applied to 102,801 mouse cDNAs from the FANTOM3 dataset, our method reliably identified over 14,000 ncRNAs and estimated the total number of ncRNAs to be about 28,000.

  1. CONVERGING TOWARDS A COMMON SPEECH CODE: IMITATIVE AND PERCEPTUO-MOTOR RECALIBRATION PROCESSES IN SPEECH PRODUCTION

    Directory of Open Access Journals (Sweden)

    Marc eSato

    2013-07-01

    Full Text Available Auditory and somatosensory systems play a key role in speech motor control. In the act of speaking, segmental speech movements are programmed to reach phonemic sensory goals, which in turn are used to estimate actual sensory feedback in order to further control production. The adult’s tendency to automatically imitate a number of acoustic-phonetic characteristics in another speaker's speech however suggests that speech production not only relies on the intended phonemic sensory goals and actual sensory feedback but also on the processing of external speech inputs. These online adaptive changes in speech production, or phonetic convergence effects, are thought to facilitate conversational exchange by contributing to setting a common perceptuo-motor ground between the speaker and the listener. In line with previous studies on phonetic convergence, we here demonstrate, in a non-interactive situation of communication, online unintentional and voluntary imitative changes in relevant acoustic features of acoustic vowel targets (fundamental and first formant frequencies during speech production and imitation. In addition, perceptuo-motor recalibration processes, or after-effects, occurred not only after vowel production and imitation but also after auditory categorization of the acoustic vowel targets. Altogether, these findings demonstrate adaptive plasticity of phonemic sensory-motor goals and suggest that, apart from sensory-motor knowledge, speech production continuously draws on perceptual learning from the external speech environment.

  2. ISOLATED SPEECH RECOGNITION SYSTEM FOR TAMIL LANGUAGE USING STATISTICAL PATTERN MATCHING AND MACHINE LEARNING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    VIMALA C.

    2015-05-01

    Full Text Available In recent years, speech technology has become a vital part of our daily lives. Various techniques have been proposed for developing Automatic Speech Recognition (ASR system and have achieved great success in many applications. Among them, Template Matching techniques like Dynamic Time Warping (DTW, Statistical Pattern Matching techniques such as Hidden Markov Model (HMM and Gaussian Mixture Models (GMM, Machine Learning techniques such as Neural Networks (NN, Support Vector Machine (SVM, and Decision Trees (DT are most popular. The main objective of this paper is to design and develop a speaker-independent isolated speech recognition system for Tamil language using the above speech recognition techniques. The background of ASR system, the steps involved in ASR, merits and demerits of the conventional and machine learning algorithms and the observations made based on the experiments are presented in this paper. For the above developed system, highest word recognition accuracy is achieved with HMM technique. It offered 100% accuracy during training process and 97.92% for testing process.

  3. Spotlight on Speech Codes 2009: The State of Free Speech on Our Nation's Campuses

    Science.gov (United States)

    Foundation for Individual Rights in Education (NJ1), 2009

    2009-01-01

    Each year, the Foundation for Individual Rights in Education (FIRE) conducts a wide, detailed survey of restrictions on speech at America's colleges and universities. The survey and resulting report explore the extent to which schools are meeting their obligations to uphold students' and faculty members' rights to freedom of speech, freedom of…

  4. Spotlight on Speech Codes 2010: The State of Free Speech on Our Nation's Campuses

    Science.gov (United States)

    Foundation for Individual Rights in Education (NJ1), 2010

    2010-01-01

    Each year, the Foundation for Individual Rights in Education (FIRE) conducts a rigorous survey of restrictions on speech at America's colleges and universities. The survey and resulting report explore the extent to which schools are meeting their legal and moral obligations to uphold students' and faculty members' rights to freedom of speech,…

  5. Spotlight on Speech Codes 2011: The State of Free Speech on Our Nation's Campuses

    Science.gov (United States)

    Foundation for Individual Rights in Education (NJ1), 2011

    2011-01-01

    Each year, the Foundation for Individual Rights in Education (FIRE) conducts a rigorous survey of restrictions on speech at America's colleges and universities. The survey and accompanying report explore the extent to which schools are meeting their legal and moral obligations to uphold students' and faculty members' rights to freedom of speech,…

  6. A wireless brain-machine interface for real-time speech synthesis.

    Directory of Open Access Journals (Sweden)

    Frank H Guenther

    Full Text Available BACKGROUND: Brain-machine interfaces (BMIs involving electrodes implanted into the human cerebral cortex have recently been developed in an attempt to restore function to profoundly paralyzed individuals. Current BMIs for restoring communication can provide important capabilities via a typing process, but unfortunately they are only capable of slow communication rates. In the current study we use a novel approach to speech restoration in which we decode continuous auditory parameters for a real-time speech synthesizer from neuronal activity in motor cortex during attempted speech. METHODOLOGY/PRINCIPAL FINDINGS: Neural signals recorded by a Neurotrophic Electrode implanted in a speech-related region of the left precentral gyrus of a human volunteer suffering from locked-in syndrome, characterized by near-total paralysis with spared cognition, were transmitted wirelessly across the scalp and used to drive a speech synthesizer. A Kalman filter-based decoder translated the neural signals generated during attempted speech into continuous parameters for controlling a synthesizer that provided immediate (within 50 ms auditory feedback of the decoded sound. Accuracy of the volunteer's vowel productions with the synthesizer improved quickly with practice, with a 25% improvement in average hit rate (from 45% to 70% and 46% decrease in average endpoint error from the first to the last block of a three-vowel task. CONCLUSIONS/SIGNIFICANCE: Our results support the feasibility of neural prostheses that may have the potential to provide near-conversational synthetic speech output for individuals with severely impaired speech motor control. They also provide an initial glimpse into the functional properties of neurons in speech motor cortical areas.

  7. Speech Coding Development for Audio fixing Using Spectrum Analysis

    Directory of Open Access Journals (Sweden)

    Mr. S. Nageswara Rao 1, Dr. C.D. Naidu 2, Dr. K. Jaya Sankar 3

    2012-12-01

    Full Text Available A new method for the enhancement of speech signals contaminated by speech-correlated noise, such as that in the output of a speech coder, is presented. This module is based on numerical speech processing algorithms which modelise the infected ear and generates the stimulus signals for the cilia cells (brain. The method is also based on constrained optimization of a criterion. This interface uses a gamma chirp filter bank constituted of 16 band pass filters based on IIR filters. The implemented method is on a block by- block basis and uses two constraints. A first constraint ensures that the signal power is preserved. A modification constraint ensures that the power of the difference of the enhanced and unenhanced signal is less than a fraction of the power of the unenhanced signal. The applied method is to increase the periodicity of the speech signal. Sounds that are not nearly periodic are perceptually unaffected by the optimization because of the modification constraint. The results demonstrated a degree of discrimination and interferences between different sounds especially in multi speaker environment.

  8. PhpHMM Tool for Generating Speech Recogniser Source Codes Using Web Technologies

    Directory of Open Access Journals (Sweden)

    R. Krejčí

    2011-01-01

    Full Text Available This paper deals with the “phpHMM” software tool, which facilitates the development and optimisation of speech recognition algorithms. This tool is being developed in the Speech Processing Group at the Department of Circuit Theory, CTU in Prague, and it is used to generate the source code of a speech recogniser by means of the PHP scripting language and the MySQL database. The input of the system is a model of speech in a standard HTK format and a list of words to be recognised. The output consists of the source codes and data structures in C programming language, which are then compiled into an executable program. This tool is operated via a web interface.

  9. Influence of GSM speech coding algorithms on the performance of text-independent speaker indentification

    OpenAIRE

    Grassi, Sara; Besacier, Laurent; DUFAUX, Alain; Ansorge, Michael; Pellandini, Fausto

    2006-01-01

    This paper investigates the influence, on theperformance of a text-independent speaker identification system, of the three speech coding algorithmsstandardized for use in the GSM wireless communication network. The speaker identification system isbased on Gaussian Mixture Models (GMM) classifiers. Only the influence of the speech codingalgorithms was taken into account. This was done bypassing the whole TIMIT database through eachcoding/decoding algorithm obtaining three transcodeddatabases. ...

  10. Model-Based Speech Signal Coding Using Optimized Temporal Decomposition for Storage and Broadcasting Applications

    Directory of Open Access Journals (Sweden)

    Chandranath R. N. Athaudage

    2003-09-01

    Full Text Available A dynamic programming-based optimization strategy for a temporal decomposition (TD model of speech and its application to low-rate speech coding in storage and broadcasting is presented. In previous work with the spectral stability-based event localizing (SBEL TD algorithm, the event localization was performed based on a spectral stability criterion. Although this approach gave reasonably good results, there was no assurance on the optimality of the event locations. In the present work, we have optimized the event localizing task using a dynamic programming-based optimization strategy. Simulation results show that an improved TD model accuracy can be achieved. A methodology of incorporating the optimized TD algorithm within the standard MELP speech coder for the efficient compression of speech spectral information is also presented. The performance evaluation results revealed that the proposed speech coding scheme achieves 50%–60% compression of speech spectral information with negligible degradation in the decoded speech quality.

  11. A continuous speech recognition approach for the design of a dictation machine

    OpenAIRE

    Smaïli, Kamel; Charpillet, François; Pierrel, Jean-Mari; Haton, Jean-Paul

    1991-01-01

    International audience; The oral entry of texts (dictation machine) remains an important potential field of application for automatic speech recognition. The RFIA group of CRIN/INRIA has been investigating this research area for the french language during the past ten years. We propose in this paper a general presentation of the present state of our MAUD system which is based upon four major interacting components: an acoustic phonetic decoder, a lexical component, a linguistic model and a us...

  12. Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech Coder

    Directory of Open Access Journals (Sweden)

    Ritisha Virulkar

    2014-03-01

    Full Text Available The CS-ACELP is a speech coder that is based on the linear prediction coding technique. It gives us the bit rate reduced to up to 8kbps and at the same time reduces the computational comp le xity of speech search described in ITU rec-ommendation G.729. This codec is used for compression of speech signal.The idea behind this algorithm is to predict the next co ming signals by the means of linear prediction. For his it uses fixed codebook and adaptive codebook.The quality of speech delivered by this coder is equivalent to 32 kbps ADPCM. The processes responsible for achieving reduction in bit rateare: sending less number of bits for no voice detection and carrying out conditional search in fixed codebook.

  13. Do North Carolina Students Have Freedom of Speech? A Review of Campus Speech Codes

    Science.gov (United States)

    Robinson, Jenna Ashley

    2010-01-01

    America's colleges and universities are supposed to be strongholds of classically liberal ideals, including the protection of individual rights and openness to debate and inquiry. Too often, this is not the case. Across the country, universities deny students and faculty their fundamental rights to freedom of speech and expression. The report…

  14. Spotlight on Speech Codes 2006: The State of Free Speech on Our Nation's Campuses

    Science.gov (United States)

    Foundation for Individual Rights in Education (NJ1), 2006

    2006-01-01

    This year, the Foundation for Individual Rights in Education (FIRE) conducted an expansive study of just how pervasive and how onerous restrictions on speech are at America's colleges and universities. Between September 2005 and September 2006, FIRE surveyed over 330 schools and found that an overwhelming majority of them explicitly prohibit…

  15. Look at the Gato! Code-Switching in Speech to Toddlers

    Science.gov (United States)

    Bail, Amelie; Morini, Giovanna; Newman, Rochelle S.

    2015-01-01

    We examined code-switching (CS) in the speech of twenty-four bilingual caregivers when speaking with their 18- to 24-month-old children. All parents CS at least once in a short play session, and some code-switched quite often (over 1/3 of utterances). This CS included both inter-sentential and intra-sentential switches, suggesting that at least…

  16. Hate-Speech Code at U. of Wisconsin Voided by Court.

    Science.gov (United States)

    Collison, Michele N-K

    1991-01-01

    A federal district judge has struck down a two-year-old University of Wisconsin code barring slurs or epithets based on an individual's race, sex, religion, sexual orientation, disability, or ethnic origin, ruling that it violates students' First-Amendment rights to freedom of speech. The decision affects a number of other colleges. (MSE)

  17. Detecting Abnormal Word Utterances in Children With Autism Spectrum Disorders: Machine-Learning-Based Voice Analysis Versus Speech Therapists.

    Science.gov (United States)

    Nakai, Yasushi; Takiguchi, Tetsuya; Matsui, Gakuyo; Yamaoka, Noriko; Takada, Satoshi

    2017-10-01

    Abnormal prosody is often evident in the voice intonations of individuals with autism spectrum disorders. We compared a machine-learning-based voice analysis with human hearing judgments made by 10 speech therapists for classifying children with autism spectrum disorders ( n = 30) and typical development ( n = 51). Using stimuli limited to single-word utterances, machine-learning-based voice analysis was superior to speech therapist judgments. There was a significantly higher true-positive than false-negative rate for machine-learning-based voice analysis but not for speech therapists. Results are discussed in terms of some artificiality of clinician judgments based on single-word utterances, and the objectivity machine-learning-based voice analysis adds to judging abnormal prosody.

  18. Improving Language Models in Speech-Based Human-Machine Interaction

    Directory of Open Access Journals (Sweden)

    Raquel Justo

    2013-02-01

    Full Text Available This work focuses on speech-based human-machine interaction. Specifically, a Spoken Dialogue System (SDS that could be integrated into a robot is considered. Since Automatic Speech Recognition is one of the most sensitive tasks that must be confronted in such systems, the goal of this work is to improve the results obtained by this specific module. In order to do so, a hierarchical Language Model (LM is considered. Different series of experiments were carried out using the proposed models over different corpora and tasks. The results obtained show that these models provide greater accuracy in the recognition task. Additionally, the influence of the Acoustic Modelling (AM in the improvement percentage of the Language Models has also been explored. Finally the use of hierarchical Language Models in a language understanding task has been successfully employed, as shown in an additional series of experiments.

  19. Code-expanded radio access protocol for machine-to-machine communications

    DEFF Research Database (Denmark)

    Thomsen, Henning; Kiilerich Pratas, Nuno; Stefanovic, Cedomir

    2013-01-01

    The random access methods used for support of machine-to-machine, also referred to as Machine-Type Communications, in current cellular standards are derivatives of traditional framed slotted ALOHA and therefore do not support high user loads efficiently. We propose an approach that is motivated...... subframes and orthogonal preambles, the amount of available contention resources is drastically increased, enabling the massive support of Machine-Type Communication users that is beyond the reach of current systems....

  20. Code-expanded radio access protocol for machine-to-machine communications

    DEFF Research Database (Denmark)

    Thomsen, Henning; Kiilerich Pratas, Nuno; Stefanovic, Cedomir

    2013-01-01

    The random access methods used for support of machine-to-machine, also referred to as Machine-Type Communications, in current cellular standards are derivatives of traditional framed slotted ALOHA and therefore do not support high user loads efficiently. We propose an approach that is motivated b...

  1. Encoding emotions in speech with the size code. A perceptual investigation.

    Science.gov (United States)

    Chuenwattanapranithi, Suthathip; Xu, Yi; Thipakorn, Bundit; Maneewongvatana, Songrit

    2008-01-01

    Our current understanding of how emotions are expressed in speech is still very limited. Part of the difficulty has been the lack of understanding of the underlying mechanisms. Here we report the findings of a somewhat unconventional investigation of emotional speech. Instead of looking for direct acoustic correlates of multiple emotions, we tested a specific theory, the size code hypothesis of emotional speech, about two emotions--anger and happiness. According to the hypothesis, anger and happiness are conveyed in speech by exaggerating or understating the body size of the speaker. In two studies consisting of six experiments, we synthesized vowels with a three-dimensional articulatory synthesizer with parameter manipulations derived from the size code hypothesis, and asked Thai listeners to judge the body size and emotion of the speaker. Vowels synthesized with a longer vocal tract and lower F(0) were mostly heard as from a larger person if the length and F(0) differences were stationary, but from an angry person if the vocal tract was dynamically lengthened and F(0) was dynamically lowered. The opposite was true for the perception of small body size and happiness. These results provide preliminary support for the size code hypothesis. They also point to potential benefits of theory-driven investigations in emotion research. 2008 S. Karger AG, Basel.

  2. Integrating Automatic Speech Recognition and Machine Translation for Better Translation Outputs

    DEFF Research Database (Denmark)

    Liyanapathirana, Jeevanthi

    than typing, making the translation process faster. The spoken translation is analyzed and combined with the machine translation output of the same sentence using different methods. We study a number of different translation models in the context of n-best list rescoring methods. As an alternative...... to the n-best list rescoring, we also use word graphs with the expectation of arriving at a tighter integration of ASR and MT models. Integration methods include constraining ASR models using language and translation models of MT, and vice versa. We currently develop and experiment different methods...... on the Danish – English language pair, with the use of a speech corpora and parallel text. The methods are investigated to check ways that the accuracy of the spoken translation of the translator can be increased with the use of machine translation outputs, which would be useful for potential computer...

  3. Vector Sum Excited Linear Prediction (VSELP) speech coding at 4.8 kbps

    Science.gov (United States)

    Gerson, Ira A.; Jasiuk, Mark A.

    1990-01-01

    Code Excited Linear Prediction (CELP) speech coders exhibit good performance at data rates as low as 4800 bps. The major drawback to CELP type coders is their larger computational requirements. The Vector Sum Excited Linear Prediction (VSELP) speech coder utilizes a codebook with a structure which allows for a very efficient search procedure. Other advantages of the VSELP codebook structure is discussed and a detailed description of a 4.8 kbps VSELP coder is given. This coder is an improved version of the VSELP algorithm, which finished first in the NSA's evaluation of the 4.8 kbps speech coders. The coder uses a subsample resolution single tap long term predictor, a single VSELP excitation codebook, a novel gain quantizer which is robust to channel errors, and a new adaptive pre/postfilter arrangement.

  4. A Machine Learning Perspective on Predictive Coding with PAQ

    CERN Document Server

    Knoll, Byron

    2011-01-01

    PAQ8 is an open source lossless data compression algorithm that currently achieves the best compression rates on many benchmarks. This report presents a detailed description of PAQ8 from a statistical machine learning perspective. It shows that it is possible to understand some of the modules of PAQ8 and use this understanding to improve the method. However, intuitive statistical explanations of the behavior of other modules remain elusive. We hope the description in this report will be a starting point for discussions that will increase our understanding, lead to improvements to PAQ8, and facilitate a transfer of knowledge from PAQ8 to other machine learning methods, such a recurrent neural networks and stochastic memoizers. Finally, the report presents a broad range of new applications of PAQ to machine learning tasks including language modeling and adaptive text prediction, adaptive game playing, classification, and compression using features from the field of deep learning.

  5. Machine-Learning Algorithms to Code Public Health Spending Accounts.

    Science.gov (United States)

    Brady, Eoghan S; Leider, Jonathon P; Resnick, Beth A; Alfonso, Y Natalia; Bishai, David

    Government public health expenditure data sets require time- and labor-intensive manipulation to summarize results that public health policy makers can use. Our objective was to compare the performances of machine-learning algorithms with manual classification of public health expenditures to determine if machines could provide a faster, cheaper alternative to manual classification. We used machine-learning algorithms to replicate the process of manually classifying state public health expenditures, using the standardized public health spending categories from the Foundational Public Health Services model and a large data set from the US Census Bureau. We obtained a data set of 1.9 million individual expenditure items from 2000 to 2013. We collapsed these data into 147 280 summary expenditure records, and we followed a standardized method of manually classifying each expenditure record as public health, maybe public health, or not public health. We then trained 9 machine-learning algorithms to replicate the manual process. We calculated recall, precision, and coverage rates to measure the performance of individual and ensembled algorithms. Compared with manual classification, the machine-learning random forests algorithm produced 84% recall and 91% precision. With algorithm ensembling, we achieved our target criterion of 90% recall by using a consensus ensemble of ≥6 algorithms while still retaining 93% coverage, leaving only 7% of the summary expenditure records unclassified. Machine learning can be a time- and cost-saving tool for estimating public health spending in the United States. It can be used with standardized public health spending categories based on the Foundational Public Health Services model to help parse public health expenditure information from other types of health-related spending, provide data that are more comparable across public health organizations, and evaluate the impact of evidence-based public health resource allocation.

  6. Towards a universal code formatter through machine learning

    NARCIS (Netherlands)

    Parr, T. (Terence); J.J. Vinju (Jurgen)

    2016-01-01

    textabstractThere are many declarative frameworks that allow us to implement code formatters relatively easily for any specific language, but constructing them is cumbersome. The first problem is that "everybody" wants to format their code differently, leading to either many formatter variants or a

  7. Coding Methods for the NMF Approach to Speech Recognition and Vocabulary Acquisition

    Directory of Open Access Journals (Sweden)

    Meng Sun

    2012-12-01

    Full Text Available This paper aims at improving the accuracy of the non- negative matrix factorization approach to word learn- ing and recognition of spoken utterances. We pro- pose and compare three coding methods to alleviate quantization errors involved in the vector quantization (VQ of speech spectra: multi-codebooks, soft VQ and adaptive VQ. We evaluate on the task of spotting a vocabulary of 50 keywords in continuous speech. The error rates of multi-codebooks decreased with increas- ing number of codebooks, but the accuracy leveled off around 5 to 10 codebooks. Soft VQ and adaptive VQ made a better trade-off between the required memory and the accuracy. The best of the proposed methods reduce the error rate to 1.2% from the 1.9% obtained with a single codebook. The coding methods and the model framework may also prove useful for applica- tions such as topic discovery/detection and mining of sequential patterns.

  8. Code-Expanded Random Access for Machine-Type Communications

    DEFF Research Database (Denmark)

    Kiilerich Pratas, Nuno; Thomsen, Henning; Stefanovic, Cedomir

    2012-01-01

    Abstract—The random access methods used for support of machine-type communications (MTC) in current cellular standards are derivatives of traditional framed slotted ALOHA and therefore do not support high user loads efficiently. Motivated by the random access method employed in LTE, we propose...

  9. Tonal Language Speech Compression Based on a Bitrate Scalable Multi-Pulse Based Code Excited Linear Prediction Coder

    Directory of Open Access Journals (Sweden)

    Suphattharachai Chomphan

    2011-01-01

    Full Text Available Problem statement: Speech compression is an important issue in the modern digital speech communication. The functionality of bitrates scalability also plays significant role, since the capacity of communication system varies all the time. When considering tonal speech, such as Thai, tone plays important role on the naturalness and the intelligibility of the speech, it must be treated appropriately. Therefore these issues are taken into account in this study. Approach: This study proposes a modification of flexible Multi-Pulse based Code Excited Linear Predictive (MP-CELP coder with bitrates scalabilities for tonal language speech in the multimedia applications. The coder consists of a core coder and bitrates scalable tools. The high pitch delay resolutions are applied to the adaptive codebook of core coder for tonal language speech quality improvement. The bitrates scalable tool employs multi-stage excitation coding based on an embedded-coding approach. The multi-pulse excitation codebook at each stage is adaptively produced depending on the selected excitation signal at the previous stage. Results: The experimental results show that the speech quality of the proposed coder is improved above the speech quality of the conventional coder without pitch-resolution adaptation. Conclusion: From the study, the proposed approach is able to improve the speech compression quality for tonal language and the functionality of bitrates scalability is also developed.

  10. A Machine Learning Perspective on Predictive Coding with PAQ

    OpenAIRE

    Knoll, Byron; de Freitas, Nando

    2011-01-01

    PAQ8 is an open source lossless data compression algorithm that currently achieves the best compression rates on many benchmarks. This report presents a detailed description of PAQ8 from a statistical machine learning perspective. It shows that it is possible to understand some of the modules of PAQ8 and use this understanding to improve the method. However, intuitive statistical explanations of the behavior of other modules remain elusive. We hope the description in this report will be a sta...

  11. "Source Coding With a Side Information ""Vending Machine"""

    OpenAIRE

    Weissman, Tsachy; Permuter, Haim H.

    2011-01-01

    We study source coding in the presence of side information, when the system can take actions that affect the availability, quality, or nature of the side information. We begin by extending the Wyner-Ziv problem of source coding with decoder side information to the case where the decoder is allowed to choose actions affecting the side information. We then consider the setting where actions are taken by the encoder, based on its observation of the source. Actions may have costs that are commens...

  12. A Quaternary Decision Diagram Machine: Optimization of Its Code

    Science.gov (United States)

    2010-08-01

    2026 IEICE TRANS. INF. & SYST., VOL.E93–D, NO.8 AUGUST 2010 INVITED PAPER Special Section on Multiple-Valued Logic and VLSI Computing A Quaternary...sequentially. A straightforward method to increase the speed is to increase the clock frequency. However, this is 2028 IEICE TRANS. INF. & SYST...reduce the 2030 IEICE TRANS. INF. & SYST., VOL.E93–D, NO.8 AUGUST 2010 Fig. 9 4-address QDD machine. Fig. 10 Branch instruction for 4-address QDD

  13. Improving Language Models in Speech-Based Human-Machine Interaction

    Directory of Open Access Journals (Sweden)

    Raquel Justo

    2013-02-01

    Full Text Available This work focuses on speech‐based human‐machine interaction. Specifically, a Spoken Dialogue System (SDS that could be integrated into a robot is considered. Since Automatic Speech Recognition is one of the most sensitive tasks that must be confronted in such systems, the goal of this work is to improve the results obtained by this specific module. In order to do so, a hierarchical Language Model (LM is considered. Different series of experiments were carried out using the proposed models over different corpora and tasks. The results obtained show that these models provide greater accuracy in the recognition task. Additionally, the influence of the Acoustic Modelling (AM in the improvement percentage of the Language Models has also been explored. Finally the use of hierarchical Language Models in a language understanding task has been successfully employed, as shown in an additional series of experiments.

  14. Accuracy comparison among different machine learning techniques for detecting malicious codes

    Science.gov (United States)

    Narang, Komal

    2016-03-01

    In this paper, a machine learning based model for malware detection is proposed. It can detect newly released malware i.e. zero day attack by analyzing operation codes on Android operating system. The accuracy of Naïve Bayes, Support Vector Machine (SVM) and Neural Network for detecting malicious code has been compared for the proposed model. In the experiment 400 benign files, 100 system files and 500 malicious files have been used to construct the model. The model yields the best accuracy 88.9% when neural network is used as classifier and achieved 95% and 82.8% accuracy for sensitivity and specificity respectively.

  15. Multi-Pulse Based Code Excited Linear Predictive Speech Coder with Fine Granularity Scalability for Tonal Language

    Directory of Open Access Journals (Sweden)

    Suphattharachai Chomphan

    2010-01-01

    Full Text Available Problem statement: The flexible bit-rate speech coder plays an important role in the modern speech communication. The MP-CELP speech coder which is a candidate of the MPEG4 natural speech coder supports a flexible and wide bit-rate range. However, a fine scalability had not been included. To support finer scalability of the coding rate, it had been studied in this study. Approach: In this study, based on the MP-CELP speech coding with HPDR technique, Fine Granularity Scalability was introduced by adjusting the amount of transmitted fixed excitation information. The FGS feature aim at changing the bit rate of the conventional coding more finely and more smoothly. Results: Through performance analysis and computer simulation, the quality of scalability of the MP-CELP coding was presented with an improvement from conventional scalable MP-CELP. The HPDR technique is also applied to the MP-CELP to use for tonal language, meanwhile it can support the core coding rate of 4.2, 5.5, 7.5 kbps and additional scaled bit rates. Conclusion: The core coder with high pitch delay resolution technique and adaptive codebook for tonal speech quality improvement has been conducted and the FGS brings about further efficient scalability.

  16. Contributions of speech science to the technology of man-machine voice interactions

    Science.gov (United States)

    Lea, Wayne A.

    1977-01-01

    Research in speech understanding was reviewed. Plans which include prosodics research, phonological rules for speech understanding systems, and continued interdisciplinary phonetics research are discussed. Improved acoustic phonetic analysis capabilities in speech recognizers are suggested.

  17. CNC LATHE MACHINE PRODUCING NC CODE BY USING DIALOG METHOD

    Directory of Open Access Journals (Sweden)

    Yakup TURGUT

    2004-03-01

    Full Text Available In this study, an NC code generation program utilising Dialog Method was developed for turning centres. Initially, CNC lathes turning methods and tool path development techniques were reviewed briefly. By using geometric definition methods, tool path was generated and CNC part program was developed for FANUC control unit. The developed program made CNC part program generation process easy. The program was developed using BASIC 6.0 programming language while the material and cutting tool database were and supported with the help of ACCESS 7.0.

  18. Code-Expanded Random Access for Machine-Type Communications

    DEFF Research Database (Denmark)

    Kiilerich Pratas, Nuno; Thomsen, Henning; Stefanovic, Cedomir

    2012-01-01

    a novel approach that is able to sustain a wide random access load range, while preserving the physical layer unchanged and incurring minor changes in the medium access control layer. The proposed scheme increases the amount of available contention resources, without resorting to the increase of system...... of an increased number of MTC users. We present the framework and analysis of the proposed code-expanded random access method and show that our approach supports load regions that are beyond the reach of current systems....

  19. Using machine-coded event data for the micro-level study of political violence

    Directory of Open Access Journals (Sweden)

    Jesse Hammond

    2014-07-01

    Full Text Available Machine-coded datasets likely represent the future of event data analysis. We assess the use of one of these datasets—Global Database of Events, Language and Tone (GDELT—for the micro-level study of political violence by comparing it to two hand-coded conflict event datasets. Our findings indicate that GDELT should be used with caution for geo-spatial analyses at the subnational level: its overall correlation with hand-coded data is mediocre, and at the local level major issues of geographic bias exist in how events are reported. Overall, our findings suggest that due to these issues, researchers studying local conflict processes may want to wait for a more reliable geocoding method before relying too heavily on this set of machine-coded data.

  20. A Bit Stream Scalable Speech/Audio Coder Combining Enhanced Regular Pulse Excitation and Parametric Coding

    Directory of Open Access Journals (Sweden)

    Albertus C. den Brinker

    2007-01-01

    Full Text Available This paper introduces a new audio and speech broadband coding technique based on the combination of a pulse excitation coder and a standardized parametric coder, namely, MPEG-4 high-quality parametric coder. After presenting a series of enhancements to regular pulse excitation (RPE to make it suitable for the modeling of broadband signals, it is shown how pulse and parametric codings complement each other and how they can be merged to yield a layered bit stream scalable coder able to operate at different points in the quality bit rate plane. The performance of the proposed coder is evaluated in a listening test. The major result is that the extra functionality of the bit stream scalability does not come at the price of a reduced performance since the coder is competitive with standardized coders (MP3, AAC, SSC.

  1. Improving on hidden Markov models: An articulatorily constrained, maximum likelihood approach to speech recognition and speech coding

    Energy Technology Data Exchange (ETDEWEB)

    Hogden, J.

    1996-11-05

    The goal of the proposed research is to test a statistical model of speech recognition that incorporates the knowledge that speech is produced by relatively slow motions of the tongue, lips, and other speech articulators. This model is called Maximum Likelihood Continuity Mapping (Malcom). Many speech researchers believe that by using constraints imposed by articulator motions, we can improve or replace the current hidden Markov model based speech recognition algorithms. Unfortunately, previous efforts to incorporate information about articulation into speech recognition algorithms have suffered because (1) slight inaccuracies in our knowledge or the formulation of our knowledge about articulation may decrease recognition performance, (2) small changes in the assumptions underlying models of speech production can lead to large changes in the speech derived from the models, and (3) collecting measurements of human articulator positions in sufficient quantity for training a speech recognition algorithm is still impractical. The most interesting (and in fact, unique) quality of Malcom is that, even though Malcom makes use of a mapping between acoustics and articulation, Malcom can be trained to recognize speech using only acoustic data. By learning the mapping between acoustics and articulation using only acoustic data, Malcom avoids the difficulties involved in collecting articulator position measurements and does not require an articulatory synthesizer model to estimate the mapping between vocal tract shapes and speech acoustics. Preliminary experiments that demonstrate that Malcom can learn the mapping between acoustics and articulation are discussed. Potential applications of Malcom aside from speech recognition are also discussed. Finally, specific deliverables resulting from the proposed research are described.

  2. Effects of Voice Coding and Speech Rate on a Synthetic Speech Display in a Telephone Information System

    Science.gov (United States)

    1988-05-01

    short term memory (STM) - a function critical in synthetic speech perception. Re- search efforts of Atkinson and Shiffrin (1968) suggest STM acts not...Institute (1973). Psychoacoustic Terminology. New York: Author. Atkinson , R.C. and Shiffrin , R.M. (1968). Human memory : a proposed system and its control...far more computer memory to store speech information than does speech synthesized by rule. For example, analysis-synthesis using a common analog-to

  3. Speech Analysis and Synthesis and Man-Machine Speech Communications for Air Operations. (Synthese et Analyse de la Parole et Liaisons Vocales Homme- Machine dans les Operations Aeriennes)

    Science.gov (United States)

    1990-05-01

    noceablet it producas either nunmsnse v a different word; for Instance, batoe (stick) would turn into bateau (boat) In standard French If the...Informatdon-conitent v.’oni, while real speech contains moreshhat, comnmon words. Many of these long words act as qualifiers piled up beore ’solutions’ to...similar piling up of subsidiary informa~tion that seems to be unacceptable In speech but common lIt writing ocours wheni a main clause is preceded by

  4. Adaptive Long-Term Coding of LSF Parameters Trajectories for Large-Delay/Very- to Ultra-Low Bit-Rate Speech Coding

    Directory of Open Access Journals (Sweden)

    Laurent Girin

    2010-01-01

    Full Text Available This paper presents a model-based method for coding the LSF parameters of LPC speech coders on a “long-term” basis, that is, beyond the usual 20–30 ms frame duration. The objective is to provide efficient LSF quantization for a speech coder with large delay but very- to ultra-low bit-rate (i.e., below 1 kb/s. To do this, speech is first segmented into voiced/unvoiced segments. A Discrete Cosine model of the time trajectory of the LSF vectors is then applied to each segment to capture the LSF interframe correlation over the whole segment. Bi-directional transformation from the model coefficients to a reduced set of LSF vectors enables both efficient “sparse” coding (using here multistage vector quantizers and the generation of interpolated LSF vectors at the decoder. The proposed method provides up to 50% gain in bit-rate over frame-by-frame quantization while preserving signal quality and competes favorably with 2D-transform coding for the lower range of tested bit rates. Moreover, the implicit time-interpolation nature of the long-term coding process provides this technique a high potential for use in speech synthesis systems.

  5. Predicting and Classifying User Identification Code System Based on Support Vector Machines

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    In digital fingerprinting, preventing piracy of images by colluders is an important and tedious issue. Each image will be embedded with a unique User IDentification (U ID) code that is the fingerprint for tracking the authorized user. The proposed hiding scheme makes use of a random number generator to scramble two copies of a UID,which will then be hidden in the randomly selected medium frequency coefficients of the host image. The linear support vector machine (SVM) will be used to train classifications by calculating the normalized correlation (NC) for the 2-class UID codes. The trained classifications will be the models used for identifying unreadable UID codes.Experimental results showed that the success of predicting the unreadable UID codes can be increased by applying SVM. The proposed scheme can be used to provide protections to intellectual property rights of digital images and to keep track of users to prevent collaborative piracies.

  6. Joint Machine Learning and Game Theory for Rate Control in High Efficiency Video Coding.

    Science.gov (United States)

    Gao, Wei; Kwong, Sam; Jia, Yuheng

    2017-08-25

    In this paper, a joint machine learning and game theory modeling (MLGT) framework is proposed for inter frame coding tree unit (CTU) level bit allocation and rate control (RC) optimization in High Efficiency Video Coding (HEVC). First, a support vector machine (SVM) based multi-classification scheme is proposed to improve the prediction accuracy of CTU-level Rate-Distortion (R-D) model. The legacy "chicken-and-egg" dilemma in video coding is proposed to be overcome by the learning-based R-D model. Second, a mixed R-D model based cooperative bargaining game theory is proposed for bit allocation optimization, where the convexity of the mixed R-D model based utility function is proved, and Nash bargaining solution (NBS) is achieved by the proposed iterative solution search method. The minimum utility is adjusted by the reference coding distortion and frame-level Quantization parameter (QP) change. Lastly, intra frame QP and inter frame adaptive bit ratios are adjusted to make inter frames have more bit resources to maintain smooth quality and bit consumption in the bargaining game optimization. Experimental results demonstrate that the proposed MLGT based RC method can achieve much better R-D performances, quality smoothness, bit rate accuracy, buffer control results and subjective visual quality than the other state-of-the-art one-pass RC methods, and the achieved R-D performances are very close to the performance limits from the FixedQP method.

  7. Speech quality evaluation of a sparse coding shrinkage noise reduction algorithm with normal hearing and hearing impaired listeners.

    Science.gov (United States)

    Sang, Jinqiu; Hu, Hongmei; Zheng, Chengshi; Li, Guoping; Lutman, Mark E; Bleeck, Stefan

    2015-09-01

    Although there are numerous papers describing single-channel noise reduction strategies to improve speech perception in a noisy environment, few studies have comprehensively evaluated the effects of noise reduction algorithms on speech quality for hearing impaired (HI). A model-based sparse coding shrinkage (SCS) algorithm has been developed, and has shown previously (Sang et al., 2014) that it is as competitive as a state-of-the-art Wiener filter approach in speech intelligibility. Here, the analysis is extended to include subjective quality ratings and a method called Interpolated Paired Comparison Rating (IPCR) is adopted to quantitatively link the benefit of speech intelligibility and speech quality. The subjective quality tests are performed through IPCR to efficiently quantify noise reduction effects on speech quality. Objective measures including frequency-weighted segmental signal-to-noise ratio (fwsegSNR), perceptual evaluation of speech quality (PESQ) and hearing aid speech quality index (HASQI) are adopted to predict the noise reduction effects. Results show little difference in speech quality between the SCS and the Wiener filter algorithm but a difference in quality rating between the HI and NH listeners. HI listeners generally gave better quality ratings of noise reduction algorithms than NH listeners. However, SCS reduced the noise more efficiently at the cost of higher distortions that were detected by NH but not by the HI. SCS is a promising candidate for noise reduction algorithms for HI. In general, care needs to be taken when adopting algorithms that were originally developed for NH participants into hearing aid applications. An algorithm that is evaluated negatively with NH might still bring benefits for HI participants.

  8. Adaptive coding of orofacial and speech actions in motor and somatosensory spaces with and without overt motor behavior.

    Science.gov (United States)

    Sato, Marc; Vilain, Coriandre; Lamalle, Laurent; Grabski, Krystyna

    2015-02-01

    Studies of speech motor control suggest that articulatory and phonemic goals are defined in multidimensional motor, somatosensory, and auditory spaces. To test whether motor simulation might rely on sensory-motor coding common with those for motor execution, we used a repetition suppression (RS) paradigm while measuring neural activity with sparse sampling fMRI during repeated overt and covert orofacial and speech actions. RS refers to the phenomenon that repeated stimuli or motor acts lead to decreased activity in specific neural populations and are associated with enhanced adaptive learning related to the repeated stimulus attributes. Common suppressed neural responses were observed in motor and posterior parietal regions in the achievement of both repeated overt and covert orofacial and speech actions, including the left premotor cortex and inferior frontal gyrus, the superior parietal cortex and adjacent intraprietal sulcus, and the left IC and the SMA. Interestingly, reduced activity of the auditory cortex was observed during overt but not covert speech production, a finding likely reflecting a motor rather an auditory imagery strategy by the participants. By providing evidence for adaptive changes in premotor and associative somatosensory brain areas, the observed RS suggests online state coding of both orofacial and speech actions in somatosensory and motor spaces with and without motor behavior and sensory feedback.

  9. Support vector machine and mel frequency Cepstral coefficient based algorithm for hand gestures and bidirectional speech to text device

    Science.gov (United States)

    Balbin, Jessie R.; Padilla, Dionis A.; Fausto, Janette C.; Vergara, Ernesto M.; Garcia, Ramon G.; Delos Angeles, Bethsedea Joy S.; Dizon, Neil John A.; Mardo, Mark Kevin N.

    2017-02-01

    This research is about translating series of hand gesture to form a word and produce its equivalent sound on how it is read and said in Filipino accent using Support Vector Machine and Mel Frequency Cepstral Coefficient analysis. The concept is to detect Filipino speech input and translate the spoken words to their text form in Filipino. This study is trying to help the Filipino deaf community to impart their thoughts through the use of hand gestures and be able to communicate to people who do not know how to read hand gestures. This also helps literate deaf to simply read the spoken words relayed to them using the Filipino speech to text system.

  10. The Automation System Censor Speech for the Indonesian Rude Swear Words Based on Support Vector Machine and Pitch Analysis

    Science.gov (United States)

    Endah, S. N.; Nugraheni, D. M. K.; Adhy, S.; Sutikno

    2017-04-01

    According to Law No. 32 of 2002 and the Indonesian Broadcasting Commission Regulation No. 02/P/KPI/12/2009 & No. 03/P/KPI/12/2009, stated that broadcast programs should not scold with harsh words, not harass, insult or demean minorities and marginalized groups. However, there are no suitable tools to censor those words automatically. Therefore, researches to develop a system of intelligent software to censor the words automatically are needed. To conduct censor, the system must be able to recognize the words in question. This research proposes the classification of speech divide into two classes using Support Vector Machine (SVM), first class is set of rude words and the second class is set of properly words. The speech pitch values as an input in SVM, it used for the development of the system for the Indonesian rude swear word. The results of the experiment show that SVM is good for this system.

  11. Temporal Fine-Structure Coding and Lateralized Speech Perception in Normal-Hearing and Hearing-Impaired Listeners

    DEFF Research Database (Denmark)

    Locsei, Gusztav; Pedersen, Julie Hefting; Laugesen, Søren;

    2016-01-01

    hearing loss above 1.5 kHz participated in the study. Speech reception thresholds (SRTs) were estimated in the presence of either speech-shaped noise, two-, four-, or eight-talker babble played reversed, or a nonreversed two-talker masker. Target audibility was ensured by applying individualized linear...... understanding in spatially complex environments, these limitations were unrelated to TFS coding abilities and were only weakly associated with a reduction in binaural-unmasking benefit for spatially separated competing sources....

  12. Mapping the speech code: Cortical responses linking the perception and production of vowels

    NARCIS (Netherlands)

    Schuerman, W.L.; Meyer, A.S.; McQueen, J.M.

    2017-01-01

    The acoustic realization of speech is constrained by the physical mechanisms by which it is produced. Yet for speech perception, the degree to which listeners utilize experience derived from speech production has long been debated. In the present study, we examined how sensorimotor adaptation during

  13. Speech Coding in the Brain: Representation of Vowel Formants by Midbrain Neurons Tuned to Sound Fluctuations(1,2,3).

    Science.gov (United States)

    Carney, Laurel H; Li, Tianhao; McDonough, Joyce M

    2015-01-01

    Current models for neural coding of vowels are typically based on linear descriptions of the auditory periphery, and fail at high sound levels and in background noise. These models rely on either auditory nerve discharge rates or phase locking to temporal fine structure. However, both discharge rates and phase locking saturate at moderate to high sound levels, and phase locking is degraded in the CNS at middle to high frequencies. The fact that speech intelligibility is robust over a wide range of sound levels is problematic for codes that deteriorate as the sound level increases. Additionally, a successful neural code must function for speech in background noise at levels that are tolerated by listeners. The model presented here resolves these problems, and incorporates several key response properties of the nonlinear auditory periphery, including saturation, synchrony capture, and phase locking to both fine structure and envelope temporal features. The model also includes the properties of the auditory midbrain, where discharge rates are tuned to amplitude fluctuation rates. The nonlinear peripheral response features create contrasts in the amplitudes of low-frequency neural rate fluctuations across the population. These patterns of fluctuations result in a response profile in the midbrain that encodes vowel formants over a wide range of levels and in background noise. The hypothesized code is supported by electrophysiological recordings from the inferior colliculus of awake rabbits. This model provides information for understanding the structure of cross-linguistic vowel spaces, and suggests strategies for automatic formant detection and speech enhancement for listeners with hearing loss.

  14. Teaching the computer to code frames in news: comparing two supervised machine learning approaches to frame analysis

    NARCIS (Netherlands)

    Burscher, B.; Odijk, D.; Vliegenthart, R.; de Rijke, M.; de Vreese, C.H.

    2014-01-01

    We explore the application of supervised machine learning (SML) to frame coding. By automating the coding of frames in news, SML facilitates the incorporation of large-scale content analysis into framing research, even if financial resources are scarce. This furthers a more integrated investigation

  15. Teaching the computer to code frames in news: comparing two supervised machine learning approaches to frame analysis

    NARCIS (Netherlands)

    Burscher, B.; Odijk, D.; Vliegenthart, R.; de Rijke, M.; de Vreese, C.H.

    2014-01-01

    We explore the application of supervised machine learning (SML) to frame coding. By automating the coding of frames in news, SML facilitates the incorporation of large-scale content analysis into framing research, even if financial resources are scarce. This furthers a more integrated investigation

  16. Brain cells in the avian 'prefrontal cortex' code for features of slot-machine-like gambling.

    Directory of Open Access Journals (Sweden)

    Damian Scarf

    Full Text Available Slot machines are the most common and addictive form of gambling. In the current study, we recorded from single neurons in the 'prefrontal cortex' of pigeons while they played a slot-machine-like task. We identified four categories of neurons that coded for different aspects of our slot-machine-like task. Reward-Proximity neurons showed a linear increase in activity as the opportunity for a reward drew near. I-Won neurons fired only when the fourth stimulus of a winning (four-of-a-kind combination was displayed. I-Lost neurons changed their firing rate at the presentation of the first nonidentical stimulus, that is, when it was apparent that no reward was forthcoming. Finally, Near-Miss neurons also changed their activity the moment it was recognized that a reward was no longer available, but more importantly, the activity level was related to whether the trial contained one, two, or three identical stimuli prior to the display of the nonidentical stimulus. These findings not only add to recent neurophysiological research employing simulated gambling paradigms, but also add to research addressing the functional correspondence between the avian NCL and primate PFC.

  17. Channel Efficiency with Security Enhancement for Remote Condition Monitoring of Multi Machine System Using Hybrid Huffman Coding

    Science.gov (United States)

    Datta, Jinia; Chowdhuri, Sumana; Bera, Jitendranath

    2016-12-01

    This paper presents a novel scheme of remote condition monitoring of multi machine system where a secured and coded data of induction machine with different parameters is communicated between a state-of-the-art dedicated hardware Units (DHU) installed at the machine terminal and a centralized PC based machine data management (MDM) software. The DHUs are built for acquisition of different parameters from the respective machines, and hence are placed at their nearby panels in order to acquire different parameters cost effectively during their running condition. The MDM software collects these data through a communication channel where all the DHUs are networked using RS485 protocol. Before transmitting, the parameter's related data is modified with the adoption of differential pulse coded modulation (DPCM) and Huffman coding technique. It is further encrypted with a private key where different keys are used for different DHUs. In this way a data security scheme is adopted during its passage through the communication channel in order to avoid any third party attack into the channel. The hybrid mode of DPCM and Huffman coding is chosen to reduce the data packet length. A MATLAB based simulation and its practical implementation using DHUs at three machine terminals (one healthy three phase, one healthy single phase and one faulty three phase machine) proves its efficacy and usefulness for condition based maintenance of multi machine system. The data at the central control room are decrypted and decoded using MDM software. In this work it is observed that Chanel efficiency with respect to different parameter measurements has been increased very much.

  18. Evaluation of pitch coding alternatives for vibrotactile stimulation in speech training of the deaf

    Energy Technology Data Exchange (ETDEWEB)

    Barbacena, I L; Barros, A T [CEFET/PB, Joao Pessoa - PB (Brazil); Freire, R C S [DEE, UFCG, Campina Grande-PB (Brazil); Vieira, E C A [CEFET/PB, Joao Pessoa - PB (Brazil)

    2007-11-15

    Use of vibrotactile feedback stimulation as an aid for speech vocalization by the hearing impaired or deaf is reviewed. Architecture of a vibrotactile based speech therapy system is proposed. Different formulations for encoding the fundamental frequency of the vocalized speech into the pulsed stimulation frequency are proposed and investigated. Simulation results are also presented to obtain a comparative evaluation of the effectiveness of the different formulated transformations. Results of the perception sensitivity to the vibrotactile stimulus frequency to verify effectiveness of the above transformations are included.

  19. Speech Coding in the Brain: Representation of Vowel Formants by Midbrain Neurons Tuned to Sound Fluctuations1,2,3

    Science.gov (United States)

    Li, Tianhao; McDonough, Joyce M.

    2015-01-01

    Abstract Current models for neural coding of vowels are typically based on linear descriptions of the auditory periphery, and fail at high sound levels and in background noise. These models rely on either auditory nerve discharge rates or phase locking to temporal fine structure. However, both discharge rates and phase locking saturate at moderate to high sound levels, and phase locking is degraded in the CNS at middle to high frequencies. The fact that speech intelligibility is robust over a wide range of sound levels is problematic for codes that deteriorate as the sound level increases. Additionally, a successful neural code must function for speech in background noise at levels that are tolerated by listeners. The model presented here resolves these problems, and incorporates several key response properties of the nonlinear auditory periphery, including saturation, synchrony capture, and phase locking to both fine structure and envelope temporal features. The model also includes the properties of the auditory midbrain, where discharge rates are tuned to amplitude fluctuation rates. The nonlinear peripheral response features create contrasts in the amplitudes of low-frequency neural rate fluctuations across the population. These patterns of fluctuations result in a response profile in the midbrain that encodes vowel formants over a wide range of levels and in background noise. The hypothesized code is supported by electrophysiological recordings from the inferior colliculus of awake rabbits. This model provides information for understanding the structure of cross-linguistic vowel spaces, and suggests strategies for automatic formant detection and speech enhancement for listeners with hearing loss. PMID:26464993

  20. Distributed and Cascade Lossy Source Coding with a Side Information "Vending Machine"

    CERN Document Server

    Ahmadi, Behzad

    2011-01-01

    Source coding with a side information "vending machine" is a recently proposed framework in which the statistical relationship between the side information and the source, instead of being given and fixed as in the classical Wyner-Ziv problem, can be controlled by the decoder. This control action is selected by the decoder based on the message encoded by the source node. Unlike conventional settings, the message can thus carry not only information about the source to be reproduced at the decoder, but also control information aimed at improving the quality of the side information. In this paper, the single-letter characterization of the trade-offs between rate, distortion and cost associated with the control actions is extended from the previously studied point-to-point set-up to two basic multiterminal models. First, a distributed source coding model is studied, in which an arbitrary number of encoders communicate over rate-limited links to a decoder, whose side information can be controlled. The control acti...

  1. Implementing Scientific Simulation Codes Highly Tailored for Vector Architectures Using Custom Configurable Computing Machines

    Science.gov (United States)

    Rutishauser, David

    2006-01-01

    The motivation for this work comes from an observation that amidst the push for Massively Parallel (MP) solutions to high-end computing problems such as numerical physical simulations, large amounts of legacy code exist that are highly optimized for vector supercomputers. Because re-hosting legacy code often requires a complete re-write of the original code, which can be a very long and expensive effort, this work examines the potential to exploit reconfigurable computing machines in place of a vector supercomputer to implement an essentially unmodified legacy source code. Custom and reconfigurable computing resources could be used to emulate an original application's target platform to the extent required to achieve high performance. To arrive at an architecture that delivers the desired performance subject to limited resources involves solving a multi-variable optimization problem with constraints. Prior research in the area of reconfigurable computing has demonstrated that designing an optimum hardware implementation of a given application under hardware resource constraints is an NP-complete problem. The premise of the approach is that the general issue of applying reconfigurable computing resources to the implementation of an application, maximizing the performance of the computation subject to physical resource constraints, can be made a tractable problem by assuming a computational paradigm, such as vector processing. This research contributes a formulation of the problem and a methodology to design a reconfigurable vector processing implementation of a given application that satisfies a performance metric. A generic, parametric, architectural framework for vector processing implemented in reconfigurable logic is developed as a target for a scheduling/mapping algorithm that maps an input computation to a given instance of the architecture. This algorithm is integrated with an optimization framework to arrive at a specification of the architecture parameters

  2. Mapping the Speech Code: Cortical Responses Linking the Perception and Production of Vowels.

    Science.gov (United States)

    Schuerman, William L; Meyer, Antje S; McQueen, James M

    2017-01-01

    The acoustic realization of speech is constrained by the physical mechanisms by which it is produced. Yet for speech perception, the degree to which listeners utilize experience derived from speech production has long been debated. In the present study, we examined how sensorimotor adaptation during production may affect perception, and how this relationship may be reflected in early vs. late electrophysiological responses. Participants first performed a baseline speech production task, followed by a vowel categorization task during which EEG responses were recorded. In a subsequent speech production task, half the participants received shifted auditory feedback, leading most to alter their articulations. This was followed by a second, post-training vowel categorization task. We compared changes in vowel production to both behavioral and electrophysiological changes in vowel perception. No differences in phonetic categorization were observed between groups receiving altered or unaltered feedback. However, exploratory analyses revealed correlations between vocal motor behavior and phonetic categorization. EEG analyses revealed correlations between vocal motor behavior and cortical responses in both early and late time windows. These results suggest that participants' recent production behavior influenced subsequent vowel perception. We suggest that the change in perception can be best characterized as a mapping of acoustics onto articulation.

  3. Amharic Speech Recognition for Speech Translation

    OpenAIRE

    Melese, Michael; Besacier, Laurent; Meshesha, Million

    2016-01-01

    International audience; The state-of-the-art speech translation can be seen as a cascade of Automatic Speech Recognition, Statistical Machine Translation and Text-To-Speech synthesis. In this study an attempt is made to experiment on Amharic speech recognition for Amharic-English speech translation in tourism domain. Since there is no Amharic speech corpus, we developed a read-speech corpus of 7.43hr in tourism domain. The Amharic speech corpus has been recorded after translating standard Bas...

  4. A new method for species identification via protein-coding and non-coding DNA barcodes by combining machine learning with bioinformatic methods.

    Directory of Open Access Journals (Sweden)

    Ai-bing Zhang

    Full Text Available Species identification via DNA barcodes is contributing greatly to current bioinventory efforts. The initial, and widely accepted, proposal was to use the protein-coding cytochrome c oxidase subunit I (COI region as the standard barcode for animals, but recently non-coding internal transcribed spacer (ITS genes have been proposed as candidate barcodes for both animals and plants. However, achieving a robust alignment for non-coding regions can be problematic. Here we propose two new methods (DV-RBF and FJ-RBF to address this issue for species assignment by both coding and non-coding sequences that take advantage of the power of machine learning and bioinformatics. We demonstrate the value of the new methods with four empirical datasets, two representing typical protein-coding COI barcode datasets (neotropical bats and marine fish and two representing non-coding ITS barcodes (rust fungi and brown algae. Using two random sub-sampling approaches, we demonstrate that the new methods significantly outperformed existing Neighbor-joining (NJ and Maximum likelihood (ML methods for both coding and non-coding barcodes when there was complete species coverage in the reference dataset. The new methods also out-performed NJ and ML methods for non-coding sequences in circumstances of potentially incomplete species coverage, although then the NJ and ML methods performed slightly better than the new methods for protein-coding barcodes. A 100% success rate of species identification was achieved with the two new methods for 4,122 bat queries and 5,134 fish queries using COI barcodes, with 95% confidence intervals (CI of 99.75-100%. The new methods also obtained a 96.29% success rate (95%CI: 91.62-98.40% for 484 rust fungi queries and a 98.50% success rate (95%CI: 96.60-99.37% for 1094 brown algae queries, both using ITS barcodes.

  5. Sparse coding of the modulation spectrum for noise-robust automatic speech recognition

    NARCIS (Netherlands)

    Ahmadi, S.; Ahadi, S.M.; Cranen, B.; Boves, L.W.J.

    2014-01-01

    The full modulation spectrum is a high-dimensional representation of one-dimensional audio signals. Most previous research in automatic speech recognition converted this very rich representation into the equivalent of a sequence of short-time power spectra, mainly to simplify the computation of the

  6. Hate Speech, the First Amendment, and Professional Codes of Conduct: Where to Draw the Line?

    Science.gov (United States)

    Mello, Jeffrey A.

    2008-01-01

    This article presents a teaching case that involves the presentation of an actual incident in which a state commission on judicial performance had to balance a judge's First Amendment rights to protected free speech against his public statements about a societal class/group that were deemed to be derogatory and inflammatory and, hence, cast…

  7. Translocation Properties of Primitive Molecular Machines and Their Relevance to the Structure of the Genetic Code

    CERN Document Server

    Aldana, M; Larralde, H; Martínez-Mekler, G; Aldana, Maximino; Cocho, Germinal; Larralde, Hernan; Martinez-Mekler, Gustavo

    2002-01-01

    We address the question, related with the origin of the genetic code, of why are there three bases per codon in the translation to protein process. As a followup to our previous work, we approach this problem by considering the translocation properties of primitive molecular machines, which capture basic features of ribosomal/messenger RNA interactions, while operating under prebiotic conditions. Our model consists of a short one-dimensional chain of charged particles(rRNA antecedent) interacting with a polymer (mRNA antecedent) via electrostatic forces. The chain is subject to external forcing that causes it to move along the polymer which is fixed in a quasi one dimensional geometry. Our numerical and analytic studies of statistical properties of random chain/polymer potentials suggest that, under very general conditions, a dynamics is attained in which the chain moves along the polymer in steps of three monomers. By adjusting the model in order to consider present day genetic sequences, we show that the ab...

  8. Press ethics and perceptions of journalism in Turkey: An analysis of journalists' ethical challenges with special regard to codes of conduct and hate speech

    OpenAIRE

    Stav, Ragnhild

    2013-01-01

    This master thesis analyzes the ethical challenges journalists have in their work, with special regard to code of conduct and hate speech. When it comes to the issue of hate speech, this master thesis focuses at hate speech directed to minorities in Turkey. The media market in Turkey is highly regulated by laws and regulations. As a result of that several newspapers have been in trouble with the law. This in turn leads to self-censorship in the business. Two media groups o...

  9. Breaking the language barrier: machine assisted diagnosis using the medical speech translator.

    Science.gov (United States)

    Starlander, Marianne; Bouillon, Pierrette; Rayner, Manny; Chatzichrisafis, Nikos; Hockey, Beth Ann; Isahara, Hitoshi; Kanzaki, Kyoko; Nakao, Yukie; Santaholma, Marianne

    2005-01-01

    In this paper, we describe and evaluate an Open Source medical speech translation system (MedSLT) intended for safety-critical applications. The aim of this system is to eliminate the language barriers in emergency situation. It translates spoken questions from English into French, Japanese and Finnish in three medical subdomains (headache, chest pain and abdominal pain), using a vocabulary of about 250-400 words per sub-domain. The architecture is a compromise between fixed-phrase translation on one hand and complex linguistically-based systems on the other. Recognition is guided by a Context Free Grammar Language Model compiled from a general unification grammar, automatically specialised for the domain. We present an evaluation of this initial prototype that shows the advantages of this grammar-based approach for this particular translation task in term of both reliability and use.

  10. An Adaptive Approach to a 2.4 kb/s LPC Speech Coding System.

    Science.gov (United States)

    1985-07-01

    laryngeal cancer ). Spectral estimation is at the foundation of speech analysis for all these goals and accurate AR model estimation in noise is...S ,5 mWnL NrinKt ) o ,-G p (d va Rmea.imn flU: 5() WOM Lu M(G)INUNM 40 4KeemS! MU= 1 UD M5) SIGHSM A SO= WAGe . M. (d) I U NS maIm ( IW vis MAMA

  11. A sparse neural code for some speech sounds but not for others.

    Directory of Open Access Journals (Sweden)

    Mathias Scharinger

    Full Text Available The precise neural mechanisms underlying speech sound representations are still a matter of debate. Proponents of 'sparse representations' assume that on the level of speech sounds, only contrastive or otherwise not predictable information is stored in long-term memory. Here, in a passive oddball paradigm, we challenge the neural foundations of such a 'sparse' representation; we use words that differ only in their penultimate consonant ("coronal" [t] vs. "dorsal" [k] place of articulation and for example distinguish between the German nouns Latz ([lats]; bib and Lachs ([laks]; salmon. Changes from standard [t] to deviant [k] and vice versa elicited a discernible Mismatch Negativity (MMN response. Crucially, however, the MMN for the deviant [lats] was stronger than the MMN for the deviant [laks]. Source localization showed this difference to be due to enhanced brain activity in right superior temporal cortex. These findings reflect a difference in phonological 'sparsity': Coronal [t] segments, but not dorsal [k] segments, are based on more sparse representations and elicit less specific neural predictions; sensory deviations from this prediction are more readily 'tolerated' and accordingly trigger weaker MMNs. The results foster the neurocomputational reality of 'representationally sparse' models of speech perception that are compatible with more general predictive mechanisms in auditory perception.

  12. Addressing Hate Speech and Hate Behaviors in Codes of Conduct: A Model for Public Institutions.

    Science.gov (United States)

    Neiger, Jan Alan; Palmer, Carolyn; Penney, Sophie; Gehring, Donald D.

    1998-01-01

    As part of a larger study, researchers collected campus codes prohibiting hate crimes, which were then reviewed to determine whether the codes presented constitutional problems. Based on this review, the authors develop and present a model policy that is content neutral and does not use language that could be viewed as unconstitutionally vague or…

  13. The use of machine learning with signal- and NLP processing of source code to detect and classify vulnerabilities and weaknesses with MARFCAT

    CERN Document Server

    Mokhov, Serguei A

    2010-01-01

    We present a machine learning approach to static code analysis for weaknesses related to security and others with the open-source MARF framework and its application to for the NIST's SATE 2010 static analysis tool exhibition workshop.

  14. Superwideband Bandwidth Extension Using Normalized MDCT Coefficients for Scalable Speech and Audio Coding

    Directory of Open Access Journals (Sweden)

    Young Han Lee

    2013-01-01

    Full Text Available A bandwidth extension (BWE algorithm from wideband to superwideband (SWB is proposed for a scalable speech/audio codec that uses modified discrete cosine transform (MDCT coefficients as spectral parameters. The superwideband is first split into several subbands that are represented as gain parameters and normalized MDCT coefficients in the proposed BWE algorithm. We then estimate normalized MDCT coefficients of the wideband to be fetched for the superwideband and quantize the fetch indices. After that, we quantize gain parameters by using relative ratios between adjacent subbands. The proposed BWE algorithm is embedded into a standard superwideband codec, the SWB extension of G.729.1 Annex E, and its bitrate and quality are compared with those of the BWE algorithm already employed in the standard superwideband codec. It is shown from the comparison that the proposed BWE algorithm relatively reduces the bitrate by around 19% with better quality, compared to the BWE algorithm in the SWB extension of G.729.1 Annex E.

  15. On Coding the States of Sequential Machines with the Use of Partition Pairs

    DEFF Research Database (Denmark)

    Zahle, Torben U.

    1966-01-01

    This article introduces a new technique of making state assignment for sequential machines. The technique is in line with the approach used by Hartmanis [l], Stearns and Hartmanis [3], and Curtis [4]. It parallels the work of Dolotta and McCluskey [7], although it was developed independently...

  16. Vector Quantization of Harmonic Magnitudes in Speech Coding Applications—A Survey and New Technique

    Directory of Open Access Journals (Sweden)

    Wai C. Chu

    2004-12-01

    Full Text Available A harmonic coder extracts the harmonic components of a signal and represents them efficiently using a few parameters. The principles of harmonic coding have become quite successful and several standardized speech and audio coders are based on it. One of the key issues in harmonic coder design is in the quantization of harmonic magnitudes, where many propositions have appeared in the literature. The objective of this paper is to provide a survey of the various techniques that have appeared in the literature for vector quantization of harmonic magnitudes, with emphasis on those adopted by the major speech coding standards; these include constant magnitude approximation, partial quantization, dimension conversion, and variable-dimension vector quantization (VDVQ. In addition, a refined VDVQ technique is proposed where experimental data are provided to demonstrate its effectiveness.

  17. Man-machine interaction in the 21st century--new paradigms through dynamic scene analysis and synthesis (Keynote Speech)

    Science.gov (United States)

    Huang, Thomas S.; Orchard, Michael T.

    1992-11-01

    The past twenty years have witnessed a revolution in the use of computers in virtually every facet of society. While this revolution has been largely fueled by dramatic technological advances, the efficient application of this technology has been made possible through advances in the paradigms defining the way users interact with computers. Today's massive computational power would probably have limited sociological impact is users still communicated with computers via the binary machine language codes used in the 1950's. Instead, this primitive paradigm was replaced by keyboards and ASCII character displays in the 1970's, and the 'mouse' and multiple-window bit-mapped displays in the 1980's. As continuing technological advances make even larger computational power available in the future, advanced paradigms for man-machine interaction will be required to allow this power to be used efficiently in a wide range of applications. Looking ahead into the 21st century, we see paradigms supporting radically new ways of interacting with computers. Ideally, we would like these interactions to mimic the ways we interact with objects and people in the physical world, and, to achieve this goal, we believe that it is essential to consider the exchange of video data into and out of the computer. Paradigms based on visual interactions represent a radical departure from existing paradigms, because they allow the computer to actively seek out information from the user via dynamic scene analysis. For example, the computer might enlarge the display when it detects that the user if squinting, or it might reorient a three- dimensional object on the screen in response to detected hand motions. This contrasts with current paradigms in which the computer relies on passive switching devices (keyboard, mouse, buttons, etc.) to receive information. Feedback will be provided to the user via dynamic scene synthesis, employing stereoscopic three-dimensional display systems. To exploit the

  18. High Performance Computing of Three-Dimensional Finite Element Codes on a 64-bit Machine

    Directory of Open Access Journals (Sweden)

    M.P Raju

    2012-01-01

    Full Text Available Three dimensional Navier-Stokes finite element formulations require huge computational power in terms of memory and CPU time. Recent developments in sparse direct solvers have significantly reduced the memory and computational time of direct solution methods. The objective of this study is twofold. First is to evaluate the performance of various state-of-the-art sequential sparse direct solvers in the context of finite element formulation of fluid flow problems. Second is to examine the merit in upgrading from 32 bit machine to a 64 bit machine with larger RAM capacity in terms of its capacity to solve larger problems. The choice of a direct solver is dependent on its computational time and its in-core memory requirements. Here four different solvers, UMFPACK, MUMPS, HSL_MA78 and PARDISO are compared. The performances of these solvers with respect to the computational time and memory requirements on a 64-bit windows server machine with 16GB RAM is evaluated.

  19. Monte Carlo simulation of a multi-leaf collimator design for telecobalt machine using BEAMnrc code

    Directory of Open Access Journals (Sweden)

    Ayyangar Komanduri

    2010-01-01

    Full Text Available This investigation aims to design a practical multi-leaf collimator (MLC system for the cobalt teletherapy machine and check its radiation properties using the Monte Carlo (MC method. The cobalt machine was modeled using the BEAMnrc Omega-Beam MC system, which could be freely downloaded from the website of the National Research Council (NRC, Canada. Comparison with standard depth dose data tables and the theoretically modeled beam showed good agreement within 2%. An MLC design with low melting point alloy (LMPA was tested for leakage properties of leaves. The LMPA leaves with a width of 7 mm and height of 6 cm, with tongue and groove of size 2 mm wide by 4 cm height, produced only 4% extra leakage compared to 10 cm height tungsten leaves. With finite 60 Co source size, the interleaf leakage was insignificant. This analysis helped to design a prototype MLC as an accessory mount on a cobalt machine. The complete details of the simulation process and analysis of results are discussed.

  20. Speech enhancement for listeners with hearing loss based on a model for vowel coding in the auditory midbrain.

    Science.gov (United States)

    Rao, Akshay; Carney, Laurel H

    2014-07-01

    A novel signal-processing strategy is proposed to enhance speech for listeners with hearing loss. The strategy focuses on improving vowel perception based on a recent hypothesis for vowel coding in the auditory system. Traditionally, studies of neural vowel encoding have focused on the representation of formants (peaks in vowel spectra) in the discharge patterns of the population of auditory-nerve (AN) fibers. A recent hypothesis focuses instead on vowel encoding in the auditory midbrain, and suggests a robust representation of formants. AN fiber discharge rates are characterized by pitch-related fluctuations having frequency-dependent modulation depths. Fibers tuned to frequencies near formants exhibit weaker pitch-related fluctuations than those tuned to frequencies between formants. Many auditory midbrain neurons show tuning to amplitude modulation frequency in addition to audio frequency. According to the auditory midbrain vowel encoding hypothesis, the response map of a population of midbrain neurons tuned to modulations near voice pitch exhibits minima near formant frequencies, due to the lack of strong pitch-related fluctuations at their inputs. This representation is robust over the range of noise conditions in which speech intelligibility is also robust for normal-hearing listeners. Based on this hypothesis, a vowel-enhancement strategy has been proposed that aims to restore vowel encoding at the level of the auditory midbrain. The signal processing consists of pitch tracking, formant tracking, and formant enhancement. The novel formant-tracking method proposed here estimates the first two formant frequencies by modeling characteristics of the auditory periphery, such as saturated discharge rates of AN fibers and modulation tuning properties of auditory midbrain neurons. The formant enhancement stage aims to restore the representation of formants at the level of the midbrain by increasing the dominance of a single harmonic near each formant and saturating

  1. 基于压缩感知的语音编码新方案%New speech coding scheme based on compressed sensing

    Institute of Scientific and Technical Information of China (English)

    许佳佳

    2016-01-01

    根据语音信号的稀疏性,将压缩感知理论应用于语音信号的处理中,提出了一种语音编码的新方案。该方法在编码端采用随机高斯矩阵对语音信号进行观测,得到较少的观测值,然后使用矢量量化编码进一步压缩数据;在解码端,通过矢量量化解码得到观测值,根据语音信号在离散余弦域中的稀疏性,用正交匹配追踪算法重构语音信号。所用算法,在保证语音信号重构质量的前提下降低计算复杂度,减小时延。实验结果表明,对于采样率为44100 Hz,量化位数为16 bit,码速率为705.6 kbps单声道语音信号压缩到100 kbps左右仍具有较好的语音质量,同时算法时间延迟低。%According to the sparse of the speech signal, applied compression perception theory to speech signal processing, this paper proposes a new scheme of speech signal coding. The method using random Gaussian matrix observing the speech signal on the encoding side , obtained fewer observations,then further compress the data using vector quantization coding.In the decoder, decoding by vector quantization, getting observations based on the speech signal sparsity in the discrete cosine domain, then reconstructed speech signal using orthogonal matching pursuit algorithm . The purpose of the algorithm is to reduce the computational complexity and delay on the premise of guarantee the quality of speech signal reconstruction. Experimental results show that the mono audio signal whose sampling rate is 44100 hz, quantitative is 16 bit and bit rate is 705.6 Kbps could be compressed to around 100 Kbps, the compressed speech signal still has good voice quality, at the same time the algorithm has lower time delay.

  2. Recent Progress in a Beam-Beam Simulation Code for Circular Hadron Machines

    Energy Technology Data Exchange (ETDEWEB)

    Kabel, Andreas; /SLAC; Fischer, Wolfram; /Brookhaven; Sen, Tanaji; /Fermilab

    2007-09-10

    While conventional tracking codes can readily provide higher-order optical quantities and give an estimate of dynamic apertures, they are unable to provide directly measurable quantities such as lifetimes and loss rates. The particle tracking framework Plibb aims at modeling a storage ring with sufficient accuracy and a sufficiently high number of turns and in the presence of beam-beam interactions to allow for an estimate of these quantities. We provide a description of new features of the codes; we also describe a novel method of treating chromaticity in ring sections in a symplectic fashion.

  3. Machine-vision-based bar code scanning for long-range applications

    Science.gov (United States)

    Banta, Larry E.; Pertl, Franz A.; Rosenecker, Charles; Rosenberry-Friend, Kimberly A.

    1998-10-01

    Bar code labeling of products has become almost universal in most industries. However, in the steel industry, problems with high temperatures, harsh physical environments and the large sizes of the products and material handling equipment have slowed implementation of bar code based systems in the hot end of the mill. Typical laser-based bar code scanners have maximum scan distances of only 15 feet or so. Longer distance models have been developed which require the use of retro reflective paper labels, but the labels must be very large, are expensive, and cannot stand the heat and physical abuse of the steel mill environment. Furthermore, it is often difficult to accurately point a hand held scanner at targets in bright sunlight or at long distances. An automated product tag reading system based on CCD cameras and computer image processing has been developed by West Virginia University, and demonstrated at the Weirton Steel Corporation. The system performs both the pointing and reading functions. A video camera is mounted on a pan/tilt head, and connected to a personal computer through a frame grabber board. The computer analyzes the images, and can identify product ID tags in a wide-angle scene. It controls the camera to point at each tag and zoom for a closeup picture. The closeups are analyzed and the program need both a barcode and the corresponding alphanumeric code on the tag. This paper describes the camera pointing and bar-code reading functions of the algorithm. A companion paper describes the OCR functions.

  4. Speech Compression for Noise-Corrupted Thai Expressive Speech

    Directory of Open Access Journals (Sweden)

    Suphattharachai Chomphan

    2011-01-01

    Full Text Available Problem statement: In speech communication, speech coding aims at preserving the speech quality with lower coding bitrate. When considering the communication environment, various types of noises deteriorates the speech quality. The expressive speech with different speaking styles may cause different speech quality with the same coding method. Approach: This research proposed a study of speech compression for noise-corrupted Thai expressive speech by using two coding methods of CS-ACELP and MP-CELP. The speech material included a hundredmale speech utterances and a hundred female speech utterances. Four speaking styles included enjoyable, sad, angry and reading styles. Five sentences of Thai speech were chosen. Three types of noises were included (train, car and air conditioner. Five levels of each type of noise were varied from 0-20 dB. The subjective test of mean opinion score was exploited in the evaluation process. Results: The experimental results showed that CS-ACELP gave the better speech quality than that of MP-CELP at all three bitrates of 6000, 8600-12600 bps. When considering the levels of noise, the 20-dB noise gave the best speech quality, while 0-dB noise gave the worst speech quality. When considering the speech gender, female speech gave the better results than that of male speech. When considering the types of noise, the air-conditioner noise gave the best speech quality, while the train noise gave the worst speech quality. Conclusion: From the study, it can be seen that coding methods, types of noise, levels of noise, speech gender influence on the coding speech quality.

  5. A speech coding algorithm based on compressed sensing%基于压缩感知的语音信号编码算法

    Institute of Scientific and Technical Information of China (English)

    王茂林; 黄文明; 王菊娇

    2012-01-01

    A new speech coding algorithm based on compressed sensing is presented for the speech signal sparse representation with discrete cosine transform (DCT). This algorithm uses Gaussian random matrix (or the speech waveform measurement, which ? quantized independently I>y unifonn quantizer. Saturated measuremtnts are simply discarded in the decoder, then speech signals are reconstructed on those remained measurements by Lasso algorithm. Experimental results show that the performance of the new algorithm is good.%针对语音信号在离散余弦变换基上的稀疏性,提出了一种基于压缩感知的语音压缩编码算法.算法在编码端采用随机高斯矩阵直接对语音波形进行观测,并采样均匀量化技术对随机观测进行最化.解码端利用未饱和的观测值通过Lasso算法实现语音信号的重构.仿真结果表明,该算法具有良好的重构性能.

  6. Multilevel Analysis in Analyzing Speech Data

    Science.gov (United States)

    Guddattu, Vasudeva; Krishna, Y.

    2011-01-01

    The speech produced by human vocal tract is a complex acoustic signal, with diverse applications in phonetics, speech synthesis, automatic speech recognition, speaker identification, communication aids, speech pathology, speech perception, machine translation, hearing research, rehabilitation and assessment of communication disorders and many…

  7. Analysis of image content recognition algorithm based on sparse coding and machine learning

    Science.gov (United States)

    Xiao, Yu

    2017-03-01

    This paper presents an image classification algorithm based on spatial sparse coding model and random forest. Firstly, SIFT feature extraction of the image; and then use the sparse encoding theory to generate visual vocabulary based on SIFT features, and using the visual vocabulary of SIFT features into a sparse vector; through the combination of regional integration and spatial sparse vector, the sparse vector gets a fixed dimension is used to represent the image; at last random forest classifier for image sparse vectors for training and testing, using the experimental data set for standard test Caltech-101 and Scene-15. The experimental results show that the proposed algorithm can effectively represent the features of the image and improve the classification accuracy. In this paper, we propose an innovative image recognition algorithm based on image segmentation, sparse coding and multi instance learning. This algorithm introduces the concept of multi instance learning, the image as a multi instance bag, sparse feature transformation by SIFT images as instances, sparse encoding model generation visual vocabulary as the feature space is mapped to the feature space through the statistics on the number of instances in bags, and then use the 1-norm SVM to classify images and generate sample weights to select important image features.

  8. lncRScan-SVM: A Tool for Predicting Long Non-Coding RNAs Using Support Vector Machine.

    Science.gov (United States)

    Sun, Lei; Liu, Hui; Zhang, Lin; Meng, Jia

    2015-01-01

    Functional long non-coding RNAs (lncRNAs) have been bringing novel insight into biological study, however it is still not trivial to accurately distinguish the lncRNA transcripts (LNCTs) from the protein coding ones (PCTs). As various information and data about lncRNAs are preserved by previous studies, it is appealing to develop novel methods to identify the lncRNAs more accurately. Our method lncRScan-SVM aims at classifying PCTs and LNCTs using support vector machine (SVM). The gold-standard datasets for lncRScan-SVM model training, lncRNA prediction and method comparison were constructed according to the GENCODE gene annotations of human and mouse respectively. By integrating features derived from gene structure, transcript sequence, potential codon sequence and conservation, lncRScan-SVM outperforms other approaches, which is evaluated by several criteria such as sensitivity, specificity, accuracy, Matthews correlation coefficient (MCC) and area under curve (AUC). In addition, several known human lncRNA datasets were assessed using lncRScan-SVM. LncRScan-SVM is an efficient tool for predicting the lncRNAs, and it is quite useful for current lncRNA study.

  9. Automatic Multi-GPU Code Generation applied to Simulation of Electrical Machines

    CERN Document Server

    Rodrigues, Antonio Wendell De Oliveira; Dekeyser, Jean-Luc; Menach, Yvonnick Le

    2011-01-01

    The electrical and electronic engineering has used parallel programming to solve its large scale complex problems for performance reasons. However, as parallel programming requires a non-trivial distribution of tasks and data, developers find it hard to implement their applications effectively. Thus, in order to reduce design complexity, we propose an approach to generate code for hybrid architectures (e.g. CPU + GPU) using OpenCL, an open standard for parallel programming of heterogeneous systems. This approach is based on Model Driven Engineering (MDE) and the MARTE profile, standard proposed by Object Management Group (OMG). The aim is to provide resources to non-specialists in parallel programming to implement their applications. Moreover, thanks to model reuse capacity, we can add/change functionalities or the target architecture. Consequently, this approach helps industries to achieve their time-to-market constraints and confirms by experimental tests, performance improvements using multi-GPU environmen...

  10. Speech enhancement

    CERN Document Server

    Benesty, Jacob; Chen, Jingdong

    2006-01-01

    We live in a noisy world! In all applications (telecommunications, hands-free communications, recording, human-machine interfaces, etc.) that require at least one microphone, the signal of interest is usually contaminated by noise and reverberation. As a result, the microphone signal has to be ""cleaned"" with digital signal processing tools before it is played out, transmitted, or stored.This book is about speech enhancement. Different well-known and state-of-the-art methods for noise reduction, with one or multiple microphones, are discussed. By speech enhancement, we mean not only noise red

  11. Intelligibility in speech maskers with a binaural cochlear implant sound coding strategy inspired by the contralateral medial olivocochlear reflex.

    Science.gov (United States)

    Lopez-Poveda, Enrique A; Eustaquio-Martín, Almudena; Stohl, Joshua S; Wolford, Robert D; Schatzer, Reinhold; Gorospe, José M; Ruiz, Santiago Santa Cruz; Benito, Fernando; Wilson, Blake S

    2017-05-01

    We have recently proposed a binaural cochlear implant (CI) sound processing strategy inspired by the contralateral medial olivocochlear reflex (the MOC strategy) and shown that it improves intelligibility in steady-state noise (Lopez-Poveda et al., 2016, Ear Hear 37:e138-e148). The aim here was to evaluate possible speech-reception benefits of the MOC strategy for speech maskers, a more natural type of interferer. Speech reception thresholds (SRTs) were measured in six bilateral and two single-sided deaf CI users with the MOC strategy and with a standard (STD) strategy. SRTs were measured in unilateral and bilateral listening conditions, and for target and masker stimuli located at azimuthal angles of (0°, 0°), (-15°, +15°), and (-90°, +90°). Mean SRTs were 2-5 dB better with the MOC than with the STD strategy for spatially separated target and masker sources. For bilateral CI users, the MOC strategy (1) facilitated the intelligibility of speech in competition with spatially separated speech maskers in both unilateral and bilateral listening conditions; and (2) led to an overall improvement in spatial release from masking in the two listening conditions. Insofar as speech is a more natural type of interferer than steady-state noise, the present results suggest that the MOC strategy holds potential for promising outcomes for CI users. Copyright © 2017. Published by Elsevier B.V.

  12. A Coding System with Independent Annotations of Gesture Forms and Functions during Verbal Communication: Development of a Database of Speech and GEsture (DoSaGE).

    Science.gov (United States)

    Kong, Anthony Pak-Hin; Law, Sam-Po; Kwan, Connie Ching-Yin; Lai, Christy; Lam, Vivian

    2015-03-01

    Gestures are commonly used together with spoken language in human communication. One major limitation of gesture investigations in the existing literature lies in the fact that the coding of forms and functions of gestures has not been clearly differentiated. This paper first described a recently developed Database of Speech and GEsture (DoSaGE) based on independent annotation of gesture forms and functions among 119 neurologically unimpaired right-handed native speakers of Cantonese (divided into three age and two education levels), and presented findings of an investigation examining how gesture use was related to age and linguistic performance. Consideration of these two factors, for which normative data are currently very limited or lacking in the literature, is relevant and necessary when one evaluates gesture employment among individuals with and without language impairment. Three speech tasks, including monologue of a personally important event, sequential description, and story-telling, were used for elicitation. The EUDICO Linguistic ANnotator (ELAN) software was used to independently annotate each participant's linguistic information of the transcript, forms of gestures used, and the function for each gesture. About one-third of the subjects did not use any co-verbal gestures. While the majority of gestures were non-content-carrying, which functioned mainly for reinforcing speech intonation or controlling speech flow, the content-carrying ones were used to enhance speech content. Furthermore, individuals who are younger or linguistically more proficient tended to use fewer gestures, suggesting that normal speakers gesture differently as a function of age and linguistic performance.

  13. Speech processing using maximum likelihood continuity mapping

    Energy Technology Data Exchange (ETDEWEB)

    Hogden, John E. (Santa Fe, NM)

    2000-01-01

    Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.

  14. Speech processing using maximum likelihood continuity mapping

    Energy Technology Data Exchange (ETDEWEB)

    Hogden, J.E.

    2000-04-18

    Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.

  15. TWO-DIMENSION CODE RECOGNITION BASED ON MACHINE VISION%基于机器视觉的2D码识别

    Institute of Scientific and Technical Information of China (English)

    常晓玮

    2014-01-01

    According to components detection of the auto parts supply chain,put forward a kind code recognition method for DataMatrix code,PDF41 7 code,and QR code based on machine vision.The method of obtaining image firstly,and then using the image denoising technique to the image acquired with a 2D code processing,and then select different processing methods according to the 2D code can effectively identify these three 2D codes.The experimental result shows that the proposed approach is feasible and effective and can recognize DataMatrix code,PDF41 7 code,QR code real -timely.%针对汽车生产中零部件供应环节的部件检测应用,提出一种基于机器视觉的DataMatrix码、PDF417码、QR码识别方法。本方法利用图像获取、图像去噪等技术对获取的具有2D码的图像进行处理,然后根据2D码选取不同的处理方法,能有效识别以上三种2D码。实验结果表明该方法是可行的,能实时识别DataMatrix码、PDF417码、QR码。

  16. Speech dynamics are coded in the left motor cortex in fluent speakers but not in adults who stutter.

    Science.gov (United States)

    Neef, Nicole E; Hoang, T N Linh; Neef, Andreas; Paulus, Walter; Sommer, Martin

    2015-03-01

    The precise excitability regulation of neuronal circuits in the primary motor cortex is central to the successful and fluent production of speech. Our question was whether the involuntary execution of undesirable movements, e.g. stuttering, is linked to an insufficient excitability tuning of neural populations in the orofacial region of the primary motor cortex. We determined the speech-related time course of excitability modulation in the left and right primary motor tongue representation. Thirteen fluent speakers (four females, nine males; aged 23-44) and 13 adults who stutter (four females, nine males, aged 21-55) were asked to build verbs with the verbal prefix 'auf'. Single-pulse transcranial magnetic stimulation was applied over the primary motor cortex during the transition phase between a fixed labiodental articulatory configuration and immediately following articulatory configurations, at different latencies after transition onset. Bilateral electromyography was recorded from self-adhesive electrodes placed on the surface of the tongue. Off-line, we extracted the motor evoked potential amplitudes and normalized these amplitudes to the individual baseline excitability during the fixed configuration. Fluent speakers demonstrated a prominent left hemisphere increase of motor cortex excitability in the transition phase (P = 0.009). In contrast, the excitability of the right primary motor tongue representation was unchanged. Interestingly, adults afflicted with stuttering revealed a lack of left-hemisphere facilitation. Moreover, the magnitude of facilitation was negatively correlated with stuttering frequency. Although orofacial midline muscles are bilaterally innervated from corticobulbar projections of both hemispheres, our results indicate that speech motor plans are controlled primarily in the left primary speech motor cortex. This speech motor planning-related asymmetry towards the left orofacial motor cortex is missing in stuttering. Moreover, a negative

  17. Machines a Comprendre la Parole: Methodologie et Bilan de Recherche (Automatic Speech Recognition: Methodology and the State of the Research)

    Science.gov (United States)

    Haton, Jean-Pierre

    1974-01-01

    Still no decisive result has been achieved in the automatic machine recognition of sentences of a natural language. Current research concentrates on developing algorithms for syntactic and semantic analysis. It is obvious that clues from all levels of perception have to be taken into account if a long term solution is ever to be found. (Author/MSE)

  18. An improved method for identification of small non-coding RNAs in bacteria using support vector machine

    Science.gov (United States)

    Barman, Ranjan Kumar; Mukhopadhyay, Anirban; Das, Santasabuj

    2017-04-01

    Bacterial small non-coding RNAs (sRNAs) are not translated into proteins, but act as functional RNAs. They are involved in diverse biological processes like virulence, stress response and quorum sensing. Several high-throughput techniques have enabled identification of sRNAs in bacteria, but experimental detection remains a challenge and grossly incomplete for most species. Thus, there is a need to develop computational tools to predict bacterial sRNAs. Here, we propose a computational method to identify sRNAs in bacteria using support vector machine (SVM) classifier. The primary sequence and secondary structure features of experimentally-validated sRNAs of Salmonella Typhimurium LT2 (SLT2) was used to build the optimal SVM model. We found that a tri-nucleotide composition feature of sRNAs achieved an accuracy of 88.35% for SLT2. We validated the SVM model also on the experimentally-detected sRNAs of E. coli and Salmonella Typhi. The proposed model had robustly attained an accuracy of 81.25% and 88.82% for E. coli K-12 and S. Typhi Ty2, respectively. We confirmed that this method significantly improved the identification of sRNAs in bacteria. Furthermore, we used a sliding window-based method and identified sRNAs from complete genomes of SLT2, S. Typhi Ty2 and E. coli K-12 with sensitivities of 89.09%, 83.33% and 67.39%, respectively.

  19. Speech processing in mobile environments

    CERN Document Server

    Rao, K Sreenivasa

    2014-01-01

    This book focuses on speech processing in the presence of low-bit rate coding and varying background environments. The methods presented in the book exploit the speech events which are robust in noisy environments. Accurate estimation of these crucial events will be useful for carrying out various speech tasks such as speech recognition, speaker recognition and speech rate modification in mobile environments. The authors provide insights into designing and developing robust methods to process the speech in mobile environments. Covering temporal and spectral enhancement methods to minimize the effect of noise and examining methods and models on speech and speaker recognition applications in mobile environments.

  20. Practical speech user interface design

    CERN Document Server

    Lewis, James R

    2010-01-01

    Although speech is the most natural form of communication between humans, most people find using speech to communicate with machines anything but natural. Drawing from psychology, human-computer interaction, linguistics, and communication theory, Practical Speech User Interface Design provides a comprehensive yet concise survey of practical speech user interface (SUI) design. It offers practice-based and research-based guidance on how to design effective, efficient, and pleasant speech applications that people can really use. Focusing on the design of speech user interfaces for IVR application

  1. Features and machine learning classification of connected speech samples from patients with autopsy proven Alzheimer's disease with and without additional vascular pathology.

    Science.gov (United States)

    Rentoumi, Vassiliki; Raoufian, Ladan; Ahmed, Samrah; de Jager, Celeste A; Garrard, Peter

    2014-01-01

    Mixed vascular and Alzheimer-type dementia and pure Alzheimer's disease are both associated with changes in spoken language. These changes have, however, seldom been subjected to systematic comparison. In the present study, we analyzed language samples obtained during the course of a longitudinal clinical study from patients in whom one or other pathology was verified at post mortem. The aims of the study were twofold: first, to confirm the presence of differences in language produced by members of the two groups using quantitative methods of evaluation; and secondly to ascertain the most informative sources of variation between the groups. We adopted a computational approach to evaluate digitized transcripts of connected speech along a range of language-related dimensions. We then used machine learning text classification to assign the samples to one of the two pathological groups on the basis of these features. The classifiers' accuracies were tested using simple lexical features, syntactic features, and more complex statistical and information theory characteristics. Maximum accuracy was achieved when word occurrences and frequencies alone were used. Features based on syntactic and lexical complexity yielded lower discrimination scores, but all combinations of features showed significantly better performance than a baseline condition in which every transcript was assigned randomly to one of the two classes. The classification results illustrate the word content specific differences in the spoken language of the two groups. In addition, those with mixed pathology were found to exhibit a marked reduction in lexical variation and complexity compared to their pure AD counterparts.

  2. 基于LDPC码的MELP语音帧不等差错保护方案%Unequal Error Protection of MELP Encoded Speech Based on LDPC Codes

    Institute of Scientific and Technical Information of China (English)

    张世杰; 陈利军

    2011-01-01

    A scheme of Unequal Error Protection(UEP) is proposed for robust transmission of Mixed Excitation Linear Prediction (MELP) compressed speech, which is based on Plotkin type constructed low - density parity -check (LDPC) codes. Its performance is evaluated under Additive White Gaussian Noise (AWGN) channel, which shows that the subjective and objective assessments of speech quality after transmission are greatly improved compared with the Equal Error Protection (EEP) scheme under the same channel conditions, especially under severe channel conditions.%针对MELP编码的语音帧,提出了一个以LDPC码作为子码,基于普洛特金方式构造的分组码的不等差错保护传输方案,并对该方案在AWGN信道下进行了仿真.仿真结果表明,该方案与一般的等差错保护相比,在相同信道状况的条件下,特别是恶劣信道的情况下,对于语音质量的主观评价和客观评价都有较大提高.

  3. Feature extraction and models for speech: An overview

    Science.gov (United States)

    Schroeder, Manfred

    2002-11-01

    Modeling of speech has a long history, beginning with Count von Kempelens 1770 mechanical speaking machine. Even then human vowel production was seen as resulting from a source (the vocal chords) driving a physically separate resonator (the vocal tract). Homer Dudley's 1928 frequency-channel vocoder and many of its descendants are based on the same successful source-filter paradigm. For linguistic studies as well as practical applications in speech recognition, compression, and synthesis (see M. R. Schroeder, Computer Speech), the extant models require the (often difficult) extraction of numerous parameters such as the fundamental and formant frequencies and various linguistic distinctive features. Some of these difficulties were obviated by the introduction of linear predictive coding (LPC) in 1967 in which the filter part is an all-pole filter, reflecting the fact that for non-nasalized vowels the vocal tract is well approximated by an all-pole transfer function. In the now ubiquitous code-excited linear prediction (CELP), the source-part is replaced by a code book which (together with a perceptual error criterion) permits speech compression to very low bit rates at high speech quality for the Internet and cell phones.

  4. Human phoneme recognition depending on speech-intrinsic variability.

    Science.gov (United States)

    Meyer, Bernd T; Jürgens, Tim; Wesker, Thorsten; Brand, Thomas; Kollmeier, Birger

    2010-11-01

    The influence of different sources of speech-intrinsic variation (speaking rate, effort, style and dialect or accent) on human speech perception was investigated. In listening experiments with 16 listeners, confusions of consonant-vowel-consonant (CVC) and vowel-consonant-vowel (VCV) sounds in speech-weighted noise were analyzed. Experiments were based on the OLLO logatome speech database, which was designed for a man-machine comparison. It contains utterances spoken by 50 speakers from five dialect/accent regions and covers several intrinsic variations. By comparing results depending on intrinsic and extrinsic variations (i.e., different levels of masking noise), the degradation induced by variabilities can be expressed in terms of the SNR. The spectral level distance between the respective speech segment and the long-term spectrum of the masking noise was found to be a good predictor for recognition rates, while phoneme confusions were influenced by the distance to spectrally close phonemes. An analysis based on transmitted information of articulatory features showed that voicing and manner of articulation are comparatively robust cues in the presence of intrinsic variations, whereas the coding of place is more degraded. The database and detailed results have been made available for comparisons between human speech recognition (HSR) and automatic speech recognizers (ASR).

  5. NICT/ATR Chinese-Japanese-English Speech-to-Speech Translation System

    Institute of Scientific and Technical Information of China (English)

    Tohru Shimizu; Yutaka Ashikari; Eiichiro Sumita; ZHANG Jinsong; Satoshi Nakamura

    2008-01-01

    This paper describes the latest version of the Chinese-Japanese-English handheld speech-to-speech translation system developed by NICT/ATR,which is now ready to be deployed for travelers.With the entire speech-to-speech translation function being implemented into one terminal,it realizes real-time,location-free speech-to-speech translation.A new noise-suppression technique notably improves the speech recognition performance.Corpus-based approaches of speech recognition,machine translation,and speech synthesis enable coverage of a wide variety of topics and portability to other languages.Test results show that the character accuracy of speech recognition is 82%-94% for Chinese speech,with a bilingual evaluation understudy score of machine translation is 0.55-0.74 for Chinese-Japanese and Chinese-English.

  6. Algorithms and Software for Predictive and Perceptual Modeling of Speech

    CERN Document Server

    Atti, Venkatraman

    2010-01-01

    From the early pulse code modulation-based coders to some of the recent multi-rate wideband speech coding standards, the area of speech coding made several significant strides with an objective to attain high quality of speech at the lowest possible bit rate. This book presents some of the recent advances in linear prediction (LP)-based speech analysis that employ perceptual models for narrow- and wide-band speech coding. The LP analysis-synthesis framework has been successful for speech coding because it fits well the source-system paradigm for speech synthesis. Limitations associated with th

  7. Recognizing GSM Digital Speech

    OpenAIRE

    Gallardo-Antolín, Ascensión; Peláez-Moreno, Carmen; Díaz-de-María, Fernando

    2005-01-01

    The Global System for Mobile (GSM) environment encompasses three main problems for automatic speech recognition (ASR) systems: noisy scenarios, source coding distortion, and transmission errors. The first one has already received much attention; however, source coding distortion and transmission errors must be explicitly addressed. In this paper, we propose an alternative front-end for speech recognition over GSM networks. This front-end is specially conceived to be effective against source c...

  8. Recognizing GSM Digital Speech

    OpenAIRE

    2005-01-01

    The Global System for Mobile (GSM) environment encompasses three main problems for automatic speech recognition (ASR) systems: noisy scenarios, source coding distortion, and transmission errors. The first one has already received much attention; however, source coding distortion and transmission errors must be explicitly addressed. In this paper, we propose an alternative front-end for speech recognition over GSM networks. This front-end is specially conceived to be effective against source c...

  9. Speaker Independent Speech Recognition of Isolated Words in Room Environment

    Directory of Open Access Journals (Sweden)

    M. Tabassum

    2017-04-01

    Full Text Available In this paper, the process of recognizing some important words from a large set of vocabularies is demonstrated based on the combination of dynamic and instantaneous features of the speech spectrum. There are many procedures to recognize a word by its vowel but this paper presents the highly effective speaker independent speech recognition in a typical room environment noise cases. To distinguish several isolated words of the sound of different vowels, two important features known as Pitch and Formant are extracted from the speech signals collected from a number of random male and female speakers. The extracted features are then analysed for the particular utterances to train the system. The specific objectives of this work are to implement an isolated and automatic word speech recognizer, which is capable of recognizing as well as responding to speech and an audio interfacing system between human and machine for an effective human-machine interaction. The whole system has been tested using computer codes and the result was satisfactory in almost 90% of cases. However, system might get confused by similar vowel sounds sometimes.

  10. Potential role of monkey inferior parietal neurons coding action semantic equivalences as precursors of parts of speech

    Science.gov (United States)

    Yamazaki, Yumiko; Yokochi, Hiroko; Tanaka, Michio; Okanoya, Kazuo; Iriki, Atsushi

    2010-01-01

    The anterior portion of the inferior parietal cortex possesses comprehensive representations of actions embedded in behavioural contexts. Mirror neurons, which respond to both self-executed and observed actions, exist in this brain region in addition to those originally found in the premotor cortex. We found that parietal mirror neurons responded differentially to identical actions embedded in different contexts. Another type of parietal mirror neuron represents an inverse and complementary property of responding equally to dissimilar actions made by itself and others for an identical purpose. Here, we propose a hypothesis that these sets of inferior parietal neurons constitute a neural basis for encoding the semantic equivalence of various actions across different agents and contexts. The neurons have mirror neuron properties, and they encoded generalization of agents, differentiation of outcomes, and categorization of actions that led to common functions. By integrating the activities of these mirror neurons with various codings, we further suggest that in the ancestral primates' brains, these various representations of meaningful action led to the gradual establishment of equivalence relations among the different types of actions, by sharing common action semantics. Such differential codings of the components of actions might represent precursors to the parts of protolanguage, such as gestural communication, which are shared among various members of a society. Finally, we suggest that the inferior parietal cortex serves as an interface between this action semantics system and other higher semantic systems, through common structures of action representation that mimic language syntax. PMID:20119879

  11. Hate Speech on Campus: A Practical Approach.

    Science.gov (United States)

    Hogan, Patrick

    1997-01-01

    Looks at arguments concerning hate speech and speech codes on college campuses, arguing that speech codes are likely to be of limited value in achieving civil rights objectives, and that there are alternatives less harmful to civil liberties and more successful in promoting civil rights. Identifies specific goals, and considers how restriction of…

  12. An Approach to Hide Secret Speech Information

    Institute of Scientific and Technical Information of China (English)

    WU Zhi-jun; DUAN Hai-xin; LI Xing

    2006-01-01

    This paper presented an approach to hide secret speech information in code excited linear prediction(CELP)-based speech coding scheme by adopting the analysis-by-synthesis (ABS)-based algorithm of speech information hiding and extracting for the purpose of secure speech communication. The secret speech is coded in 2.4Kb/s mixed excitation linear prediction (MELP), which is embedded in CELP type public speech. The ABS algorithm adopts speech synthesizer in speech coder. Speech embedding and coding are synchronous, i.e. a fusion of speech information data of public and secret. The experiment of embedding 2.4 Kb/s MELP secret speech in G.728 scheme coded public speech transmitted via public switched telephone network (PSTN) shows that the proposed approach satisfies the requirements of information hiding, meets the secure communication speech quality constraints, and achieves high hiding capacity of average 3.2 Kb/s with an excellent speech quality and complicating speakers' recognition.

  13. When speech enhances Spatial Musical Association of Response Codes: Joint spatial associations of pitch and timbre in nonmusicians.

    Science.gov (United States)

    Weis, Tina; Estner, Barbara; Lachmann, Thomas

    2016-01-01

    Previous studies have shown that the effect of the Spatial Musical Association of Response Codes (SMARC) depends on various features, such as task conditions (whether pitch height is implicit or explicit), response dimension (horizontal vs. vertical), presence or absence of a reference tone, and former musical training of the participants. In the present study, we investigated the effects of pitch range and timbre: in particular, how timbre (piano vs. vocal) contributes to the horizontal and vertical SMARC effect in nonmusicians under varied pitch range conditions. Nonmusicians performed a timbre judgement task in which the pitch range was either small (6 or 8 semitone steps) or large (9 or 12 semitone steps) in a horizontal and a vertical response setting. For piano sounds, SMARC effects were observed in all conditions. For the vocal sounds, in contrast, SMARC effects depended on pitch range. We concluded that the occurrence of the SMARC effect, especially in horizontal response settings, depends on the interaction of the timbre (vocal and piano) and pitch range if vocal and instrumental sounds are combined in one experiment: the human voice enhances the attention, both to the vocal and the instrumental sounds.

  14. Dynamic parameters’ identification for the feeding system of computer numerical control machine tools stimulated by G-code

    Directory of Open Access Journals (Sweden)

    Guangsheng Chen

    2015-08-01

    Full Text Available This study proposed a dynamic parameters’ identification method for the feeding system of computer numerical control machine tools based on internal sensor. A simplified control model and linear identification model of the feeding system were established, in which the input and output signals are from sensors embedded in computer numerical control machine tools, and the dynamic parameters of the feeding system, including the equivalent inertia, equivalent damping, worktable damping, and the overall stiffness of the mechanical system, were solved by the least square method. Using the high-order Taylor expansion, the nonlinear Stribeck friction model was linearized and the parameters of the Stribeck friction model were obtained by the same way. To verify the validity and effectiveness of the identification method, identification experiments, circular motion testing, and simulations were conducted. The results obtained were stable and suggested that inertia and damping identification experiments converged fast. Stiffness identification experiments showed some deviation from simulation due to the influences of geometric error and nonlinear of stiffness. However, the identification results were still of reference significance and the method is convenient, effective, and suited for industrial condition.

  15. Public Speech.

    Science.gov (United States)

    Green, Thomas F.

    1994-01-01

    Discusses the importance of public speech in society, noting the power of public speech to create a world and a public. The paper offers a theory of public speech, identifies types of public speech, and types of public speech fallacies. Two ways of speaking of the public and of public life are distinguished. (SM)

  16. Processing Code and Maintenance and Safety Operation for CNC Machining Center%CNC加工中心加工守则及维护保养与安全操作

    Institute of Scientific and Technical Information of China (English)

    罗昊

    2012-01-01

    介绍了CNC加工中心加工工艺守则,CNC加工中心保养规程,以及CNC加工中心安全技术操作规程。希望对CNC加工人员有一些帮助。%This paper Introduced the CNC machining center processing code, CNC machining center maintenance procedures, as well as CNC machining center safe technical operation rules. Hoped it has some helps for to CNC processing person.

  17. Application of distance-coded reference measuring system on rotary swivel drive of the 3D laser cutting machine%距离编码测量装置在三维激光切割机上的应用

    Institute of Scientific and Technical Information of China (English)

    翟东升; 钟昇; 洪超

    2013-01-01

    The application method of distance-coded reference measuring system on rotary swivel drive of the 3D laser cutting machine has been introduced in the text. The above method and conclusion provide reference for the application of distance-coded reference measuring system on other machine tools.%本文主要介绍带距离编码参考点标记的测量装置在三维激光切割机旋转机构上的应用方法,可为距离编码测量装置的应用提供参考.

  18. Speech Problems

    Science.gov (United States)

    ... of your treatment plan may include seeing a speech therapist , a person who is trained to treat speech disorders. How often you have to see the speech therapist will vary — you'll probably start out seeing ...

  19. Speaking Code

    DEFF Research Database (Denmark)

    Cox, Geoff

    ; alternatives to mainstream development, from performances of the live-coding scene to the organizational forms of commons-based peer production; the democratic promise of social media and their paradoxical role in suppressing political expression; and the market’s emptying out of possibilities for free...... development, Speaking Code unfolds an argument to undermine the distinctions between criticism and practice, and to emphasize the aesthetic and political aspects of software studies. Not reducible to its functional aspects, program code mirrors the instability inherent in the relationship of speech...... expression in the public realm. The book’s line of argument defends language against its invasion by economics, arguing that speech continues to underscore the human condition, however paradoxical this may seem in an era of pervasive computing....

  20. Machine medical ethics

    CERN Document Server

    Pontier, Matthijs

    2015-01-01

    The essays in this book, written by researchers from both humanities and sciences, describe various theoretical and experimental approaches to adding medical ethics to a machine in medical settings. Medical machines are in close proximity with human beings, and getting closer: with patients who are in vulnerable states of health, who have disabilities of various kinds, with the very young or very old, and with medical professionals. In such contexts, machines are undertaking important medical tasks that require emotional sensitivity, knowledge of medical codes, human dignity, and privacy. As machine technology advances, ethical concerns become more urgent: should medical machines be programmed to follow a code of medical ethics? What theory or theories should constrain medical machine conduct? What design features are required? Should machines share responsibility with humans for the ethical consequences of medical actions? How ought clinical relationships involving machines to be modeled? Is a capacity for e...

  1. Are "Hate Speech" Codes Unconstitutional?

    Science.gov (United States)

    Schimmel, David

    1993-01-01

    In a case concerning a teenager charged with cross burning, the Supreme Court, in a 9-0 decision, ruled that a St. Paul, Minnesota, ordinance was unconstitutional. Summarizes Justice Scalia's opinion and three concurring opinions that reflect bitter disagreement among the justices. Discusses the meaning of this decision and its implications for…

  2. Hate Speech and the First Amendment.

    Science.gov (United States)

    Rainey, Susan J.; Kinsler, Waren S.; Kannarr, Tina L.; Reaves, Asa E.

    This document is comprised of California state statutes, federal legislation, and court litigation pertaining to hate speech and the First Amendment. The document provides an overview of California education code sections relating to the regulation of speech; basic principles of the First Amendment; government efforts to regulate hate speech,…

  3. Debugging the virtual machine

    Energy Technology Data Exchange (ETDEWEB)

    Miller, P.; Pizzi, R.

    1994-09-02

    A computer program is really nothing more than a virtual machine built to perform a task. The program`s source code expresses abstract constructs using low level language features. When a virtual machine breaks, it can be very difficult to debug because typical debuggers provide only low level machine implementation in formation to the software engineer. We believe that the debugging task can be simplified by introducing aspects of the abstract design into the source code. We introduce OODIE, an object-oriented language extension that allows programmers to specify a virtual debugging environment which includes the design and abstract data types of the virtual machine.

  4. Speech Matters

    DEFF Research Database (Denmark)

    Hasse Jørgensen, Stina

    2011-01-01

    About Speech Matters - Katarina Gregos, the Greek curator's exhibition at the Danish Pavillion, the Venice Biannual 2011.......About Speech Matters - Katarina Gregos, the Greek curator's exhibition at the Danish Pavillion, the Venice Biannual 2011....

  5. Speech Development

    Science.gov (United States)

    ... The speech-language pathologist should consistently assess your child’s speech and language development, as well as screen for hearing problems (with ... and caregivers play a vital role in a child’s speech and language development. It is important that you talk to your ...

  6. Intelligibility Enhancement of Speech in Noise

    OpenAIRE

    Valentini-Botinhao, Cassia; Yamagishi, Junichi; King, Simon

    2014-01-01

    Speech technology can facilitate human-machine interaction and create new communication interfaces. Text-To-Speech (TTS) systems provide speech output for dialogue, notification and reading applications as well as personalized voices for people that have lost the use of their own. TTS systems are built to produce synthetic voices that should sound as natural, expressive and intelligible as possible and if necessary be similar to a particular speaker. Although naturalness is an important requi...

  7. NC-code post-processor for 4-axis machining center based on NX of FANUC system%基于NX的FANUC系统四轴加工中心后置处理器构建

    Institute of Scientific and Technical Information of China (English)

    谭大庆

    2013-01-01

      A customized NC-code post-processor,based on the universal template of NX post-processor, is designed to meet the standards of 4-axis machining center equipped with FANUC CNC system.%  使用NX后置处理构建器通用模板的基础上,设计符合FANUC数控系统四轴加工中心要求的专用后置处理器。

  8. How should a speech recognizer work?

    NARCIS (Netherlands)

    Scharenborg, O.E.; Norris, D.G.; Bosch, L.F.M. ten; McQueen, J.M.

    2005-01-01

    Although researchers studying human speech recognition (HSR) and automatic speech recognition (ASR) share a common interest in how information processing systems (human or machine) recognize spoken language, there is little communication between the two disciplines. We suggest that this lack of comm

  9. 基于压缩感知的低速率语音编码新方案%New low bit rate speech coding scheme based on compressed sensing

    Institute of Scientific and Technical Information of China (English)

    叶蕾; 杨震; 孙林慧

    2011-01-01

    利用语音小波高频系数的稀疏性和压缩感知原理,提出一种新的基于压缩感知的低速率语音编码方案,其中小波高频系数的压缩感知重构分别采用l1范数优化方案及码本预测方案进行,前者对大幅度样值重构效果较好,且不仅适用于语音,也适用于音乐信号,具有传统的线性预测编码方法无法比拟的优势,后者对稀疏系数位置的估计较好,且不需要采用压缩感知重构常用的基追踪算法或匹配追踪算法,从而减少了计算量.两种方法的联合使用能发挥各自的优势,使得重构语音的音质进一步改善.%Utilizing the sparsity of high frequency wavelet transform coefficients of speech signal and theory of compressed sensing, a new low bit rate speech coding scheme based on compressed sensing is proposed. The reconstruction of high frequency wavelet transform coefficients is achieved by l1 normal optimization and codebook prediction reconstruction respectively. L1 reconstruction has good effect for large coefficients and suits for both speech and music, with which traditional linear prediction coding cannot compare. Codebook prediction reconstruction has good effect for the location of sparse coefficients and reduces the amount of calculation due to not using basis pursuit or matching pursuit. The combination of these two reconstruction methods can bring the advantages of both methods and improve the quality of the reconstructed speech.

  10. A New Scheme of Speech Coding Based on Compressed Sensing and Sinusoidal Dictionary%基于压缩感知和正弦字典的语音编码新方案

    Institute of Scientific and Technical Information of China (English)

    李尚靖; 朱琦; 朱俊华

    2015-01-01

    A novel speech coding method based on compressed sensing is proposed in this paper. Based on compressed sensing theory,the row echelon matrix retains parts of speech time domain features in the measurements,and utilize a sinusoidal dictionary and matching pur-suit for measurements sequence modeling. The model parameters are encoded by appropriate methods respectively. At the decoder,basis pursuit algorithm employs the decoded measurements for synthesized speech reconstruction. A rear low-pass filter is adopted to improve auditory effects. Simulation results show the average MOS scores of the synthesis speech are between 2. 81~3. 23 in low bit rate (2. 8~5. 7 kbps),which achieves a preferable coding effect in compressed sensing framework.%文中提出一种压缩感知框架采样下的语音编码方案。根据压缩感知原理,利用行阶梯矩阵投影产生的观测序列保留了部分语音信息的时域特征,利用正弦字典和匹配追踪算法对观测序列进行建模,对于每帧观测序列的模型参数,根据各自特性采用合适的编码方式进行编码。在解码端对解码后的观测序列利用基追踪算法重构合成语音,并后置低通滤波器提高合成语音的人耳听觉效果。仿真实验表明,提出的编码方案在2.8~5.7 kbps时得到的合成语音平均MOS分为2.81~3.23,在压缩感知框架下取得了较好的语音编码效果。

  11. Feature selection for speech emotion recognition in Spanish and Basque: on the use of machine learning to improve human-computer interaction.

    Directory of Open Access Journals (Sweden)

    Andoni Arruti

    Full Text Available Study of emotions in human-computer interaction is a growing research area. This paper shows an attempt to select the most significant features for emotion recognition in spoken Basque and Spanish Languages using different methods for feature selection. RekEmozio database was used as the experimental data set. Several Machine Learning paradigms were used for the emotion classification task. Experiments were executed in three phases, using different sets of features as classification variables in each phase. Moreover, feature subset selection was applied at each phase in order to seek for the most relevant feature subset. The three phases approach was selected to check the validity of the proposed approach. Achieved results show that an instance-based learning algorithm using feature subset selection techniques based on evolutionary algorithms is the best Machine Learning paradigm in automatic emotion recognition, with all different feature sets, obtaining a mean of 80,05% emotion recognition rate in Basque and a 74,82% in Spanish. In order to check the goodness of the proposed process, a greedy searching approach (FSS-Forward has been applied and a comparison between them is provided. Based on achieved results, a set of most relevant non-speaker dependent features is proposed for both languages and new perspectives are suggested.

  12. Feature selection for speech emotion recognition in Spanish and Basque: on the use of machine learning to improve human-computer interaction.

    Science.gov (United States)

    Arruti, Andoni; Cearreta, Idoia; Alvarez, Aitor; Lazkano, Elena; Sierra, Basilio

    2014-01-01

    Study of emotions in human-computer interaction is a growing research area. This paper shows an attempt to select the most significant features for emotion recognition in spoken Basque and Spanish Languages using different methods for feature selection. RekEmozio database was used as the experimental data set. Several Machine Learning paradigms were used for the emotion classification task. Experiments were executed in three phases, using different sets of features as classification variables in each phase. Moreover, feature subset selection was applied at each phase in order to seek for the most relevant feature subset. The three phases approach was selected to check the validity of the proposed approach. Achieved results show that an instance-based learning algorithm using feature subset selection techniques based on evolutionary algorithms is the best Machine Learning paradigm in automatic emotion recognition, with all different feature sets, obtaining a mean of 80,05% emotion recognition rate in Basque and a 74,82% in Spanish. In order to check the goodness of the proposed process, a greedy searching approach (FSS-Forward) has been applied and a comparison between them is provided. Based on achieved results, a set of most relevant non-speaker dependent features is proposed for both languages and new perspectives are suggested.

  13. Experimental study on phase perception in speech

    Institute of Scientific and Technical Information of China (English)

    BU Fanliang; CHEN Yanpu

    2003-01-01

    As the human ear is dull to the phase in speech, little attention has been paid tophase information in speech coding. In fact, the speech perceptual quality may be degeneratedif the phase distortion is very large. The perceptual effect of the STFT (Short time Fouriertransform) phase spectrum is studied by auditory subjective hearing tests. Three main con-clusions are (1) If the phase information is neglected completely, the subjective quality of thereconstructed speech may be very poor; (2) Whether the neglected phase is in low frequencyband or high frequency band, the difference from the original speech can be perceived by ear;(3) It is very difficult for the human ear to perceive the difference of speech quality betweenoriginal speech and reconstructed speech while the phase quantization step size is shorter thanπ/7.

  14. Joint source channel coding using arithmetic codes

    CERN Document Server

    Bi, Dongsheng

    2009-01-01

    Based on the encoding process, arithmetic codes can be viewed as tree codes and current proposals for decoding arithmetic codes with forbidden symbols belong to sequential decoding algorithms and their variants. In this monograph, we propose a new way of looking at arithmetic codes with forbidden symbols. If a limit is imposed on the maximum value of a key parameter in the encoder, this modified arithmetic encoder can also be modeled as a finite state machine and the code generated can be treated as a variable-length trellis code. The number of states used can be reduced and techniques used fo

  15. An Analysis of Chinese College Students’ Reasons for and Attitudes towards Chinese-English CodeSwitching(CS) in Their Daily Speech

    Institute of Scientific and Technical Information of China (English)

    唐慧莹

    2015-01-01

    Since English became a compulsory subject for Chinese college students in 1986,English learning and speaking has been popular in Chinese universities.Under this background,Chinese college students tend to code-switch between Chinese and English,which means to mix English when they speak Chinese.In the past decades,many researchers have studied code-switching(CS)from different perspectives.This paper studies what researchers have said about CS in the previous studies.

  16. An Analysis of Chinese College Students’ Reasons for and Attitudes towards Chinese-English Code-Switching (CS)in Their Daily Speech

    Institute of Scientific and Technical Information of China (English)

    2015-01-01

    Since English became a compulsory subject for Chinese college students in 1986,English learning and speaking has been popular in Chinese universities.Under this background,Chinese college students tend to code-switch between Chinese and English,which means to mix English when they speak Chinese.In the past decades,many researchers have studied code-switching (CS)from different perspectives.This paper studies what researchers have said about CS in the previous studies.

  17. Compiling scheme using abstract state machines

    OpenAIRE

    2003-01-01

    The project investigates the use of Abstract State Machine in the process of computer program compilation. Compilation is to produce machine-code from a source program written in a high-level language. A compiler is a program written for the purpose. Machine-code is the computer-readable representation of sequences of computer instructions. An Abstract State Machine (ASM) is a notional computing machine, developed by Yuri Gurevich, for accurately and easily representing the semantics of...

  18. Speech Compression of Thai Dialects with Low-Bit-Rate Speech Coders

    Directory of Open Access Journals (Sweden)

    Suphattharachai Chomphan

    2012-01-01

    Full Text Available Problem statement: In modern speech communication at low bit rate, speech coding deteriorates the characteristics of the coded speech significantly. Considering the dialects in Thai, the coding quality of four main dialects spoken by Thai people residing in four core region including central, north, northeast and south regions has not been studied. Approach: This study presents a comparative study of the coding quality of four main Thai dialects by using different low-bit-rate speech coders including the Conjugate Structure Algebraic Code Excited Linear Predictive (CS-ACELP coder and the Multi-Pulse based Code Excited Linear Predictive (MP-CELP coder. The objective and subjective tests have been conducted to evaluate the coding quality of four main dialects. Results: From the experimental results, both tests show that the coding quality of North dialect is highest, meanwhile the coding quality of Northeast dialect is lowest. Moreover, the coding quality of male speech is mostly higher than that of female speech. Conclusion: From the study, it can be obviously seen that the coding quality of all Thai dialects are different.

  19. Hateful Help--A Practical Look at the Issue of Hate Speech.

    Science.gov (United States)

    Shelton, Michael W.

    Many college and university administrators have responded to the recent increase in hateful incidents on campus by putting hate speech codes into place. The establishment of speech codes has sparked a heated debate over the impact that such codes have upon free speech and First Amendment values. Some commentators have suggested that viewing hate…

  20. Soft computing in machine learning

    CERN Document Server

    Park, Jooyoung; Inoue, Atsushi

    2014-01-01

    As users or consumers are now demanding smarter devices, intelligent systems are revolutionizing by utilizing machine learning. Machine learning as part of intelligent systems is already one of the most critical components in everyday tools ranging from search engines and credit card fraud detection to stock market analysis. You can train machines to perform some things, so that they can automatically detect, diagnose, and solve a variety of problems. The intelligent systems have made rapid progress in developing the state of the art in machine learning based on smart and deep perception. Using machine learning, the intelligent systems make widely applications in automated speech recognition, natural language processing, medical diagnosis, bioinformatics, and robot locomotion. This book aims at introducing how to treat a substantial amount of data, to teach machines and to improve decision making models. And this book specializes in the developments of advanced intelligent systems through machine learning. It...

  1. Autocoding State Machine in Erlang

    DEFF Research Database (Denmark)

    Guo, Yu; Hoffman, Torben; Gunder, Nicholas

    2008-01-01

    This paper presents an autocoding tool suit, which supports development of state machine in a model-driven fashion, where models are central to all phases of the development process. The tool suit, which is built on the Eclipse platform, provides facilities for the graphical specification...... of a state machine model. Once the state machine is specified, it is used as input to a code generation engine that generates source code in Erlang....

  2. Speech recognition from spectral dynamics

    Indian Academy of Sciences (India)

    Hynek Hermansky

    2011-10-01

    Information is carried in changes of a signal. The paper starts with revisiting Dudley’s concept of the carrier nature of speech. It points to its close connection to modulation spectra of speech and argues against short-term spectral envelopes as dominant carriers of the linguistic information in speech. The history of spectral representations of speech is briefly discussed. Some of the history of gradual infusion of the modulation spectrum concept into Automatic recognition of speech (ASR) comes next, pointing to the relationship of modulation spectrum processing to wellaccepted ASR techniques such as dynamic speech features or RelAtive SpecTrAl (RASTA) filtering. Next, the frequency domain perceptual linear prediction technique for deriving autoregressive models of temporal trajectories of spectral power in individual frequency bands is reviewed. Finally, posterior-based features, which allow for straightforward application of modulation frequency domain information, are described. The paper is tutorial in nature, aims at a historical global overview of attempts for using spectral dynamics in machine recognition of speech, and does not always provide enough detail of the described techniques. However, extensive references to earlier work are provided to compensate for the lack of detail in the paper.

  3. Speech Indexing

    NARCIS (Netherlands)

    Ordelman, R.J.F.; Jong, de F.M.G.; Leeuwen, van D.A.; Blanken, H.M.; de Vries, A.P.; Blok, H.E.; Feng, L.

    2007-01-01

    This chapter will focus on the automatic extraction of information from the speech in multimedia documents. This approach is often referred to as speech indexing and it can be regarded as a subfield of audio indexing that also incorporates for example the analysis of music and sounds. If the objecti

  4. Plowing Speech

    OpenAIRE

    Zla ba sgrol ma

    2009-01-01

    This file contains a plowing speech and a discussion about the speech This collection presents forty-nine audio files including: several folk song genres; folktales and; local history from the Sman shad Valley of Sde dge county World Oral Literature Project

  5. Speech recognition with amplitude and frequency modulations

    Science.gov (United States)

    Zeng, Fan-Gang; Nie, Kaibao; Stickney, Ginger S.; Kong, Ying-Yee; Vongphoe, Michael; Bhargave, Ashish; Wei, Chaogang; Cao, Keli

    2005-02-01

    Amplitude modulation (AM) and frequency modulation (FM) are commonly used in communication, but their relative contributions to speech recognition have not been fully explored. To bridge this gap, we derived slowly varying AM and FM from speech sounds and conducted listening tests using stimuli with different modulations in normal-hearing and cochlear-implant subjects. We found that although AM from a limited number of spectral bands may be sufficient for speech recognition in quiet, FM significantly enhances speech recognition in noise, as well as speaker and tone recognition. Additional speech reception threshold measures revealed that FM is particularly critical for speech recognition with a competing voice and is independent of spectral resolution and similarity. These results suggest that AM and FM provide independent yet complementary contributions to support robust speech recognition under realistic listening situations. Encoding FM may improve auditory scene analysis, cochlear-implant, and audiocoding performance. auditory analysis | cochlear implant | neural code | phase | scene analysis

  6. Two-microphone Separation of Speech Mixtures

    DEFF Research Database (Denmark)

    2006-01-01

    Matlab source code for underdetermined separation of instaneous speech mixtures. The algorithm is described in [1] Michael Syskind Pedersen, DeLiang Wang, Jan Larsen and Ulrik Kjems: ''Two-microphone Separation of Speech Mixtures,'' 2006, submitted for journal publoication. See also, [2] Michael...

  7. Hate Speech: A Call to Principles.

    Science.gov (United States)

    Klepper, William M.; Bakken, Timothy

    1997-01-01

    Reviews the history of First Amendment rulings as they relate to speech codes and of other regulations directed at the content of speech. A case study, based on an experience at Trenton State College, details the legal constraints, principles, and practices that Student Affairs administrators should be aware of regarding such situations.…

  8. When Machines Design Machines!

    DEFF Research Database (Denmark)

    2011-01-01

    Until recently we were the sole designers, alone in the driving seat making all the decisions. But, we have created a world of complexity way beyond human ability to understand, control, and govern. Machines now do more trades than humans on stock markets, they control our power, water, gas...... and food supplies, manage our elevators, microclimates, automobiles and transport systems, and manufacture almost everything. It should come as no surprise that machines are now designing machines. The chips that power our computers and mobile phones, the robots and commercial processing plants on which we...... depend, all are now largely designed by machines. So what of us - will be totally usurped, or are we looking at a new symbiosis with human and artificial intelligences combined to realise the best outcomes possible. In most respects we have no choice! Human abilities alone cannot solve any of the major...

  9. Incorporating Speech Recognition into a Natural User Interface

    Science.gov (United States)

    Chapa, Nicholas

    2017-01-01

    The Augmented/ Virtual Reality (AVR) Lab has been working to study the applicability of recent virtual and augmented reality hardware and software to KSC operations. This includes the Oculus Rift, HTC Vive, Microsoft HoloLens, and Unity game engine. My project in this lab is to integrate voice recognition and voice commands into an easy to modify system that can be added to an existing portion of a Natural User Interface (NUI). A NUI is an intuitive and simple to use interface incorporating visual, touch, and speech recognition. The inclusion of speech recognition capability will allow users to perform actions or make inquiries using only their voice. The simplicity of needing only to speak to control an on-screen object or enact some digital action means that any user can quickly become accustomed to using this system. Multiple programs were tested for use in a speech command and recognition system. Sphinx4 translates speech to text using a Hidden Markov Model (HMM) based Language Model, an Acoustic Model, and a word Dictionary running on Java. PocketSphinx had similar functionality to Sphinx4 but instead ran on C. However, neither of these programs were ideal as building a Java or C wrapper slowed performance. The most ideal speech recognition system tested was the Unity Engine Grammar Recognizer. A Context Free Grammar (CFG) structure is written in an XML file to specify the structure of phrases and words that will be recognized by Unity Grammar Recognizer. Using Speech Recognition Grammar Specification (SRGS) 1.0 makes modifying the recognized combinations of words and phrases very simple and quick to do. With SRGS 1.0, semantic information can also be added to the XML file, which allows for even more control over how spoken words and phrases are interpreted by Unity. Additionally, using a CFG with SRGS 1.0 produces a Finite State Machine (FSM) functionality limiting the potential for incorrectly heard words or phrases. The purpose of my project was to

  10. Industrial Applications of Automatic Speech Recognition Systems

    Directory of Open Access Journals (Sweden)

    Dr. Jayashri Vajpai

    2016-03-01

    Full Text Available Current trends in developing technologies form important bridges to the future, fortified by the early and productive use of technology for enriching the human life. Speech signal processing, which includes automatic speech recognition, synthetic speech, and natural language processing, is beginning to have a significant impact on business, industry and ease of operation of personal computers. Apart from this, it facilitates the deeper understanding of complex mechanism of functioning of human brain. Advances in speech recognition technology, over the past five decades, have enabled a wide range of industrial applications. Yet today's applications provide a small preview of a rich future for speech and voice interface technology that will eventually replace keyboards with microphones for designing human machine interface for providing easy access to increasingly intelligent machines. It also shows how the capabilities of speech recognition systems in industrial applications are evolving over time to usher in the next generation of voice-enabled services. This paper aims to present an effective survey of the speech recognition technology described in the available literature and integrate the insights gained during the process of study of individual research and developments. The current applications of speech recognition for real world and industry have also been outlined with special reference to applications in the areas of medical, industrial robotics, forensic, defence and aviation

  11. Shared acoustic codes underlie emotional communication in music and speech—Evidence from deep transfer learning

    Science.gov (United States)

    Schuller, Björn

    2017-01-01

    Music and speech exhibit striking similarities in the communication of emotions in the acoustic domain, in such a way that the communication of specific emotions is achieved, at least to a certain extent, by means of shared acoustic patterns. From an Affective Sciences points of view, determining the degree of overlap between both domains is fundamental to understand the shared mechanisms underlying such phenomenon. From a Machine learning perspective, the overlap between acoustic codes for emotional expression in music and speech opens new possibilities to enlarge the amount of data available to develop music and speech emotion recognition systems. In this article, we investigate time-continuous predictions of emotion (Arousal and Valence) in music and speech, and the Transfer Learning between these domains. We establish a comparative framework including intra- (i.e., models trained and tested on the same modality, either music or speech) and cross-domain experiments (i.e., models trained in one modality and tested on the other). In the cross-domain context, we evaluated two strategies—the direct transfer between domains, and the contribution of Transfer Learning techniques (feature-representation-transfer based on Denoising Auto Encoders) for reducing the gap in the feature space distributions. Our results demonstrate an excellent cross-domain generalisation performance with and without feature representation transfer in both directions. In the case of music, cross-domain approaches outperformed intra-domain models for Valence estimation, whereas for Speech intra-domain models achieve the best performance. This is the first demonstration of shared acoustic codes for emotional expression in music and speech in the time-continuous domain. PMID:28658285

  12. High Performance Speech Compression System

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Since Pulse Code Modulation emerged in 1937, digitized speec h has experienced rapid development due to its outstanding voice quali ty, reliability, robustness and security in communication. But how to reduce channel width without loss of speech quality remains a crucial problem in speech coding theory. A new full-duplex digital speech comm unication system based on the Vocoder of AMBE-1000 and microcontroller ATMEL 89C51 is introduced. It shows higher voice quality than current mobile phone system with only a quarter of channel width needed for t he latter. The prospective areas in which the system can be applied in clude satellite communication, IP Phone, virtual meeting and the most important, defence industry.

  13. A New Speech Codec Based on ANN with Low Delay

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The author designs a new speech codec in this paper, which is based on ANN to carry out nonlinear prediction. This new codec synthesizes speeches with better quality than the conventional waveform or hybrid codecs does at the same bit rate. Moreover, the most important characteristic of this codec is the low coding delay, which will benefit the enhancement of the speech communication QoS when we transmit speech signals in IP or ATM networks.

  14. Speaking of Race, Speaking of Sex: Hate Speech, Civil Rights, and Civil Liberties.

    Science.gov (United States)

    Gates, Henry Louis, Jr.; And Others

    The essays of this collection explore the restriction of speech and the hate speech codes that attempt to restrict bigoted or offensive speech and punish those who engage in it. These essays generally argue that speech restrictions are dangerous and counterproductive, but they acknowledge that it is very difficult to distinguish between…

  15. A Survey on Statistical Based Single Channel Speech Enhancement Techniques

    Directory of Open Access Journals (Sweden)

    Sunnydayal. V

    2014-11-01

    Full Text Available Speech enhancement is a long standing problem with various applications like hearing aids, automatic recognition and coding of speech signals. Single channel speech enhancement technique is used for enhancement of the speech degraded by additive background noises. The background noise can have an adverse impact on our ability to converse without hindrance or smoothly in very noisy environments, such as busy streets, in a car or cockpit of an airplane. Such type of noises can affect quality and intelligibility of speech. This is a survey paper and its object is to provide an overview of speech enhancement algorithms so that enhance the noisy speech signal which is corrupted by additive noise. The algorithms are mainly based on statistical based approaches. Different estimators are compared. Challenges and Opportunities of speech enhancement are also discussed. This paper helps in choosing the best statistical based technique for speech enhancement

  16. Personality in speech assessment and automatic classification

    CERN Document Server

    Polzehl, Tim

    2015-01-01

    This work combines interdisciplinary knowledge and experience from research fields of psychology, linguistics, audio-processing, machine learning, and computer science. The work systematically explores a novel research topic devoted to automated modeling of personality expression from speech. For this aim, it introduces a novel personality assessment questionnaire and presents the results of extensive labeling sessions to annotate the speech data with personality assessments. It provides estimates of the Big 5 personality traits, i.e. openness, conscientiousness, extroversion, agreeableness, and neuroticism. Based on a database built on the questionnaire, the book presents models to tell apart different personality types or classes from speech automatically.

  17. Hate speech

    Directory of Open Access Journals (Sweden)

    Anne Birgitta Nilsen

    2014-12-01

    Full Text Available The manifesto of the Norwegian terrorist Anders Behring Breivik is based on the “Eurabia” conspiracy theory. This theory is a key starting point for hate speech amongst many right-wing extremists in Europe, but also has ramifications beyond these environments. In brief, proponents of the Eurabia theory claim that Muslims are occupying Europe and destroying Western culture, with the assistance of the EU and European governments. By contrast, members of Al-Qaeda and other extreme Islamists promote the conspiracy theory “the Crusade” in their hate speech directed against the West. Proponents of the latter theory argue that the West is leading a crusade to eradicate Islam and Muslims, a crusade that is similarly facilitated by their governments. This article presents analyses of texts written by right-wing extremists and Muslim extremists in an effort to shed light on how hate speech promulgates conspiracy theories in order to spread hatred and intolerance.The aim of the article is to contribute to a more thorough understanding of hate speech’s nature by applying rhetorical analysis. Rhetorical analysis is chosen because it offers a means of understanding the persuasive power of speech. It is thus a suitable tool to describe how hate speech works to convince and persuade. The concepts from rhetorical theory used in this article are ethos, logos and pathos. The concept of ethos is used to pinpoint factors that contributed to Osama bin Laden's impact, namely factors that lent credibility to his promotion of the conspiracy theory of the Crusade. In particular, Bin Laden projected common sense, good morals and good will towards his audience. He seemed to have coherent and relevant arguments; he appeared to possess moral credibility; and his use of language demonstrated that he wanted the best for his audience.The concept of pathos is used to define hate speech, since hate speech targets its audience's emotions. In hate speech it is the

  18. The Unsupervised Acquisition of a Lexicon from Continuous Speech.

    Science.gov (United States)

    1995-11-01

    COGNITIVE SCIENCES A.I. Memo No. 1558 November, 1996 C.B.C.L. Memo No. 129 The Unsupervised Acquisition of a Lexicon from Continuous Speech Carl de...tion of the input. Thus, it has diverse application in speech recognition, lexicography , text and speech compression, machine translation, and the...the Cognitive Science Society, pages 28{36, 1993. [5] Michael R. Brent, Andrew Lundberg, and Sreerama Murthy. Discovering morphemic suxes: A case

  19. Microphone Array Design Measures for Hands-Free Speech Recognition

    OpenAIRE

    井上 雅晶; 山田 武志; 中村, 哲; 鹿野 清宏

    1998-01-01

    One of the key technologies for natural man-machine interface is hands-free speech recognition. The performance of hands-free distant- talking speech recognition will be seriously degraded by noise and reverberation in real environments. A microphone array is applied to solve the problem. When applying a microphone array to speech recognition, parameters such as number of microphone elements and their spacing interval affect the performance. In order to optimize these parameters, a measure wh...

  20. Vocal Tract Representation in the Recognition of Cerebral Palsied Speech

    Science.gov (United States)

    Rudzicz, Frank; Hirst, Graeme; van Lieshout, Pascal

    2012-01-01

    Purpose: In this study, the authors explored articulatory information as a means of improving the recognition of dysarthric speech by machine. Method: Data were derived chiefly from the TORGO database of dysarthric articulation (Rudzicz, Namasivayam, & Wolff, 2011) in which motions of various points in the vocal tract are measured during speech.…

  1. A Software Agent for Speech Abiding Systems

    Directory of Open Access Journals (Sweden)

    R. Manoharan

    2009-01-01

    Full Text Available Problem statement: In order to bring speech into the mainstream of business process an efficient digital signal processor is necessary. The Fast Fourier Transform (FFT and the butter fly structure symmetry will enable the harwaring easier. With the DSP and software proposed, togetherly established by means of a system, named here as “Speech Abiding System (SAS”, a software agent, which involves the digital representation of speech signals and the use of digital processors to analyze, synthesize, or modify such signals. The proposed SAS addresses the issues in two parts. Part I: Capturing the Speaker and the Language independent error free Speech Content for speech applications processing and Part II: To accomplish the speech content as an input to the Speech User Applications/Interface (SUI. Approach: Discrete Fourier Transform (DFT of the speech signal is the essential ingredient to evolve this SAS and Discrete-Time Fourier Transform (DTFT links the discrete-time domain to the continuous-frequency domain. The direct computation of DFT is prohibitively expensive in terms of the required computer operations. Fortunately, a number of “fast” transforms have been developed that are mathematically equivalent to the DFT, but which require significantly a fewer computer operations for their implementation. Results: From Part-I, the SAS able to capture an error free Speech content to facilitate the speech as a good input in the main stream of business processing. Part-II provides an environment to implement the speech user applications at a primitive level. Conclusion/Recommendations: The SAS agent along with the required hardware architecture, a Finite State Automata (FSA machine can be created to develop global oriented domain specific speech user applications easily. It will have a major impact on interoperability and disintermediation in the Information Technology Cycle (ITC for computer program generating.

  2. Speech Intelligibility

    Science.gov (United States)

    Brand, Thomas

    Speech intelligibility (SI) is important for different fields of research, engineering and diagnostics in order to quantify very different phenomena like the quality of recordings, communication and playback devices, the reverberation of auditoria, characteristics of hearing impairment, benefit using hearing aids or combinations of these things.

  3. Speech dynamics

    NARCIS (Netherlands)

    Pols, L.C.W.

    2011-01-01

    In order for speech to be informative and communicative, segmental and suprasegmental variation is mandatory. Only this leads to meaningful words and sentences. The building blocks are no stable entities put next to each other (like beads on a string or like printed text), but there are gradual tran

  4. 人机显示界面中的文字和位置编码%Text and position coding of human-machine display interface

    Institute of Scientific and Technical Information of China (English)

    张磊; 庄达民

    2011-01-01

    In process of operating aircraft, pilots need to use large amounts of information. So reasonable coding of information can improve driving safety. According to research requirements, a task model was developed for the ergonomics experiment. After subjects complete the tasks, their correct rate and reaction time were measured. Combined the measured results with eye movement data, the impact of text and position coding on information identification was analyzed to provide a scientific basis for the ergonomics design of information interface. Experimental results show that subject's identification of the text information is affected by the position coding. And position coding relates to vision scope and attention allocation strategy. Identification efficiency of the center is better than the periphery, and the left position is better than the right position. Identification efficiency of Chinese information is better than English, the impact of mother tongue should be considered in the practical application.%飞行员在驾驶飞机的过程中需要用到大量的信息,对信息进行合理的编码可以提高驾驶安全性.根据研究的需要,开发一个用于工效实验的作业任务模型.通过测量被试完成作业任务的正确率和反应时间,结合眼动仪测得的眼动数据,分析文字和位置编码对信息辨识的影响,为显示界面适人性设计提供科学依据.实验结果表明:人对文字信息的辨识受到位置编码影响,位置编码方式与视野范围和注意力分配策略相关;中心位置的辨识效率优于边缘位置,左侧位置优于右侧位置;中文信息的辨识效果优于英文信息,实际应用时应考虑母语的影响.

  5. Reference-free automatic quality assessment of tracheoesophageal speech.

    Science.gov (United States)

    Huang, Andy; Falk, Tiago H; Chan, Wai-Yip; Parsa, Vijay; Doyle, Philip

    2009-01-01

    Evaluation of the quality of tracheoesophageal (TE) speech using machines instead of human experts can enhance the voice rehabilitation process for patients who have undergone total laryngectomy and voice restoration. Towards the goal of devising a reference-free TE speech quality estimation algorithm, we investigate the efficacy of speech signal features that are used in standard telephone-speech quality assessment algorithms, in conjunction with a recently introduced speech modulation spectrum measure. Tests performed on two TE speech databases demonstrate that the modulation spectral measure and a subset of features in the standard ITU-T P.563 algorithm estimate TE speech quality with better correlation (up to 0.9) than previously proposed features.

  6. The Research and Appliaction of the Multi-classification Algorithm of Error-Correcting Codes Based on Support Vector Machine%基于SVM的纠错编码多分类算法的研究与应用

    Institute of Scientific and Technical Information of China (English)

    祖文超; 苑津莎; 王峰; 刘磊

    2012-01-01

    In order to enhance the accuracy rate of transformer fault diagnosis,multiclass classification algorithm,which is based upon Error-correcting codes connects with SVM,has been proposedThe mathe-matical model of transformer fault diagnosis is set up according to the theory of Support Vector Machine. Firstly,the Error-correcting codes matrix constructs some irrelevant Support Vector Machine,so that the accuracy rate of classified model can be enhanced.Finally,taking the dissolved gases in the transformer oil as the practise and testing sample of Error-correcting codes and SVM to realize transformer fault diagno- sis.And checking the arithmetic by using UCI data.The multiclass classification algorithm has been verified through VS2008 combined with Libsvm has been verified.And the result shows the method has high ac- curacy of classification.%为了提高变压器故障诊断的准确率,提出了一种基于纠错编码和支持向量机相结合的多分类算法,根据SVM理论建立变压器故障诊断数学模型,首先基于纠错编码矩阵构造出若干个互不相关的子支持向量机,以提高分类模型的分类准确率。最后把变压器油中溶解气体(DGA)作为纠错编码支持向量机的训练以及测试样本,实现变压器的故障诊断,同时用UCI数据对该算法进行验证。通过VS2008和Libsvm相结合对其进行验证,结果表明该方法具有很高的分类精度。

  7. Speech communications in noise

    Science.gov (United States)

    1984-07-01

    The physical characteristics of speech, the methods of speech masking measurement, and the effects of noise on speech communication are investigated. Topics include the speech signal and intelligibility, the effects of noise on intelligibility, the articulation index, and various devices for evaluating speech systems.

  8. Machine Translation

    Institute of Scientific and Technical Information of China (English)

    张严心

    2015-01-01

    As a kind of ancillary translation tool, Machine Translation has been paid increasing attention to and received different kinds of study by a great deal of researchers and scholars for a long time. To know the definition of Machine Translation and to analyse its benefits and problems are significant for translators in order to make good use of Machine Translation, and helpful to develop and consummate Machine Translation Systems in the future.

  9. Steganalysis of recorded speech

    Science.gov (United States)

    Johnson, Micah K.; Lyu, Siwei; Farid, Hany

    2005-03-01

    Digital audio provides a suitable cover for high-throughput steganography. At 16 bits per sample and sampled at a rate of 44,100 Hz, digital audio has the bit-rate to support large messages. In addition, audio is often transient and unpredictable, facilitating the hiding of messages. Using an approach similar to our universal image steganalysis, we show that hidden messages alter the underlying statistics of audio signals. Our statistical model begins by building a linear basis that captures certain statistical properties of audio signals. A low-dimensional statistical feature vector is extracted from this basis representation and used by a non-linear support vector machine for classification. We show the efficacy of this approach on LSB embedding and Hide4PGP. While no explicit assumptions about the content of the audio are made, our technique has been developed and tested on high-quality recorded speech.

  10. Fishing for meaningful units in connected speech

    DEFF Research Database (Denmark)

    Henrichsen, Peter Juel; Christiansen, Thomas Ulrich

    2009-01-01

    In many branches of spoken language analysis including ASR, the set of smallest meaningful units of speech is taken to coincide with the set of phones or phonemes. However, fishing for phones is difficult, error-prone, and computationally expensive. We present an experiment, based on machine...

  11. Two-microphone Separation of Speech Mixtures

    DEFF Research Database (Denmark)

    2006-01-01

    of Speech Mixtures," 2006, submited for journal publication. See also, [2] Michael Syskind Pedersen, DeLiang Wang, Jan Larsen and Ulrik Kjems: "Overcomplete Blind Source Separation by Combining ICA and Binary Time-Frequency Masking," in proceedings of IEEE International workshop on Machine Learning...

  12. Acquiring a Lexicon from Unsegmented Speech

    CERN Document Server

    De Marcken, C

    1995-01-01

    We present work-in-progress on the machine acquisition of a lexicon from sentences that are each an unsegmented phone sequence paired with a primitive representation of meaning. A simple exploratory algorithm is described, along with the direction of current work and a discussion of the relevance of the problem for child language acquisition and computer speech recognition.

  13. Sustainable machining

    CERN Document Server

    2017-01-01

    This book provides an overview on current sustainable machining. Its chapters cover the concept in economic, social and environmental dimensions. It provides the reader with proper ways to handle several pollutants produced during the machining process. The book is useful on both undergraduate and postgraduate levels and it is of interest to all those working with manufacturing and machining technology.

  14. The cognitive approach to conscious machines

    CERN Document Server

    Haikonen, Pentti O

    2003-01-01

    Could a machine have an immaterial mind? The author argues that true conscious machines can be built, but rejects artificial intelligence and classical neural networks in favour of the emulation of the cognitive processes of the brain-the flow of inner speech, inner imagery and emotions. This results in a non-numeric meaning-processing machine with distributed information representation and system reactions. It is argued that this machine would be conscious; it would be aware of its own existence and its mental content and perceive this as immaterial. Novel views on consciousness and the mind-

  15. Scientific Bases of Human-Machine Communication by Voice

    Science.gov (United States)

    Schafer, Ronald W.

    1995-10-01

    The scientific bases for human-machine communication by voice are in the fields of psychology, linguistics, acoustics, signal processing, computer science, and integrated circuit technology. The purpose of this paper is to highlight the basic scientific and technological issues in human-machine communication by voice and to point out areas of future research opportunity. The discussion is organized around the following major issues in implementing human-machine voice communication systems: (i) hardware/software implementation of the system, (ii) speech synthesis for voice output, (iii) speech recognition and understanding for voice input, and (iv) usability factors related to how humans interact with machines.

  16. Noise Estimation and Noise Removal Techniques for Speech Recognition in Adverse Environment

    OpenAIRE

    Shrawankar, Urmila; Thakare, Vilas

    2010-01-01

    International audience; Noise is ubiquitous in almost all acoustic environments. The speech signal, that is recorded by a microphone is generally infected by noise originating from various sources. Such contamination can change the characteristics of the speech signals and degrade the speech quality and intelligibility, thereby causing significant harm to human-to-machine communication systems. Noise detection and reduction for speech applications is often formulated as a digital filtering pr...

  17. Ensemble Feature Extraction Modules for Improved Hindi Speech Recognition System

    Directory of Open Access Journals (Sweden)

    Malay Kumar

    2012-05-01

    Full Text Available Speech is the most natural way of communication between human beings. The field of speech recognition generates intrigues of man - machine conversation and due to its versatile applications; automatic speech recognition systems have been designed. In this paper we are presenting a novel approach for Hindi speech recognition by ensemble feature extraction modules of ASR systems and their outputs have been combined using voting technique ROVER. Experimental results have been shown that proposed system will produce better result than traditional ASR systems.

  18. Coded Random Access

    DEFF Research Database (Denmark)

    Paolini, Enrico; Stefanovic, Cedomir; Liva, Gianluigi

    2015-01-01

    , in which the structure of the access protocol can be mapped to a structure of an erasure-correcting code defined on graph. This opens the possibility to use coding theory and tools for designing efficient random access protocols, offering markedly better performance than ALOHA. Several instances of coded......The rise of machine-to-machine communications has rekindled the interest in random access protocols as a support for a massive number of uncoordinatedly transmitting devices. The legacy ALOHA approach is developed under a collision model, where slots containing collided packets are considered...... as waste. However, if the common receiver (e.g., base station) is capable to store the collision slots and use them in a transmission recovery process based on successive interference cancellation, the design space for access protocols is radically expanded. We present the paradigm of coded random access...

  19. Code Mixing in a Young Bilingual Child.

    Science.gov (United States)

    Anderson, Raquel; Brice, Alejandro

    1999-01-01

    Spontaneous speech samples of a bilingual Spanish-English speaking child were collected during a period of 17 months (ages 6-8). Data revealed percentages and rank ordering of syntactic elements switched in the longitudinal language samples obtained. Specific recommendations for using code mixing in therapy for speech-language pathologists are…

  20. Spoken Language Understanding Systems for Extracting Semantic Information from Speech

    CERN Document Server

    Tur, Gokhan

    2011-01-01

    Spoken language understanding (SLU) is an emerging field in between speech and language processing, investigating human/ machine and human/ human communication by leveraging technologies from signal processing, pattern recognition, machine learning and artificial intelligence. SLU systems are designed to extract the meaning from speech utterances and its applications are vast, from voice search in mobile devices to meeting summarization, attracting interest from both commercial and academic sectors. Both human/machine and human/human communications can benefit from the application of SLU, usin

  1. Going to a Speech Therapist

    Science.gov (United States)

    ... Video: Getting an X-ray Going to a Speech Therapist KidsHealth > For Kids > Going to a Speech Therapist ... therapists (also called speech-language pathologists ). What Do Speech Therapists Help With? Speech therapists help people of all ...

  2. Speech research

    Science.gov (United States)

    1992-06-01

    Phonology is traditionally seen as the discipline that concerns itself with the building blocks of linguistic messages. It is the study of the structure of sound inventories of languages and of the participation of sounds in rules or processes. Phonetics, in contrast, concerns speech sounds as produced and perceived. Two extreme positions on the relationship between phonological messages and phonetic realizations are represented in the literature. One holds that the primary home for linguistic symbols, including phonological ones, is the human mind, itself housed in the human brain. The second holds that their primary home is the human vocal tract.

  3. A Search Complexity Improvement of Vector Quantization to Immittance Spectral Frequency Coefficients in AMR-WB Speech Codec

    OpenAIRE

    Bing-Jhih Yao; Cheng-Yu Yeh; Shaw-Hwa Hwang

    2016-01-01

    An adaptive multi-rate wideband (AMR-WB) code is a speech codec developed on the basis of an algebraic code-excited linear-prediction (ACELP) coding technique, and has a double advantage of low bit rates and high speech quality. This coding technique is widely used in modern mobile communication systems for a high speech quality in handheld devices. However, a major disadvantage is that a vector quantization (VQ) of immittance spectral frequency (ISF) coefficients occupies a significant compu...

  4. Simple machines

    CERN Document Server

    Graybill, George

    2007-01-01

    Just how simple are simple machines? With our ready-to-use resource, they are simple to teach and easy to learn! Chocked full of information and activities, we begin with a look at force, motion and work, and examples of simple machines in daily life are given. With this background, we move on to different kinds of simple machines including: Levers, Inclined Planes, Wedges, Screws, Pulleys, and Wheels and Axles. An exploration of some compound machines follows, such as the can opener. Our resource is a real time-saver as all the reading passages, student activities are provided. Presented in s

  5. Quantum Virtual Machine (QVM)

    Energy Technology Data Exchange (ETDEWEB)

    2016-11-18

    There is a lack of state-of-the-art HPC simulation tools for simulating general quantum computing. Furthermore, there are no real software tools that integrate current quantum computers into existing classical HPC workflows. This product, the Quantum Virtual Machine (QVM), solves this problem by providing an extensible framework for pluggable virtual, or physical, quantum processing units (QPUs). It enables the execution of low level quantum assembly codes and returns the results of such executions.

  6. Speech production, Psychology of

    NARCIS (Netherlands)

    Schriefers, H.J.; Vigliocco, G.

    2015-01-01

    Research on speech production investigates the cognitive processes involved in transforming thoughts into speech. This article starts with a discussion of the methodological issues inherent to research in speech production that illustrates how empirical approaches to speech production must differ fr

  7. Status Report on Speech Research, July 1994-December 1995.

    Science.gov (United States)

    Fowler, Carol A., Ed.

    This publication (one of a series) contains 19 articles which report the status and progress of studies on the nature of speech, instruments for its investigation, and practical applications. Articles are: "Speech Perception Deficits in Poor Readers: Auditory Processing or Phonological Coding?" (Maria Mody and others); "Auditory…

  8. 78 FR 49717 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

    Science.gov (United States)

    2013-08-15

    ... reasons that STS ] has not been more widely utilized. Are people with speech disabilities not connected to... COMMISSION 47 CFR Part 64 Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services; Telecommunications Relay Services and Speech-to-Speech Services for Individuals With...

  9. Speech Articulator and User Gesture Measurements Using Micropower, Interferometric EM-Sensors

    Energy Technology Data Exchange (ETDEWEB)

    Holzrichter, J F; Ng, L C

    2001-02-06

    Very low power, GHz frequency, ''radar-like'' sensors can measure a variety of motions produced by a human user of machine interface devices. These data can be obtained ''at a distance'' and can measure ''hidden'' structures. Measurements range from acoustic induced, 10-micron amplitude vibrations of vocal tract tissues, to few centimeter human speech articulator motions, to meter-class motions of the head, hands, or entire body. These EM sensors measure ''fringe motions'' as reflected EM waves are mixed with a local (homodyne) reference wave. These data, when processed using models of the system being measured, provide real time states of interface positions or other targets vs. time. An example is speech articulator positions vs. time in the user's body. This information appears to be useful for a surprisingly wide range of applications ranging from speech coding synthesis and recognition, speaker or object identification, noise cancellation, hand or head motions for cursor direction, and other applications.

  10. Speech Articulator and User Gesture Measurements Using Micropower, Interferometric EM-Sensore

    Energy Technology Data Exchange (ETDEWEB)

    Holzrichter, J.F.

    2000-09-15

    Very low power, GHz frequency, ''radar-like'' sensors can measure a variety of motions produced by a human user of machine interface devices. These data can be obtained ''at a distance'' and can measure ''hidden'' structures. Measurements range from acoustic induced 10-micron amplitude vibrations of vocal tract tissues, to few centimeter human speech articulator motions, to meter-class motions of the head, hands, or entire body. These EM sensors measure ''fringe motions' as reflected EM waves are mixed with a local (homodyne) reference wave. These data, when processed using models of the system being measured, provide real time states of interface positions vs. time. An example is speech articulator positions vs. time in the user's body. This information appears to be useful for a surprisingly wide range of applications ranging from speech coding and recognition, speaker or object identification, noise cancellation, hand or head motions for cursor direction, and other applications.

  11. Martin Luther King's "I Have a Dream": The Speech Event as Metaphor.

    Science.gov (United States)

    Alvarez, Alexandra

    1988-01-01

    Martin Luther King's speech is examined as a sermon in the Black Baptist tradition. The speech, which is a dialog between speaker and audience, has, in addition to the "message" contained in the code, a broader ethnographic meaning. The speech event itself is metaphorical in nature, signaling political protest. (Author/BJV)

  12. Free Speech versus Civil Discourse: Where Do We Go from Here?

    Science.gov (United States)

    McMasters, Paul

    1994-01-01

    Problems associated with the establishment of speech codes on college campuses, in response to hate speech, are examined. An inventory of speech regulations already in effect at 384 colleges and universities, by Arati R. Korwar, is also presented. The summary organizes behavior and related policy into 13 categories. (MSE)

  13. Electric machine

    Science.gov (United States)

    El-Refaie, Ayman Mohamed Fawzi [Niskayuna, NY; Reddy, Patel Bhageerath [Madison, WI

    2012-07-17

    An interior permanent magnet electric machine is disclosed. The interior permanent magnet electric machine comprises a rotor comprising a plurality of radially placed magnets each having a proximal end and a distal end, wherein each magnet comprises a plurality of magnetic segments and at least one magnetic segment towards the distal end comprises a high resistivity magnetic material.

  14. Speech Enhancement

    DEFF Research Database (Denmark)

    Benesty, Jacob; Jensen, Jesper Rindom; Christensen, Mads Græsbøll;

    of methods and have been introduced in somewhat different contexts. Linear filtering methods originate in stochastic processes, while subspace methods have largely been based on developments in numerical linear algebra and matrix approximation theory. This book bridges the gap between these two classes......Speech enhancement is a classical problem in signal processing, yet still largely unsolved. Two of the conventional approaches for solving this problem are linear filtering, like the classical Wiener filter, and subspace methods. These approaches have traditionally been treated as different classes...... of methods by showing how the ideas behind subspace methods can be incorporated into traditional linear filtering. In the context of subspace methods, the enhancement problem can then be seen as a classical linear filter design problem. This means that various solutions can more easily be compared...

  15. Speech therapy with obturator.

    Science.gov (United States)

    Shyammohan, A; Sreenivasulu, D

    2010-12-01

    Rehabilitation of speech is tantamount to closure of defect in cases with velopharyngeal insufficiency. Often the importance of speech therapy is sidelined during the fabrication of obturators. Usually the speech part is taken up only at a later stage and is relegated entirely to a speech therapist without the active involvement of the prosthodontist. The article suggests a protocol for speech therapy in such cases to be done in unison with a prosthodontist.

  16. Two-microphone Separation of Speech Mixtures

    DEFF Research Database (Denmark)

    2006-01-01

    of Speech Mixtures," 2006, submited for journal publication. See also, [2] Michael Syskind Pedersen, DeLiang Wang, Jan Larsen and Ulrik Kjems: "Overcomplete Blind Source Separation by Combining ICA and Binary Time-Frequency Masking," in proceedings of IEEE International workshop on Machine Learning......In this demonstration we show the separation of 3-7 mixed speech sources based on information from two microphones. Separation with background noise is demonstrated too. The algorithms are described in 1] Michael Syskind Pedersen, DeLiang Wang, Jan Larsen and Ulrik Kjems: "Two-microphone Separation...

  17. Mechanisms of enhancing visual-speech recognition by prior auditory information.

    Science.gov (United States)

    Blank, Helen; von Kriegstein, Katharina

    2013-01-15

    Speech recognition from visual-only faces is difficult, but can be improved by prior information about what is said. Here, we investigated how the human brain uses prior information from auditory speech to improve visual-speech recognition. In a functional magnetic resonance imaging study, participants performed a visual-speech recognition task, indicating whether the word spoken in visual-only videos matched the preceding auditory-only speech, and a control task (face-identity recognition) containing exactly the same stimuli. We localized a visual-speech processing network by contrasting activity during visual-speech recognition with the control task. Within this network, the left posterior superior temporal sulcus (STS) showed increased activity and interacted with auditory-speech areas if prior information from auditory speech did not match the visual speech. This mismatch-related activity and the functional connectivity to auditory-speech areas were specific for speech, i.e., they were not present in the control task. The mismatch-related activity correlated positively with performance, indicating that posterior STS was behaviorally relevant for visual-speech recognition. In line with predictive coding frameworks, these findings suggest that prediction error signals are produced if visually presented speech does not match the prediction from preceding auditory speech, and that this mechanism plays a role in optimizing visual-speech recognition by prior information. Copyright © 2012 Elsevier Inc. All rights reserved.

  18. Neural Decoder for Topological Codes

    Science.gov (United States)

    Torlai, Giacomo; Melko, Roger G.

    2017-07-01

    We present an algorithm for error correction in topological codes that exploits modern machine learning techniques. Our decoder is constructed from a stochastic neural network called a Boltzmann machine, of the type extensively used in deep learning. We provide a general prescription for the training of the network and a decoding strategy that is applicable to a wide variety of stabilizer codes with very little specialization. We demonstrate the neural decoder numerically on the well-known two-dimensional toric code with phase-flip errors.

  19. Language Recognition via Sparse Coding

    Science.gov (United States)

    2016-09-08

    target language lT ) from the two pipelines. We can also think of more sophis- ticated fusion schemes on logistic regression and neural net - works. Table...vol. 16, no. 5, pp. 980–988, July 2008. [6] G. Sivaram, S. K. Nemala, M. Elhilali, T. D. Tran, and H. Herman- sky , “Sparse Coding for Speech

  20. The Machine within the Machine

    CERN Multimedia

    Katarina Anthony

    2014-01-01

    Although Virtual Machines are widespread across CERN, you probably won't have heard of them unless you work for an experiment. Virtual machines - known as VMs - allow you to create a separate machine within your own, allowing you to run Linux on your Mac, or Windows on your Linux - whatever combination you need.   Using a CERN Virtual Machine, a Linux analysis software runs on a Macbook. When it comes to LHC data, one of the primary issues collaborations face is the diversity of computing environments among collaborators spread across the world. What if an institute cannot run the analysis software because they use different operating systems? "That's where the CernVM project comes in," says Gerardo Ganis, PH-SFT staff member and leader of the CernVM project. "We were able to respond to experimentalists' concerns by providing a virtual machine package that could be used to run experiment software. This way, no matter what hardware they have ...

  1. Speech Acquisition and Automatic Speech Recognition for Integrated Spacesuit Audio Systems

    Science.gov (United States)

    Huang, Yiteng; Chen, Jingdong; Chen, Shaoyan

    2010-01-01

    A voice-command human-machine interface system has been developed for spacesuit extravehicular activity (EVA) missions. A multichannel acoustic signal processing method has been created for distant speech acquisition in noisy and reverberant environments. This technology reduces noise by exploiting differences in the statistical nature of signal (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, the automatic speech recognition (ASR) accuracy can be improved to the level at which crewmembers would find the speech interface useful. The developed speech human/machine interface will enable both crewmember usability and operational efficiency. It can enjoy a fast rate of data/text entry, small overall size, and can be lightweight. In addition, this design will free the hands and eyes of a suited crewmember. The system components and steps include beam forming/multi-channel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, model adaption, ASR HMM (Hidden Markov Model) training, and ASR decoding. A state-of-the-art phoneme recognizer can obtain an accuracy rate of 65 percent when the training and testing data are free of noise. When it is used in spacesuits, the rate drops to about 33 percent. With the developed microphone array speech-processing technologies, the performance is improved and the phoneme recognition accuracy rate rises to 44 percent. The recognizer can be further improved by combining the microphone array and HMM model adaptation techniques and using speech samples collected from inside spacesuits. In addition, arithmetic complexity models for the major HMMbased ASR components were developed. They can help real-time ASR system designers select proper tasks when in the face of constraints in computational resources.

  2. Smarter machines

    Science.gov (United States)

    Fairfield, Jessamyn

    2017-03-01

    Although today's computers can perform superhuman feats, even the best are no match for human brains at tasks like processing speech. But as Jessamyn Fairfield explains, a new generation of computational devices is being developed to mimic the networks of neurons inside our heads.

  3. Quantum Neural Network Based Machine Translator for Hindi to English

    OpenAIRE

    Ravi Narayan; V. P. Singh; S. Chakraverty

    2014-01-01

    This paper presents the machine learning based machine translation system for Hindi to English, which learns the semantically correct corpus. The quantum neural based pattern recognizer is used to recognize and learn the pattern of corpus, using the information of part of speech of individual word in the corpus, like a human. The system performs the machine translation using its knowledge gained during the learning by inputting the pair of sentences of Devnagri-Hindi and English. To analyze t...

  4. Quantum Neural Network Based Machine Translator for Hindi to English

    OpenAIRE

    Ravi Narayan; Singh, V. P.; S. Chakraverty

    2014-01-01

    This paper presents the machine learning based machine translation system for Hindi to English, which learns the semantically correct corpus. The quantum neural based pattern recognizer is used to recognize and learn the pattern of corpus, using the information of part of speech of individual word in the corpus, like a human. The system performs the machine translation using its knowledge gained during the learning by inputting the pair of sentences of Devnagri-Hindi and English. To analyze t...

  5. Existence detection and embedding rate estimation of blended speech in covert speech communications.

    Science.gov (United States)

    Li, Lijuan; Gao, Yong

    2016-01-01

    Covert speech communications may be used by terrorists to commit crimes through Internet. Steganalysis aims to detect secret information in covert communications to prevent crimes. Herein, based on the average zero crossing rate of the odd-even difference (AZCR-OED), a steganalysis algorithm for blended speech is proposed; it can detect the existence and estimate the embedding rate of blended speech. First, the odd-even difference (OED) of the speech signal is calculated and divided into frames. The average zero crossing rate (ZCR) is calculated for each OED frame, and the minimum average ZCR and AZCR-OED of the entire speech signal are extracted as features. Then, a support vector machine classifier is used to determine whether the speech signal is blended. Finally, a voice activity detection algorithm is applied to determine the hidden location of the secret speech and estimate the embedding rate. The results demonstrate that without attack, the detection accuracy can reach 80 % or more when the embedding rate is greater than 10 %, and the estimated embedding rate is similar to the real value. And when some attacks occur, it can also reach relatively high detection accuracy. The algorithm has high performance in terms of accuracy, effectiveness and robustness.

  6. [Speech evoked auditory brainstem response and cognitive disorders].

    Science.gov (United States)

    Zhou, M; Wang, N Y

    2016-12-07

    Speech evoked auditory brainstem response(s-ABR)is evoked by compound syllable, and those stimulus are similar to the daily language which convey both semantic information and non-semantic information. Speech coding program can take place at brainstem. As a new method, s-ABR may reveal the mystery of speech coding program. Many tests have proved that s-ABR is somehow related to cognitive ability. We mainly illustrated the possibility of grading the cognitive ability using s-ABR, the abnormal test result from those cognitive disorders, and the family factors that contribute to cognitive disorder.

  7. Cultural and biological evolution of phonemic speech

    NARCIS (Netherlands)

    de Boer, B.; Freitas, A.A.; Capcarrere, M.S.; Bentley, Peter J.; Johnson, Colin G.; Timmis, Jon

    2005-01-01

    This paper investigates the interaction between cultural evolution and biological evolution in the emergence of phonemic coding in speech. It is observed that our nearest relatives, the primates, use holistic utterances, whereas humans use phonemic utterances. It can therefore be argued that our las

  8. Freedom of Speech Wins in Wisconsin

    Science.gov (United States)

    Downs, Donald Alexander

    2006-01-01

    One might derive, from the eradication of a particularly heinous speech code, some encouragement that all is not lost in the culture wars. A core of dedicated scholars, working from within, made it obvious, to all but the most radical left, that imposing social justice by restricting thought and expression was a recipe for tyranny. Donald…

  9. OLIVE: Speech-Based Video Retrieval

    NARCIS (Netherlands)

    Jong, de Franciska; Gauvain, Jean-Luc; Hartog, den Jurgen; Netter, Klaus

    1999-01-01

    This paper describes the Olive project which aims to support automated indexing of video material by use of human language technologies. Olive is making use of speech recognition to automatically derive transcriptions of the sound tracks, generating time-coded linguistic elements which serve as the

  10. Cultural and biological evolution of phonemic speech

    NARCIS (Netherlands)

    de Boer, B.; Freitas, A.A.; Capcarrere, M.S.; Bentley, Peter J.; Johnson, Colin G.; Timmis, Jon

    2005-01-01

    This paper investigates the interaction between cultural evolution and biological evolution in the emergence of phonemic coding in speech. It is observed that our nearest relatives, the primates, use holistic utterances, whereas humans use phonemic utterances. It can therefore be argued that our

  11. OLIVE: Speech-Based Video Retrieval

    NARCIS (Netherlands)

    de Jong, Franciska M.G.; Gauvain, Jean-Luc; den Hartog, Jurgen; den Hartog, Jeremy; Netter, Klaus

    1999-01-01

    This paper describes the Olive project which aims to support automated indexing of video material by use of human language technologies. Olive is making use of speech recognition to automatically derive transcriptions of the sound tracks, generating time-coded linguistic elements which serve as the

  12. Freedom of Speech Wins in Wisconsin

    Science.gov (United States)

    Downs, Donald Alexander

    2006-01-01

    One might derive, from the eradication of a particularly heinous speech code, some encouragement that all is not lost in the culture wars. A core of dedicated scholars, working from within, made it obvious, to all but the most radical left, that imposing social justice by restricting thought and expression was a recipe for tyranny. Donald…

  13. Machine Learning

    CERN Document Server

    CERN. Geneva

    2017-01-01

    Machine learning, which builds on ideas in computer science, statistics, and optimization, focuses on developing algorithms to identify patterns and regularities in data, and using these learned patterns to make predictions on new observations. Boosted by its industrial and commercial applications, the field of machine learning is quickly evolving and expanding. Recent advances have seen great success in the realms of computer vision, natural language processing, and broadly in data science. Many of these techniques have already been applied in particle physics, for instance for particle identification, detector monitoring, and the optimization of computer resources. Modern machine learning approaches, such as deep learning, are only just beginning to be applied to the analysis of High Energy Physics data to approach more and more complex problems. These classes will review the framework behind machine learning and discuss recent developments in the field.

  14. Emotion identification using extremely low frequency components of speech feature contours.

    Science.gov (United States)

    Lin, Chang-Hong; Liao, Wei-Kai; Hsieh, Wen-Chi; Liao, Wei-Jiun; Wang, Jia-Ching

    2014-01-01

    The investigations of emotional speech identification can be divided into two main parts, features and classifiers. In this paper, how to extract an effective speech feature set for the emotional speech identification is addressed. In our speech feature set, we use not only statistical analysis of frame-based acoustical features, but also the approximated speech feature contours, which are obtained by extracting extremely low frequency components to speech feature contours. Furthermore, principal component analysis (PCA) is applied to the approximated speech feature contours so that an efficient representation of approximated contours can be derived. The proposed speech feature set is fed into support vector machines (SVMs) to perform multiclass emotion identification. The experimental results demonstrate the performance of the proposed system with 82.26% identification rate.

  15. Delayed Speech or Language Development

    Science.gov (United States)

    ... to 2-Year-Old Delayed Speech or Language Development KidsHealth > For Parents > Delayed Speech or Language Development ... child is right on schedule. Normal Speech & Language Development It's important to discuss early speech and language ...

  16. Objects Control through Speech Recognition Using LabVIEW

    Directory of Open Access Journals (Sweden)

    Ankush Sharma

    2013-01-01

    Full Text Available Speech is the natural form of human communication and the speech processing is the one of the most stimulating area of the signal processing. Speech recognition technology has made it possible for computer to follow the human voice command and understand the human languages. The objects (LED, Toggle switch etc. control through human speech is designed in this paper. By combine the virtual instrumentation technology and speech recognition techniques. And also provided password authentication. This can be done with the help of LabVIEW programming concepts. The microphone is using to take voice commands from Human. This microphone signals interface with LabVIEW code. The LabVIEW code will generate appropriate control signal to control the objects. The entire work done on the LabVIEW platform.

  17. Using the TED Talks to Evaluate Spoken Post-editing of Machine Translation

    DEFF Research Database (Denmark)

    Liyanapathirana, Jeevanthi; Popescu-Belis, Andrei

    2016-01-01

    This paper presents a solution to evaluate spoken post-editing of imperfect machine translation output by a human translator. We compare two approaches to the combination of machine translation (MT) and automatic speech recognition (ASR): a heuristic algorithm and a machine learning method...

  18. Modeling words with subword units in an articulatorily constrained speech recognition algorithm

    Energy Technology Data Exchange (ETDEWEB)

    Hogden, J.

    1997-11-20

    The goal of speech recognition is to find the most probable word given the acoustic evidence, i.e. a string of VQ codes or acoustic features. Speech recognition algorithms typically take advantage of the fact that the probability of a word, given a sequence of VQ codes, can be calculated.

  19. Post-editing through Speech Recognition

    DEFF Research Database (Denmark)

    Mesa-Lao, Bartolomé

    In the past couple of years automatic speech recognition (ASR) software has quietly created a niche for itself in many situations of our lives. Nowadays it can be found at the other end of customer-support hotlines, it is built into operating systems and it is offered as an alternative text......-input method for smartphones. On another front, given the significant improvements in Machine Translation (MT) quality and the increasing demand for translations, post-editing of MT is becoming a popular practice in the translation industry, since it has been shown to allow for larger volumes of translations...... to be produced saving time and costs. The translation industry is at a deeply transformative point in its evolution and the coming years herald an era of converge where speech technology could make a difference. As post-editing services are becoming a common practice among language service providers and speech...

  20. Bimodal Emotion Recognition from Speech and Text

    Directory of Open Access Journals (Sweden)

    Weilin Ye

    2014-01-01

    Full Text Available This paper presents an approach to emotion recognition from speech signals and textual content. In the analysis of speech signals, thirty-seven acoustic features are extracted from the speech input. Two different classifiers Support Vector Machines (SVMs and BP neural network are adopted to classify the emotional states. In text analysis, we use the two-step classification method to recognize the emotional states. The final emotional state is determined based on the emotion outputs from the acoustic and textual analyses. In this paper we have two parallel classifiers for acoustic information and two serial classifiers for textual information, and a final decision is made by combing these classifiers in decision level fusion. Experimental results show that the emotion recognition accuracy of the integrated system is better than that of either of the two individual approaches.

  1. 78 FR 49693 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

    Science.gov (United States)

    2013-08-15

    ... COMMISSION 47 CFR Part 64 Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services; Telecommunications Relay Services and Speech-to-Speech Services for Individuals With... this document, the Commission amends telecommunications relay services (TRS) mandatory...

  2. Speech Recognition Supported by Lip Analysis

    OpenAIRE

    Butt, Waqqas ur Rehman

    2016-01-01

    Computers have become more pervasive than ever with a wide range of devices and multiple ways of interaction. Traditional ways of human computer interaction using keyboards, mice and display monitors are being replaced by more natural modes such as speech, touch, and gesture. The continuous progress of technology brings to an irreversible change of paradigms of interaction between human and machine. They are now used in daily life in many devices that have revolutionized the way users interac...

  3. Fifty years of progress in speech understanding systems

    Science.gov (United States)

    Zue, Victor

    2004-10-01

    Researchers working on human-machine interfaces realized nearly 50 years ago that automatic speech recognition (ASR) alone is not sufficient; one needs to impart linguistic knowledge to the system such that the signal could ultimately be understood. A speech understanding system combines speech recognition (i.e., the speech to symbols conversion) with natural language processing (i.e., the symbol to meaning transformation) to achieve understanding. Speech understanding research dates back to the DARPA Speech Understanding Project in the early 1970s. However, large-scale efforts only began in earnest in the late 1980s, with government research programs in the U.S. and Europe providing the impetus. This has resulted in many innovations including novel approaches to natural language understanding (NLU) for speech input, and integration techniques for ASR and NLU. In the past decade, speech understanding systems have become major building blocks of conversational interfaces that enable users to access and manage information using spoken dialogue, incorporating language generation, discourse modeling, dialogue management, and speech synthesis. Today, we are at the threshold of developing multimodal interfaces, augmenting sound with sight and touch. This talk will highlight past work and speculate on the future. [Work supported by an industrial consortium of the MIT Oxygen Project.

  4. Exploring Recurrence Properties of Vowels for Analysis of Emotions in Speech

    National Research Council Canada - National Science Library

    Angela Lombardi; Pietro Guccione; Cataldo Guaragnella

    2016-01-01

      Speech Emotion Recognition (SER) is a recent field of research that aims at identifying the emotional state of a speaker through a collection of machine learning and pattern recognition techniques...

  5. Speech and Language Impairments

    Science.gov (United States)

    ... impairment. Many children are identified as having a speech or language impairment after they enter the public school system. A teacher may notice difficulties in a child’s speech or communication skills and refer the child for ...

  6. Word pair classification during imagined speech using direct brain recordings

    Science.gov (United States)

    Martin, Stephanie; Brunner, Peter; Iturrate, Iñaki; Millán, José Del R.; Schalk, Gerwin; Knight, Robert T.; Pasley, Brian N.

    2016-05-01

    People that cannot communicate due to neurological disorders would benefit from an internal speech decoder. Here, we showed the ability to classify individual words during imagined speech from electrocorticographic signals. In a word imagery task, we used high gamma (70–150 Hz) time features with a support vector machine model to classify individual words from a pair of words. To account for temporal irregularities during speech production, we introduced a non-linear time alignment into the SVM kernel. Classification accuracy reached 88% in a two-class classification framework (50% chance level), and average classification accuracy across fifteen word-pairs was significant across five subjects (mean = 58% p perception and production. These data represent a proof of concept study for basic decoding of speech imagery, and delineate a number of key challenges to usage of speech imagery neural representations for clinical applications.

  7. FY05 LDRD Fianl Report Investigation of AAA+ protein machines that participate in DNA replication, recombination, and in response to DNA damage LDRD Project Tracking Code: 04-LW-049

    Energy Technology Data Exchange (ETDEWEB)

    Sawicka, D; de Carvalho-Kavanagh, M S; Barsky, D; Venclovas, C

    2006-12-04

    The AAA+ proteins are remarkable macromolecules that are able to self-assemble into nanoscale machines. These protein machines play critical roles in many cellular processes, including the processes that manage a cell's genetic material, but the mechanism at the molecular level has remained elusive. We applied computational molecular modeling, combined with advanced sequence analysis and available biochemical and genetic data, to structurally characterize eukaryotic AAA+ proteins and the protein machines they form. With these models we have examined intermolecular interactions in three-dimensions (3D), including both interactions between the components of the AAA+ complexes and the interactions of these protein machines with their partners. These computational studies have provided new insights into the molecular structure and the mechanism of action for AAA+ protein machines, thereby facilitating a deeper understanding of processes involved in DNA metabolism.

  8. Speech 7 through 12.

    Science.gov (United States)

    Nederland Independent School District, TX.

    GRADES OR AGES: Grades 7 through 12. SUBJECT MATTER: Speech. ORGANIZATION AND PHYSICAL APPEARANCE: Following the foreward, philosophy and objectives, this guide presents a speech curriculum. The curriculum covers junior high and Speech I, II, III (senior high). Thirteen units of study are presented for junior high, each unit is divided into…

  9. [Prosody, speech input and language acquisition].

    Science.gov (United States)

    Jungheim, M; Miller, S; Kühn, D; Ptok, M

    2014-04-01

    In order to acquire language, children require speech input. The prosody of the speech input plays an important role. In most cultures adults modify their code when communicating with children. Compared to normal speech this code differs especially with regard to prosody. For this review a selective literature search in PubMed and Scopus was performed. Prosodic characteristics are a key feature of spoken language. By analysing prosodic features, children gain knowledge about underlying grammatical structures. Child-directed speech (CDS) is modified in a way that meaningful sequences are highlighted acoustically so that important information can be extracted from the continuous speech flow more easily. CDS is said to enhance the representation of linguistic signs. Taking into consideration what has previously been described in the literature regarding the perception of suprasegmentals, CDS seems to be able to support language acquisition due to the correspondence of prosodic and syntactic units. However, no findings have been reported, stating that the linguistically reduced CDS could hinder first language acquisition.

  10. Laser Marked Codes For Paperless Tracking Applications

    Science.gov (United States)

    Crater, David

    1987-01-01

    The application of laser markers for marking machine readable codes is described. Use of such codes for automatic tracking and considerations for marker performance and features are discussed. Available laser marker types are reviewed. Compatibility of laser/material combinations and material/code/reader systems are reviewed.

  11. Machine Learning

    Energy Technology Data Exchange (ETDEWEB)

    Chikkagoudar, Satish; Chatterjee, Samrat; Thomas, Dennis G.; Carroll, Thomas E.; Muller, George

    2017-04-21

    The absence of a robust and unified theory of cyber dynamics presents challenges and opportunities for using machine learning based data-driven approaches to further the understanding of the behavior of such complex systems. Analysts can also use machine learning approaches to gain operational insights. In order to be operationally beneficial, cybersecurity machine learning based models need to have the ability to: (1) represent a real-world system, (2) infer system properties, and (3) learn and adapt based on expert knowledge and observations. Probabilistic models and Probabilistic graphical models provide these necessary properties and are further explored in this chapter. Bayesian Networks and Hidden Markov Models are introduced as an example of a widely used data driven classification/modeling strategy.

  12. Wideband Speech Recovery Using Psychoacoustic Criteria

    Directory of Open Access Journals (Sweden)

    Visar Berisha

    2007-08-01

    Full Text Available Many modern speech bandwidth extension techniques predict the high-frequency band based on features extracted from the lower band. While this method works for certain types of speech, problems arise when the correlation between the low and the high bands is not sufficient for adequate prediction. These situations require that additional high-band information is sent to the decoder. This overhead information, however, can be cleverly quantized using human auditory system models. In this paper, we propose a novel speech compression method that relies on bandwidth extension. The novelty of the technique lies in an elaborate perceptual model that determines a quantization scheme for wideband recovery and synthesis. Furthermore, a source/filter bandwidth extension algorithm based on spectral spline fitting is proposed. Results reveal that the proposed system improves the quality of narrowband speech while performing at a lower bitrate. When compared to other wideband speech coding schemes, the proposed algorithms provide comparable speech quality at a lower bitrate.

  13. Wideband Speech Recovery Using Psychoacoustic Criteria

    Directory of Open Access Journals (Sweden)

    Berisha Visar

    2007-01-01

    Full Text Available Many modern speech bandwidth extension techniques predict the high-frequency band based on features extracted from the lower band. While this method works for certain types of speech, problems arise when the correlation between the low and the high bands is not sufficient for adequate prediction. These situations require that additional high-band information is sent to the decoder. This overhead information, however, can be cleverly quantized using human auditory system models. In this paper, we propose a novel speech compression method that relies on bandwidth extension. The novelty of the technique lies in an elaborate perceptual model that determines a quantization scheme for wideband recovery and synthesis. Furthermore, a source/filter bandwidth extension algorithm based on spectral spline fitting is proposed. Results reveal that the proposed system improves the quality of narrowband speech while performing at a lower bitrate. When compared to other wideband speech coding schemes, the proposed algorithms provide comparable speech quality at a lower bitrate.

  14. Objective voice and speech analysis of persons with chronic hoarseness by prosodic analysis of speech samples.

    Science.gov (United States)

    Haderlein, Tino; Döllinger, Michael; Matoušek, Václav; Nöth, Elmar

    2016-10-01

    Automatic voice assessment is often performed using sustained vowels. In contrast, speech analysis of read-out texts can be applied to voice and speech assessment. Automatic speech recognition and prosodic analysis were used to find regression formulae between automatic and perceptual assessment of four voice and four speech criteria. The regression was trained with 21 men and 62 women (average age 49.2 years) and tested with another set of 24 men and 49 women (48.3 years), all suffering from chronic hoarseness. They read the text 'Der Nordwind und die Sonne' ('The North Wind and the Sun'). Five voice and speech therapists evaluated the data on 5-point Likert scales. Ten prosodic and recognition accuracy measures (features) were identified which describe all the examined criteria. Inter-rater correlation within the expert group was between r = 0.63 for the criterion 'match of breath and sense units' and r = 0.87 for the overall voice quality. Human-machine correlation was between r = 0.40 for the match of breath and sense units and r = 0.82 for intelligibility. The perceptual ratings of different criteria were highly correlated with each other. Likewise, the feature sets modeling the criteria were very similar. The automatic method is suitable for assessing chronic hoarseness in general and for subgroups of functional and organic dysphonia. In its current version, it is almost as reliable as a randomly picked rater from a group of voice and speech therapists.

  15. Speech in spinocerebellar ataxia.

    Science.gov (United States)

    Schalling, Ellika; Hartelius, Lena

    2013-12-01

    Spinocerebellar ataxias (SCAs) are a heterogeneous group of autosomal dominant cerebellar ataxias clinically characterized by progressive ataxia, dysarthria and a range of other concomitant neurological symptoms. Only a few studies include detailed characterization of speech symptoms in SCA. Speech symptoms in SCA resemble ataxic dysarthria but symptoms related to phonation may be more prominent. One study to date has shown an association between differences in speech and voice symptoms related to genotype. More studies of speech and voice phenotypes are motivated, to possibly aid in clinical diagnosis. In addition, instrumental speech analysis has been demonstrated to be a reliable measure that may be used to monitor disease progression or therapy outcomes in possible future pharmacological treatments. Intervention by speech and language pathologists should go beyond assessment. Clinical guidelines for management of speech, communication and swallowing need to be developed for individuals with progressive cerebellar ataxia.

  16. Speech parts as Poisson processes.

    Science.gov (United States)

    Badalamenti, A F

    2001-09-01

    This paper presents evidence that six of the seven parts of speech occur in written text as Poisson processes, simple or recurring. The six major parts are nouns, verbs, adjectives, adverbs, prepositions, and conjunctions, with the interjection occurring too infrequently to support a model. The data consist of more than the first 5000 words of works by four major authors coded to label the parts of speech, as well as periods (sentence terminators). Sentence length is measured via the period and found to be normally distributed with no stochastic model identified for its occurrence. The models for all six speech parts but the noun significantly distinguish some pairs of authors and likewise for the joint use of all words types. Any one author is significantly distinguished from any other by at least one word type and sentence length very significantly distinguishes each from all others. The variety of word type use, measured by Shannon entropy, builds to about 90% of its maximum possible value. The rate constants for nouns are close to the fractions of maximum entropy achieved. This finding together with the stochastic models and the relations among them suggest that the noun may be a primitive organizer of written text.

  17. Digital speech processing using Matlab

    CERN Document Server

    Gopi, E S

    2014-01-01

    Digital Speech Processing Using Matlab deals with digital speech pattern recognition, speech production model, speech feature extraction, and speech compression. The book is written in a manner that is suitable for beginners pursuing basic research in digital speech processing. Matlab illustrations are provided for most topics to enable better understanding of concepts. This book also deals with the basic pattern recognition techniques (illustrated with speech signals using Matlab) such as PCA, LDA, ICA, SVM, HMM, GMM, BPN, and KSOM.

  18. Machine testning

    DEFF Research Database (Denmark)

    De Chiffre, Leonardo

    This document is used in connection with a laboratory exercise of 3 hours duration as a part of the course GEOMETRICAL METROLOGY AND MACHINE TESTING. The exercise includes a series of tests carried out by the student on a conventional and a numerically controled lathe, respectively. This document...

  19. Representational Machines

    DEFF Research Database (Denmark)

    Petersson, Dag; Dahlgren, Anna; Vestberg, Nina Lager

    to the enterprises of the medium. This is the subject of Representational Machines: How photography enlists the workings of institutional technologies in search of establishing new iconic and social spaces. Together, the contributions to this edited volume span historical epochs, social environments, technological...

  20. Commercial applications of speech interface technology: an industry at the threshold.

    Science.gov (United States)

    Oberteuffer, J A

    1995-10-24

    Speech interface technology, which includes automatic speech recognition, synthetic speech, and natural language processing, is beginning to have a significant impact on business and personal computer use. Today, powerful and inexpensive microprocessors and improved algorithms are driving commercial applications in computer command, consumer, data entry, speech-to-text, telephone, and voice verification. Robust speaker-independent recognition systems for command and navigation in personal computers are now available; telephone-based transaction and database inquiry systems using both speech synthesis and recognition are coming into use. Large-vocabulary speech interface systems for document creation and read-aloud proofing are expanding beyond niche markets. Today's applications represent a small preview of a rich future for speech interface technology that will eventually replace keyboards with microphones and loud-speakers to give easy accessibility to increasingly intelligent machines.

  1. Towards Artificial Speech Therapy: A Neural System for Impaired Speech Segmentation.

    Science.gov (United States)

    Iliya, Sunday; Neri, Ferrante

    2016-09-01

    This paper presents a neural system-based technique for segmenting short impaired speech utterances into silent, unvoiced, and voiced sections. Moreover, the proposed technique identifies those points of the (voiced) speech where the spectrum becomes steady. The resulting technique thus aims at detecting that limited section of the speech which contains the information about the potential impairment of the speech. This section is of interest to the speech therapist as it corresponds to the possibly incorrect movements of speech organs (lower lip and tongue with respect to the vocal tract). Two segmentation models to detect and identify the various sections of the disordered (impaired) speech signals have been developed and compared. The first makes use of a combination of four artificial neural networks. The second is based on a support vector machine (SVM). The SVM has been trained by means of an ad hoc nested algorithm whose outer layer is a metaheuristic while the inner layer is a convex optimization algorithm. Several metaheuristics have been tested and compared leading to the conclusion that some variants of the compact differential evolution (CDE) algorithm appears to be well-suited to address this problem. Numerical results show that the SVM model with a radial basis function is capable of effective detection of the portion of speech that is of interest to a therapist. The best performance has been achieved when the system is trained by the nested algorithm whose outer layer is hybrid-population-based/CDE. A population-based approach displays the best performance for the isolation of silence/noise sections, and the detection of unvoiced sections. On the other hand, a compact approach appears to be clearly well-suited to detect the beginning of the steady state of the voiced signal. Both the proposed segmentation models display outperformed two modern segmentation techniques based on Gaussian mixture model and deep learning.

  2. Adding machine and calculating machine

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    In 1642 the French mathematician Blaise Pascal(1623-1662) invented a machine;.that could add and subtract. It had.wheels that each had: 1 to 10 marked off along its circumference. When the wheel at the right, representing units, made one complete circle, it engaged the wheel to its left, represents tens, and moved it forward one notch.

  3. Speech-to-Speech Relay Service

    Science.gov (United States)

    ... to make an STS call. You are then connected to an STS CA who will repeat your spoken words, making the spoken words clear to the other party. Persons with speech disabilities may also receive STS calls. The calling ...

  4. Exploration of Speech Planning and Producing by Speech Error Analysis

    Institute of Scientific and Technical Information of China (English)

    冷卉

    2012-01-01

    Speech error analysis is an indirect way to discover speech planning and producing processes. From some speech errors made by people in their daily life, linguists and learners can reveal the planning and producing processes more easily and clearly.

  5. Genesis machines

    CERN Document Server

    Amos, Martyn

    2014-01-01

    Silicon chips are out. Today's scientists are using real, wet, squishy, living biology to build the next generation of computers. Cells, gels and DNA strands are the 'wetware' of the twenty-first century. Much smaller and more intelligent, these organic computers open up revolutionary possibilities. Tracing the history of computing and revealing a brave new world to come, Genesis Machines describes how this new technology will change the way we think not just about computers - but about life itself.

  6. The Implications of Virtual Machine Introspection for Digital Forensics on Nonquiescent Virtual Machines

    Science.gov (United States)

    2011-06-01

    TERMS Virtual Machine Introspection, VMI , Virtual Machine, Forensics, Xen 15. NUMBER OF PAGES 61 16. PRICE CODE 17. SECURITY CLASSIFICATION OF...taken for VMI memory dump utility to run with different loads ...........22 x THIS PAGE INTENTIONALLY LEFT BLANK xi LIST OF TABLES Table 1...Time to compile kernel without VMI ..............................................................20 Table 2. Time to compile kernel with VMI

  7. Indirect Speech Acts

    Institute of Scientific and Technical Information of China (English)

    李威

    2001-01-01

    Indirect speech acts are frequently used in verbal communication, the interpretation of them is of great importance in order to meet the demands of the development of students' communicative competence. This paper, therefore, intends to present Searle' s indirect speech acts and explore the way how indirect speech acts are interpreted in accordance with two influential theories. It consists of four parts. Part one gives a general introduction to the notion of speech acts theory. Part two makes an elaboration upon the conception of indirect speech act theory proposed by Searle and his supplement and development of illocutionary acts. Part three deals with the interpretation of indirect speech acts. Part four draws implication from the previous study and also serves as the conclusion of the dissertation.

  8. Untyped Memory in the Java Virtual Machine

    DEFF Research Database (Denmark)

    Gal, Andreas; Probst, Christian; Franz, Michael

    2005-01-01

    We have implemented a virtual execution environment that executes legacy binary code on top of the type-safe Java Virtual Machine by recompiling native code instructions to type-safe bytecode. As it is essentially impossible to infer static typing into untyped machine code, our system emulates...... untyped memory on top of Java’s type system. While this approach allows to execute native code on any off-the-shelf JVM, the resulting runtime performance is poor. We propose a set of virtual machine extensions that add type-unsafe memory objects to JVM. We contend that these JVM extensions do not relax...... Java’s type system as the same functionality can be achieved in pure Java, albeit much less efficiently....

  9. Neural mechanisms underlying auditory feedback control of speech.

    Science.gov (United States)

    Tourville, Jason A; Reilly, Kevin J; Guenther, Frank H

    2008-02-01

    The neural substrates underlying auditory feedback control of speech were investigated using a combination of functional magnetic resonance imaging (fMRI) and computational modeling. Neural responses were measured while subjects spoke monosyllabic words under two conditions: (i) normal auditory feedback of their speech and (ii) auditory feedback in which the first formant frequency of their speech was unexpectedly shifted in real time. Acoustic measurements showed compensation to the shift within approximately 136 ms of onset. Neuroimaging revealed increased activity in bilateral superior temporal cortex during shifted feedback, indicative of neurons coding mismatches between expected and actual auditory signals, as well as right prefrontal and Rolandic cortical activity. Structural equation modeling revealed increased influence of bilateral auditory cortical areas on right frontal areas during shifted speech, indicating that projections from auditory error cells in posterior superior temporal cortex to motor correction cells in right frontal cortex mediate auditory feedback control of speech.

  10. Human Emotion Recognition From Speech

    Directory of Open Access Journals (Sweden)

    Miss. Aparna P. Wanare

    2014-07-01

    Full Text Available Speech Emotion Recognition is a recent research topic in the Human Computer Interaction (HCI field. The need has risen for a more natural communication interface between humans and computer, as computers have become an integral part of our lives. A lot of work currently going on to improve the interaction between humans and computers. To achieve this goal, a computer would have to be able to distinguish its present situation and respond differently depending on that observation. Part of this process involves understanding a user‟s emotional state. To make the human computer interaction more natural, the objective is that computer should be able to recognize emotional states in the same as human does. The efficiency of emotion recognition system depends on type of features extracted and classifier used for detection of emotions. The proposed system aims at identification of basic emotional states such as anger, joy, neutral and sadness from human speech. While classifying different emotions, features like MFCC (Mel Frequency Cepstral Coefficient and Energy is used. In this paper, Standard Emotional Database i.e. English Database is used which gives the satisfactory detection of emotions than recorded samples of emotions. This methodology describes and compares the performances of Learning Vector Quantization Neural Network (LVQ NN, Multiclass Support Vector Machine (SVM and their combination for emotion recognition.

  11. Esophageal speeches modified by the Speech Enhancer Program®

    OpenAIRE

    Manochiopinig, Sriwimon; Boonpramuk, Panuthat

    2014-01-01

    Esophageal speech appears to be the first choice of speech treatment for a laryngectomy. However, many laryngectomy people are unable to speak well. The aim of this study was to evaluate post-modified speech quality of Thai esophageal speakers using the Speech Enhancer Program®. The method adopted was to approach five speech–language pathologists to assess the speech accuracy and intelligibility of the words and continuing speech of the seven laryngectomy people. A comparison study was conduc...

  12. Coding Partitions

    Directory of Open Access Journals (Sweden)

    Fabio Burderi

    2007-05-01

    Full Text Available Motivated by the study of decipherability conditions for codes weaker than Unique Decipherability (UD, we introduce the notion of coding partition. Such a notion generalizes that of UD code and, for codes that are not UD, allows to recover the ``unique decipherability" at the level of the classes of the partition. By tacking into account the natural order between the partitions, we define the characteristic partition of a code X as the finest coding partition of X. This leads to introduce the canonical decomposition of a code in at most one unambiguouscomponent and other (if any totally ambiguouscomponents. In the case the code is finite, we give an algorithm for computing its canonical partition. This, in particular, allows to decide whether a given partition of a finite code X is a coding partition. This last problem is then approached in the case the code is a rational set. We prove its decidability under the hypothesis that the partition contains a finite number of classes and each class is a rational set. Moreover we conjecture that the canonical partition satisfies such a hypothesis. Finally we consider also some relationships between coding partitions and varieties of codes.

  13. Noise Feedback Coding Revisited: Refurbished Legacy Codecs and New Coding Models

    Institute of Scientific and Technical Information of China (English)

    Stephane Ragot; Balazs Kovesi; Alain Le Guyader

    2012-01-01

    Noise feedback coding (NFC) has attracted renewed interest with the recent standardization of backward-compatible enhancements for ITU-T G.711 and G.722. It has also been revisited with the emergence of proprietary speech codecs, such as BV16, BV32, and SILK, that have structures different from CELP coding. In this article, we review NFC and describe a novel coding technique that optimally shapes coding noise in embedded pulse-code modulation (PCM) and embedded adaptive differential PCM (ADPCM). We describe how this new technique was incorporated into the recent ITU-T G.711.1, G.711 App. III, and G.722 Annex B (G.722B) speech-coding standards.

  14. Ear, Hearing and Speech

    DEFF Research Database (Denmark)

    Poulsen, Torben

    2000-01-01

    An introduction is given to the the anatomy and the function of the ear, basic psychoacoustic matters (hearing threshold, loudness, masking), the speech signal and speech intelligibility. The lecture note is written for the course: Fundamentals of Acoustics and Noise Control (51001)......An introduction is given to the the anatomy and the function of the ear, basic psychoacoustic matters (hearing threshold, loudness, masking), the speech signal and speech intelligibility. The lecture note is written for the course: Fundamentals of Acoustics and Noise Control (51001)...

  15. Advances in Speech Recognition

    CERN Document Server

    Neustein, Amy

    2010-01-01

    This volume is comprised of contributions from eminent leaders in the speech industry, and presents a comprehensive and in depth analysis of the progress of speech technology in the topical areas of mobile settings, healthcare and call centers. The material addresses the technical aspects of voice technology within the framework of societal needs, such as the use of speech recognition software to produce up-to-date electronic health records, not withstanding patients making changes to health plans and physicians. Included will be discussion of speech engineering, linguistics, human factors ana

  16. Simulating Turing machines on Maurer machines

    NARCIS (Netherlands)

    Bergstra, J.A.; Middelburg, C.A.

    2008-01-01

    In a previous paper, we used Maurer machines to model and analyse micro-architectures. In the current paper, we investigate the connections between Turing machines and Maurer machines with the purpose to gain an insight into computability issues relating to Maurer machines. We introduce ways to

  17. Marital conflict and adjustment: speech nonfluencies in intimate disclosure.

    Science.gov (United States)

    Paul, E L; White, K M; Speisman, J C; Costos, D

    1988-06-01

    Speech nonfluency in response to questions about the marital relationship was used to assess anxiety. Subjects were 31 husbands and 31 wives, all white, college educated, from middle- to lower-middle-class families, and ranging from 20 to 30 years of age. Three types of nonfluencies were coded: filled pauses, unfilled pauses, and repetitions. Speech-disturbance ratios were computed by dividing the sum of speech nonfluencies by the total words spoken. The results support the notion that some issues within marriage are more sensitive and/or problematic than others, and that, in an interview situation, gender interacts with question content in the production of nonfluencies.

  18. Environmentally Friendly Machining

    CERN Document Server

    Dixit, U S; Davim, J Paulo

    2012-01-01

    Environment-Friendly Machining provides an in-depth overview of environmentally-friendly machining processes, covering numerous different types of machining in order to identify which practice is the most environmentally sustainable. The book discusses three systems at length: machining with minimal cutting fluid, air-cooled machining and dry machining. Also covered is a way to conserve energy during machining processes, along with useful data and detailed descriptions for developing and utilizing the most efficient modern machining tools. Researchers and engineers looking for sustainable machining solutions will find Environment-Friendly Machining to be a useful volume.

  19. Speech-Language Therapy (For Parents)

    Science.gov (United States)

    ... Feeding Your 1- to 2-Year-Old Speech-Language Therapy KidsHealth > For Parents > Speech-Language Therapy A ... with speech and/or language disorders. Speech Disorders, Language Disorders, and Feeding Disorders A speech disorder refers ...

  20. Machine Transliteration

    CERN Document Server

    Knight, K; Knight, Kevin; Graehl, Jonathan

    1997-01-01

    It is challenging to translate names and technical terms across languages with different alphabets and sound inventories. These items are commonly transliterated, i.e., replaced with approximate phonetic equivalents. For example, "computer" in English comes out as "konpyuutaa" in Japanese. Translating such items from Japanese back to English is even more challenging, and of practical interest, as transliterated items make up the bulk of text phrases not found in bilingual dictionaries. We describe and evaluate a method for performing backwards transliterations by machine. This method uses a generative model, incorporating several distinct stages in the transliteration process.

  1. Automatic Teller Machine for disable users and its security issues

    OpenAIRE

    Ali, Liaqat; Jahankhani, Hamid; Jahankhani, Hossein

    2007-01-01

    Automatic Teller Machine is highly beneficial in banking industry. The banking industry forcefully promotes the use of ATM cards. In spite of the success and extensive use of Automatic Teller Machine, a large percentage of bank clients can not use them and it has been noted that no importance has been given to the feature of the accessibility at all. The significant number of disable users in bank industry had experience many difficulties in their interaction with these ATM’s. Speech technolo...

  2. Plug into 'the modernizing machine'!

    DEFF Research Database (Denmark)

    Krejsler, John B.

    2013-01-01

    ‘The modernizing machine’ codes individual bodies, things and symbols with images from New Public Management, neoliberal and Knowledge Economy discourses. Drawing on Deleuze & Guattari’s concept of machines, this article explores how ‘the modernizing machine’ produces neo-liberal modernization...... of the public sector. Taking its point of departure in Danish university reform, the article explores how the university is transformed by this desiring-producing machine. ‘The modernizing machine’ wrestles with the so-called ‘democratic-Humboldtian machine’. The University Act of 2003 and the host of reforms...... bodies and minds simultaneously produce academic subjectivities by plugging into these transformative machinic forces and are produced as they are traversed by them. What is experienced as stressful closures vis-à-vis new opportunities depends to a great extent upon how these producing...

  3. Unvoiced Speech Recognition Using Tissue-Conductive Acoustic Sensor

    Directory of Open Access Journals (Sweden)

    Heracleous Panikos

    2007-01-01

    Full Text Available We present the use of stethoscope and silicon NAM (nonaudible murmur microphones in automatic speech recognition. NAM microphones are special acoustic sensors, which are attached behind the talker's ear and can capture not only normal (audible speech, but also very quietly uttered speech (nonaudible murmur. As a result, NAM microphones can be applied in automatic speech recognition systems when privacy is desired in human-machine communication. Moreover, NAM microphones show robustness against noise and they might be used in special systems (speech recognition, speech transform, etc. for sound-impaired people. Using adaptation techniques and a small amount of training data, we achieved for a 20 k dictation task a word accuracy for nonaudible murmur recognition in a clean environment. In this paper, we also investigate nonaudible murmur recognition in noisy environments and the effect of the Lombard reflex on nonaudible murmur recognition. We also propose three methods to integrate audible speech and nonaudible murmur recognition using a stethoscope NAM microphone with very promising results.

  4. Tracking Speech Sound Acquisition

    Science.gov (United States)

    Powell, Thomas W.

    2011-01-01

    This article describes a procedure to aid in the clinical appraisal of child speech. The approach, based on the work by Dinnsen, Chin, Elbert, and Powell (1990; Some constraints on functionally disordered phonologies: Phonetic inventories and phonotactics. "Journal of Speech and Hearing Research", 33, 28-37), uses a railway idiom to track gains in…

  5. Preschool Connected Speech Inventory.

    Science.gov (United States)

    DiJohnson, Albert; And Others

    This speech inventory developed for a study of aurally handicapped preschool children (see TM 001 129) provides information on intonation patterns in connected speech. The inventory consists of a list of phrases and simple sentences accompanied by pictorial clues. The test is individually administered by a teacher-examiner who presents the spoken…

  6. Illustrated Speech Anatomy.

    Science.gov (United States)

    Shearer, William M.

    Written for students in the fields of speech correction and audiology, the text deals with the following: structures involved in respiration; the skeleton and the processes of inhalation and exhalation; phonation and pitch, the larynx, and esophageal speech; muscles involved in articulation; muscles involved in resonance; and the anatomy of the…

  7. Private Speech in Ballet

    Science.gov (United States)

    Johnston, Dale

    2006-01-01

    Authoritarian teaching practices in ballet inhibit the use of private speech. This paper highlights the critical importance of private speech in the cognitive development of young ballet students, within what is largely a non-verbal art form. It draws upon research by Russian psychologist Lev Vygotsky and contemporary socioculturalists, to…

  8. Private Speech in Ballet

    Science.gov (United States)

    Johnston, Dale

    2006-01-01

    Authoritarian teaching practices in ballet inhibit the use of private speech. This paper highlights the critical importance of private speech in the cognitive development of young ballet students, within what is largely a non-verbal art form. It draws upon research by Russian psychologist Lev Vygotsky and contemporary socioculturalists, to…

  9. Tracking Speech Sound Acquisition

    Science.gov (United States)

    Powell, Thomas W.

    2011-01-01

    This article describes a procedure to aid in the clinical appraisal of child speech. The approach, based on the work by Dinnsen, Chin, Elbert, and Powell (1990; Some constraints on functionally disordered phonologies: Phonetic inventories and phonotactics. "Journal of Speech and Hearing Research", 33, 28-37), uses a railway idiom to track gains in…

  10. Free Speech Yearbook 1976.

    Science.gov (United States)

    Phifer, Gregg, Ed.

    The articles collected in this annual address several aspects of First Amendment Law. The following titles are included: "Freedom of Speech As an Academic Discipline" (Franklyn S. Haiman), "Free Speech and Foreign-Policy Decision Making" (Douglas N. Freeman), "The Supreme Court and the First Amendment: 1975-1976"…

  11. Preschool Connected Speech Inventory.

    Science.gov (United States)

    DiJohnson, Albert; And Others

    This speech inventory developed for a study of aurally handicapped preschool children (see TM 001 129) provides information on intonation patterns in connected speech. The inventory consists of a list of phrases and simple sentences accompanied by pictorial clues. The test is individually administered by a teacher-examiner who presents the spoken…

  12. Advertising and Free Speech.

    Science.gov (United States)

    Hyman, Allen, Ed.; Johnson, M. Bruce, Ed.

    The articles collected in this book originated at a conference at which legal and economic scholars discussed the issue of First Amendment protection for commercial speech. The first article, in arguing for freedom for commercial speech, finds inconsistent and untenable the arguments of those who advocate freedom from regulation for political…

  13. Free Speech. No. 38.

    Science.gov (United States)

    Kane, Peter E., Ed.

    This issue of "Free Speech" contains the following articles: "Daniel Schoor Relieved of Reporting Duties" by Laurence Stern, "The Sellout at CBS" by Michael Harrington, "Defending Dan Schorr" by Tome Wicker, "Speech to the Washington Press Club, February 25, 1976" by Daniel Schorr, "Funds Voted For Schorr Inquiry" by Richard Lyons, "Erosion of the…

  14. Machine Protection

    CERN Document Server

    Schmidt, R

    2014-01-01

    The protection of accelerator equipment is as old as accelerator technology and was for many years related to high-power equipment. Examples are the protection of powering equipment from overheating (magnets, power converters, high-current cables), of superconducting magnets from damage after a quench and of klystrons. The protection of equipment from beam accidents is more recent. It is related to the increasing beam power of high-power proton accelerators such as ISIS, SNS, ESS and the PSI cyclotron, to the emission of synchrotron light by electron–positron accelerators and FELs, and to the increase of energy stored in the beam (in particular for hadron colliders such as LHC). Designing a machine protection system requires an excellent understanding of accelerator physics and operation to anticipate possible failures that could lead to damage. Machine protection includes beam and equipment monitoring, a system to safely stop beam operation (e.g. dumping the beam or stopping the beam at low energy) and an ...

  15. Diamond Measuring Machine

    Energy Technology Data Exchange (ETDEWEB)

    Krstulic, J.F.

    2000-01-27

    The fundamental goal of this project was to develop additional capabilities to the diamond measuring prototype, work out technical difficulties associated with the original device, and perform automated measurements which are accurate and repeatable. For this project, FM and T was responsible for the overall system design, edge extraction, and defect extraction and identification. AccuGem provided a lab and computer equipment in Lawrence, 3D modeling, industry expertise, and sets of diamonds for testing. The system executive software which controls stone positioning, lighting, focusing, report generation, and data acquisition was written in Microsoft Visual Basic 6, while data analysis and modeling were compiled in C/C++ DLLs. All scanning parameters and extracted data are stored in a central database and available for automated analysis and reporting. The Phase 1 study showed that data can be extracted and measured from diamond scans, but most of the information had to be manually extracted. In this Phase 2 project, all data required for geometric modeling and defect identification were automatically extracted and passed to a 3D modeling module for analysis. Algorithms were developed which automatically adjusted both light levels and stone focus positioning for each diamond-under-test. After a diamond is analyzed and measurements are completed, a report is printed for the customer which shows carat weight, summarizes stone geometry information, lists defects and their size, displays a picture of the diamond, and shows a plot of defects on a top view drawing of the stone. Initial emphasis of defect extraction was on identification of feathers, pinpoints, and crystals. Defects were plotted color-coded by industry standards for inclusions (red), blemishes (green), and unknown defects (blue). Diamonds with a wide variety of cut quality, size, and number of defects were tested in the machine. Edge extraction, defect extraction, and modeling code were tested for

  16. Perception of words and pitch patterns in song and speech

    Directory of Open Access Journals (Sweden)

    Julia eMerrill

    2012-03-01

    Full Text Available This fMRI study examines shared and distinct cortical areas involved in the auditory perception of song and speech at the level of their underlying constituents: words, pitch and rhythm. Univariate and multivariate analyses were performed on the brain activity patterns of six conditions, arranged in a subtractive hierarchy: sung sentences including words, pitch and rhythm; hummed speech prosody and song melody containing only pitch patterns and rhythm; as well as the pure musical or speech rhythm.Systematic contrasts between these balanced conditions following their hierarchical organization showed a great overlap between song and speech at all levels in the bilateral temporal lobe, but suggested a differential role of the inferior frontal gyrus (IFG and intraparietal sulcus (IPS in processing song and speech. The left IFG was involved in word- and pitch-related processing in speech, the right IFG in processing pitch in song.Furthermore, the IPS showed sensitivity to discrete pitch relations in song as opposed to the gliding pitch in speech. Finally, the superior temporal gyrus and premotor cortex coded for general differences between words and pitch patterns, irrespective of whether they were sung or spoken. Thus, song and speech share many features which are reflected in a fundamental similarity of brain areas involved in their perception. However, fine-grained acoustic differences on word and pitch level are reflected in the activity of IFG and IPS.

  17. Evaluating a topographical mapping from speech acoustics to tongue positions

    Energy Technology Data Exchange (ETDEWEB)

    Hogden, J.; Heard, M. [Los Alamos Natl. Lab., MS B256, Los Alamos, NM 87545 (United States)

    1995-05-01

    The {ital continuity} {ital mapping} algorithm---a procedure for learning to recover the relative positions of the articulators from speech signals---is evaluated using human speech data. The advantage of continuity mapping is that it is an unsupervised algorithm; that is, it can potentially be trained to make a mapping from speech acoustics to speech articulation without articulator measurements. The procedure starts by vector quantizing short windows of a speech signal so that each window is represented (encoded) by a single number. Next, multidimensional scaling is used to map quantization codes that were temporally close in the encoded speech to nearby points in a {ital continuity} {ital map}. Since speech sounds produced sufficiently close together in time must have been produced by similar articulator configurations, and speech sounds produced close together in time are close to each other in the continuity map, sounds produced by similar articulator positions should be mapped to similar positions in the continuity map. The data set used for evaluating the continuity mapping algorithm is comprised of simultaneously collected articulator and acoustic measurements made using an electromagnetic midsagittal articulometer on a human subject. Comparisons between measured articulator positions and those recovered using continuity mapping will be presented.

  18. Speech emotion recognition with unsupervised feature learning

    Institute of Scientific and Technical Information of China (English)

    Zheng-wei HUANG; Wen-tao XUE; Qi-rong MAO

    2015-01-01

    Emotion-based features are critical for achieving high performance in a speech emotion recognition (SER) system. In general, it is difficult to develop these features due to the ambiguity of the ground-truth. In this paper, we apply several unsupervised feature learning algorithms (including K-means clustering, the sparse auto-encoder, and sparse restricted Boltzmann machines), which have promise for learning task-related features by using unlabeled data, to speech emotion recognition. We then evaluate the performance of the proposed approach and present a detailed analysis of the effect of two important factors in the model setup, the content window size and the number of hidden layer nodes. Experimental results show that larger content windows and more hidden nodes contribute to higher performance. We also show that the two-layer network cannot explicitly improve performance compared to a single-layer network.

  19. Charisma in business speeches

    DEFF Research Database (Denmark)

    Niebuhr, Oliver; Brem, Alexander; Novák-Tót, Eszter

    2016-01-01

    to business speeches. Consistent with the public opinion, our findings are indicative of Steve Jobs being a more charismatic speaker than Mark Zuckerberg. Beyond previous studies, our data suggest that rhythm and emphatic accentuation are also involved in conveying charisma. Furthermore, the differences......Charisma is a key component of spoken language interaction; and it is probably for this reason that charismatic speech has been the subject of intensive research for centuries. However, what is still largely missing is a quantitative and objective line of research that, firstly, involves analyses...... of the acoustic-prosodic signal, secondly, focuses on business speeches like product presentations, and, thirdly, in doing so, advances the still fairly fragmentary evidence on the prosodic correlates of charismatic speech. We show that the prosodic features of charisma in political speeches also apply...

  20. Analysis of machining and machine tools

    CERN Document Server

    Liang, Steven Y

    2016-01-01

    This book delivers the fundamental science and mechanics of machining and machine tools by presenting systematic and quantitative knowledge in the form of process mechanics and physics. It gives readers a solid command of machining science and engineering, and familiarizes them with the geometry and functionality requirements of creating parts and components in today’s markets. The authors address traditional machining topics, such as: single and multiple point cutting processes grinding components accuracy and metrology shear stress in cutting cutting temperature and analysis chatter They also address non-traditional machining, such as: electrical discharge machining electrochemical machining laser and electron beam machining A chapter on biomedical machining is also included. This book is appropriate for advanced undergraduate and graduate mechani cal engineering students, manufacturing engineers, and researchers. Each chapter contains examples, exercises and their solutions, and homework problems that re...

  1. Predicting speech intelligibility in conditions with nonlinearly processed noisy speech

    DEFF Research Database (Denmark)

    Jørgensen, Søren; Dau, Torsten

    2013-01-01

    The speech-based envelope power spectrum model (sEPSM; [1]) was proposed in order to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII). The sEPSM applies the signal-tonoise ratio in the envelope domain (SNRenv), which was demonstrated...... to successfully predict speech intelligibility in conditions with nonlinearly processed noisy speech, such as processing with spectral subtraction. Moreover, a multiresolution version (mr-sEPSM) was demonstrated to account for speech intelligibility in various conditions with stationary and fluctuating...... from computational auditory scene analysis and further support the hypothesis that the SNRenv is a powerful metric for speech intelligibility prediction....

  2. Is talking to an automated teller machine natural and fun?

    Science.gov (United States)

    Chan, F Y; Khalid, H M

    Usability and affective issues of using automatic speech recognition technology to interact with an automated teller machine (ATM) are investigated in two experiments. The first uncovered dialogue patterns of ATM users for the purpose of designing the user interface for a simulated speech ATM system. Applying the Wizard-of-Oz methodology, multiple mapping and word spotting techniques, the speech driven ATM accommodates bilingual users of Bahasa Melayu and English. The second experiment evaluates the usability of a hybrid speech ATM, comparing it with a simulated manual ATM. The aim is to investigate how natural and fun can talking to a speech ATM be for these first-time users. Subjects performed the withdrawal and balance enquiry tasks. The ANOVA was performed on the usability and affective data. The results showed significant differences between systems in the ability to complete the tasks as well as in transaction errors. Performance was measured on the time taken by subjects to complete the task and the number of speech recognition errors that occurred. On the basis of user emotions, it can be said that the hybrid speech system enabled pleasurable interaction. Despite the limitations of speech recognition technology, users are set to talk to the ATM when it becomes available for public use.

  3. Python for probability, statistics, and machine learning

    CERN Document Server

    Unpingco, José

    2016-01-01

    This book covers the key ideas that link probability, statistics, and machine learning illustrated using Python modules in these areas. The entire text, including all the figures and numerical results, is reproducible using the Python codes and their associated Jupyter/IPython notebooks, which are provided as supplementary downloads. The author develops key intuitions in machine learning by working meaningful examples using multiple analytical methods and Python codes, thereby connecting theoretical concepts to concrete implementations. Modern Python modules like Pandas, Sympy, and Scikit-learn are applied to simulate and visualize important machine learning concepts like the bias/variance trade-off, cross-validation, and regularization. Many abstract mathematical ideas, such as convergence in probability theory, are developed and illustrated with numerical examples. This book is suitable for anyone with an undergraduate-level exposure to probability, statistics, or machine learning and with rudimentary knowl...

  4. Holographic codes

    CERN Document Server

    Latorre, Jose I

    2015-01-01

    There exists a remarkable four-qutrit state that carries absolute maximal entanglement in all its partitions. Employing this state, we construct a tensor network that delivers a holographic many body state, the H-code, where the physical properties of the boundary determine those of the bulk. This H-code is made of an even superposition of states whose relative Hamming distances are exponentially large with the size of the boundary. This property makes H-codes natural states for a quantum memory. H-codes exist on tori of definite sizes and get classified in three different sectors characterized by the sum of their qutrits on cycles wrapped through the boundaries of the system. We construct a parent Hamiltonian for the H-code which is highly non local and finally we compute the topological entanglement entropy of the H-code.

  5. Sharing code

    OpenAIRE

    Kubilius, Jonas

    2014-01-01

    Sharing code is becoming increasingly important in the wake of Open Science. In this review I describe and compare two popular code-sharing utilities, GitHub and Open Science Framework (OSF). GitHub is a mature, industry-standard tool but lacks focus towards researchers. In comparison, OSF offers a one-stop solution for researchers but a lot of functionality is still under development. I conclude by listing alternative lesser-known tools for code and materials sharing.

  6. Sperry Univac speech communications technology

    Science.gov (United States)

    Medress, Mark F.

    1977-01-01

    Technology and systems for effective verbal communication with computers were developed. A continuous speech recognition system for verbal input, a word spotting system to locate key words in conversational speech, prosodic tools to aid speech analysis, and a prerecorded voice response system for speech output are described.

  7. Voice and Speech after Laryngectomy

    Science.gov (United States)

    Stajner-Katusic, Smiljka; Horga, Damir; Musura, Maja; Globlek, Dubravka

    2006-01-01

    The aim of the investigation is to compare voice and speech quality in alaryngeal patients using esophageal speech (ESOP, eight subjects), electroacoustical speech aid (EACA, six subjects) and tracheoesophageal voice prosthesis (TEVP, three subjects). The subjects reading a short story were recorded in the sound-proof booth and the speech samples…

  8. Speech Correction in the Schools.

    Science.gov (United States)

    Eisenson, Jon; Ogilvie, Mardel

    An introduction to the problems and therapeutic needs of school age children whose speech requires remedial attention, the text is intended for both the classroom teacher and the speech correctionist. General considerations include classification and incidence of speech defects, speech correction services, the teacher as a speaker, the mechanism…

  9. Environmental Contamination of Normal Speech.

    Science.gov (United States)

    Harley, Trevor A.

    1990-01-01

    Environmentally contaminated speech errors (irrelevant words or phrases derived from the speaker's environment and erroneously incorporated into speech) are hypothesized to occur at a high level of speech processing, but with a relatively late insertion point. The data indicate that speech production processes are not independent of other…

  10. Subspace-Based Noise Reduction for Speech Signals via Diagonal and Triangular Matrix Decompositions

    DEFF Research Database (Denmark)

    Hansen, Per Christian; Jensen, Søren Holdt

    2007-01-01

    We survey the definitions and use of rank-revealing matrix decompositions in single-channel noise reduction algorithms for speech signals. Our algorithms are based on the rank-reduction paradigm and, in particular, signal subspace techniques. The focus is on practical working algorithms, using bo...... with working Matlab code and applications in speech processing....

  11. Subspace-Based Noise Reduction for Speech Signals via Diagonal and Triangular Matrix Decompositions

    DEFF Research Database (Denmark)

    Hansen, Per Christian; Jensen, Søren Holdt

    We survey the definitions and use of rank-revealing matrix decompositions in single-channel noise reduction algorithms for speech signals. Our algorithms are based on the rank-reduction paradigm and, in particular, signal subspace techniques. The focus is on practical working algorithms, using bo...... with working Matlab code and applications in speech processing....

  12. Family Worlds: Couple Satisfaction, Parenting Style, and Mothers' and Fathers' Speech to Young Children.

    Science.gov (United States)

    Pratt, Michael W.; And Others

    1992-01-01

    Investigated relations between certain family context variables and the conversational behavior of 36 parents who were playing with their 3 year olds. Transcripts were coded for types of conversational functions and structure of parent speech. Marital satisfaction was associated with aspects of parent speech. (LB)

  13. Modelling the Architecture of Phonetic Plans: Evidence from Apraxia of Speech

    Science.gov (United States)

    Ziegler, Wolfram

    2009-01-01

    In theories of spoken language production, the gestural code prescribing the movements of the speech organs is usually viewed as a linear string of holistic, encapsulated, hard-wired, phonetic plans, e.g., of the size of phonemes or syllables. Interactions between phonetic units on the surface of overt speech are commonly attributed to either the…

  14. Perception of the Auditory-Visual Illusion in Speech Perception by Children with Phonological Disorders

    Science.gov (United States)

    Dodd, Barbara; McIntosh, Beth; Erdener, Dogu; Burnham, Denis

    2008-01-01

    An example of the auditory-visual illusion in speech perception, first described by McGurk and MacDonald, is the perception of [ta] when listeners hear [pa] in synchrony with the lip movements for [ka]. One account of the illusion is that lip-read and heard speech are combined in an articulatory code since people who mispronounce words respond…

  15. Family Worlds: Couple Satisfaction, Parenting Style, and Mothers' and Fathers' Speech to Young Children.

    Science.gov (United States)

    Pratt, Michael W.; And Others

    1992-01-01

    Investigated relations between certain family context variables and the conversational behavior of 36 parents who were playing with their 3 year olds. Transcripts were coded for types of conversational functions and structure of parent speech. Marital satisfaction was associated with aspects of parent speech. (LB)

  16. Speech intelligibility measure for vocal control of an automaton

    Science.gov (United States)

    Naranjo, Michel; Tsirigotis, Georgios

    1998-07-01

    The acceleration of investigations in Speech Recognition allows to augur, in the next future, a wide establishment of Vocal Control Systems in the production units. The communication between a human and a machine necessitates technical devices that emit, or are submitted to important noise perturbations. The vocal interface introduces a new control problem of a deterministic automaton using uncertain information. The purpose is to place exactly the automaton in a final state, ordered by voice, from an unknown initial state. The whole Speech Processing procedure, presented in this paper, has for input the temporal speech signal of a word and for output a recognised word labelled with an intelligibility index given by the recognition quality. In the first part, we present the essential psychoacoustic concepts for the automatic calculation of the loudness of a speech signal. The architecture of a Time Delay Neural Network is presented in second part where we also give the results of the recognition. The theory of the fuzzy subset, in third part, allows to extract at the same time a recognised word and its intelligibility index. In the fourth part, an Anticipatory System models the control of a Sequential Machine. A prediction phase and an updating one appear which involve data coming from the information system. A Bayesian decision strategy is used and the criterion is a weighted sum of criteria defined from information, minimum path functions and speech intelligibility measure.

  17. A Fortran 90 code for magnetohydrodynamics

    Energy Technology Data Exchange (ETDEWEB)

    Walker, D.W.

    1992-03-01

    This report describes progress in developing a Fortran 90 version of the KITE code for studying plasma instabilities in Tokamaks. In particular, the evaluation of convolution terms appearing in the numerical solution is discussed, and timing results are presented for runs performed on an 8k processor Connection Machine (CM-2). Estimates of the performance on a full-size 64k CM-2 are given, and range between 100 and 200 Mflops. The advantages of having a Fortran 90 version of the KITE code are stressed, and the future use of such a code on the newly announced CM5 and Paragon computers, from Thinking Machines Corporation and Intel, is considered.

  18. Two-Level Semantics and Code Generation

    DEFF Research Database (Denmark)

    Nielson, Flemming; Nielson, Hanne Riis

    1988-01-01

    not absolutely necessary for describing the input-output semantics of programming languages, it is necessary when issues such as data flow analysis and code generation are considered. For an example stack-machine, the authors show how to generate code for the run-time computations and still perform the compile......-time computations. Using an example, it is argued that compiler-tricks such as the use of activation records suggest how to cope with certain syntactic restrictions in the metalanguage. The correctness of the code generation is proved using Kripke-like relations and using a modified machine that can be made to loop...

  19. Extracting Information from Spoken User Input. A Machine Learning Approach

    NARCIS (Netherlands)

    Lendvai, P.K.

    2004-01-01

    We propose a module that performs automatic analysis of user input in spoken dialogue systems using machine learning algorithms. The input to the module is material received from the speech recogniser and the dialogue manager of the spoken dialogue system, the output is a four-level

  20. MARTI: man-machine animation real-time interface

    Science.gov (United States)

    Jones, Christian M.; Dlay, Satnam S.

    1997-05-01

    The research introduces MARTI (man-machine animation real-time interface) for the realization of natural human-machine interfacing. The system uses simple vocal sound-tracks of human speakers to provide lip synchronization of computer graphical facial models. We present novel research in a number of engineering disciplines, which include speech recognition, facial modeling, and computer animation. This interdisciplinary research utilizes the latest, hybrid connectionist/hidden Markov model, speech recognition system to provide very accurate phone recognition and timing for speaker independent continuous speech, and expands on knowledge from the animation industry in the development of accurate facial models and automated animation. The research has many real-world applications which include the provision of a highly accurate and 'natural' man-machine interface to assist user interactions with computer systems and communication with one other using human idiosyncrasies; a complete special effects and animation toolbox providing automatic lip synchronization without the normal constraints of head-sets, joysticks, and skilled animators; compression of video data to well below standard telecommunication channel bandwidth for video communications and multi-media systems; assisting speech training and aids for the handicapped; and facilitating player interaction for 'video gaming' and 'virtual worlds.' MARTI has introduced a new level of realism to man-machine interfacing and special effect animation which has been previously unseen.

  1. QCD on the connection machine: beyond LISP

    Science.gov (United States)

    Brickner, Ralph G.; Baillie, Clive F.; Johnsson, S. Lennart

    1991-04-01

    We report on the status of code development for a simulation of quantum chromodynamics (QCD) with dynamical Wilson fermions on the Connection Machine model CM-2. Our original code, written in Lisp, gave performance in the near-GFLOPS range. We have rewritten the most time-consuming parts of the code in the low-level programming systems CMIS, including the matrix multiply and the communication. Current versions of the code run at approximately 3.6 GFLOPS for the fermion matrix inversion, and we expect the next version to reach or exceed 5 GFLOPS.

  2. PROSODIC FEATURE BASED TEXT DEPENDENT SPEAKER RECOGNITION USING MACHINE LEARNING ALGORITHMS

    Directory of Open Access Journals (Sweden)

    Sunil Agrawal

    2010-10-01

    Full Text Available Most of us are aware of the fact that voices of different individuals do not sound alike. The ability of recognizing a person solely from his voice is known as speaker recognition. Speaker recognition can not only assist in building better access control systems and security apparatus, it can be a useful tool in many other areas such as forensic speech analysis. The choice of features plays an important role in the performance of ML algorithm. Here we propose prosodic features based text dependent speaker recognition where the prosodic features can be extracted through linear predictive coding. Formants are efficient parameters to characterize a speaker’s voice. Formants are combined with their corresponding amplitudes, fundamental frequency, duration of speech utterance and energy ofthe windowed section. This feature vector is input to machine learning (ML algorithms for recognition. We investigate the performance of four ML algorithms namely MLP, RBFN, C4.5 decision tree, and BayesNet. Out of these ML algorithms, C4.5 decision tree performance is consistent. MLP performs better for gender recognition and experimental results show that RBFN gives better performance for increased population size.

  3. The self-advantage in visual speech processing enhances audiovisual speech recognition in noise.

    Science.gov (United States)

    Tye-Murray, Nancy; Spehar, Brent P; Myerson, Joel; Hale, Sandra; Sommers, Mitchell S

    2015-08-01

    Individuals lip read themselves more accurately than they lip read others when only the visual speech signal is available (Tye-Murray et al., Psychonomic Bulletin & Review, 20, 115-119, 2013). This self-advantage for vision-only speech recognition is consistent with the common-coding hypothesis (Prinz, European Journal of Cognitive Psychology, 9, 129-154, 1997), which posits (1) that observing an action activates the same motor plan representation as actually performing that action and (2) that observing one's own actions activates motor plan representations more than the others' actions because of greater congruity between percepts and corresponding motor plans. The present study extends this line of research to audiovisual speech recognition by examining whether there is a self-advantage when the visual signal is added to the auditory signal under poor listening conditions. Participants were assigned to sub-groups for round-robin testing in which each participant was paired with every member of their subgroup, including themselves, serving as both talker and listener/observer. On average, the benefit participants obtained from the visual signal when they were the talker was greater than when the talker was someone else and also was greater than the benefit others obtained from observing as well as listening to them. Moreover, the self-advantage in audiovisual speech recognition was significant after statistically controlling for individual differences in both participants' ability to benefit from a visual speech signal and the extent to which their own visual speech signal benefited others. These findings are consistent with our previous finding of a self-advantage in lip reading and with the hypothesis of a common code for action perception and motor plan representation.

  4. Global Freedom of Speech

    DEFF Research Database (Denmark)

    Binderup, Lars Grassme

    2007-01-01

    , as opposed to a legal norm, that curbs exercises of the right to free speech that offend the feelings or beliefs of members from other cultural groups. The paper rejects the suggestion that acceptance of such a norm is in line with liberal egalitarian thinking. Following a review of the classical liberal...... egalitarian reasons for free speech - reasons from overall welfare, from autonomy and from respect for the equality of citizens - it is argued that these reasons outweigh the proposed reasons for curbing culturally offensive speech. Currently controversial cases such as that of the Danish Cartoon Controversy...

  5. Speaking Code

    DEFF Research Database (Denmark)

    Cox, Geoff

    Speaking Code begins by invoking the “Hello World” convention used by programmers when learning a new language, helping to establish the interplay of text and code that runs through the book. Interweaving the voice of critical writing from the humanities with the tradition of computing and software...

  6. Polar Codes

    Science.gov (United States)

    2014-12-01

    QPSK Gaussian channels . .......................................................................... 39 vi 1. INTRODUCTION Forward error correction (FEC...Capacity of BSC. 7 Figure 5. Capacity of AWGN channel . 8 4. INTRODUCTION TO POLAR CODES Polar codes were introduced by E. Arikan in [1]. This paper...Under authority of C. A. Wilgenbusch, Head ISR Division EXECUTIVE SUMMARY This report describes the results of the project “More reliable wireless

  7. The Rhetoric in English Speech

    Institute of Scientific and Technical Information of China (English)

    马鑫

    2014-01-01

    English speech has a very long history and always attached importance of people highly. People usually give a speech in economic activities, political forums and academic reports to express their opinions to investigate or persuade others. English speech plays a rather important role in English literature. The distinct theme of speech should attribute to the rhetoric. It discusses parallelism, repetition and rhetorical question in English speech, aiming to help people appreciate better the charm of them.

  8. Automation of printing machine

    OpenAIRE

    Sušil, David

    2016-01-01

    Bachelor thesis is focused on the automation of the printing machine and comparing the two types of printing machines. The first chapter deals with the history of printing, typesettings, printing techniques and various kinds of bookbinding. The second chapter describes the difference between sheet-fed printing machines and offset printing machines, the difference between two representatives of rotary machines, technological process of the products on these machines, the description of the mac...

  9. Noise estimation Algorithms for Speech Enhancement in highly non-stationary Environments

    Directory of Open Access Journals (Sweden)

    Anuradha R Fukane

    2011-03-01

    Full Text Available A noise estimation algorithm plays an important role in speech enhancement. Speech enhancement for automatic speaker recognition system, Man-Machine communication, Voice recognition systems, speech coders, Hearing aids, Video conferencing and many applications are related to speech processing. All these systems are real world systems and input available for these systems is only the noisy speech signal, before applying to these systems we have to remove the noise component from noisy speech signal means enhanced speech signal can be applied to these systems. In most speech enhancement algorithms, it is assumed that an estimate of noise spectrum is available. Noise estimate is critical part and it is important for speech enhancement algorithms. If the noise estimate is too low then annoying residual noise will be available and if the noise estimate is too high then speech will get distorted and loss intelligibility. This paper focus on the different approaches of noise estimation. Section I introduction, Section II explains simple approach of Voice activity detector (VAD for noise estimation, Section III explains different classes of noise estimation algorithms, Section IV explains performance evaluation of noise estimation algorithms, Section V conclusion.

  10. Error analysis to improve the speech recognition accuracy on Telugu language

    Indian Academy of Sciences (India)

    N Usha Rani; P N Girija

    2012-12-01

    Speech is one of the most important communication channels among the people. Speech Recognition occupies a prominent place in communication between the humans and machine. Several factors affect the accuracy of the speech recognition system. Much effort was involved to increase the accuracy of the speech recognition system, still erroneous output is generating in current speech recognition systems. Telugu language is one of the most widely spoken south Indian languages. In the proposed Telugu speech recognition system, errors obtained from decoder are analysed to improve the performance of the speech recognition system. Static pronunciation dictionary plays a key role in the speech recognition accuracy. Modification should be performed in the dictionary, which is used in the decoder of the speech recognition system. This modification reduces the number of the confusion pairs which improves the performance of the speech recognition system. Language model scores are also varied with this modification. Hit rate is considerably increased during this modification and false alarms have been changing during the modification of the pronunciation dictionary. Variations are observed in different error measures such as F-measures, error-rate and Word Error Rate (WER) by application of the proposed method.

  11. Auditory-neurophysiological responses to speech during early childhood: Effects of background noise.

    Science.gov (United States)

    White-Schwoch, Travis; Davies, Evan C; Thompson, Elaine C; Woodruff Carr, Kali; Nicol, Trent; Bradlow, Ann R; Kraus, Nina

    2015-10-01

    Early childhood is a critical period of auditory learning, during which children are constantly mapping sounds to meaning. But this auditory learning rarely occurs in ideal listening conditions-children are forced to listen against a relentless din. This background noise degrades the neural coding of these critical sounds, in turn interfering with auditory learning. Despite the importance of robust and reliable auditory processing during early childhood, little is known about the neurophysiology underlying speech processing in children so young. To better understand the physiological constraints these adverse listening scenarios impose on speech sound coding during early childhood, auditory-neurophysiological responses were elicited to a consonant-vowel syllable in quiet and background noise in a cohort of typically-developing preschoolers (ages 3-5 yr). Overall, responses were degraded in noise: they were smaller, less stable across trials, slower, and there was poorer coding of spectral content and the temporal envelope. These effects were exacerbated in response to the consonant transition relative to the vowel, suggesting that the neural coding of spectrotemporally-dynamic speech features is more tenuous in noise than the coding of static features-even in children this young. Neural coding of speech temporal fine structure, however, was more resilient to the addition of background noise than coding of temporal envelope information. Taken together, these results demonstrate that noise places a neurophysiological constraint on speech processing during early childhood by causing a breakdown in neural processing of speech acoustics. These results may explain why some listeners have inordinate difficulties understanding speech in noise. Speech-elicited auditory-neurophysiological responses offer objective insight into listening skills during early childhood by reflecting the integrity of neural coding in quiet and noise; this paper documents typical response

  12. Visual speech form influences the speed of auditory speech processing.

    Science.gov (United States)

    Paris, Tim; Kim, Jeesun; Davis, Chris

    2013-09-01

    An important property of visual speech (movements of the lips and mouth) is that it generally begins before auditory speech. Research using brain-based paradigms has demonstrated that seeing visual speech speeds up the activation of the listener's auditory cortex but it is not clear whether these observed neural processes link to behaviour. It was hypothesized that the very early portion of visual speech (occurring before auditory speech) will allow listeners to predict the following auditory event and so facilitate the speed of speech perception. This was tested in the current behavioural experiments. Further, we tested whether the salience of the visual speech played a role in this speech facilitation effect (Experiment 1). We also determined the relative contributions that visual form (what) and temporal (when) cues made (Experiment 2). The results showed that visual speech cues facilitated response times and that this was based on form rather than temporal cues. Copyright © 2013 Elsevier Inc. All rights reserved.

  13. Machine musicianship

    Science.gov (United States)

    Rowe, Robert

    2002-05-01

    The training of musicians begins by teaching basic musical concepts, a collection of knowledge commonly known as musicianship. Computer programs designed to implement musical skills (e.g., to make sense of what they hear, perform music expressively, or compose convincing pieces) can similarly benefit from access to a fundamental level of musicianship. Recent research in music cognition, artificial intelligence, and music theory has produced a repertoire of techniques that can make the behavior of computer programs more musical. Many of these were presented in a recently published book/CD-ROM entitled Machine Musicianship. For use in interactive music systems, we are interested in those which are fast enough to run in real time and that need only make reference to the material as it appears in sequence. This talk will review several applications that are able to identify the tonal center of musical material during performance. Beyond this specific task, the design of real-time algorithmic listening through the concurrent operation of several connected analyzers is examined. The presentation includes discussion of a library of C++ objects that can be combined to perform interactive listening and a demonstration of their capability.

  14. Anxiety and ritualized speech

    Science.gov (United States)

    Lalljee, Mansur; Cook, Mark

    1975-01-01

    The experiment examines the effects on a number of words that seem irrelevant to semantic communication. The Units of Ritualized Speech (URSs) considered are: 'I mean', 'in fact', 'really', 'sort of', 'well' and 'you know'. (Editor)

  15. Anxiety and ritualized speech

    Science.gov (United States)

    Lalljee, Mansur; Cook, Mark

    1975-01-01

    The experiment examines the effects on a number of words that seem irrelevant to semantic communication. The Units of Ritualized Speech (URSs) considered are: 'I mean', 'in fact', 'really', 'sort of', 'well' and 'you know'. (Editor)

  16. HATE SPEECH AS COMMUNICATION

    National Research Council Canada - National Science Library

    Gladilin Aleksey Vladimirovich

    2012-01-01

    The purpose of the paper is a theoretical comprehension of hate speech from communication point of view, on the one hand, and from the point of view of prejudice, stereotypes and discrimination on the other...

  17. Speech intelligibility in hospitals.

    Science.gov (United States)

    Ryherd, Erica E; Moeller, Michael; Hsu, Timothy

    2013-07-01

    Effective communication between staff members is key to patient safety in hospitals. A variety of patient care activities including admittance, evaluation, and treatment rely on oral communication. Surprisingly, published information on speech intelligibility in hospitals is extremely limited. In this study, speech intelligibility measurements and occupant evaluations were conducted in 20 units of five different U.S. hospitals. A variety of unit types and locations were studied. Results show that overall, no unit had "good" intelligibility based on the speech intelligibility index (SII > 0.75) and several locations found to have "poor" intelligibility (SII speech intelligibility across a variety of hospitals and unit types, offers some evidence of the positive impact of absorption on intelligibility, and identifies areas for future research.

  18. Speech disorders - children

    Science.gov (United States)

    ... this page: //medlineplus.gov/ency/article/001430.htm Speech disorders - children To use the sharing features on ... 2017, A.D.A.M., Inc. Duplication for commercial use must be authorized in writing by ADAM ...

  19. Speech impairment (adult)

    Science.gov (United States)

    ... this page: //medlineplus.gov/ency/article/003204.htm Speech impairment (adult) To use the sharing features on ... 2017, A.D.A.M., Inc. Duplication for commercial use must be authorized in writing by ADAM ...

  20. A language for easy and efficient modeling of Turing machines

    Institute of Scientific and Technical Information of China (English)

    Pinaki Chakraborty

    2007-01-01

    A Turing Machine Description Language (TMDL) is developed for easy and efficient modeling of Turing machines.TMDL supports formal symbolic representation of Turing machines. The grammar for the language is also provided. Then a fast singlepass compiler is developed for TMDL. The scope of code optimization in the compiler is examined. An interpreter is used to simulate the exact behavior of the compiled Turing machines. A dynamically allocated and resizable array is used to simulate the infinite tape of a Turing machine. The procedure for simulating composite Turing machines is also explained. In this paper, two sample Turing machines have been designed in TMDL and their simulations are discussed. The TMDL can be extended to model the different variations of the standard Turing machine.

  1. Speech Compression and Synthesis

    Science.gov (United States)

    1980-10-01

    phonological rules combined with diphone improved the algorithms used by the phonetic synthesis prog?Im for gain normalization and time... phonetic vocoder, spectral template. i0^Th^TreprtTörc"u’d1sTuV^ork for the past two years on speech compression’and synthesis. Since there was an...from Block 19: speech recognition, pnoneme recogmtion. initial design for a phonetic recognition program. We also recorded ana partially labeled a

  2. Degraded neural and behavioral processing of speech sounds in a rat model of Rett syndrome.

    Science.gov (United States)

    Engineer, Crystal T; Rahebi, Kimiya C; Borland, Michael S; Buell, Elizabeth P; Centanni, Tracy M; Fink, Melyssa K; Im, Kwok W; Wilson, Linda G; Kilgard, Michael P

    2015-11-01

    Individuals with Rett syndrome have greatly impaired speech and language abilities. Auditory brainstem responses to sounds are normal, but cortical responses are highly abnormal. In this study, we used the novel rat Mecp2 knockout model of Rett syndrome to document the neural and behavioral processing of speech sounds. We hypothesized that both speech discrimination ability and the neural response to speech sounds would be impaired in Mecp2 rats. We expected that extensive speech training would improve speech discrimination ability and the cortical response to speech sounds. Our results reveal that speech responses across all four auditory cortex fields of Mecp2 rats were hyperexcitable, responded slower, and were less able to follow rapidly presented sounds. While Mecp2 rats could accurately perform consonant and vowel discrimination tasks in quiet, they were significantly impaired at speech sound discrimination in background noise. Extensive speech training improved discrimination ability. Training shifted cortical responses in both Mecp2 and control rats to favor the onset of speech sounds. While training increased the response to low frequency sounds in control rats, the opposite occurred in Mecp2 rats. Although neural coding and plasticity are abnormal in the rat model of Rett syndrome, extensive therapy appears to be effective. These findings may help to explain some aspects of communication deficits in Rett syndrome and suggest that extensive rehabilitation therapy might prove beneficial.

  3. Representation Learning Based Speech Assistive System for Persons With Dysarthria.

    Science.gov (United States)

    Chandrakala, S; Rajeswari, Natarajan

    2017-09-01

    An assistive system for persons with vocal impairment due to dysarthria converts dysarthric speech to normal speech or text. Because of the articulatory deficits, dysarthric speech recognition needs a robust learning technique. Representation learning is significant for complex tasks such as dysarthric speech recognition. We focus on robust representation for dysarthric speech recognition that involves recognizing sequential patterns of varying length utterances. We propose a hybrid framework that uses a generative learning based data representation with a discriminative learning based classifier. In this hybrid framework, we propose to use Example Specific Hidden Markov Models (ESHMMs) to obtain log-likelihood scores for a dysarthric speech utterance to form fixed dimensional score vector representation. This representation is used as an input to discriminative classifier such as support vector machine.The performance of the proposed approach is evaluatedusingUA-Speechdatabase.The recognitionaccuracy is much better than the conventional hidden Markov model based approach and Deep Neural Network-Hidden Markov Model (DNN-HMM). The efficiency of the discriminative nature of score vector representation is proved for "very low" intelligibility words.

  4. Reiteration: At the Intersection of Code-Switching and Translation

    Science.gov (United States)

    Harjunpää, Katariina; Mäkilähde, Aleksi

    2016-01-01

    One of the most studied forms of multilingual language use is "code-switching," the use of more than one language within a speech exchange. Some forms of code-switching may also be regarded as instances of "translation," but the relation between these notions in studies of multilingual discourse remains underspecified. The…

  5. Understanding the effect of noise on electrical stimulation sequences in cochlear implants and its impact on speech intelligibility.

    Science.gov (United States)

    Qazi, Obaid Ur Rehman; van Dijk, Bas; Moonen, Marc; Wouters, Jan

    2013-05-01

    The present study investigates the most important factors that limit the intelligibility of the cochlear implant (CI) processed speech in noisy environments. The electrical stimulation sequences provided in CIs are affected by the noise in the following three manners. First of all, the natural gaps in the speech are filled, which distorts the low-frequency ON/OFF modulations of the speech signal. Secondly, speech envelopes are distorted to include modulations of both speech and noise. Lastly, the N-of-M type of speech coding strategies may select the noise dominated channels instead of the dominant speech channels at low signal-to-noise ratio's (SNRs). Different stimulation sequences are tested with CI subjects to study how these three noise effects individually limit the intelligibility of the CI processed speech. Tests are also conducted with normal hearing (NH) subjects using vocoded speech to identify any significant differences in the noise reduction requirements and speech distortion limitations between the two subject groups. Results indicate that compared to NH subjects CI subjects can tolerate significantly lower levels of steady state speech shaped noise in the speech gaps but at the same time can tolerate comparable levels of distortions in the speech segments. Furthermore, modulations in the stimulus current level have no effect on speech intelligibility as long as the channel selection remains ideal. Finally, wrong maxima selection together with the introduction of noise in the speech gaps significantly degrades the intelligibility. At low SNRs wrong maxima selection introduces interruptions in the speech and makes it difficult to fuse noisy and interrupted speech signals into a coherent speech stream. Copyright © 2013 Elsevier B.V. All rights reserved.

  6. Electrical machines mathematical fundamentals of machine topologies

    CERN Document Server

    Gerling, Dieter

    2015-01-01

    Electrical Machines and Drives play a powerful role in industry with an ever increasing importance. This fact requires the understanding of machine and drive principles by engineers of many different disciplines. Therefore, this book is intended to give a comprehensive deduction of these principles. Special attention is given to the precise mathematical derivation of the necessary formulae to calculate machines and drives and to the discussion of simplifications (if applied) with the associated limits. The book shows how the different machine topologies can be deduced from general fundamentals, and how they are linked together. This book addresses graduate students, researchers, and developers of Electrical Machines and Drives, who are interested in getting knowledge about the principles of machine and drive operation and in detecting the mathematical and engineering specialties of the different machine and drive topologies together with their mutual links. The detailed - but nevertheless compact - mat...

  7. Interactive Systems for Designing Machine Elements and Assemblies

    Directory of Open Access Journals (Sweden)

    Kacalak Wojciech

    2015-09-01

    Full Text Available The article describes the development of fundamentals of machine elements and assemblies design processes automation using artificial intelligence, and descriptions of structural elements’ features in a natural language. In the proposed interactive automated design systems, computational artificial intelligence methods allow communication by speech and natural language, resulting in analyses of design engineer’s messages, analyses of constructions, encoding and assessments of constructions, CAD system controlling and visualizations. The system is equipped with several adaptive intelligent layers for human biometric identification, recognition of speech and handwriting, recognition of words, analyses and recognition of messages, enabling interpretation of messages, and assessments of human reactions. The article proposes a concept of intelligent processing for analysis of descriptions of machine elements’ structural features in a natural language. It also presents the developed methodology for similarity analysis between structural features of designed machine elements and corresponding antipatterns allowing normalization of parameters of the analysed structural solutions.

  8. Human speech articulator measurements using low power, 2GHz Homodyne sensors

    Energy Technology Data Exchange (ETDEWEB)

    Barnes, T; Burnett, G C; Holzrichter, J F

    1999-06-29

    Very low power, short-range microwave ''radar-like'' sensors can measure the motions and vibrations of internal human speech articulators as speech is produced. In these animate (and also in inanimate acoustic systems) microwave sensors can measure vibration information associated with excitation sources and other interfaces. These data, together with the corresponding acoustic data, enable the calculation of system transfer functions. This information appears to be useful for a surprisingly wide range of applications such as speech coding and recognition, speaker or object identification, speech and musical instrument synthesis, noise cancellation, and other applications.

  9. Variable Frame Rate and Length Analysis for Data Compression in Distributed Speech Recognition

    DEFF Research Database (Denmark)

    Kraljevski, Ivan; Tan, Zheng-Hua

    2014-01-01

    This paper addresses the issue of data compression in distributed speech recognition on the basis of a variable frame rate and length analysis method. The method first conducts frame selection by using a posteriori signal-to-noise ratio weighted energy distance to find the right time resolution...... length for steady regions. The method is applied to scalable source coding in distributed speech recognition where the target bitrate is met by adjusting the frame rate. Speech recognition results show that the proposed approach outperforms other compression methods in terms of recognition accuracy...... for noisy speech while achieving higher compression rates....

  10. Laser machining of advanced materials

    CERN Document Server

    Dahotre, Narendra B

    2011-01-01

    Advanced materialsIntroductionApplicationsStructural ceramicsBiomaterials CompositesIntermetallicsMachining of advanced materials IntroductionFabrication techniquesMechanical machiningChemical Machining (CM)Electrical machiningRadiation machining Hybrid machiningLaser machiningIntroductionAbsorption of laser energy and multiple reflectionsThermal effectsLaser machining of structural ceramicsIntrodu

  11. Machine-to-machine communications architectures, technology, standards, and applications

    CERN Document Server

    Misic, Vojislav B

    2014-01-01

    With the number of machine-to-machine (M2M)-enabled devices projected to reach 20 to 50 billion by 2020, there is a critical need to understand the demands imposed by such systems. Machine-to-Machine Communications: Architectures, Technology, Standards, and Applications offers rigorous treatment of the many facets of M2M communication, including its integration with current technology.Presenting the work of a different group of international experts in each chapter, the book begins by supplying an overview of M2M technology. It considers proposed standards, cutting-edge applications, architectures, and traffic modeling and includes case studies that highlight the differences between traditional and M2M communications technology.Details a practical scheme for the forward error correction code designInvestigates the effectiveness of the IEEE 802.15.4 low data rate wireless personal area network standard for use in M2M communicationsIdentifies algorithms that will ensure functionality, performance, reliability, ...

  12. Computer-based speech therapy for childhood speech sound disorders.

    Science.gov (United States)

    Furlong, Lisa; Erickson, Shane; Morris, Meg E

    2017-07-01

    With the current worldwide workforce shortage of Speech-Language Pathologists, new and innovative ways of delivering therapy to children with speech sound disorders are needed. Computer-based speech therapy may be an effective and viable means of addressing service access issues for children with speech sound disorders. To evaluate the efficacy of computer-based speech therapy programs for children with speech sound disorders. Studies reporting the efficacy of computer-based speech therapy programs were identified via a systematic, computerised database search. Key study characteristics, results, main findings and details of computer-based speech therapy programs were extracted. The methodological quality was evaluated using a structured critical appraisal tool. 14 studies were identified and a total of 11 computer-based speech therapy programs were evaluated. The results showed that computer-based speech therapy is associated with positive clinical changes for some children with speech sound disorders. There is a need for collaborative research between computer engineers and clinicians, particularly during the design and development of computer-based speech therapy programs. Evaluation using rigorous experimental designs is required to understand the benefits of computer-based speech therapy. The reader will be able to 1) discuss how computerbased speech therapy has the potential to improve service access for children with speech sound disorders, 2) explain the ways in which computer-based speech therapy programs may enhance traditional tabletop therapy and 3) compare the features of computer-based speech therapy programs designed for different client populations. Copyright © 2017 Elsevier Inc. All rights reserved.

  13. SPEECH DISORDERS ENCOUNTERED DURING SPEECH THERAPY AND THERAPY TECHNIQUES

    Directory of Open Access Journals (Sweden)

    İlhan ERDEM

    2013-06-01

    Full Text Available Speech which is a physical and mental process, agreed signs and sounds to create a sense of mind to the message that change . Process to identify the sounds of speech it is essential to know the structure and function of various organs which allows to happen the conversation. Speech is a physical and mental process so many factors can lead to speech disorders. Speech disorder can be about language acquisitions as well as it can be caused medical and psychological many factors. Disordered speech, language, medical and psychological conditions as well as acquisitions also be caused by many factors. Speaking, is the collective work of many organs, such as an orchestra. Mental dimension of the speech disorder which is a very complex skill so it must be found which of these obstacles inhibit conversation. Speech disorder is a defect in speech flow, rhythm, tizliğinde, beats, the composition and vocalization. In this study, speech disorders such as articulation disorders, stuttering, aphasia, dysarthria, a local dialect speech, , language and lip-laziness, rapid speech peech defects in a term of language skills. This causes of speech disorders were investigated and presented suggestions for remedy was discussed.

  14. Voice Activity Detection Using Fuzzy Entropy and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    R. Johny Elton

    2016-08-01

    Full Text Available This paper proposes support vector machine (SVM based voice activity detection using FuzzyEn to improve detection performance under noisy conditions. The proposed voice activity detection (VAD uses fuzzy entropy (FuzzyEn as a feature extracted from noise-reduced speech signals to train an SVM model for speech/non-speech classification. The proposed VAD method was tested by conducting various experiments by adding real background noises of different signal-to-noise ratios (SNR ranging from −10 dB to 10 dB to actual speech signals collected from the TIMIT database. The analysis proves that FuzzyEn feature shows better results in discriminating noise and corrupted noisy speech. The efficacy of the SVM classifier was validated using 10-fold cross validation. Furthermore, the results obtained by the proposed method was compared with those of previous standardized VAD algorithms as well as recently developed methods. Performance comparison suggests that the proposed method is proven to be more efficient in detecting speech under various noisy environments with an accuracy of 93.29%, and the FuzzyEn feature detects speech efficiently even at low SNR levels.

  15. The deleuzian abstract machines

    DEFF Research Database (Denmark)

    Werner Petersen, Erik

    2005-01-01

    production. In Kafka: Toward a Minor Literature, Deleuze and Guatari gave the most comprehensive explanation to the abstract machine in the work of art. Like the war-machines of Virilio, the Kafka-machine operates in three gears or speeds. Furthermore, the machine is connected to spatial diagrams...

  16. Teaching Speech Communication with a Foreign Accent: A Pilot Study.

    Science.gov (United States)

    Chen, Guo-Ming; Chung, Jensen

    A pilot study examined problems encountered by foreign instructors teaching in American colleges. Fourteen Chinese-born instructors teaching in Speech Communication answered a questionnaire containing 12 open-ended questions. Recurring themes were coded from the answers, and then organized into three categories: cultural differences; linguistic…

  17. Speech-Language Therapy (For Parents)

    Science.gov (United States)

    ... Speech-language pathologists (SLPs), often informally known as speech therapists, are professionals educated in the study of human ... Palate Hearing Evaluation in Children Going to a Speech Therapist Stuttering Hearing Impairment Speech Problems Cleft Lip and ...

  18. A Transducer/Equipment System for Capturing Speech Information for Subsequent Processing by Computer Systems

    Science.gov (United States)

    1994-01-07

    have shown interest in a speech capture system that would operate in a noisy lobby, casino, airport and shopping mall floor for access to the Automated...control or selection. Vending machines, shopping dispenser kiosks , and entertainment virtual reality games of the future will all be voice activated...EVALUATION RESEARCH INC PAGE 1 TR-3150- 178 - High speech recognition accuracy for commercial applications; automated drive thru fast food ordering

  19. Transitioning from analog to digital audio recording in childhood speech sound disorders

    Science.gov (United States)

    Shriberg, Lawrence D.; McSweeny, Jane L.; Anderson, Bruce E.; Campbell, Thomas F.; Chial, Michael R.; Green, Jordan R.; Hauner, Katherina K.; Moore, Christopher A.; Rusiewicz, Heather L.; Wilson, David L.

    2014-01-01

    Few empirical findings or technical guidelines are available on the current transition from analog to digital audio recording in childhood speech sound disorders. Of particular concern in the present context was whether a transition from analog- to digital-based transcription and coding of prosody and voice features might require re-standardizing a reference database for research in childhood speech sound disorders. Two research transcribers with different levels of experience glossed, transcribed, and prosody-voice coded conversational speech samples from eight children with mild to severe speech disorders of unknown origin. The samples were recorded, stored, and played back using representative analog and digital audio systems. Effect sizes calculated for an array of analog versus digital comparisons ranged from negligible to medium, with a trend for participants’ speech competency scores to be slightly lower for samples obtained and transcribed using the digital system. We discuss the implications of these and other findings for research and clinical practise. PMID:16019779

  20. Managing the reaction effects of speech disorders on speech ...

    African Journals Online (AJOL)

    Speech disorders is responsible for defective speaking. It is usually ... They occur as a result of persistent frustrations which speech defectives usually encounter for speaking defectively. This paper ... AJOL African Journals Online. HOW TO ...

  1. Under-resourced speech recognition based on the speech manifold

    CSIR Research Space (South Africa)

    Sahraeian, R

    2015-09-01

    Full Text Available Conventional acoustic modeling involves estimating many parameters to effectively model feature distributions. The sparseness of speech and text data, however, degrades the reliability of the estimation process and makes speech recognition a...

  2. Emotive codes in Dostoevsky's "The Gambler"

    Directory of Open Access Journals (Sweden)

    Iskhakova Z.Z.

    2016-10-01

    Full Text Available This article is devoted to encoding and decoding the emotional image of F.M Dostoyevsky through the variant emotive deictic field with its emotive codes in the sphere. The object is the emotional speech of the characters in the works of F.M. Dostoyevsky. The subject of the study are the emotive signs in the emotional speech of the characters of F.M. Dostoyevsky. The aim of the article is to identify the most frequent emotive signs in Dostoevsky's "The Gambler", reflecting the emotional state of the characters, through which Russian writer expresses himself.

  3. Speech production deficits in early readers: predictors of risk.

    Science.gov (United States)

    Foy, Judith G; Mann, Virginia A

    2012-04-01

    Speech problems and reading disorders are linked, suggesting that speech problems may potentially be an early marker of later difficulty in associating graphemes with phonemes. Current norms suggest that complete mastery of the production of the consonant phonemes in English occurs in most children at around 6-7 years. Many children enter formal schooling (kindergarten) around 5 years of age with near-adult levels of speech production. Given that previous research has shown that speech production abilities and phonological awareness skills are linked in preschool children, we set out to examine whether this pattern also holds for children just beginning to learn to read, as suggested by the critical age hypothesis. In the present study, using a diverse sample, we explored whether expressive phonological skills in 92 5-year-old children at the beginning and end of kindergarten were associated with early reading skills. Speech errors were coded according to whether they were developmentally appropriate, position within the syllable, manner of production of the target sounds, and whether the error involved a substitution, omission, or addition of a speech sound. At the beginning of the school year, children with significant early reading deficits on a predictively normed test (DIBELS) made more speech errors than children who were at grade level. Most of these errors were typical of kindergarten children (e.g., substitutions involving fricatives), but reading-delayed children made more of these errors than children who entered kindergarten with grade level skills. The reading-delayed children also made more atypical errors, consistent with our previous findings about preschoolers. Children who made no speech errors at the beginning of kindergarten had superior early reading abilities, and improvements in speech errors over the course of the year were significantly correlated with year-end reading skills. The role of expressive vocabulary and working memory were also

  4. EMOTIONAL SPEECH RECOGNITION BASED ON SVM WITH GMM SUPERVECTOR

    Institute of Scientific and Technical Information of China (English)

    Chen Yanxiang; Xie Jian

    2012-01-01

    Emotion recognition from speech is an important field of research in human computer interaction.In this letter the framework of Support Vector Machines (SVM) with Gaussian Mixture Model (GMM) supervector is introduced for emotional speech recognition.Because of the importance of variance in reflecting the distribution of speech,the normalized mean vectors potential to exploit the information from the variance are adopted to form the GMM supervector.Comparative experiments from five aspects are conducted to study their corresponding effect to system performance.The experiment results,which indicate that the influence of number of mixtures is strong as well as influence of duration is weak,provide basis for the train set selection of Universal Background Model (UBM).

  5. SPEECH/MUSIC CLASSIFICATION USING WAVELET BASED FEATURE EXTRACTION TECHNIQUES

    Directory of Open Access Journals (Sweden)

    Thiruvengatanadhan Ramalingam

    2014-01-01

    Full Text Available Audio classification serves as the fundamental step towards the rapid growth in audio data volume. Due to the increasing size of the multimedia sources speech and music classification is one of the most important issues for multimedia information retrieval. In this work a speech/music discrimination system is developed which utilizes the Discrete Wavelet Transform (DWT as the acoustic feature. Multi resolution analysis is the most significant statistical way to extract the features from the input signal and in this study, a method is deployed to model the extracted wavelet feature. Support Vector Machines (SVM are based on the principle of structural risk minimization. SVM is applied to classify audio into their classes namely speech and music, by learning from training data. Then the proposed method extends the application of Gaussian Mixture Models (GMM to estimate the probability density function using maximum likelihood decision methods. The system shows significant results with an accuracy of 94.5%.

  6. Intelligibility of speech of children with speech and sound disorders

    OpenAIRE

    Ivetac, Tina

    2014-01-01

    The purpose of this study is to examine speech intelligibility of children with primary speech and sound disorders aged 3 to 6 years in everyday life. The research problem is based on the degree to which parents or guardians, immediate family members (sister, brother, grandparents), extended family members (aunt, uncle, cousin), child's friends, other acquaintances, child's teachers and strangers understand the speech of children with speech sound disorders. We examined whether the level ...

  7. Automatic speech recognition An evaluation of Google Speech

    OpenAIRE

    Stenman, Magnus

    2015-01-01

    The use of speech recognition is increasing rapidly and is now available in smart TVs, desktop computers, every new smart phone, etc. allowing us to talk to computers naturally. With the use in home appliances, education and even in surgical procedures accuracy and speed becomes very important. This thesis aims to give an introduction to speech recognition and discuss its use in robotics. An evaluation of Google Speech, using Google’s speech API, in regards to word error rate and translation ...

  8. Differential Diagnosis of Severe Speech Disorders Using Speech Gestures

    Science.gov (United States)

    Bahr, Ruth Huntley

    2005-01-01

    The differentiation of childhood apraxia of speech from severe phonological disorder is a common clinical problem. This article reports on an attempt to describe speech errors in children with childhood apraxia of speech on the basis of gesture use and acoustic analyses of articulatory gestures. The focus was on the movement of articulators and…

  9. Machine learning a probabilistic perspective

    CERN Document Server

    Murphy, Kevin P

    2012-01-01

    Today's Web-enabled deluge of electronic data calls for automated methods of data analysis. Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach. The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic method...

  10. Automated analysis of free speech predicts psychosis onset in high-risk youths.

    Science.gov (United States)

    Bedi, Gillinder; Carrillo, Facundo; Cecchi, Guillermo A; Slezak, Diego Fernández; Sigman, Mariano; Mota, Natália B; Ribeiro, Sidarta; Javitt, Daniel C; Copelli, Mauro; Corcoran, Cheryl M

    2015-01-01

    Psychiatry lacks the objective clinical tests routinely used in other specializations. Novel computerized methods to characterize complex behaviors such as speech could be used to identify and predict psychiatric illness in individuals. In this proof-of-principle study, our aim was to test automated speech analyses combined with Machine Learning to predict later psychosis onset in youths at clinical high-risk (CHR) for psychosis. Thirty-four CHR youths (11 females) had baseline interviews and were assessed quarterly for up to 2.5 years; five transitioned to psychosis. Using automated analysis, transcripts of interviews were evaluated for semantic and syntactic features predicting later psychosis onset. Speech features were fed into a convex hull classification algorithm with leave-one-subject-out cross-validation to assess their predictive value for psychosis outcome. The canonical correlation between the speech features and prodromal symptom ratings was computed. Derived speech features included a Latent Semantic Analysis measure of semantic coherence and two syntactic markers of speech complexity: maximum phrase length and use of determiners (e.g., which). These speech features predicted later psychosis development with 100% accuracy, outperforming classification from clinical interviews. Speech features were significantly correlated with prodromal symptoms. Findings support the utility of automated speech analysis to measure subtle, clinically relevant mental state changes in emergent psychosis. Recent developments in computer science, including natural language processing, could provide the foundation for future development of objective clinical tests for psychiatry.

  11. Tackling the complexity in speech

    DEFF Research Database (Denmark)

    section includes four carefully selected chapters. They deal with facets of speech production, speech acoustics, and/or speech perception or recognition, place them in an integrated phonetic-phonological perspective, and relate them in more or less explicit ways to aspects of speech technology. Therefore......, we hope that this volume can help speech scientists with traditional training in phonetics and phonology to keep up with the latest developments in speech technology. In the opposite direction, speech researchers starting from a technological perspective will hopefully get inspired by reading about...... the questions, phenomena, and communicative functions that are currently addressed in phonetics and phonology. Either way, the future of speech research lies in international, interdisciplinary collaborations, and our volume is meant to reflect and facilitate such collaborations...

  12. Denial Denied: Freedom of Speech

    Directory of Open Access Journals (Sweden)

    Glen Newey

    2009-12-01

    Full Text Available Free speech is a widely held principle. This is in some ways surprising, since formal and informal censorship of speech is widespread, and rather different issues seem to arise depending on whether the censorship concerns who speaks, what content is spoken or how it is spoken. I argue that despite these facts, free speech can indeed be seen as a unitary principle. On my analysis, the core of the free speech principle is the denial of the denial of speech, whether to a speaker, to a proposition, or to a mode of expression. Underlying free speech is the principle of freedom of association, according to which speech is both a precondition of future association (e.g. as a medium for negotiation and a mode of association in its own right. I conclude by applying this account briefly to two contentious issues: hate speech and pornography.

  13. Tackling the complexity in speech

    DEFF Research Database (Denmark)

    section includes four carefully selected chapters. They deal with facets of speech production, speech acoustics, and/or speech perception or recognition, place them in an integrated phonetic-phonological perspective, and relate them in more or less explicit ways to aspects of speech technology. Therefore......, we hope that this volume can help speech scientists with traditional training in phonetics and phonology to keep up with the latest developments in speech technology. In the opposite direction, speech researchers starting from a technological perspective will hopefully get inspired by reading about...... the questions, phenomena, and communicative functions that are currently addressed in phonetics and phonology. Either way, the future of speech research lies in international, interdisciplinary collaborations, and our volume is meant to reflect and facilitate such collaborations...

  14. Current trends in small vocabulary speech recognition for equipment control

    Science.gov (United States)

    Doukas, Nikolaos; Bardis, Nikolaos G.

    2017-09-01

    Speech recognition systems allow human - machine communication to acquire an intuitive nature that approaches the simplicity of inter - human communication. Small vocabulary speech recognition is a subset of the overall speech recognition problem, where only a small number of words need to be recognized. Speaker independent small vocabulary recognition can find significant applications in field equipment used by military personnel. Such equipment may typically be controlled by a small number of commands that need to be given quickly and accurately, under conditions where delicate manual operations are difficult to achieve. This type of application could hence significantly benefit by the use of robust voice operated control components, as they would facilitate the interaction with their users and render it much more reliable in times of crisis. This paper presents current challenges involved in attaining efficient and robust small vocabulary speech recognition. These challenges concern feature selection, classification techniques, speaker diversity and noise effects. A state machine approach is presented that facilitates the voice guidance of different equipment in a variety of situations.

  15. Proof-Carrying Code with Correct Compilers

    Science.gov (United States)

    Appel, Andrew W.

    2009-01-01

    In the late 1990s, proof-carrying code was able to produce machine-checkable safety proofs for machine-language programs even though (1) it was impractical to prove correctness properties of source programs and (2) it was impractical to prove correctness of compilers. But now it is practical to prove some correctness properties of source programs, and it is practical to prove correctness of optimizing compilers. We can produce more expressive proof-carrying code, that can guarantee correctness properties for machine code and not just safety. We will construct program logics for source languages, prove them sound w.r.t. the operational semantics of the input language for a proved-correct compiler, and then use these logics as a basis for proving the soundness of static analyses.

  16. Single-Channel Speech Separation using Sparse Non-Negative Matrix Factorization

    DEFF Research Database (Denmark)

    Schmidt, Mikkel N.; Olsson, Rasmus Kongsgaard

    2007-01-01

    We apply machine learning techniques to the problem of separating multiple speech sources from a single microphone recording. The method of choice is a sparse non-negative matrix factorization algorithm, which in an unsupervised manner can learn sparse representations of the data. This is applied...... to the learning of personalized dictionaries from a speech corpus, which in turn are used to separate the audio stream into its components. We show that computational savings can be achieved by segmenting the training data on a phoneme level. To split the data, a conventional speech recognizer is used...

  17. National Working Conference on Organization Code Held in Jinan

    Institute of Scientific and Technical Information of China (English)

    2004-01-01

    @@ On March 22nd, 2004, National Working Conference on Organization Code was held in Jinan. Li Zhonghai, director of the Standardization Administration of China, put forward in his speech:"We shall position and develop the code work of 2004 taking advantage of modern informatization management, to re-establish an undertaking and make the code work stand on the front line of construction of state informatization".

  18. Speech spectrogram expert

    Energy Technology Data Exchange (ETDEWEB)

    Johannsen, J.; Macallister, J.; Michalek, T.; Ross, S.

    1983-01-01

    Various authors have pointed out that humans can become quite adept at deriving phonetic transcriptions from speech spectrograms (as good as 90percent accuracy at the phoneme level). The authors describe an expert system which attempts to simulate this performance. The speech spectrogram expert (spex) is actually a society made up of three experts: a 2-dimensional vision expert, an acoustic-phonetic expert, and a phonetics expert. The visual reasoning expert finds important visual features of the spectrogram. The acoustic-phonetic expert reasons about how visual features relates to phonemes, and about how phonemes change visually in different contexts. The phonetics expert reasons about allowable phoneme sequences and transformations, and deduces an english spelling for phoneme strings. The speech spectrogram expert is highly interactive, allowing users to investigate hypotheses and edit rules. 10 references.

  19. RECOGNISING SPEECH ACTS

    Directory of Open Access Journals (Sweden)

    Phyllis Kaburise

    2012-09-01

    Full Text Available Speech Act Theory (SAT, a theory in pragmatics, is an attempt to describe what happens during linguistic interactions. Inherent within SAT is the idea that language forms and intentions are relatively formulaic and that there is a direct correspondence between sentence forms (for example, in terms of structure and lexicon and the function or meaning of an utterance. The contention offered in this paper is that when such a correspondence does not exist, as in indirect speech utterances, this creates challenges for English second language speakers and may result in miscommunication. This arises because indirect speech acts allow speakers to employ various pragmatic devices such as inference, implicature, presuppositions and context clues to transmit their messages. Such devices, operating within the non-literal level of language competence, may pose challenges for ESL learners.

  20. Protection limits on free speech

    Institute of Scientific and Technical Information of China (English)

    李敏

    2014-01-01

    Freedom of speech is one of the basic rights of citizens should receive broad protection, but in the real context of China under what kind of speech can be protected and be restricted, how to grasp between state power and free speech limit is a question worth considering. People tend to ignore the freedom of speech and its function, so that some of the rhetoric cannot be demonstrated in the open debates.

  1. Accurate discrimination of conserved coding and non-coding regions through multiple indicators of evolutionary dynamics

    Directory of Open Access Journals (Sweden)

    Pesole Graziano

    2009-09-01

    Full Text Available Abstract Background The conservation of sequences between related genomes has long been recognised as an indication of functional significance and recognition of sequence homology is one of the principal approaches used in the annotation of newly sequenced genomes. In the context of recent findings that the number non-coding transcripts in higher organisms is likely to be much higher than previously imagined, discrimination between conserved coding and non-coding sequences is a topic of considerable interest. Additionally, it should be considered desirable to discriminate between coding and non-coding conserved sequences without recourse to the use of sequence similarity searches of protein databases as such approaches exclude the identification of novel conserved proteins without characterized homologs and may be influenced by the presence in databases of sequences which are erroneously annotated as coding. Results Here we present a machine learning-based approach for the discrimination of conserved coding sequences. Our method calculates various statistics related to the evolutionary dynamics of two aligned sequences. These features are considered by a Support Vector Machine which designates the alignment coding or non-coding with an associated probability score. Conclusion We show that our approach is both sensitive and accurate with respect to comparable methods and illustrate several situations in which it may be applied, including the identification of conserved coding regions in genome sequences and the discrimination of coding from non-coding cDNA sequences.

  2. Management of Statically Modifiable Prolog Code

    Institute of Scientific and Technical Information of China (English)

    张晨曦; 慈云桂

    1989-01-01

    The Warren Abstract Machine is an efficient execution model for Prolog,which has become the basis of many high performance Prolog systems.However.little support for the implementation of the non-logical components of Prolog is provided in the WAM.The original Warren code is not modifiable.In this paper,we show how static modifications of Warren code can be achieved by adding a few instructions and a little extra information to the code.The implementation of the code manager is discussed.Algorithms for some basic operations are given.

  3. The University and Free Speech

    OpenAIRE

    Grcic, Joseph

    2014-01-01

    Free speech is a necessary condition for the growth of knowledge and the implementation of real and rational democracy. Educational institutions play a central role in socializing individuals to function within their society. Academic freedom is the right to free speech in the context of the university and tenure, properly interpreted, is a necessary component of protecting academic freedom and free speech.

  4. Designing speech for a recipient

    DEFF Research Database (Denmark)

    Fischer, Kerstin

    is investigated on three candidates for so-called ‘simplified registers’: speech to children (also called motherese or baby talk), speech to foreigners (also called foreigner talk) and speech to robots. The volume integrates research from various disciplines, such as psychology, sociolinguistics...

  5. ADMINISTRATIVE GUIDE IN SPEECH CORRECTION.

    Science.gov (United States)

    HEALEY, WILLIAM C.

    WRITTEN PRIMARILY FOR SCHOOL SUPERINTENDENTS, PRINCIPALS, SPEECH CLINICIANS, AND SUPERVISORS, THIS GUIDE OUTLINES THE MECHANICS OF ORGANIZING AND CONDUCTING SPEECH CORRECTION ACTIVITIES IN THE PUBLIC SCHOOLS. IT INCLUDES THE REQUIREMENTS FOR CERTIFICATION OF A SPEECH CLINICIAN IN MISSOURI AND DESCRIBES ESSENTIAL STEPS FOR THE DEVELOPMENT OF A…

  6. The Aster code; Code Aster

    Energy Technology Data Exchange (ETDEWEB)

    Delbecq, J.M

    1999-07-01

    The Aster code is a 2D or 3D finite-element calculation code for structures developed by the R and D direction of Electricite de France (EdF). This dossier presents a complete overview of the characteristics and uses of the Aster code: introduction of version 4; the context of Aster (organisation of the code development, versions, systems and interfaces, development tools, quality assurance, independent validation); static mechanics (linear thermo-elasticity, Euler buckling, cables, Zarka-Casier method); non-linear mechanics (materials behaviour, big deformations, specific loads, unloading and loss of load proportionality indicators, global algorithm, contact and friction); rupture mechanics (G energy restitution level, restitution level in thermo-elasto-plasticity, 3D local energy restitution level, KI and KII stress intensity factors, calculation of limit loads for structures), specific treatments (fatigue, rupture, wear, error estimation); meshes and models (mesh generation, modeling, loads and boundary conditions, links between different modeling processes, resolution of linear systems, display of results etc..); vibration mechanics (modal and harmonic analysis, dynamics with shocks, direct transient dynamics, seismic analysis and aleatory dynamics, non-linear dynamics, dynamical sub-structuring); fluid-structure interactions (internal acoustics, mass, rigidity and damping); linear and non-linear thermal analysis; steels and metal industry (structure transformations); coupled problems (internal chaining, internal thermo-hydro-mechanical coupling, chaining with other codes); products and services. (J.S.)

  7. SPEECH DISORDERS ENCOUNTERED DURING SPEECH THERAPY AND THERAPY TECHNIQUES

    OpenAIRE

    2013-01-01

    Speech which is a physical and mental process, agreed signs and sounds to create a sense of mind to the message that change . Process to identify the sounds of speech it is essential to know the structure and function of various organs which allows to happen the conversation. Speech is a physical and mental process so many factors can lead to speech disorders. Speech disorder can be about language acquisitions as well as it can be caused medical and psychological many factors. Disordered sp...

  8. Freedom of Speech and Hate Speech: an analysis of possible limits for freedom of speech

    National Research Council Canada - National Science Library

    Riva Sobrado de Freitas; Matheus Felipe de Castro

    2013-01-01

      In a view to determining the outlines of the Freedom of Speech and to specify its contents, we face hate speech as an offensive and repulsive manifestation, particularly directed to minority groups...

  9. Machine tool structures

    CERN Document Server

    Koenigsberger, F

    1970-01-01

    Machine Tool Structures, Volume 1 deals with fundamental theories and calculation methods for machine tool structures. Experimental investigations into stiffness are discussed, along with the application of the results to the design of machine tool structures. Topics covered range from static and dynamic stiffness to chatter in metal cutting, stability in machine tools, and deformations of machine tool structures. This volume is divided into three sections and opens with a discussion on stiffness specifications and the effect of stiffness on the behavior of the machine under forced vibration c

  10. Optimal codes as Tanner codes with cyclic component codes

    DEFF Research Database (Denmark)

    Høholdt, Tom; Pinero, Fernando; Zeng, Peng

    2014-01-01

    In this article we study a class of graph codes with cyclic code component codes as affine variety codes. Within this class of Tanner codes we find some optimal binary codes. We use a particular subgraph of the point-line incidence plane of A(2,q) as the Tanner graph, and we are able to describe...... the codes succinctly using Gröbner bases....

  11. Speech transmission index from running speech: A neural network approach

    Science.gov (United States)

    Li, F. F.; Cox, T. J.

    2003-04-01

    Speech transmission index (STI) is an important objective parameter concerning speech intelligibility for sound transmission channels. It is normally measured with specific test signals to ensure high accuracy and good repeatability. Measurement with running speech was previously proposed, but accuracy is compromised and hence applications limited. A new approach that uses artificial neural networks to accurately extract the STI from received running speech is developed in this paper. Neural networks are trained on a large set of transmitted speech examples with prior knowledge of the transmission channels' STIs. The networks perform complicated nonlinear function mappings and spectral feature memorization to enable accurate objective parameter extraction from transmitted speech. Validations via simulations demonstrate the feasibility of this new method on a one-net-one-speech extract basis. In this case, accuracy is comparable with normal measurement methods. This provides an alternative to standard measurement techniques, and it is intended that the neural network method can facilitate occupied room acoustic measurements.

  12. Speech-language pathologists' views on attrition from the profession.

    Science.gov (United States)

    McLaughlin, Emma; Lincoln, Michelle; Adamson, Barbara

    2008-01-01

    The aim of this study was to identify common themes in speech-language pathologists' perceptions of factors that increase and decrease their experiences of job stress, their satisfaction with their jobs and the profession, and their opinions about why people chose to leave the speech-language pathology profession. The participants' perceptions about the relationships between job stress, work satisfaction and job and profession retention were also explored. Sixty members of Speech Pathology Australia from a range of geographical and professional contexts were asked to participate in telephone interviews. Eighteen speech-language pathologists agreed to participate (30% response rate), and took part in semi-structured telephone interviews. Two researchers independently coded transcripts of the interviews for themes. Eight major themes were identified. These were positive aspects of the profession, workload, non-work obligations, effectiveness, recognition, support, learning and autonomy. The themes that emerged from analysis of these interviews provide new evidence about the positive and negative aspects of working as a speech-language pathologist, and provide preliminary insights into potential reasons as to why speech-language pathologists choose to remain in or leave the profession.

  13. Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference

    OpenAIRE

    Byeongwook Lee; Kwang-Hyun Cho

    2016-01-01

    Speech segmentation is a crucial step in automatic speech recognition because additional speech analyses are performed for each framed speech segment. Conventional segmentation techniques primarily segment speech using a fixed frame size for computational simplicity. However, this approach is insufficient for capturing the quasi-regular structure of speech, which causes substantial recognition failure in noisy environments. How does the brain handle quasi-regular structured speech and maintai...

  14. Automatic Speech Acquisition and Recognition for Spacesuit Audio Systems

    Science.gov (United States)

    Ye, Sherry

    2015-01-01

    NASA has a widely recognized but unmet need for novel human-machine interface technologies that can facilitate communication during astronaut extravehicular activities (EVAs), when loud noises and strong reverberations inside spacesuits make communication challenging. WeVoice, Inc., has developed a multichannel signal-processing method for speech acquisition in noisy and reverberant environments that enables automatic speech recognition (ASR) technology inside spacesuits. The technology reduces noise by exploiting differences between the statistical nature of signals (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, ASR accuracy can be improved to the level at which crewmembers will find the speech interface useful. System components and features include beam forming/multichannel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, and ASR decoding. Arithmetic complexity models were developed and will help designers of real-time ASR systems select proper tasks when confronted with constraints in computational resources. In Phase I of the project, WeVoice validated the technology. The company further refined the technology in Phase II and developed a prototype for testing and use by suited astronauts.

  15. Plug Into "The Modernizing Machine"! Danish University Reform and Its Transformable Academic Subjectivities

    Science.gov (United States)

    Krejsler, John Benedicto

    2013-01-01

    "The modernizing machine" codes individual bodies, things, and symbols with images from New Public Management, neo-liberal, and Knowledge Economy discourses. Drawing on Deleuze and Guattari's concept of machines, this article explores how "the modernizing machine" produces neo-liberal modernization of the public sector. Taking…

  16. Global Freedom of Speech

    DEFF Research Database (Denmark)

    Binderup, Lars Grassme

    2007-01-01

    , as opposed to a legal norm, that curbs exercises of the right to free speech that offend the feelings or beliefs of members from other cultural groups. The paper rejects the suggestion that acceptance of such a norm is in line with liberal egalitarian thinking. Following a review of the classical liberal...

  17. Speech and Hearing Therapy.

    Science.gov (United States)

    Sakata, Reiko; Sakata, Robert

    1978-01-01

    In the public school, the speech and hearing therapist attempts to foster child growth and development through the provision of services basic to awareness of self and others, management of personal and social interactions, and development of strategies for coping with the handicap. (MM)

  18. Perceptual learning in speech

    NARCIS (Netherlands)

    Norris, D.; McQueen, J.M.; Cutler, A.

    2003-01-01

    This study demonstrates that listeners use lexical knowledge in perceptual learning of speech sounds. Dutch listeners first made lexical decisions on Dutch words and nonwords. The final fricative of 20 critical words had been replaced by an ambiguous sound, between [f] and [s]. One group of listener

  19. Speech and Language Delay

    Science.gov (United States)

    ... home affect my child’s language and speech?The brain has to work harder to interpret and use 2 languages, so it may take longer for children to start using either one or both of the languages they’re learning. It’s not unusual for a bilingual child to ...

  20. Mandarin Visual Speech Information

    Science.gov (United States)

    Chen, Trevor H.

    2010-01-01

    While the auditory-only aspects of Mandarin speech are heavily-researched and well-known in the field, this dissertation addresses its lesser-known aspects: The visual and audio-visual perception of Mandarin segmental information and lexical-tone information. Chapter II of this dissertation focuses on the audiovisual perception of Mandarin…

  1. Speech After Banquet

    Science.gov (United States)

    Yang, Chen Ning

    2013-05-01

    I am usually not so short of words, but the previous speeches have rendered me really speechless. I have known and admired the eloquence of Freeman Dyson, but I did not know that there is a hidden eloquence in my colleague George Sterman...

  2. Speech disfluency in centenarians.

    Science.gov (United States)

    Searl, Jeffrey P; Gabel, Rodney M; Fulks, J Steven

    2002-01-01

    Other than a single case presentation of a 105-year-old female, no other studies have addressed the speech fluency characteristics of centenarians. The purpose of this study was to provide descriptive information on the fluency characteristics of speakers between the ages of 100-103 years. Conversational speech samples from seven speakers were evaluated for the frequency and types of disfluencies and speech rate. The centenarian speakers had a disfluency rate similar to that reported for 70-, 80-, and early 90-year-olds. The types of disfluencies observed also were similar to those reported for younger elderly speakers (primarily whole word/phrase, or formulative fluency breaks). Finally, the speech rate data for the current group of speakers supports prior literature reports of a slower rate with advancing age, but extends the finding to centenarians. As a result of this activity, participants will be able to: (1) describe the frequency of disfluency breaks and the types of disfluencies exhibited by centenarian speakers, (2) describe the mean and range of speaking rates in centenarians, and (3) compare the present findings for centenarians to the fluency and speaking rate characteristics reported in the literature.

  3. Mandarin Visual Speech Information

    Science.gov (United States)

    Chen, Trevor H.

    2010-01-01

    While the auditory-only aspects of Mandarin speech are heavily-researched and well-known in the field, this dissertation addresses its lesser-known aspects: The visual and audio-visual perception of Mandarin segmental information and lexical-tone information. Chapter II of this dissertation focuses on the audiovisual perception of Mandarin…

  4. The Commercial Speech Doctrine.

    Science.gov (United States)

    Luebke, Barbara F.

    In its 1942 ruling in the "Valentine vs. Christensen" case, the Supreme Court established the doctrine that commercial speech is not protected by the First Amendment. In 1975, in the "Bigelow vs. Virginia" case, the Supreme Court took a decisive step toward abrogating that doctrine, by ruling that advertising is not stripped of…

  5. Lateralized speech perception in normal-hearing and hearing-impaired listeners and its relationship to temporal processing

    DEFF Research Database (Denmark)

    Locsei, Gusztav; Pedersen, Julie Hefting; Laugesen, Søren;

    2016-01-01

    This study investigated the role of temporal fine structure (TFS) coding in spatially complex, lateralized listening tasks. Speech reception thresholds (SRTs) were measured in young normal-hearing (NH) and two groups of elderly hearing-impaired (HI) listeners in the presence of speech-shaped noise...

  6. Design of Demining Machines

    CERN Document Server

    Mikulic, Dinko

    2013-01-01

    In constant effort to eliminate mine danger, international mine action community has been developing safety, efficiency and cost-effectiveness of clearance methods. Demining machines have become necessary when conducting humanitarian demining where the mechanization of demining provides greater safety and productivity. Design of Demining Machines describes the development and testing of modern demining machines in humanitarian demining.   Relevant data for design of demining machines are included to explain the machinery implemented and some innovative and inspiring development solutions. Development technologies, companies and projects are discussed to provide a comprehensive estimate of the effects of various design factors and to proper selection of optimal parameters for designing the demining machines.   Covering the dynamic processes occurring in machine assemblies and their components to a broader understanding of demining machine as a whole, Design of Demining Machines is primarily tailored as a tex...

  7. Applied machining technology

    CERN Document Server

    Tschätsch, Heinz

    2010-01-01

    Machining and cutting technologies are still crucial for many manufacturing processes. This reference presents all important machining processes in a comprehensive and coherent way. It includes many examples of concrete calculations, problems and solutions.

  8. Machining with abrasives

    CERN Document Server

    Jackson, Mark J

    2011-01-01

    Abrasive machining is key to obtaining the desired geometry and surface quality in manufacturing. This book discusses the fundamentals and advances in the abrasive machining processes. It provides a complete overview of developing areas in the field.

  9. Women, Men, and Machines.

    Science.gov (United States)

    Form, William; McMillen, David Byron

    1983-01-01

    Data from the first national study of technological change show that proportionately more women than men operate machines, are more exposed to machines that have alienating effects, and suffer more from the negative effects of technological change. (Author/SSH)

  10. Brain versus Machine Control.

    Directory of Open Access Journals (Sweden)

    Jose M Carmena

    2004-12-01

    Full Text Available Dr. Octopus, the villain of the movie "Spiderman 2", is a fusion of man and machine. Neuroscientist Jose Carmena examines the facts behind this fictional account of a brain- machine interface

  11. Fingerprinting Communication and Computation on HPC Machines

    Energy Technology Data Exchange (ETDEWEB)

    Peisert, Sean

    2010-06-02

    How do we identify what is actually running on high-performance computing systems? Names of binaries, dynamic libraries loaded, or other elements in a submission to a batch queue can give clues, but binary names can be changed, and libraries provide limited insight and resolution on the code being run. In this paper, we present a method for"fingerprinting" code running on HPC machines using elements of communication and computation. We then discuss how that fingerprint can be used to determine if the code is consistent with certain other types of codes, what a user usually runs, or what the user requested an allocation to do. In some cases, our techniques enable us to fingerprint HPC codes using runtime MPI data with a high degree of accuracy.

  12. Conversation, speech acts, and memory.

    Science.gov (United States)

    Holtgraves, Thomas

    2008-03-01

    Speakers frequently have specific intentions that they want others to recognize (Grice, 1957). These specific intentions can be viewed as speech acts (Searle, 1969), and I argue that they play a role in long-term memory for conversation utterances. Five experiments were conducted to examine this idea. Participants in all experiments read scenarios ending with either a target utterance that performed a specific speech act (brag, beg, etc.) or a carefully matched control. Participants were more likely to falsely recall and recognize speech act verbs after having read the speech act version than after having read the control version, and the speech act verbs served as better recall cues for the speech act utterances than for the controls. Experiment 5 documented individual differences in the encoding of speech act verbs. The results suggest that people recognize and retain the actions that people perform with their utterances and that this is one of the organizing principles of conversation memory.

  13. Metaheuristic applications to speech enhancement

    CERN Document Server

    Kunche, Prajna

    2016-01-01

    This book serves as a basic reference for those interested in the application of metaheuristics to speech enhancement. The major goal of the book is to explain the basic concepts of optimization methods and their use in heuristic optimization in speech enhancement to scientists, practicing engineers, and academic researchers in speech processing. The authors discuss why it has been a challenging problem for researchers to develop new enhancement algorithms that aid in the quality and intelligibility of degraded speech. They present powerful optimization methods to speech enhancement that can help to solve the noise reduction problems. Readers will be able to understand the fundamentals of speech processing as well as the optimization techniques, how the speech enhancement algorithms are implemented by utilizing optimization methods, and will be given the tools to develop new algorithms. The authors also provide a comprehensive literature survey regarding the topic.

  14. Improved Statistical Speech Segmentation Using Connectionist Approach

    Directory of Open Access Journals (Sweden)

    M. S. Salam

    2009-01-01

    Full Text Available Problem statement: Speech segmentation is an important part for speech recognition, synthesizing and coding. Statistical based approach detects segmentation points via computing spectral distortion of the signal without prior knowledge of the acoustic information proved to be able to give good match, less omission but lot of insertion. These insertion points dropped segmentation accuracy. Approach: This study proposed a fusion method between statistical and connectionist approaches namely the divergence algorithm and Multi Layer Perceptron (MLP with adaptive learning for segmentation of Malay connected digit with the aim to improve statistical approach via detection of insertion points. The neural network was optimized via trial and error in finding suitable parameters and speech time normalization methods. The best neural network classifier was then fusion with divergence algorithm to make segmentation. Results: The results of the experiments showed that the best neural network classifier used learning rate of value 1.0 and momentum rate of value 0.9 with data normalization based on zero-padded. The segmentation using fusion of statistical and connectionist was able to reduce insertion points up to 10.4% while maintaining match points above 99% and omission point below 0.7% within time tolerance of 0.09 second. Conclusion: The result of segmentation using the proposed fusion method indicated potential use of connectionist approach in improving continuous segmentation by statistical approach.

  15. Relationship between speech motor control and speech intelligibility in children with speech sound disorders.

    Science.gov (United States)

    Namasivayam, Aravind Kumar; Pukonen, Margit; Goshulak, Debra; Yu, Vickie Y; Kadis, Darren S; Kroll, Robert; Pang, Elizabeth W; De Nil, Luc F

    2013-01-01

    The current study was undertaken to investigate the impact of speech motor issues on the speech intelligibility of children with moderate to severe speech sound disorders (SSD) within the context of the PROMPT intervention approach. The word-level Children's Speech Intelligibility Measure (CSIM), the sentence-level Beginner's Intelligibility Test (BIT) and tests of speech motor control and articulation proficiency were administered to 12 children (3:11 to 6:7 years) before and after PROMPT therapy. PROMPT treatment was provided for 45 min twice a week for 8 weeks. Twenty-four naïve adult listeners aged 22-46 years judged the intelligibility of the words and sentences. For CSIM, each time a recorded word was played to the listeners they were asked to look at a list of 12 words (multiple-choice format) and circle the word while for BIT sentences, the listeners were asked to write down everything they heard. Words correctly circled (CSIM) or transcribed (BIT) were averaged across three naïve judges to calculate percentage speech intelligibility. Speech intelligibility at both the word and sentence level was significantly correlated with speech motor control, but not articulatory proficiency. Further, the severity of speech motor planning and sequencing issues may potentially be a limiting factor in connected speech intelligibility and highlights the need to target these issues early and directly in treatment. The reader will be able to: (1) outline the advantages and disadvantages of using word- and sentence-level speech intelligibility tests; (2) describe the impact of speech motor control and articulatory proficiency on speech intelligibility; and (3) describe how speech motor control and speech intelligibility data may provide critical information to aid treatment planning. Copyright © 2013 Elsevier Inc. All rights reserved.

  16. A Mobile Phone based Speech Therapist

    OpenAIRE

    Pandey, Vinod K.; Pande, Arun; Kopparapu, Sunil Kumar

    2016-01-01

    Patients with articulatory disorders often have difficulty in speaking. These patients need several speech therapy sessions to enable them speak normally. These therapy sessions are conducted by a specialized speech therapist. The goal of speech therapy is to develop good speech habits as well as to teach how to articulate sounds the right way. Speech therapy is critical for continuous improvement to regain normal speech. Speech therapy sessions require a patient to travel to a hospital or a ...

  17. Doubly Fed Induction Machine Control For Wind Energy Conversion System

    Science.gov (United States)

    2009-06-01

    this_block) % Revision History: % % 18-Dec-2008 (15:15 hours): % Original code was machine generated by Xilinx’s System Generator % after...this_block.setTopLevelLanguage(’VHDL’); this_block.setEntityName(’code’); % System Generator has to assume that your entity has a combinational % feed through

  18. An object-oriented extension for debugging the virtual machine

    Energy Technology Data Exchange (ETDEWEB)

    Pizzi, R.G. Jr. [California Univ., Davis, CA (United States)

    1994-12-01

    A computer is nothing more then a virtual machine programmed by source code to perform a task. The program`s source code expresses abstract constructs which are compiled into some lower level target language. When a virtual machine breaks, it can be very difficult to debug because typical debuggers provide only low-level target implementation information to the software engineer. We believe that the debugging task can be simplified by introducing aspects of the abstract design and data into the source code. We introduce OODIE, an object-oriented extension to programming languages that allows programmers to specify a virtual environment by describing the meaning of the design and data of a virtual machine. This specification is translated into symbolic information such that an augmented debugger can present engineers with a programmable debugging environment specifically tailored for the virtual machine that is to be debugged.

  19. A Universal Reactive Machine

    DEFF Research Database (Denmark)

    Andersen, Henrik Reif; Mørk, Simon; Sørensen, Morten U.

    1997-01-01

    Turing showed the existence of a model universal for the set of Turing machines in the sense that given an encoding of any Turing machine asinput the universal Turing machine simulates it. We introduce the concept of universality for reactive systems and construct a CCS processuniversal...

  20. Quantum Neural Network Based Machine Translator for Hindi to English

    Directory of Open Access Journals (Sweden)

    Ravi Narayan

    2014-01-01

    Full Text Available This paper presents the machine learning based machine translation system for Hindi to English, which learns the semantically correct corpus. The quantum neural based pattern recognizer is used to recognize and learn the pattern of corpus, using the information of part of speech of individual word in the corpus, like a human. The system performs the machine translation using its knowledge gained during the learning by inputting the pair of sentences of Devnagri-Hindi and English. To analyze the effectiveness of the proposed approach, 2600 sentences have been evaluated during simulation and evaluation. The accuracy achieved on BLEU score is 0.7502, on NIST score is 6.5773, on ROUGE-L score is 0.9233, and on METEOR score is 0.5456, which is significantly higher in comparison with Google Translation and Bing Translation for Hindi to English Machine Translation.

  1. Two General Architectures for Intelligent Machine Performance Degradation Assessment

    Directory of Open Access Journals (Sweden)

    Yanwei Xu

    2015-01-01

    Full Text Available Markov model is of good ability to infer random events whose likelihood depends on previous events. Based on this theory, hidden Markov model serves as an extension of Markov model to present an event from observations rather than states in Markov model. Moreover, due to successful applications in speech recognition, it attracts much attention in machine fault diagnosis. This paper presents two architectures for machine performance degradation assessment, which can be used to minimize machine downtime, reduce economic loss, and improve productivity. The major difference between the two architectures is whether historical data are available to build hidden Markov models. In case studies, bearing data as well as available historical data are used to demonstrate the effectiveness of the first architecture. Then, whole life gearbox data without historical data are employed to demonstrate the effectiveness of the second architecture. The results obtained from two case studies show that the presented architectures have good abilities for machine performance degradation assessment.

  2. Quantum neural network based machine translator for Hindi to English.

    Science.gov (United States)

    Narayan, Ravi; Singh, V P; Chakraverty, S

    2014-01-01

    This paper presents the machine learning based machine translation system for Hindi to English, which learns the semantically correct corpus. The quantum neural based pattern recognizer is used to recognize and learn the pattern of corpus, using the information of part of speech of individual word in the corpus, like a human. The system performs the machine translation using its knowledge gained during the learning by inputting the pair of sentences of Devnagri-Hindi and English. To analyze the effectiveness of the proposed approach, 2600 sentences have been evaluated during simulation and evaluation. The accuracy achieved on BLEU score is 0.7502, on NIST score is 6.5773, on ROUGE-L score is 0.9233, and on METEOR score is 0.5456, which is significantly higher in comparison with Google Translation and Bing Translation for Hindi to English Machine Translation.

  3. Asymmetric dynamic attunement of speech and gestures in the construction of children’s understanding

    Directory of Open Access Journals (Sweden)

    Lisette eDe Jonge-Hoekstra

    2016-03-01

    Full Text Available As children learn they use their speech to express words and their hands to gesture. This study investigates the interplay between real-time gestures and speech as children construct cognitive understanding during a hands-on science task. 12 children (M = 6, F = 6 from Kindergarten (n = 5 and first grade (n = 7 participated in this study. Each verbal utterance and gesture during the task were coded, on a complexity scale derived from dynamic skill theory. To explore the interplay between speech and gestures, we applied a cross recurrence quantification analysis (CRQA to the two coupled time series of the skill levels of verbalizations and gestures. The analysis focused on 1 the temporal relation between gestures and speech, 2 the relative strength and direction of the interaction between gestures and speech, 3 the relative strength and direction between gestures and speech for different levels of understanding, and 4 relations between CRQA measures and other child characteristics. The results show that older and younger children differ in the (temporal asymmetry in the gestures-speech interaction. For younger children, the balance leans more towards gestures leading speech in time, while the balance leans more towards speech leading gestures for older children. Secondly, at the group level, speech attracts gestures in a more dynamically stable fashion than vice versa, and this asymmetry in gestures and speech extends to lower and higher understanding levels. Yet, for older children, the mutual coupling between gestures and speech is more dynamically stable regarding the higher understanding levels. Gestures and speech are more synchronized in time as children are older. A higher score on schools’ language tests is related to speech attracting gestures more rigidly and more asymmetry between gestures and speech, only for the less difficult understanding levels. A higher score on math or past science tasks is related to less asymmetry between

  4. Speech Motor Control in Fluent and Dysfluent Speech Production of an Individual with Apraxia of Speech and Broca's Aphasia

    Science.gov (United States)

    van Lieshout, Pascal H. H. M.; Bose, Arpita; Square, Paula A.; Steele, Catriona M.

    2007-01-01

    Apraxia of speech (AOS) is typically described as a motor-speech disorder with clinically well-defined symptoms, but without a clear understanding of the underlying problems in motor control. A number of studies have compared the speech of subjects with AOS to the fluent speech of controls, but only a few have included speech movement data and if…

  5. POLYSHIFT Communications Software for the Connection Machine System CM-200

    Directory of Open Access Journals (Sweden)

    William George

    1994-01-01

    Full Text Available We describe the use and implementation of a polyshift function PSHIFT for circular shifts and end-offs shifts. Polyshift is useful in many scientific codes using regular grids, such as finite difference codes in several dimensions, and multigrid codes, molecular dynamics computations, and in lattice gauge physics computations, such as quantum chromodynamics (QCD calculations. Our implementation of the PSHIFT function on the Connection Machine systems CM-2 and CM-200 offers a speedup of up to a factor of 3–4 compared with CSHIFT when the local data motion within a node is small. The PSHIFT routine is included in the Connection Machine Scientific Software Library (CMSSL.

  6. Towards Preserving Model Coverage and Structural Code Coverage

    Directory of Open Access Journals (Sweden)

    Raimund Kirner

    2009-01-01

    Full Text Available Embedded systems are often used in safety-critical environments. Thus, thorough testing of them is mandatory. To achieve a required structural code-coverage criteria it is beneficial to derive the test data at a higher program-representation level than machine code. Higher program-representation levels include, beside the source-code level, languages of domain-specific modeling environments with automatic code generation. For a testing framework with automatic generation of test data this will enable high retargetability of the framework. In this article we address the challenge of ensuring that the structural code coverage achieved at a higher program representation level is preserved during the code generations and code transformations down to machine code. We define the formal properties that have to be fullfilled by a code transformation to guarantee preservation of structural code coverage. Based on these properties we discuss how to preserve code coverage achieved at source-code level. Additionally, we discuss how structural code coverage at model level could be preserved. The results presented in this article are aimed toward the integration of support for preserving structural code coverage into compilers and code generators.

  7. Speech-To-Text Conversion STT System Using Hidden Markov Model HMM

    Directory of Open Access Journals (Sweden)

    Su Myat Mon

    2015-06-01

    Full Text Available Abstract Speech is an easiest way to communicate with each other. Speech processing is widely used in many applications like security devices household appliances cellular phones ATM machines and computers. The human computer interface has been developed to communicate or interact conveniently for one who is suffering from some kind of disabilities. Speech-to-Text Conversion STT systems have a lot of benefits for the deaf or dumb people and find their applications in our daily lives. In the same way the aim of the system is to convert the input speech signals into the text output for the deaf or dumb students in the educational fields. This paper presents an approach to extract features by using Mel Frequency Cepstral Coefficients MFCC from the speech signals of isolated spoken words. And Hidden Markov Model HMM method is applied to train and test the audio files to get the recognized spoken word. The speech database is created by using MATLAB.Then the original speech signals are preprocessed and these speech samples are extracted to the feature vectors which are used as the observation sequences of the Hidden Markov Model HMM recognizer. The feature vectors are analyzed in the HMM depending on the number of states.

  8. Asynchronized synchronous machines

    CERN Document Server

    Botvinnik, M M

    1964-01-01

    Asynchronized Synchronous Machines focuses on the theoretical research on asynchronized synchronous (AS) machines, which are "hybrids” of synchronous and induction machines that can operate with slip. Topics covered in this book include the initial equations; vector diagram of an AS machine; regulation in cases of deviation from the law of full compensation; parameters of the excitation system; and schematic diagram of an excitation regulator. The possible applications of AS machines and its calculations in certain cases are also discussed. This publication is beneficial for students and indiv

  9. Quantum machine learning.

    Science.gov (United States)

    Biamonte, Jacob; Wittek, Peter; Pancotti, Nicola; Rebentrost, Patrick; Wiebe, Nathan; Lloyd, Seth

    2017-09-13

    Fuelled by increasing computer power and algorithmic advances, machine learning techniques have become powerful tools for finding patterns in data. Quantum systems produce atypical patterns that classical systems are thought not to produce efficiently, so it is reasonable to postulate that quantum computers may outperform classical computers on machine learning tasks. The field of quantum machine learning explores how to devise and implement quantum software that could enable machine learning that is faster than that of classical computers. Recent work has produced quantum algorithms that could act as the building blocks of machine learning programs, but the hardware and software challenges are still considerable.

  10. Precision machine design

    CERN Document Server

    Slocum, Alexander H

    1992-01-01

    This book is a comprehensive engineering exploration of all the aspects of precision machine design - both component and system design considerations for precision machines. It addresses both theoretical analysis and practical implementation providing many real-world design case studies as well as numerous examples of existing components and their characteristics. Fast becoming a classic, this book includes examples of analysis techniques, along with the philosophy of the solution method. It explores the physics of errors in machines and how such knowledge can be used to build an error budget for a machine, how error budgets can be used to design more accurate machines.

  11. Coding of intonational meanings beyond F0

    DEFF Research Database (Denmark)

    Niebuhr, Oliver

    2008-01-01

    An acoustic analysis of a German read-speech corpus showed that utterance-final /t/ aspirations differ systematically depending on the accompanying nuclear accent contour. Two contours were included: Terminal-falling early and late F0 peaks in terms of the Kiel Intonation Model. They correspond t......, so far solely associated with intonation contours. Hence, the traditionally separated segmental and suprasegmental coding levels seem to be more intertwined than previously thought....

  12. Quantification and Systematic Characterization of Stuttering-Like Disfluencies in Acquired Apraxia of Speech.

    Science.gov (United States)

    Bailey, Dallin J; Blomgren, Michael; DeLong, Catharine; Berggren, Kiera; Wambaugh, Julie L

    2017-06-22

    The purpose of this article is to quantify and describe stuttering-like disfluencies in speakers with acquired apraxia of speech (AOS), utilizing the Lidcombe Behavioural Data Language (LBDL). Additional purposes include measuring test-retest reliability and examining the effect of speech sample type on disfluency rates. Two types of speech samples were elicited from 20 persons with AOS and aphasia: repetition of mono- and multisyllabic words from a protocol for assessing AOS (Duffy, 2013), and connected speech tasks (Nicholas & Brookshire, 1993). Sampling was repeated at 1 and 4 weeks following initial sampling. Stuttering-like disfluencies were coded using the LBDL, which is a taxonomy that focuses on motoric aspects of stuttering. Disfluency rates ranged from 0% to 13.1% for the connected speech task and from 0% to 17% for the word repetition task. There was no significant effect of speech sampling time on disfluency rate in the connected speech task, but there was a significant effect of time for the word repetition task. There was no significant effect of speech sample type. Speakers demonstrated both major types of stuttering-like disfluencies as categorized by the LBDL (fixed postures and repeated movements). Connected speech samples yielded more reliable tallies over repeated measurements. Suggestions are made for modifying the LBDL for use in AOS in order to further add to systematic descriptions of motoric disfluencies in this disorder.

  13. Effect of Assembling and Machining Errors on Wavefront Coding Imaging Performance of Cubic Phase Mask%波前编码中立方相位板的装配及加工误差对成像性能的影响

    Institute of Scientific and Technical Information of China (English)

    张效栋; 张林; 刘现磊; 姜丽丽

    2015-01-01

    在波前编码技术中,通过立方相位板的光学调制和后续图像处理,扩展了系统景深.其中系统的光学调制过程,可以用广义光瞳函数描述.系统的广义光瞳函数描述了光通过立方相位板后相位的变化过程.立方相位板是波前编码技术的关键器件,装配及加工误差直接影响系统的成像性能.本文通过推导不同误差情况下的广义光曈函数,得到了立方相位板装配及加工误差对点扩散函数( PSF)和调制传递函数( MTF)的变化规律.评估这些规律,得到了装配和加工误差对系统成像性能的变化规律,为装配和加工过程提供了基本的指导.文中分析了不同装配误差和加工误差对于系统性能的影响,其中围绕Z轴的装配误差和加工中振动引起的正弦形状误差对于MTF的影响最大.因此,在装配和加工中应尽量避免围绕Z轴的装配误差和正弦形状误差,正弦形状误差的PV值应保持在0.5μm之内.%In wavefront coding , a cubic phase mask extends the field depth of an imaging system through modulation and deconvolution .The modulation could be defined by a pupil function which describes how light wave is affected after passing through the mask .As the key optical element responsible for providing the designated spatially varying path length ,the machining and assembling errors ( MAEs) affect the ima-ging performance directly by deviating the modulation .In this paper , with derivation of the generalized pupil function under various errors , the influence rules of assembly and machining errors of cubic phase mask on imaging performance were obtained .To assess these MAEs , the point spread function ( PSF) and the modulation transfer function(MTF) were derived.The influence rules of MAEs on imaging perform-ance were established as well .These rules are eventually developed into the necessary and essential guidelines to the machining and

  14. Sensorimotor Interactions in Speech Learning

    Directory of Open Access Journals (Sweden)

    Douglas M Shiller

    2011-10-01

    Full Text Available Auditory input is essential for normal speech development and plays a key role in speech production throughout the life span. In traditional models, auditory input plays two critical roles: 1 establishing the acoustic correlates of speech sounds that serve, in part, as the targets of speech production, and 2 as a source of feedback about a talker's own speech outcomes. This talk will focus on both of these roles, describing a series of studies that examine the capacity of children and adults to adapt to real-time manipulations of auditory feedback during speech production. In one study, we examined sensory and motor adaptation to a manipulation of auditory feedback during production of the fricative “s”. In contrast to prior accounts, adaptive changes were observed not only in speech motor output but also in subjects' perception of the sound. In a second study, speech adaptation was examined following a period of auditory–perceptual training targeting the perception of vowels. The perceptual training was found to systematically improve subjects' motor adaptation response to altered auditory feedback during speech production. The results of both studies support the idea that perceptual and motor processes are tightly coupled in speech production learning, and that the degree and nature of this coupling may change with development.

  15. A window into the intoxicated mind? Speech as an index of psychoactive drug effects.

    Science.gov (United States)

    Bedi, Gillinder; Cecchi, Guillermo A; Slezak, Diego F; Carrillo, Facundo; Sigman, Mariano; de Wit, Harriet

    2014-09-01

    Abused drugs can profoundly alter mental states in ways that may motivate drug use. These effects are usually assessed with self-report, an approach that is vulnerable to biases. Analyzing speech during intoxication may present a more direct, objective measure, offering a unique 'window' into the mind. Here, we employed computational analyses of speech semantic and topological structure after ±3,4-methylenedioxymethamphetamine (MDMA; 'ecstasy') and methamphetamine in 13 ecstasy users. In 4 sessions, participants completed a 10-min speech task after MDMA (0.75 and 1.5 mg/kg), methamphetamine (20 mg), or placebo. Latent Semantic Analyses identified the semantic proximity between speech content and concepts relevant to drug effects. Graph-based analyses identified topological speech characteristics. Group-level drug effects on semantic distances and topology were assessed. Machine-learning analyses (with leave-one-out cross-validation) assessed whether speech characteristics could predict drug condition in the individual subject. Speech after MDMA (1.5 mg/kg) had greater semantic proximity than placebo to the concepts friend, support, intimacy, and rapport. Speech on MDMA (0.75 mg/kg) had greater proximity to empathy than placebo. Conversely, speech on methamphetamine was further from compassion than placebo. Classifiers discriminated between MDMA (1.5 mg/kg) and placebo with 88% accuracy, and MDMA (1.5 mg/kg) and methamphetamine with 84% accuracy. For the two MDMA doses, the classifier performed at chance. These data suggest that automated semantic speech analyses can capture subtle alterations in mental state, accurately discriminating between drugs. The findings also illustrate the potential for automated speech-based approaches to characterize clinically relevant alterations to mental state, including those occurring in psychiatric illness.

  16. Syntactic Reordering for Arabic- English Phrase-Based Machine Translation

    Science.gov (United States)

    Hatem, Arwa; Omar, Nazlia

    Machine Translation (MT) refers to the use of a machine for performing translation task which converts text or speech in one Natural Language (Source Language (SL)) into another Natural Language (Target Language (TL)). The translation from Arabic to English is difficult task due to the Arabic languages are highly inflectional, rich morphology and relatively free word order. Word ordering plays an important part in the translation process. The paper proposes a transfer-based approach in Arabic to English MT to handle the word ordering problem. Preliminary tested indicate that our system, AE-TBMT is competitive when compared against other approaches from the literature.

  17. NOVEL BIPHASE CODE -INTEGRATED SIDELOBE SUPPRESSION CODE

    Institute of Scientific and Technical Information of China (English)

    Wang Feixue; Ou Gang; Zhuang Zhaowen

    2004-01-01

    A kind of novel binary phase code named sidelobe suppression code is proposed in this paper. It is defined to be the code whose corresponding optimal sidelobe suppression filter outputs the minimum sidelobes. It is shown that there do exist sidelobe suppression codes better than the conventional optimal codes-Barker codes. For example, the sidelobe suppression code of length 11 with filter of length 39 has better sidelobe level up to 17dB than that of Barker code with the same code length and filter length.

  18. Hate Speech: Power in the Marketplace.

    Science.gov (United States)

    Harrison, Jack B.

    1994-01-01

    A discussion of hate speech and freedom of speech on college campuses examines the difference between hate speech from normal, objectionable interpersonal comments and looks at Supreme Court decisions on the limits of student free speech. Two cases specifically concerning regulation of hate speech on campus are considered: Chaplinsky v. New…

  19. From concatenated codes to graph codes

    DEFF Research Database (Denmark)

    Justesen, Jørn; Høholdt, Tom

    2004-01-01

    We consider codes based on simple bipartite expander graphs. These codes may be seen as the first step leading from product type concatenated codes to more complex graph codes. We emphasize constructions of specific codes of realistic lengths, and study the details of decoding by message passing...

  20. An overview of the SPHINX speech recognition system

    Science.gov (United States)

    Lee, Kai-Fu; Hon, Hsiao-Wuen; Reddy, Raj

    1990-01-01

    A description is given of SPHINX, a system that demonstrates the feasibility of accurate, large-vocabulary, speaker-independent, continuous speech recognition. SPHINX is based on discrete hidden Markov models (HMMs) with linear-predictive-coding derived parameters. To provide speaker independence, knowledge was added to these HMMs in several ways: multiple codebooks of fixed-width parameters, and an enhanced recognizer with carefully designed models and word-duration modeling. To deal with coarticulation in continuous speech, yet still adequately represent a large vocabulary, two new subword speech units are introduced: function-word-dependent phone models and generalized triphone models. With grammars of perplexity 997, 60, and 20, SPHINX attained word accuracies of 71, 94, and 96 percent, respectively, on a 997-word task.

  1. Variation and Synthetic Speech

    CERN Document Server

    Miller, C; Massey, N; Miller, Corey; Karaali, Orhan; Massey, Noel

    1997-01-01

    We describe the approach to linguistic variation taken by the Motorola speech synthesizer. A pan-dialectal pronunciation dictionary is described, which serves as the training data for a neural network based letter-to-sound converter. Subsequent to dictionary retrieval or letter-to-sound generation, pronunciations are submitted a neural network based postlexical module. The postlexical module has been trained on aligned dictionary pronunciations and hand-labeled narrow phonetic transcriptions. This architecture permits the learning of individual postlexical variation, and can be retrained for each speaker whose voice is being modeled for synthesis. Learning variation in this way can result in greater naturalness for the synthetic speech that is produced by the system.

  2. Speech is Golden

    DEFF Research Database (Denmark)

    Juel Henrichsen, Peter

    2014-01-01

    Most of the Danish municipalities are ready to begin to adopt automatic speech recognition, but at the same time remain nervous following a long series of bad business cases in the recent past. Complaints are voiced over costly licences and low service levels, typical effects of a de facto monopoly...... on the supply side. The present article reports on a new public action strategy which has taken shape in the course of 2013-14. While Denmark is a small language area, our public sector is well organised and has considerable purchasing power. Across this past year, Danish local authorities have organised around...... of the present article, in the role of economically neutral advisers. The aim of the initiative is to pave the way for the first profitable contract in the field - which we hope to see in 2014 - an event which would precisely break the present deadlock and open up a billion EUR market for speech technology...

  3. Mutual Disambiguation of Eye Gaze and Speech for Sight Translation and Reading

    DEFF Research Database (Denmark)

    Kulkarni, Rucha; Jain, Kritika; Bansal, Himanshu;

    Researchers are proposing interactive machine translation as a potential method to make language translation process more efficient and usable. Introduction of different modalities like eye gaze and speech are being explored to add to the interactivity of language translation system. Unfortunately...

  4. Mutual Disambiguation of Eye Gaze and Speech for Sight Translation and Reading

    DEFF Research Database (Denmark)

    Kulkarni, Rucha; Jain, Kritika; Bansal, Himanshu

    2013-01-01

    Researchers are proposing interactive machine translation as a potential method to make language translation process more efficient and usable. Introduction of different modalities like eye gaze and speech are being explored to add to the interactivity of language translation system. Unfortunatel...

  5. [Improving speech comprehension using a new cochlear implant speech processor].

    Science.gov (United States)

    Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A

    2009-06-01

    The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg

  6. Neurophysiology of speech differences in childhood apraxia of speech.

    Science.gov (United States)

    Preston, Jonathan L; Molfese, Peter J; Gumkowski, Nina; Sorcinelli, Andrea; Harwood, Vanessa; Irwin, Julia R; Landi, Nicole

    2014-01-01

    Event-related potentials (ERPs) were recorded during a picture naming task of simple and complex words in children with typical speech and with childhood apraxia of speech (CAS). Results reveal reduced amplitude prior to speaking complex (multisyllabic) words relative to simple (monosyllabic) words for the CAS group over the right hemisphere during a time window thought to reflect phonological encoding of word forms. Group differences were also observed prior to production of spoken tokens regardless of word complexity during a time window just prior to speech onset (thought to reflect motor planning/programming). Results suggest differences in pre-speech neurolinguistic processes.

  7. Hiding Information under Speech

    Science.gov (United States)

    2005-12-12

    as it arrives in real time, and it disappears as fast as it arrives. Furthermore, our cognitive process for translating audio sounds to the meaning... steganography , whose goal is to make the embedded data completely undetectable. In addi- tion, we must dismiss the idea of hiding data by using any...therefore, an image has more room to hide data; and (2) speech steganography has not led to many money-making commercial businesses. For these two

  8. Speech Quality Measurement

    Science.gov (United States)

    1977-06-10

    noise test , t=2 for t1-v low p’ass f lit er te st ,and t 3 * or theit ADP(NI cod ing tevst ’*s is the sub lec nube 0l e tet Bostz- Av L b U0...a 1ý...it aepa rate, speech clu.1 t laboratory and controlled by the NOVA 830 computoer . Bach of tho stations has a CRT, .15 response buttons, a "rad button

  9. Binary Masking & Speech Intelligibility

    DEFF Research Database (Denmark)

    Boldt, Jesper

    The purpose of this thesis is to examine how binary masking can be used to increase intelligibility in situations where hearing impaired listeners have difficulties understanding what is being said. The major part of the experiments carried out in this thesis can be categorized as either experime...... mask using a directional system and a method for correcting errors in the target binary mask. The last part of the thesis, proposes a new method for objective evaluation of speech intelligibility....

  10. Tracing the emergence of categorical speech perception in the human auditory system.

    Science.gov (United States)

    Bidelman, Gavin M; Moreno, Sylvain; Alain, Claude

    2013-10-01

    Speech perception requires the effortless mapping from smooth, seemingly continuous changes in sound features into discrete perceptual units, a conversion exemplified in the phenomenon of categorical perception. Explaining how/when the human brain performs this acoustic-phonetic transformation remains an elusive problem in current models and theories of speech perception. In previous attempts to decipher the neural basis of speech perception, it is often unclear whether the alleged brain correlates reflect an underlying percept or merely changes in neural activity that covary with parameters of the stimulus. Here, we recorded neuroelectric activity generated at both cortical and subcortical levels of the auditory pathway elicited by a speech vowel continuum whose percept varied categorically from /u/ to /a/. This integrative approach allows us to characterize how various auditory structures code, transform, and ultimately render the perception of speech material as well as dissociate brain responses reflecting changes in stimulus acoustics from those that index true internalized percepts. We find that activity from the brainstem mirrors properties of the speech waveform with remarkable fidelity, reflecting progressive changes in speech acoustics but not the discrete phonetic classes reported behaviorally. In comparison, patterns of late cortical evoked activity contain information reflecting distinct perceptual categories and predict the abstract phonetic speech boundaries heard by listeners. Our findings demonstrate a critical transformation in neural speech representations between brainstem and early auditory cortex analogous to an acoustic-phonetic mapping necessary to generate categorical speech percepts. Analytic modeling demonstrates that a simple nonlinearity accounts for the transformation between early (subcortical) brain activity and subsequent cortical/behavioral responses to speech (>150-200 ms) thereby describing a plausible mechanism by which the

  11. Communication Studies of DMP and SMP Machines

    Science.gov (United States)

    Sohn, Andrew; Biswas, Rupak; Chancellor, Marisa K. (Technical Monitor)

    1997-01-01

    Understanding the interplay between machines and problems is key to obtaining high performance on parallel machines. This paper investigates the interplay between programming paradigms and communication capabilities of parallel machines. In particular, we explicate the communication capabilities of the IBM SP-2 distributed-memory multiprocessor and the SGI PowerCHALLENGEarray symmetric multiprocessor. Two benchmark problems of bitonic sorting and Fast Fourier Transform are selected for experiments. Communication-efficient algorithms are developed to exploit the overlapping capabilities of the machines. Programs are written in Message-Passing Interface for portability and identical codes are used for both machines. Various data sizes and message sizes are used to test the machines' communication capabilities. Experimental results indicate that the communication performance of the multiprocessors are consistent with the size of messages. The SP-2 is sensitive to message size but yields a much higher communication overlapping because of the communication co-processor. The PowerCHALLENGEarray is not highly sensitive to message size and yields a low communication overlapping. Bitonic sorting yields lower performance compared to FFT due to a smaller computation-to-communication ratio.

  12. Perspex machine: VII. The universal perspex machine

    Science.gov (United States)

    Anderson, James A. D. W.

    2006-01-01

    The perspex machine arose from the unification of projective geometry with the Turing machine. It uses a total arithmetic, called transreal arithmetic, that contains real arithmetic and allows division by zero. Transreal arithmetic is redefined here. The new arithmetic has both a positive and a negative infinity which lie at the extremes of the number line, and a number nullity that lies off the number line. We prove that nullity, 0/0, is a number. Hence a number may have one of four signs: negative, zero, positive, or nullity. It is, therefore, impossible to encode the sign of a number in one bit, as floating-point arithmetic attempts to do, resulting in the difficulty of having both positive and negative zeros and NaNs. Transrational arithmetic is consistent with Cantor arithmetic. In an extension to real arithmetic, the product of zero, an infinity, or nullity with its reciprocal is nullity, not unity. This avoids the usual contradictions that follow from allowing division by zero. Transreal arithmetic has a fixed algebraic structure and does not admit options as IEEE, floating-point arithmetic does. Most significantly, nullity has a simple semantics that is related to zero. Zero means "no value" and nullity means "no information." We argue that nullity is as useful to a manufactured computer as zero is to a human computer. The perspex machine is intended to offer one solution to the mind-body problem by showing how the computable aspects of mind and, perhaps, the whole of mind relates to the geometrical aspects of body and, perhaps, the whole of body. We review some of Turing's writings and show that he held the view that his machine has spatial properties. In particular, that it has the property of being a 7D lattice of compact spaces. Thus, we read Turing as believing that his machine relates computation to geometrical bodies. We simplify the perspex machine by substituting an augmented Euclidean geometry for projective geometry. This leads to a general

  13. Compressive Sensing in Speech from LPC using Gradient Projection for Sparse Reconstruction

    Directory of Open Access Journals (Sweden)

    Viral Modha

    2015-02-01

    Full Text Available This paper presents compressive sensing technique used for speech reconstruction using linear predictive coding because the speech is more sparse in LPC. DCT of a speech is taken and the DCT points of sparse speech are thrown away arbitrarily. This is achieved by making some point in DCT domain to be zero by multiplying with mask functions. From the incomplete points in DCT domain, the original speech is reconstructed using compressive sensing and the tool used is Gradient Projection for Sparse Reconstruction. The performance of the result is compared with direct IDCT subjectively. The experiment is done and it is observed that the performance is better for compressive sensing than the DCT.

  14. Speech Therapy Prevention in Kindergarten

    Directory of Open Access Journals (Sweden)

    Vašíková Jana

    2017-08-01

    Full Text Available Introduction: This contribution presents the results of a research focused on speech therapy in kindergartens. This research was realized in Zlín Region. It explains how speech therapy prevention is realized in kindergartens, determines the educational qualifications of teachers for this activity and verifies the quality of the applied methodologies in the daily program of kindergartens. Methods: The empirical part of the study was conducted through a qualitative research. For data collection, we used participant observation. We analyzed the research data and presented them verbally, using frequency tables and graphs, which were subsequently interpreted. Results: In this research, 71% of the teachers completed a course of speech therapy prevention, 28% of the teachers received pedagogical training and just 1% of the teachers are clinical speech pathologists. In spite of this, the research data show that, in most of kindergartens, the aim of speech therapy prevention is performed in order to correct deficiencies in speech and voice. The content of speech therapy prevention is implemented in this direction. Discussion: Awareness of the teachers’/parents’ regarding speech therapy prevention in kindergartens. Limitations: This research was implemented in autumn of 2016 in Zlín Region. Research data cannot be generalized to the entire population. We have the ambition to expand this research to other regions next year. Conclusions: Results show that both forms of speech therapy prevention - individual and group - are used. It is also often a combination of both. The aim of the individual forms is, in most cases, to prepare a child for cooperation during voice correction. The research also confirmed that most teachers do not have sufficient education in speech therapy. Most of them completed a course of speech therapy as primary prevention educators. The results also show that teachers spend a lot of time by speech therapy prevention in

  15. Abortion and compelled physician speech.

    Science.gov (United States)

    Orentlicher, David

    2015-01-01

    Informed consent mandates for abortion providers may infringe the First Amendment's freedom of speech. On the other hand, they may reinforce the physician's duty to obtain informed consent. Courts can promote both doctrines by ensuring that compelled physician speech pertains to medical facts about abortion rather than abortion ideology and that compelled speech is truthful and not misleading. © 2015 American Society of Law, Medicine & Ethics, Inc.

  16. Noise Reduction in Car Speech

    OpenAIRE

    V. Bolom

    2009-01-01

    This paper presents properties of chosen multichannel algorithms for speech enhancement in a noisy environment. These methods are suitable for hands-free communication in a car cabin. Criteria for evaluation of these systems are also presented. The criteria consider both the level of noise suppression and the level of speech distortion. The performance of multichannel algorithms is investigated for a mixed model of speech signals and car noise and for real signals recorded in a car. 

  17. Speech recognition in university classrooms

    OpenAIRE

    Wald, Mike; Bain, Keith; Basson, Sara H

    2002-01-01

    The LIBERATED LEARNING PROJECT (LLP) is an applied research project studying two core questions: 1) Can speech recognition (SR) technology successfully digitize lectures to display spoken words as text in university classrooms? 2) Can speech recognition technology be used successfully as an alternative to traditional classroom notetaking for persons with disabilities? This paper addresses these intriguing questions and explores the underlying complex relationship between speech recognition te...

  18. Noise Reduction in Car Speech

    Directory of Open Access Journals (Sweden)

    V. Bolom

    2009-01-01

    Full Text Available This paper presents properties of chosen multichannel algorithms for speech enhancement in a noisy environment. These methods are suitable for hands-free communication in a car cabin. Criteria for evaluation of these systems are also presented. The criteria consider both the level of noise suppression and the level of speech distortion. The performance of multichannel algorithms is investigated for a mixed model of speech signals and car noise and for real signals recorded in a car. 

  19. Good Codes From Generalised Algebraic Geometry Codes

    CERN Document Server

    Jibril, Mubarak; Ahmed, Mohammed Zaki; Tjhai, Cen

    2010-01-01

    Algebraic geometry codes or Goppa codes are defined with places of degree one. In constructing generalised algebraic geometry codes places of higher degree are used. In this paper we present 41 new codes over GF(16) which improve on the best known codes of the same length and rate. The construction method uses places of small degree with a technique originally published over 10 years ago for the construction of generalised algebraic geometry codes.

  20. Physics codes on parallel computers

    Energy Technology Data Exchange (ETDEWEB)

    Eltgroth, P.G.

    1985-12-04

    An effort is under way to develop physics codes which realize the potential of parallel machines. A new explicit algorithm for the computation of hydrodynamics has been developed which avoids global synchronization entirely. The approach, called the Independent Time Step Method (ITSM), allows each zone to advance at its own pace, determined by local information. The method, coded in FORTRAN, has demonstrated parallelism of greater than 20 on the Denelcor HEP machine. ITSM can also be used to replace current implicit treatments of problems involving diffusion and heat conduction. Four different approaches toward work distribution have been investigated and implemented for the one-dimensional code on the Denelcor HEP. They are ''self-scheduled'', an ASKFOR monitor, a ''queue of queues'' monitor, and a distributed ASKFOR monitor. The self-scheduled approach shows the lowest overhead but the poorest speedup. The distributed ASKFOR monitor shows the best speedup and the lowest execution times on the tested problems. 2 refs., 3 figs.