WorldWideScience

Sample records for voice recognition technology

  1. Voice Recognition Technology: Has It Come of Age?

    Directory of Open Access Journals (Sweden)

    Joseph R. Zumalt

    2005-12-01

    Full Text Available Voice recognition software allows computer users to bypass their keyboards and use their voices to enter text. While the library literature is somewhat silent about voice recognition technology, the medical and legal communities have reported some success using it. Voice recognition software was tested for dictation accuracy and usability within an agriculture library at the University of Illinois. Dragon NaturallySpeaking 8.0 was found to be more accurate than speech recognition within Microsoft Office 2003. Helpful Web sites and a short history regarding this breakthrough technology are included.

  2. Voice Recognition: A New Assessment Tool?

    Science.gov (United States)

    Jones, Darla

    2005-01-01

    This article presents the results of a study conducted in Anchorage, Alaska, that evaluated the accuracy and efficiency of using voice recognition (VR) technology to collect oral reading fluency data for classroom-based assessments. The primary research question was as follows: Is voice recognition technology a valid and reliable alternative to…

  3. Neural mechanisms for voice recognition

    NARCIS (Netherlands)

    Andics, A.V.; McQueen, J.M.; Petersson, K.M.; Gal, V.; Rudas, G.; Vidnyanszky, Z.

    2010-01-01

    We investigated neural mechanisms that support voice recognition in a training paradigm with fMRI. The same listeners were trained on different weeks to categorize the mid-regions of voice-morph continua as an individual's voice. Stimuli implicitly defined a voice-acoustics space, and training expli

  4. Literature Review of Voice Recognition and Generation Technology for Army Helicopter Applications.

    Science.gov (United States)

    1984-08-01

    support up this conclusion (Jay, 1981; Coler , 1983). Based upon the research presented, the following statements can be made: a. When flight control...dB) must be overcome by the voice recognizer ( Coler , 1983). 11 55i The effects of noise on voice recognition were the topic of a study performed at...noise when the subject was also required to perform a tracking task and enter data ( Coler , 1983). Performance was evaluated for three different

  5. FILTWAM and Voice Emotion Recognition

    NARCIS (Netherlands)

    Bahreini, Kiavash; Nadolski, Rob; Westera, Wim

    2014-01-01

    This paper introduces the voice emotion recognition part of our framework for improving learning through webcams and microphones (FILTWAM). This framework enables multimodal emotion recognition of learners during game-based learning. The main goal of this study is to validate the use of microphone d

  6. FILTWAM and Voice Emotion Recognition

    NARCIS (Netherlands)

    Bahreini, Kiavash; Nadolski, Rob; Westera, Wim

    2014-01-01

    This paper introduces the voice emotion recognition part of our framework for improving learning through webcams and microphones (FILTWAM). This framework enables multimodal emotion recognition of learners during game-based learning. The main goal of this study is to validate the use of microphone

  7. Voice congruency facilitates word recognition.

    Directory of Open Access Journals (Sweden)

    Sandra Campeanu

    Full Text Available Behavioral studies of spoken word memory have shown that context congruency facilitates both word and source recognition, though the level at which context exerts its influence remains equivocal. We measured event-related potentials (ERPs while participants performed both types of recognition task with words spoken in four voices. Two voice parameters (i.e., gender and accent varied between speakers, with the possibility that none, one or two of these parameters was congruent between study and test. Results indicated that reinstating the study voice at test facilitated both word and source recognition, compared to similar or no context congruency at test. Behavioral effects were paralleled by two ERP modulations. First, in the word recognition test, the left parietal old/new effect showed a positive deflection reflective of context congruency between study and test words. Namely, the same speaker condition provided the most positive deflection of all correctly identified old words. In the source recognition test, a right frontal positivity was found for the same speaker condition compared to the different speaker conditions, regardless of response success. Taken together, the results of this study suggest that the benefit of context congruency is reflected behaviorally and in ERP modulations traditionally associated with recognition memory.

  8. Practical applications of interactive voice technologies: Some accomplishments and prospects

    Science.gov (United States)

    Grady, Michael W.; Hicklin, M. B.; Porter, J. E.

    1977-01-01

    A technology assessment of the application of computers and electronics to complex systems is presented. Three existing systems which utilize voice technology (speech recognition and speech generation) are described. Future directions in voice technology are also described.

  9. WHEEL CHAIR USING VOICE RECOGNITION

    OpenAIRE

    Manish Kumar Yadav*; Rajat Kumar; Santosh Yadav; Ravindra Prajapati; Prof. Kshirsagar

    2016-01-01

    The wide spread prevalence of lost limbs and sensing system is of major concern in present day due to wars, accident, age and health problems. This Omni-directional wheelchair was designed for the less able elderly to move more flexibly in narrow spaces, such as elevators or small aisle. The wheelchair is developed to help disabled patients by using speech recognition system to control the movement of wheelchair in different directions by using voice commands and also the simple movement of t...

  10. Human voice recognition depends on language ability.

    Science.gov (United States)

    Perrachione, Tyler K; Del Tufo, Stephanie N; Gabrieli, John D E

    2011-07-29

    The ability to recognize people by their voice is an important social behavior. Individuals differ in how they pronounce words, and listeners may take advantage of language-specific knowledge of speech phonology to facilitate recognizing voices. Impaired phonological processing is characteristic of dyslexia and thought to be a basis for difficulty in learning to read. We tested voice-recognition abilities of dyslexic and control listeners for voices speaking listeners' native language or an unfamiliar language. Individuals with dyslexia exhibited impaired voice-recognition abilities compared with controls only for voices speaking their native language. These results demonstrate the importance of linguistic representations for voice recognition. Humans appear to identify voices by making comparisons between talkers' pronunciations of words and listeners' stored abstract representations of the sounds in those words.

  11. Voice Recognition in Face-Blind Patients.

    Science.gov (United States)

    Liu, Ran R; Pancaroglu, Raika; Hills, Charlotte S; Duchaine, Brad; Barton, Jason J S

    2016-04-01

    Right or bilateral anterior temporal damage can impair face recognition, but whether this is an associative variant of prosopagnosia or part of a multimodal disorder of person recognition is an unsettled question, with implications for cognitive and neuroanatomic models of person recognition. We assessed voice perception and short-term recognition of recently heard voices in 10 subjects with impaired face recognition acquired after cerebral lesions. All 4 subjects with apperceptive prosopagnosia due to lesions limited to fusiform cortex had intact voice discrimination and recognition. One subject with bilateral fusiform and anterior temporal lesions had a combined apperceptive prosopagnosia and apperceptive phonagnosia, the first such described case. Deficits indicating a multimodal syndrome of person recognition were found only in 2 subjects with bilateral anterior temporal lesions. All 3 subjects with right anterior temporal lesions had normal voice perception and recognition, 2 of whom performed normally on perceptual discrimination of faces. This confirms that such lesions can cause a modality-specific associative prosopagnosia.

  12. Electrolarynx Voice Recognition Utilizing Pulse Coupled Neural Network

    Directory of Open Access Journals (Sweden)

    Fatchul Arifin

    2010-08-01

    Full Text Available The laryngectomies patient has no ability to speak normally because their vocal chords have been removed. The easiest option for the patient to speak again is by using electrolarynx speech. This tool is placed on the lower chin. Vibration of the neck while speaking is used to produce sound. Meanwhile, the technology of "voice recognition" has been growing very rapidly. It is expected that the technology of "voice recognition" can also be used by laryngectomies patients who use electrolarynx.This paper describes a system for electrolarynx speech recognition. Two main parts of the system are feature extraction and pattern recognition. The Pulse Coupled Neural Network – PCNN is used to extract the feature and characteristic of electrolarynx speech. Varying of β (one of PCNN parameter also was conducted. Multi layer perceptron is used to recognize the sound patterns. There are two kinds of recognition conducted in this paper: speech recognition and speaker recognition. The speech recognition recognizes specific speech from every people. Meanwhile, speaker recognition recognizes specific speech from specific person. The system ran well. The "electrolarynx speech recognition" has been tested by recognizing of “A” and "not A" voice. The results showed that the system had 94.4% validation. Meanwhile, the electrolarynx speaker recognition has been tested by recognizing of “saya” voice from some different speakers. The results showed that the system had 92.2% validation. Meanwhile, the best β parameter of PCNN for electrolarynx recognition is 3.

  13. Evaluation of an Intelligent Assistive Technology for Voice Navigation of Spreadsheets

    CERN Document Server

    Flood, Derek; Caffery, Fergal Mc; Bishop, Brian

    2008-01-01

    An integral part of spreadsheet auditing is navigation. For sufferers of Repetitive Strain Injury who need to use voice recognition technology this navigation can be highly problematic. To counter this the authors have developed an intelligent voice navigation system, iVoice, which replicates common spreadsheet auditing behaviours through simple voice commands. This paper outlines the iVoice system and summarizes the results of a study to evaluate iVoice when compared to a leading voice recognition technology.

  14. Voice recognition software for clinical use.

    Science.gov (United States)

    Korn, K

    1998-11-01

    The current generation voice recognition products truly offer the promise of voice recognition systems, that are financially and operationally acceptable for use in a health care facility. Although the initial capital outlay for the purchase of such equipment may be substantial, the long-term benefit is felt to outweigh the expense. The ability to utilize computer equipment for educational purposes and information management alone helps to rationalize the cost. In addition, it is important to remember that the Internet has become a substantial source of information which provides another functional use for this equipment. Although one can readily see the implication for such a program in clinical practice, other uses for the program should not be overlooked. Uses far beyond the writing of clinic notes and correspondence can be easily envisioned. Utilization of voice recognition software offers clinical practices the ability to produce quality printed records in a timely and cost-effective manner. After learning procedures for the selected product and appropriately formatting word processing software and printers, printed progress notes should be able to be produced in less time than traditional dictation and transcription methods. Although certain procedures and practices may need to be altered, or may preclude optimal utilization of this type of system, many advantages are apparent. It is recommended that facilities consider utilization of Voice Recognition products such as Dragon Systems Naturally Speaking Software, or at least consider a trial of this method with one of the limited-feature products, if current dictation practices are unsatisfactory or excessively costly. Free downloadable trial software or single user software can provide a reduced-cost method for trial evaluation of such products if a major commitment is not felt to be desired. A list of voice recognition software manufacturer web sites may be accessed through the following: http

  15. Implicit multisensory associations influence voice recognition.

    Directory of Open Access Journals (Sweden)

    Katharina von Kriegstein

    2006-10-01

    Full Text Available Natural objects provide partially redundant information to the brain through different sensory modalities. For example, voices and faces both give information about the speech content, age, and gender of a person. Thanks to this redundancy, multimodal recognition is fast, robust, and automatic. In unimodal perception, however, only part of the information about an object is available. Here, we addressed whether, even under conditions of unimodal sensory input, crossmodal neural circuits that have been shaped by previous associative learning become activated and underpin a performance benefit. We measured brain activity with functional magnetic resonance imaging before, while, and after participants learned to associate either sensory redundant stimuli, i.e. voices and faces, or arbitrary multimodal combinations, i.e. voices and written names, ring tones, and cell phones or brand names of these cell phones. After learning, participants were better at recognizing unimodal auditory voices that had been paired with faces than those paired with written names, and association of voices with faces resulted in an increased functional coupling between voice and face areas. No such effects were observed for ring tones that had been paired with cell phones or names. These findings demonstrate that brief exposure to ecologically valid and sensory redundant stimulus pairs, such as voices and faces, induces specific multisensory associations. Consistent with predictive coding theories, associative representations become thereafter available for unimodal perception and facilitate object recognition. These data suggest that for natural objects effective predictive signals can be generated across sensory systems and proceed by optimization of functional connectivity between specialized cortical sensory modules.

  16. Machine Recognition vs Human Recognition of Voices

    Science.gov (United States)

    2012-05-01

    recognized. The accuracy of speaker recognition for disyllables was 87%. For monosyllables, it was 81%, consonant- vowel excerpts were 63%, and... vowel excerpts were 56%. Thus, they demonstrated that the identification performance decreased as the number of phonemes decreased. In [2], the...will still sound natural and the performance of listeners could be tied directly to the degradation of particular frequencies. If the performance

  17. Pegembangan Game dengan Menggunakan Teknologi Voice Recognition Berbasis Android

    Directory of Open Access Journals (Sweden)

    Franky Hadinata Marpaung

    2014-06-01

    Full Text Available The purpose of this research is to create a new kind of game by using technology that rarely used in current games. It is developed as an entertainment media and also a social media in which the users can play the games together via multiplayer mode. This research uses Scrum development method since it supports small scaled developer and it supports software increment along the development. Using this game application, the users can play and watch interesting animations by controlling it with their voice, listen the character imitating the users’ voice, play various mini games both in single player or multiplayer mode via Bluetooth connection. The conclusion is that game application of My Name is Dug use voice recognition and inter-devices connection as its main features. It also has various mini games that support both single player and multiplayer.

  18. Improving Speaker Recognition by Biometric Voice Deconstruction

    Directory of Open Access Journals (Sweden)

    Luis Miguel eMazaira-Fernández

    2015-09-01

    Full Text Available Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g. YouTube to broadcast its message. In this new scenario, classical identification methods (such fingerprints or face recognition have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. Through the present paper, a new methodology to characterize speakers will be shown. This methodology is benefiting from the advances achieved during the last years in understanding and modelling voice production. The paper hypothesizes that a gender dependent characterization of speakers combined with the use of a new set of biometric parameters extracted from the components resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract gender-dependent extended biometric parameters are given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions.

  19. Improving Speaker Recognition by Biometric Voice Deconstruction.

    Science.gov (United States)

    Mazaira-Fernandez, Luis Miguel; Álvarez-Marquina, Agustín; Gómez-Vilda, Pedro

    2015-01-01

    Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g., YouTube) to broadcast its message. In this new scenario, classical identification methods (such as fingerprints or face recognition) have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. The present study benefits from the advances achieved during last years in understanding and modeling voice production. The paper hypothesizes that a gender-dependent characterization of speakers combined with the use of a set of features derived from the components, resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract the gender-dependent extended biometric parameters is given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions.

  20. Secure Recognition of Voice-Less Commands Using Videos

    Science.gov (United States)

    Yau, Wai Chee; Kumar, Dinesh Kant; Weghorn, Hans

    Interest in voice recognition technologies for internet applications is growing due to the flexibility of speech-based communication. The major drawback with the use of sound for internet access with computers is that the commands will be audible to other people in the vicinity. This paper examines a secure and voice-less method for recognition of speech-based commands using video without evaluating sound signals. The proposed approach represents mouth movements in the video data using 2D spatio-temporal templates (STT). Zernike moments (ZM) are computed from STT and fed into support vector machines (SVM) to be classified into one of the utterances. The experimental results demonstrate that the proposed technique produces a high accuracy of 98% in a phoneme classification task. The proposed technique is demonstrated to be invariant to global variations of illumination level. Such a system is useful for securely interpreting user commands for internet applications on mobile devices.

  1. Temporal voice areas exist in autism spectrum disorder but are dysfunctional for voice identity recognition

    Science.gov (United States)

    Borowiak, Kamila; von Kriegstein, Katharina

    2016-01-01

    The ability to recognise the identity of others is a key requirement for successful communication. Brain regions that respond selectively to voices exist in humans from early infancy on. Currently, it is unclear whether dysfunction of these voice-sensitive regions can explain voice identity recognition impairments. Here, we used two independent functional magnetic resonance imaging studies to investigate voice processing in a population that has been reported to have no voice-sensitive regions: autism spectrum disorder (ASD). Our results refute the earlier report that individuals with ASD have no responses in voice-sensitive regions: Passive listening to vocal, compared to non-vocal, sounds elicited typical responses in voice-sensitive regions in the high-functioning ASD group and controls. In contrast, the ASD group had a dysfunction in voice-sensitive regions during voice identity but not speech recognition in the right posterior superior temporal sulcus/gyrus (STS/STG)—a region implicated in processing complex spectrotemporal voice features and unfamiliar voices. The right anterior STS/STG correlated with voice identity recognition performance in controls but not in the ASD group. The findings suggest that right STS/STG dysfunction is critical for explaining voice recognition impairments in high-functioning ASD and show that ASD is not characterised by a general lack of voice-sensitive responses. PMID:27369067

  2. Building Domain Specific Languages for Voice Recognition Applications

    Directory of Open Access Journals (Sweden)

    Cristian IONITA

    2008-01-01

    Full Text Available This paper presents a method of implementing the voice recognition for the control of software applications. The solutions proposed are based on transforming a subset of the natural language in commands recognized by the application using a formal language defined by the means of a context free grammar. At the end of the paper is presented the modality of integration of voice recognition and of voice synthesis for the Romanian language in Windows applications.

  3. Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques

    CERN Document Server

    Muda, Lindasalwa; Elamvazuthi, I

    2010-01-01

    Digital processing of speech signal and voice recognition algorithm is very important for fast and accurate automatic voice recognition technology. The voice is a signal of infinite information. A direct analysis and synthesizing the complex voice signal is due to too much information contained in the signal. Therefore the digital signal processes such as Feature Extraction and Feature Matching are introduced to represent the voice signal. Several methods such as Liner Predictive Predictive Coding (LPC), Hidden Markov Model (HMM), Artificial Neural Network (ANN) and etc are evaluated with a view to identify a straight forward and effective method for voice signal. The extraction and matching process is implemented right after the Pre Processing or filtering signal is performed. The non-parametric method for modelling the human auditory perception system, Mel Frequency Cepstral Coefficients (MFCCs) are utilize as extraction techniques. The non linear sequence alignment known as Dynamic Time Warping (DTW) intro...

  4. Familiar Person Recognition: Is Autonoetic Consciousness More Likely to Accompany Face Recognition Than Voice Recognition?

    Science.gov (United States)

    Barsics, Catherine; Brédart, Serge

    2010-11-01

    Autonoetic consciousness is a fundamental property of human memory, enabling us to experience mental time travel, to recollect past events with a feeling of self-involvement, and to project ourselves in the future. Autonoetic consciousness is a characteristic of episodic memory. By contrast, awareness of the past associated with a mere feeling of familiarity or knowing relies on noetic consciousness, depending on semantic memory integrity. Present research was aimed at evaluating whether conscious recollection of episodic memories is more likely to occur following the recognition of a familiar face than following the recognition of a familiar voice. Recall of semantic information (biographical information) was also assessed. Previous studies that investigated the recall of biographical information following person recognition used faces and voices of famous people as stimuli. In this study, the participants were presented with personally familiar people's voices and faces, thus avoiding the presence of identity cues in the spoken extracts and allowing a stricter control of frequency exposure with both types of stimuli (voices and faces). In the present study, the rate of retrieved episodic memories, associated with autonoetic awareness, was significantly higher from familiar faces than familiar voices even though the level of overall recognition was similar for both these stimuli domains. The same pattern was observed regarding semantic information retrieval. These results and their implications for current Interactive Activation and Competition person recognition models are discussed.

  5. Enhancing nursing practice by utilizing voice recognition for direct documentation.

    Science.gov (United States)

    Fratzke, Jason; Tucker, Sharon; Shedenhelm, Heidi; Arnold, Jackie; Belda, Tom; Petera, Michael

    2014-02-01

    Innovative strategies that preserve nursing time for direct patient care activities are needed. This study examined the utility, feasibility, and acceptability of voice recognition (VR) software to document nursing care and patient outcomes in an electronic health record in a simulated nursing care environment. A phase 1 trial included 5 iterative experiments with observations and nurse participant feedback to allow enhancements to the speech detection capabilities and refinement of the technology, software, and processes. Utility ratings improved over time; however, interference on nursing care remained a concern throughout. Nurse participants favored keyboard entry electronic health record, largely due to software and technical issues, but also relative to the culture shift the new technology brings to nursing practice. Successful adoption of VR technology by nursing will be dependent on receptiveness of the nurses and perceived benefits, timely access to education and training, and minimization of barriers to using the software.

  6. Exploring expressivity and emotion with artificial voice and speech technologies.

    Science.gov (United States)

    Pauletto, Sandra; Balentine, Bruce; Pidcock, Chris; Jones, Kevin; Bottaci, Leonardo; Aretoulaki, Maria; Wells, Jez; Mundy, Darren P; Balentine, James

    2013-10-01

    Emotion in audio-voice signals, as synthesized by text-to-speech (TTS) technologies, was investigated to formulate a theory of expression for user interface design. Emotional parameters were specified with markup tags, and the resulting audio was further modulated with post-processing techniques. Software was then developed to link a selected TTS synthesizer with an automatic speech recognition (ASR) engine, producing a chatbot that could speak and listen. Using these two artificial voice subsystems, investigators explored both artistic and psychological implications of artificial speech emotion. Goals of the investigation were interdisciplinary, with interest in musical composition, augmentative and alternative communication (AAC), commercial voice announcement applications, human-computer interaction (HCI), and artificial intelligence (AI). The work-in-progress points towards an emerging interdisciplinary ontology for artificial voices. As one study output, HCI tools are proposed for future collaboration.

  7. Noise Robust Speech Recognition Applied to Voice-Driven Wheelchair

    Science.gov (United States)

    Sasou, Akira; Kojima, Hiroaki

    2009-12-01

    Conventional voice-driven wheelchairs usually employ headset microphones that are capable of achieving sufficient recognition accuracy, even in the presence of surrounding noise. However, such interfaces require users to wear sensors such as a headset microphone, which can be an impediment, especially for the hand disabled. Conversely, it is also well known that the speech recognition accuracy drastically degrades when the microphone is placed far from the user. In this paper, we develop a noise robust speech recognition system for a voice-driven wheelchair. This system can achieve almost the same recognition accuracy as the headset microphone without wearing sensors. We verified the effectiveness of our system in experiments in different environments, and confirmed that our system can achieve almost the same recognition accuracy as the headset microphone without wearing sensors.

  8. When the face fits: recognition of celebrities from matching and mismatching faces and voices.

    Science.gov (United States)

    Stevenage, Sarah V; Neil, Greg J; Hamlin, Iain

    2014-01-01

    The results of two experiments are presented in which participants engaged in a face-recognition or a voice-recognition task. The stimuli were face-voice pairs in which the face and voice were co-presented and were either "matched" (same person), "related" (two highly associated people), or "mismatched" (two unrelated people). Analysis in both experiments confirmed that accuracy and confidence in face recognition was consistently high regardless of the identity of the accompanying voice. However accuracy of voice recognition was increasingly affected as the relationship between voice and accompanying face declined. Moreover, when considering self-reported confidence in voice recognition, confidence remained high for correct responses despite the proportion of these responses declining across conditions. These results converged with existing evidence indicating the vulnerability of voice recognition as a relatively weak signaller of identity, and results are discussed in the context of a person-recognition framework.

  9. The Neuropsychology of Familiar Person Recognition from Face and Voice

    OpenAIRE

    2014-01-01

    Prosopagnosia has been considered for a long period of time as the most important and almost exclusive disorder in the recognition of familiar people. In recent years, however, this conviction has been undermined by the description of patients showing a concomitant defect in the recognition of familiar faces and voices as a consequence of lesions encroaching upon the right anterior temporal lobe (ATL). These new data have obliged researchers to reconsider on one hand the construct of ‘associa...

  10. Emotional Recognition in Autism Spectrum Conditions from Voices and Faces

    Science.gov (United States)

    Stewart, Mary E.; McAdam, Clair; Ota, Mitsuhiko; Peppe, Sue; Cleland, Joanne

    2013-01-01

    The present study reports on a new vocal emotion recognition task and assesses whether people with autism spectrum conditions (ASC) perform differently from typically developed individuals on tests of emotional identification from both the face and the voice. The new test of vocal emotion contained trials in which the vocal emotion of the sentence…

  11. Voice recognition software can be used for scientific articles

    DEFF Research Database (Denmark)

    Pommergaard, Hans-Christian; Huang, Chenxi; Burcharth, Jacob;

    2015-01-01

    INTRODUCTION: Dictation of scientific articles has been recognised as an efficient method for producing high-quality, first article drafts. However, standardised transcription service by a secretary may not be available for all researchers and voice recognition software (VRS) may therefore...

  12. Native voice, self-concept and the moral case for personalized voice technology.

    Science.gov (United States)

    Nathanson, Esther

    2017-01-01

    Purpose (1) To explore the role of native voice and effects of voice loss on self-concept and identity, and survey the state of assistive voice technology; (2) to establish the moral case for developing personalized voice technology. Methods This narrative review examines published literature on the human significance of voice, the impact of voice loss on self-concept and identity, and the strengths and limitations of current voice technology. Based on the impact of voice loss on self and identity, and voice technology limitations, the moral case for personalized voice technology is developed. Results Given the richness of information conveyed by voice, loss of voice constrains expression of the self, but the full impact is poorly understood. Augmentative and alternative communication (AAC) devices facilitate communication but, despite advances in this field, voice output cannot yet express the unique nuances of individual voice. The ethical principles of autonomy, beneficence and equality of opportunity establish the moral responsibility to invest in accessible, cost-effective, personalized voice technology. Conclusions Although further research is needed to elucidate the full effects of voice loss on self-concept, identity and social functioning, current understanding of the profoundly negative impact of voice loss establishes the moral case for developing personalized voice technology. Implications for Rehabilitation Rehabilitation of voice-disordered patients should facilitate self-expression, interpersonal connectedness and social/occupational participation. Proactive questioning about the psychological and social experiences of patients with voice loss is a valuable entry point for rehabilitation planning. Personalized voice technology would enhance sense of self, communicative participation and autonomy and promote shared healthcare decision-making. Further research is needed to identify the best strategies to preserve and strengthen identity and sense of

  13. Acoustic cues for the recognition of self-voice and other-voice

    Directory of Open Access Journals (Sweden)

    Mingdi eXu

    2013-10-01

    Full Text Available Self-recognition, being indispensable for successful social communication, has become a major focus in current social neuroscience. The physical aspects of the self are most typically manifested in the face and voice. Compared with the wealth of studies on self-face recognition, self-voice recognition (SVR has not gained much attention. Converging evidence has suggested that the fundamental frequency (F0 and formant structures serve as the key acoustic cues for other-voice recognition (OVR. However, little is known about which, and how, acoustic cues are utilized for SVR as opposed to OVR. To address this question, we independently manipulated the F0 and formant information of recorded voices and investigated their contributions to SVR and OVR. Japanese participants were presented with recorded vocal stimuli and were asked to identify the speaker—either themselves or one of their peers. Six groups of 5 peers of the same sex participated in the study. Under conditions where the formant information was fully preserved and where only the frequencies lower than the third formant (F3 were retained, accuracies of SVR deteriorated significantly with the modulation of the F0, and the results were comparable for OVR. By contrast, under a condition where only the frequencies higher than F3 were retained, the accuracy of SVR was significantly higher than that of OVR throughout the range of F0 modulations, and the F0 scarcely affected the accuracies of SVR and OVR. Our results indicate that while both F0 and formant information are involved in SVR, as well as in OVR, the advantage of SVR is manifested only when major formant information for speech intelligibility is absent. These findings imply the robustness of self-voice representation, possibly by virtue of auditory familiarity and other factors such as its association with motor/articulatory representation.

  14. The Neuropsychology of Familiar Person Recognition from Face and Voice

    Directory of Open Access Journals (Sweden)

    Guido Gainotti

    2014-05-01

    Full Text Available Prosopagnosia has been considered for a long period of time as the most important and almost exclusive disorder in the recognition of familiar people. In recent years, however, this conviction has been undermined by the description of patients showing a concomitant defect in the recognition of familiar faces and voices as a consequence of lesions encroaching upon the right anterior temporal lobe (ATL. These new data have obliged researchers to reconsider on one hand the construct of ‘associative prosopagnosia’ and on the other hand current models of people recognition. A systematic review of the patterns of familiar people recognition disorders observed in patients with right and left ATL lesions has shown that in patients with right ATL lesions face familiarity feelings and the retrieval of person-specific semantic information from faces are selectively affected, whereas in patients with left ATL lesions the defect selectively concerns famous people naming. Furthermore, some patients with right ATL lesions and intact face familiarity feelings show a defect in the retrieval of person-specific semantic knowledge greater from face than from name. These data are at variance with current models assuming: (a that familiarity feelings are generated at the level of person identity nodes (PINs where information processed by various sensory modalities converge, and (b that PINs provide a modality-free gateway to a single semantic system, where information about people is stored in an amodal format. They suggest, on the contrary: (a that familiarity feelings are generated at the level of modality-specific recognition units; (b that face and voice recognition units are represented more in the right than in the left ATLs; (c that in the right ATL are mainly stored person-specific information based on a convergence of perceptual information, whereas in the left ATLs are represented verbally-mediated person-specific information.

  15. Voice Technology Using Personal Computers.

    Science.gov (United States)

    1987-01-01

    PROGRA -.. NDC -A85 -AiSG 743 VOICE TECNOLOGY USING PERSONAL COMPUTE3SI(U) alit FORCE 212 INST OF TECH MRIGJ4T-PATTERSON RFB ON G L TALBOT 1987...Inline( $2E/$C6/$06/ Int24Err $Ol/$50/$89/$F8/$2E/$A2/ Int24ErrCode /$58/$BO/$OO/$89/$EC/$5D/$CF); - 150 - i..r - l. .. { Turbo: PUSH BP save...caller’s stack frame MOV BP,SP Set up this procedure’s stack frame PUSH BP ? Inline: MOV BYTE CS:[INT24Err],I Set INT24Err to True PUSH AX MOV AX,DI Get INT

  16. Voice recognition software can be used for scientific articles

    DEFF Research Database (Denmark)

    Pommergaard, Hans-Christian; Huang, Chenxi; Burcharth, Jacob

    2015-01-01

    INTRODUCTION: Dictation of scientific articles has been recognised as an efficient method for producing high-quality, first article drafts. However, standardised transcription service by a secretary may not be available for all researchers and voice recognition software (VRS) may therefore...... be an alternative. The purpose of this study was to evaluate the out-of-the-box accuracy of VRS. METHODS: Eleven young researchers without dictation experience dictated the first draft of their own scientific article after thorough preparation according to a pre-defined schedule. The dictate transcribed by VRS...

  17. New Voices: Communication through Technology.

    Science.gov (United States)

    Exceptional Parent, 1983

    1983-01-01

    The article recalls working with young severely disabled children able to communicate only through eye movements or specially developed communication boards. These children could now be helped by sophisticated computerized technology, standard and specialized forms of which are described. The influence of positioning, portability, and other…

  18. Voice recognition software can be used for scientific articles

    DEFF Research Database (Denmark)

    Pommergaard, Hans-Christian; Huang, Chenxi; Burcharth, Jacob;

    2015-01-01

    INTRODUCTION: Dictation of scientific articles has been recognised as an efficient method for producing high-quality, first article drafts. However, standardised transcription service by a secretary may not be available for all researchers and voice recognition software (VRS) may therefore...... be an alternative. The purpose of this study was to evaluate the out-of-the-box accuracy of VRS. METHODS: Eleven young researchers without dictation experience dictated the first draft of their own scientific article after thorough preparation according to a pre-defined schedule. The dictate transcribed by VRS...... was compared with the same dictate transcribed by an experienced research secretary, and the effect of adding words to the vocabulary of the VRS was investigated. The number of errors per hundred words was used as outcome. Furthermore, three experienced researchers assessed the subjective readability using...

  19. Embodied Transcription: A Creative Method for Using Voice-Recognition Software

    Science.gov (United States)

    Brooks, Christine

    2010-01-01

    Voice-recognition software is designed to be used by one user (voice) at a time, requiring a researcher to speak all of the words of a recorded interview to achieve transcription. Thus, the researcher becomes a conduit through which interview material is inscribed as written word. Embodied Transcription acknowledges performative and interpretative…

  20. Impact of PACS and Voice-Recognition Reporting on the Education of Radiology Residents

    OpenAIRE

    Gutierrez, Antonio J.; Mullins, Mark E.; Robert A. Novelline

    2005-01-01

    Rationale and Objectives: The introduction of picture archiving and communication system (PACS) has decreased the time needed to interpret radiology examinations resulting in an increased workflow. Because of concerns that the increase in exam throughput and the use of voice recognition may have a negative impact upon radiology resident education, a survey was conducted to assess the impact of PACS and voice recognition. Materials and Methods: Residents at four diagnostic radiology training p...

  1. Superior voice recognition in a patient with acquired prosopagnosia and object agnosia.

    Science.gov (United States)

    Hoover, Adria E N; Démonet, Jean-François; Steeves, Jennifer K E

    2010-11-01

    Anecdotally, it has been reported that individuals with acquired prosopagnosia compensate for their inability to recognize faces by using other person identity cues such as hair, gait or the voice. Are they therefore superior at the use of non-face cues, specifically voices, to person identity? Here, we empirically measure person and object identity recognition in a patient with acquired prosopagnosia and object agnosia. We quantify person identity (face and voice) and object identity (car and horn) recognition for visual, auditory, and bimodal (visual and auditory) stimuli. The patient is unable to recognize faces or cars, consistent with his prosopagnosia and object agnosia, respectively. He is perfectly able to recognize people's voices and car horns and bimodal stimuli. These data show a reverse shift in the typical weighting of visual over auditory information for audiovisual stimuli in a compromised visual recognition system. Moreover, the patient shows selectively superior voice recognition compared to the controls revealing that two different stimulus domains, persons and objects, may not be equally affected by sensory adaptation effects. This also implies that person and object identity recognition are processed in separate pathways. These data demonstrate that an individual with acquired prosopagnosia and object agnosia can compensate for the visual impairment and become quite skilled at using spared aspects of sensory processing. In the case of acquired prosopagnosia it is advantageous to develop a superior use of voices for person identity recognition in everyday life. Copyright © 2010 Elsevier Ltd. All rights reserved.

  2. Voice recognition software can be used for scientific articles.

    Science.gov (United States)

    Pommergaard, Hans-Christian; Huang, Chenxi; Burcharth, Jacob; Rosenberg, Jacob

    2015-02-01

    Dictation of scientific articles has been recognised as an efficient method for producing high-quality, first article drafts. However, standardised transcription service by a secretary may not be available for all researchers and voice recognition software (VRS) may therefore be an alternative. The purpose of this study was to evaluate the out-of-the-box accuracy of VRS. Eleven young researchers without dictation experience dictated the first draft of their own scientific article after thorough preparation according to a pre-defined schedule. The dictate transcribed by VRS was compared with the same dictate transcribed by an experienced research secretary, and the effect of adding words to the vocabulary of the VRS was investigated. The number of errors per hundred words was used as outcome. Furthermore, three experienced researchers assessed the subjective readability using a Likert scale (0-10). Dragon Nuance Premium version 12.5 was used as VRS. The median number of errors per hundred words was 18 (range: 8.5-24.3), which improved when 15,000 words were added to the vocabulary. Subjective readability assessment showed that the texts were understandable with a median score of five (range: 3-9), which was improved with the addition of 5,000 words. The out-of-the-box performance of VRS was acceptable and improved after additional words were added. Further studies are needed to investigate the effect of additional software accuracy training.

  3. Who gets credit for input? Demographic and structural status cues in voice recognition.

    Science.gov (United States)

    Howell, Taeya M; Harrison, David A; Burris, Ethan R; Detert, James R

    2015-11-01

    The authors investigate the employee features that, alongside overall voice expression, affect supervisors' voice recognition. Drawing primarily from status characteristics and network position theories, the authors propose and find in a study of 693 employees from 89 different credit union units that supervisors are more likely to credit those reporting the same amount of voice if the employees have higher ascribed or assigned (by the organization) status--cued by demographic variables such as majority ethnicity and full-time work hours. Further, supervisors are more likely to recognize voice from employees who have higher achieved status--cued by their centrality in informal social structures. The authors also find that even when certain groups of lower status employees speak up more, they cannot compensate for the negative effect of their demographic membership on voice recognition by their boss. The authors underscore how recognition of employee voice by supervisors matters for employees. It carries (mediates) the effects of voice expression and status onto performance evaluations 1 year later, which means that demographic differences in the assignment of credit for voice can serve as an implicit pathway for discrimination.

  4. Voice Assessment of Student Work: Recent Studies and Emerging Technologies

    Science.gov (United States)

    Eckhouse, Barry; Carroll, Rebecca

    2013-01-01

    Although relatively little attention has been given to the voice assessment of student work, at least when compared with more traditional forms of text-based review, the attention it has received strongly points to a promising form of review that has been hampered by the limits of an emerging technology. A fresh review of voice assessment in light…

  5. Voice identity recognition: functional division of the right STS and its behavioral relevance.

    Science.gov (United States)

    Schall, Sonja; Kiebel, Stefan J; Maess, Burkhard; von Kriegstein, Katharina

    2015-02-01

    The human voice is the primary carrier of speech but also a fingerprint for person identity. Previous neuroimaging studies have revealed that speech and identity recognition is accomplished by partially different neural pathways, despite the perceptual unity of the vocal sound. Importantly, the right STS has been implicated in voice processing, with different contributions of its posterior and anterior parts. However, the time point at which vocal and speech processing diverge is currently unknown. Also, the exact role of the right STS during voice processing is so far unclear because its behavioral relevance has not yet been established. Here, we used the high temporal resolution of magnetoencephalography and a speech task control to pinpoint transient behavioral correlates: we found, at 200 msec after stimulus onset, that activity in right anterior STS predicted behavioral voice recognition performance. At the same time point, the posterior right STS showed increased activity during voice identity recognition in contrast to speech recognition whereas the left mid STS showed the reverse pattern. In contrast to the highly speech-sensitive left STS, the current results highlight the right STS as a key area for voice identity recognition and show that its anatomical-functional division emerges around 200 msec after stimulus onset. We suggest that this time point marks the speech-independent processing of vocal sounds in the posterior STS and their successful mapping to vocal identities in the anterior STS.

  6. Voicing the Technological Body. Some Musicological Reflections on Combinations of Voice and Technology in Popular Music

    Directory of Open Access Journals (Sweden)

    Florian Heesch

    2016-05-01

    Full Text Available The article deals with interrelations of voice, body and technology in popular music from a musicological perspective. It is an attempt to outline a systematic approach to the history of music technology with regard to aesthetic aspects, taking the identity of the singing subject as a main point of departure for a hermeneutic reading of popular song. Although the argumentation is based largely on musicological research, it is also inspired by the notion of presentness as developed by theologian and media scholar Walter Ong. The variety of the relationships between voice, body, and technology with regard to musical representations of identity, in particular gender and race, is systematized alongside the following cagories: (1 the “absence of the body,” that starts with the establishment of phonography; (2 “amplified presence,” as a signifier for uses of the microphone to enhance low sounds in certain manners; and (3 “hybridity,” including vocal identities that blend human body sounds and technological processing, whereby special focus is laid on uses of the vocoder and similar technologies.

  7. Gesture Recognition Technology: A Review

    Directory of Open Access Journals (Sweden)

    PALLAVI HALARNKAR

    2012-11-01

    Full Text Available Gesture Recognition Technology has evolved greatly over the years. The past has seen the contemporary Human – Computer Interface techniques and their drawbacks, which limit the speed and naturalness of the human brain and body. As a result gesture recognition technology has developed since the early 1900s with a view to achieving ease and lessening the dependence on devices like keyboards, mice and touchscreens. Attempts have been made to combine natural gestures to operate with the technology around us to enable us to make optimum use of our body gestures making our work faster and more human friendly. The present has seen huge development in this field ranging from devices like virtual keyboards, video game controllers to advanced security systems which work on face, hand and body recognition techniques. The goal is to make full use of themovements of the body and every angle made by the parts of the body in order to supplement technology to become human friendly and understand natural human behavior and gestures. The future of this technology is very bright with prototypes of amazing devices in research and development to make the world equipped with digital information at hand whenever and wherever required.

  8. Ability for voice recognition is a marker for dyslexia in children.

    Science.gov (United States)

    Perea, Manuel; Jiménez, María; Suárez-Coalla, Paz; Fernández, Nohemí; Viña, Cecilia; Cuetos, Fernando

    2014-01-01

    A recent voice recognition experiment conducted by Perrachione, Del Tufo, and Gabrieli (2011) revealed that, in normal adult readers, the accuracy at identifying human voices was better in the participants' mother tongue than in an unfamiliar language, while this difference was absent in a group of adults with dyslexia. This pattern favored a view of dyslexia as due to "fundamentally impoverished native-language phonological representations." To further examine this issue, we conducted two voice recognition experiments, one with children with/without dyslexia, and the other with adults with/without dyslexia. Results revealed that children/adults with dyslexia were less accurate at identifying voices than normal readers and, importantly, this effect was independent of language. These data are more consistent with the assumption of dyslexia as due to a deficit in multisensory integration rather than a deficit based on impoverished native-language phonologically based representations.

  9. Analysis of the influence of sound signal processing parameters on the quality voice command recognition

    Directory of Open Access Journals (Sweden)

    L. P. Dyuzhayev

    2014-04-01

    Full Text Available Introduction. For the task of voice control over different devices recognition of single (isolated voice commands is required. Typically, this control method requires high reliability (at least 95% accuracy voice recognition. It should be noted that voice commands are often pronounced in high noisiness. All presently known methods and algorithms of speech recognition do not allow to clearly determine which parameters of sound signal can provide the best results. The main part. On the first level of voice recognition is about preprocessing and extracting of acoustic features that have a number of useful features – they are easily calculated, providing a compact representation of the voice commands that are resistant to noise interference; On the next level given command is looked for in the reference dictionary. To get MFCC coefficients input file has to be divided into frames. Each frame is measured by a window function and processed by discrete Fourier transform. The resulting representation of signal in the frequency domain is divided into ranges using a set of triangular filters. The last step is to perform discrete cosine transform. Method of dynamic time warping allows to get a value that is an inverse of degree of similarity between given command and a reference. Conclusions. Research has shown that in the field of voice commands recognition optimum results in terms of quality / performance can be achieved using the following parameters of sound signal processing:8 kHz sample rate, frame duration 70–120 ms, Hamming weighting function of a window, number of Fourier samples is 512.

  10. Speech Recognition Technology Applied to Intelligent Mobile Navigation System

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The capability of human-computer interaction reflects the intelligent degree of mobile navigation system.The navigation data and functions of mobile navigation system are divided into system commands and non-system commands in this paper.And then a group of speech commands are Abstracted.This paper applies speech recognition technology to intelligent mobile navigation system to process speech commands and does some deep research on the integration of speech recognition technology with mobile navigation system.The navigation operation can be performed by speech commands,which makes human-computer interaction easy during navigation.Speech command interface of navigation system is implemented by Dutty ++ Software,which is based on speech recognition system -Via Voice of IBM.Through navigation experiments,navigation can be done almost without keyboard,which proved that human-computer interaction is very convenient by speech commands and the reliability is also higher.

  11. Body expressions influence recognition of emotions in the face and voice.

    Science.gov (United States)

    Van den Stock, Jan; Righart, Ruthger; de Gelder, Beatrice

    2007-08-01

    The most familiar emotional signals consist of faces, voices, and whole-body expressions, but so far research on emotions expressed by the whole body is sparse. The authors investigated recognition of whole-body expressions of emotion in three experiments. In the first experiment, participants performed a body expression-matching task. Results indicate good recognition of all emotions, with fear being the hardest to recognize. In the second experiment, two alternative forced choice categorizations of the facial expression of a compound face-body stimulus were strongly influenced by the bodily expression. This effect was a function of the ambiguity of the facial expression. In the third experiment, recognition of emotional tone of voice was similarly influenced by task irrelevant emotional body expressions. Taken together, the findings illustrate the importance of emotional whole-body expressions in communication either when viewed on their own or, as is often the case in realistic circumstances, in combination with facial expressions and emotional voices.

  12. Effects of emotional and perceptual-motor stress on a voice recognition system's accuracy: An applied investigation

    Science.gov (United States)

    Poock, G. K.; Martin, B. J.

    1984-02-01

    This was an applied investigation examining the ability of a speech recognition system to recognize speakers' inputs when the speakers were under different stress levels. Subjects were asked to speak to a voice recognition system under three conditions: (1) normal office environment, (2) emotional stress, and (3) perceptual-motor stress. Results indicate a definite relationship between voice recognition system performance and the type of low stress reference patterns used to achieve recognition.

  13. Motorcycle Start-stop System based on Intelligent Biometric Voice Recognition

    Science.gov (United States)

    Winda, A.; E Byan, W. R.; Sofyan; Armansyah; Zariantin, D. L.; Josep, B. G.

    2017-03-01

    Current mechanical key in the motorcycle is prone to bulgary, being stolen or misplaced. Intelligent biometric voice recognition as means to replace this mechanism is proposed as an alternative. The proposed system will decide whether the voice is belong to the user or not and the word utter by the user is ‘On’ or ‘Off’. The decision voice will be sent to Arduino in order to start or stop the engine. The recorded voice is processed in order to get some features which later be used as input to the proposed system. The Mel-Frequency Ceptral Coefficient (MFCC) is adopted as a feature extraction technique. The extracted feature is the used as input to the SVM-based identifier. Experimental results confirm the effectiveness of the proposed intelligent voice recognition and word recognition system. It show that the proposed method produces a good training and testing accuracy, 99.31% and 99.43%, respectively. Moreover, the proposed system shows the performance of false rejection rate (FRR) and false acceptance rate (FAR) accuracy of 0.18% and 17.58%, respectively. In the intelligent word recognition shows that the training and testing accuracy are 100% and 96.3%, respectively.

  14. Recognition of voice commands using adaptation of foreign language speech recognizer via selection of phonetic transcriptions

    Science.gov (United States)

    Maskeliunas, Rytis; Rudzionis, Vytautas

    2011-06-01

    In recent years various commercial speech recognizers have become available. These recognizers provide the possibility to develop applications incorporating various speech recognition techniques easily and quickly. All of these commercial recognizers are typically targeted to widely spoken languages having large market potential; however, it may be possible to adapt available commercial recognizers for use in environments where less widely spoken languages are used. Since most commercial recognition engines are closed systems the single avenue for the adaptation is to try set ways for the selection of proper phonetic transcription methods between the two languages. This paper deals with the methods to find the phonetic transcriptions for Lithuanian voice commands to be recognized using English speech engines. The experimental evaluation showed that it is possible to find phonetic transcriptions that will enable the recognition of Lithuanian voice commands with recognition accuracy of over 90%.

  15. (Almost) Word for Word: As Voice Recognition Programs Improve, Students Reap the Benefits

    Science.gov (United States)

    Smith, Mark

    2006-01-01

    Voice recognition software is hardly new--attempts at capturing spoken words and turning them into written text have been available to consumers for about two decades. But what was once an expensive and highly unreliable tool has made great strides in recent years, perhaps most recognized in programs such as Nuance's Dragon NaturallySpeaking…

  16. Using Voice Boards: pedagogical design, technological implementation, evaluation and reflections

    Directory of Open Access Journals (Sweden)

    Elisabeth Yaneske

    2010-12-01

    Full Text Available We present a case study to evaluate the use of a Wimba Voice Board to support asynchronous audio discussion. We discuss the learning strategy and pedagogic rationale when a Voice Board was implemented within an MA module for language learners, enabling students to create learning objects and facilitating peer-to-peer learning. Previously students studying the module had communicated using text-based synchronous and asynchronous discussion only. A common criticism of text-based media is the lack of non-verbal communication. Audio communication is a richer medium where use of pitch, tone, emphasis and inflection can increase personalisation and prevent misinterpretation. Feedback from staff and students on the affordances and constraints of voice communication are presented. Evaluations show that while there were several issues with the usability of the Wimba Voice Board, both staff and students felt the use of voice communication in an online environment had many advantages, including increased personalisation, motivation, and the opportunity to practice speaking and listening skills. However, some students were inhibited by feelings of embarrassment. The case study provides an in-depth study of Voice Boards, which makes an important contribution to the learning technology literature.

  17. Design and implementation of a user-oriented speech recognition interface: the synergy of technology and human factors

    NARCIS (Netherlands)

    Kloosterman, Sietse H.

    1994-01-01

    The design and implementation of a user-oriented speech recognition interface are described. The interface enables the use of speech recognition in so-called interactive voice response systems which can be accessed via a telephone connection. In the design of the interface a synergy of technology

  18. DESIGN AND IMPLEMENTATION OF A USER-ORIENTED SPEECH RECOGNITION INTERFACE - THE SYNERGY OF TECHNOLOGY AND HUMAN-FACTORS

    NARCIS (Netherlands)

    KLOOSTERMAN, SH

    The design and implementation of a user-oriented speech recognition interface are described. The interface enables the use of speech recognition in so-called interactive voice response systems which can be accessed via a telephone connection. In the design of the interface a synergy of technology

  19. Automatic speech recognition (ASR) and its use as a tool for assessment or therapy of voice, speech, and language disorders.

    Science.gov (United States)

    Kitzing, Peter; Maier, Andreas; Ahlander, Viveka Lyberg

    2009-01-01

    In general opinion computerized automatic speech recognition (ASR) seems to be regarded as a method only to accomplish transcriptions from spoken language to written text and as such quite insecure and rather cumbersome. However, due to great advances in computer technology and informatics methodology ASR has nowadays become quite dependable and easier to handle, and the number of applications has increased considerably. After some introductory background information on ASR a number of applications of great interest for professionals in voice, speech, and language therapy are pointed out. In the foreseeable future, the keyboard and mouse will by means of ASR technology be replaced in many functions by a microphone as the human-computer interface, and the computer will talk back via its loud-speaker. It seems important that professionals engaged in the care of oral communication disorders take part in this development so their clients may get the optimal benefit from this new technology.

  20. Application of Voice Recognition Input to Decision Support Systems

    Science.gov (United States)

    1988-12-01

    namely, a Bark-scale frequency warping and the incorporation of suprasegmental energy information. All distortion measures and their modifications were...lowest score; (2) Whereas the addition of suprasegmental energy information helped the recognition performance, the use of gain and absolute loudness

  1. Voice recognition through phonetic features with Punjabi utterances

    Science.gov (United States)

    Kaur, Jasdeep; Juglan, K. C.; Sharma, Vishal; Upadhyay, R. K.

    2017-07-01

    This paper deals with perception and disorders of speech in view of Punjabi language. Visualizing the importance of voice identification, various parameters of speaker identification has been studied. The speech material was recorded with a tape recorder in their normal and disguised mode of utterances. Out of the recorded speech materials, the utterances free from noise, etc were selected for their auditory and acoustic spectrographic analysis. The comparison of normal and disguised speech of seven subjects is reported. The fundamental frequency (F0) at similar places, Plosive duration at certain phoneme, Amplitude ratio (A1:A2) etc. were compared in normal and disguised speech. It was found that the formant frequency of normal and disguised speech remains almost similar only if it is compared at the position of same vowel quality and quantity. If the vowel is more closed or more open in the disguised utterance the formant frequency will be changed in comparison to normal utterance. The ratio of the amplitude (A1: A2) is found to be speaker dependent. It remains unchanged in the disguised utterance. However, this value may shift in disguised utterance if cross sectioning is not done at the same location.

  2. Voice input/output capabilities at Perception Technology Corporation

    Science.gov (United States)

    Ferber, Leon A.

    1977-01-01

    Condensed resumes of key company personnel at the Perception Technology Corporation are presented. The staff possesses recognition, speech synthesis, speaker authentication, and language identification. Hardware and software engineers' capabilities are included.

  3. Air bridge docking -- on voice command recognition and synthesis technology in ATC%陆空对接的桥梁--论空管中的语音指令识别与合成技术

    Institute of Scientific and Technical Information of China (English)

    马林南

    2015-01-01

    空中交通管制指令标准用语的训练是非常重要的、不可或缺的内容,在我国民航运输业不断发展的时代,空中交通流量急剧增长,陆空通话标准用语的训练是空中管制模拟训练中的主要技术。为了改变当前训练系统依赖于专人飞行员席位的设置状况,本文对空中管制中的语音指令识别与合成技术进行研究,针对其中的多个关键技术开展讨论,以实现空中控制模拟训练机中自动飞行员席位替代专人飞行员席位。%The training of air traffic control instructions standard terminology is very important and indispensable content,in the era of the continuous development of China's civil aviation transportation industry,the rapid growth of air traffic flow,radiotelephony communication standard language training is training in the main technology of air traffic control simulation.In order to change the current training system depends on the special pilot seats in,this paper of air traffic control in the speech command recognition and synthesis technology research, for which a number of key technology to launch the discussion,in order to achieve the air control simulation training machine automatic pilot seat replacement pilot special seats.

  4. Recent Advances in Robust Speech Recognition Technology

    CERN Document Server

    Ramírez, Javier

    2011-01-01

    This E-book is a collection of articles that describe advances in speech recognition technology. Robustness in speech recognition refers to the need to maintain high speech recognition accuracy even when the quality of the input speech is degraded, or when the acoustical, articulate, or phonetic characteristics of speech in the training and testing environments differ. Obstacles to robust recognition include acoustical degradations produced by additive noise, the effects of linear filtering, nonlinearities in transduction or transmission, as well as impulsive interfering sources, and diminishe

  5. Voice emotion recognition by cochlear-implanted children and their normally-hearing peers.

    Science.gov (United States)

    Chatterjee, Monita; Zion, Danielle J; Deroche, Mickael L; Burianek, Brooke A; Limb, Charles J; Goren, Alison P; Kulkarni, Aditya M; Christensen, Julie A

    2015-04-01

    Despite their remarkable success in bringing spoken language to hearing impaired listeners, the signal transmitted through cochlear implants (CIs) remains impoverished in spectro-temporal fine structure. As a consequence, pitch-dominant information such as voice emotion, is diminished. For young children, the ability to correctly identify the mood/intent of the speaker (which may not always be visible in their facial expression) is an important aspect of social and linguistic development. Previous work in the field has shown that children with cochlear implants (cCI) have significant deficits in voice emotion recognition relative to their normally hearing peers (cNH). Here, we report on voice emotion recognition by a cohort of 36 school-aged cCI. Additionally, we provide for the first time, a comparison of their performance to that of cNH and NH adults (aNH) listening to CI simulations of the same stimuli. We also provide comparisons to the performance of adult listeners with CIs (aCI), most of whom learned language primarily through normal acoustic hearing. Results indicate that, despite strong variability, on average, cCI perform similarly to their adult counterparts; that both groups' mean performance is similar to aNHs' performance with 8-channel noise-vocoded speech; that cNH achieve excellent scores in voice emotion recognition with full-spectrum speech, but on average, show significantly poorer scores than aNH with 8-channel noise-vocoded speech. A strong developmental effect was observed in the cNH with noise-vocoded speech in this task. These results point to the considerable benefit obtained by cochlear-implanted children from their devices, but also underscore the need for further research and development in this important and neglected area. This article is part of a Special Issue entitled .

  6. Village voice: towards inclusive information technologies

    Energy Technology Data Exchange (ETDEWEB)

    Garside, Ben

    2009-04-15

    A decade ago it was dubbed the 'digital divide'. Now, the gap in information and communications technologies (ICTs) between North and South is gradually shrinking. The developing world accounts for two-thirds of total mobile phone subscriptions, and Africa has the world's fastest growing mobile phone market. By gaining a toehold in affordable ICTs, the poor can access the knowledge and services they need, such as real-time market prices, to boost their livelihoods. But to be sustainable, technologies need to factor in social realities. These include how people already share knowledge, and adapt to introduced technologies: mobile phones, for instance, confer status but can eat into much-needed income. Many development agencies opt for technology-led solutions that fail to 'take'. Approaches that keep development concerns at their core and people as their central focus are key.

  7. Voice Technology Design Guides for Navy Training Systems.

    Science.gov (United States)

    1983-03-01

    81 mi4000111 b bleck ehm ~m) - This project was directed toward gathering information about applications of automated speech technology (AST) and...environmental events. automated performance measurement, and a strong voice interaction between the trainee and the system. Both successes and difficulties have...80-C-0057-1 Strong interaction with the user community is necessary throughout the curriculum development. A subject matter expert, ideally, should

  8. Recognition disorders for famous faces and voices: a review of the literature and normative data of a new test battery.

    Science.gov (United States)

    Quaranta, Davide; Piccininni, Chiara; Carlesimo, Giovanni Augusto; Luzzi, Simona; Marra, Camillo; Papagno, Costanza; Trojano, Luigi; Gainotti, Guido

    2016-03-01

    Several anatomo-clinical investigations have shown that familiar face recognition disorders not due to high level perceptual defects are often observed in patients with lesions of the right anterior temporal lobe (ATL). The meaning of these findings is, however, controversial, because some authors claim that these patients show pure instances of modality-specific 'associative prosopagnosia', whereas other authors maintain that in these patients voice recognition is also impaired and that these patients have a 'multimodal person recognition disorder'. To solve the problem of the nature of famous faces recognition disorders in patients affected by right ATL lesions, it is therefore very important to verify with formal tests if these patients are or are not able to recognize others by voice, but a direct comparison between the two modalities is hindered by the fact that voice recognition is more difficult than face recognition. To circumvent this difficulty, we constructed a test battery in which subjects were requested to recognize the same persons (well-known at the national level) through their faces and voices, evaluating familiarity and identification processes. The present paper describes the 'Famous People Recognition Battery' and reports the normative data necessary to clarify the nature of person recognition disorders observed in patients affected by right ATL lesions.

  9. Voice Matching Using Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Abhishek Bal

    2014-03-01

    Full Text Available In this paper, the use of Genetic Algorithm (GA for voice recognition is described. The practical application of Genetic Algorithm (GA to the solution of engineering problem is a rapidly emerging approach in the field of control engineering and signal processing. Genetic algorithms are useful for searching a space in multi-directional way from large spaces and poorly defined space. Voice is a signal of infinite information. Digital processing of voice signal is very important for automatic voice recognition technology. Nowadays, voice processing is very much important in security mechanism due to mimicry characteristic. So studying the voice feature extraction in voice processing is very necessary in military, hospital, telephone system, investigation bureau and etc. In order to extract valuable information from the voice signal, make decisions on the process, and obtain results, the data needs to be manipulated and analyzed. In this paper, if the instant voice is not matched with same person’s reference voices in the database, then Genetic Algorithm (GA is applied between two randomly chosen reference voices. Again the instant voice is compared with the result of Genetic Algorithm (GA which is used, including its three main steps: selection, crossover and mutation. We illustrate our approach with different sample of voices from human in our institution.

  10. Training Implications of Airborne Applications of Automated Speech Recognition Technology.

    Science.gov (United States)

    1980-10-01

    Coler , C. R. Automated speech recognition and man- computer interaction research at NASA Ames Research Center. In S. Harris (Ed.), Proceedings: Voice...Sons, Inc., 1964. 56 NAVTRAEQUIPCEN 80-D-0009-0155-1 Coler , C. R. Automated speech recognition and man- computer interaction research at NASA Ames

  11. Real-time, face recognition technology

    Energy Technology Data Exchange (ETDEWEB)

    Brady, S.

    1995-11-01

    The Institute for Scientific Computing Research (ISCR) at Lawrence Livermore National Laboratory recently developed the real-time, face recognition technology KEN. KEN uses novel imaging devices such as silicon retinas developed at Caltech or off-the-shelf CCD cameras to acquire images of a face and to compare them to a database of known faces in a robust fashion. The KEN-Online project makes that recognition technology accessible through the World Wide Web (WWW), an internet service that has recently seen explosive growth. A WWW client can submit face images, add them to the database of known faces and submit other pictures that the system tries to recognize. KEN-Online serves to evaluate the recognition technology and grow a large face database. KEN-Online includes the use of public domain tools such as mSQL for its name-database and perl scripts to assist the uploading of images.

  12. Students’ Voices about Learning with Technology

    Directory of Open Access Journals (Sweden)

    Ruth Geer

    2012-01-01

    Full Text Available Problem statement: This study argues for the inclusion of student voice as a valid means of identifying 21st century pedagogical approaches to learning. Today’s students are increasingly living and thriving in a digital world and have a new “digital vernacular” which leads to differences in the way students think about learning. Approach: In Australia many students are already immersed in technologies and have preconceived ideas of what technologies they can expect to use in the classroom and how they will learn. Our schools are slowly changing but are struggling to understand what a contemporary learning environment might look like. Current and emerging technologies are forcing teachers to rethink how best to prepare students for the demands and challenges of the 21st century. Results: Technology plays a key role in how students play, learn, gain information and interact with others. Teachers are challenged to find ways of tapping into the natural curiosities of students allowing them to do more learning on their own. This study explores the use of student voice in an Australian primary school as a valid method to inform teachers about what tools can best support students in their learning. Focus groups, questionnaires and drawings are used to identify technologies, strategies and settings that help students to learn. Conclusion: The findings indicate that students expect to use a variety of technologies in their learning as many students use technologies as a natural tool in their everyday life. This research attempts to clarify what a contemporary learning environment might look like and what teaching strategies and technologies can increase motivation and engagement thus improving student learning opportunities. The student data also includes suggestions to teachers on how they may provide rich learning experiences for students.

  13. Examining the effects of variation in emotional tone of voice on spoken word recognition.

    Science.gov (United States)

    Krestar, Maura L; McLennan, Conor T

    2013-09-01

    Emotional tone of voice (ETV) is essential for optimal verbal communication. Research has found that the impact of variation in nonlinguistic features of speech on spoken word recognition differs according to a time course. In the current study, we investigated whether intratalker variation in ETV follows the same time course in two long-term repetition priming experiments. We found that intratalker variability in ETVs affected reaction times to spoken words only when processing was relatively slow and difficult, not when processing was relatively fast and easy. These results provide evidence for the use of both abstract and episodic lexical representations for processing within-talker variability in ETV, depending on the time course of spoken word recognition.

  14. An Introduction to Face Recognition Technology

    Directory of Open Access Journals (Sweden)

    Shang-Hung Lin

    2000-01-01

    Full Text Available Recently face recognition is attracting much attention in the society of network multimedia information access.  Areas such as network security, content indexing and retrieval, and video compression benefits from face recognition technology because "people" are the center of attention in a lot of video.  Network access control via face recognition not only makes hackers virtually impossible to steal one's "password", but also increases the user-friendliness in human-computer interaction.  Indexing and/or retrieving video data based on the appearances of particular persons will be useful for users such as news reporters, political scientists, and moviegoers.  For the applications of videophone and teleconferencing, the assistance of face recognition also provides a more efficient coding scheme.  In this paper, we give an introductory course of this new information processing technology.  The paper shows the readers the generic framework for the face recognition system, and the variants that are frequently encountered by the face recognizer.  Several famous face recognition algorithms, such as eigenfaces and neural networks, will also be explained.

  15. It doesn't matter what you say: FMRI correlates of voice learning and recognition independent of speech content.

    Science.gov (United States)

    Zäske, Romi; Awwad Shiekh Hasan, Bashar; Belin, Pascal

    2017-09-01

    Listeners can recognize newly learned voices from previously unheard utterances, suggesting the acquisition of high-level speech-invariant voice representations during learning. Using functional magnetic resonance imaging (fMRI) we investigated the anatomical basis underlying the acquisition of voice representations for unfamiliar speakers independent of speech, and their subsequent recognition among novel voices. Specifically, listeners studied voices of unfamiliar speakers uttering short sentences and subsequently classified studied and novel voices as "old" or "new" in a recognition test. To investigate "pure" voice learning, i.e., independent of sentence meaning, we presented German sentence stimuli to non-German speaking listeners. To disentangle stimulus-invariant and stimulus-dependent learning, during the test phase we contrasted a "same sentence" condition in which listeners heard speakers repeating the sentences from the preceding study phase, with a "different sentence" condition. Voice recognition performance was above chance in both conditions although, as expected, performance was higher for same than for different sentences. During study phases activity in the left inferior frontal gyrus (IFG) was related to subsequent voice recognition performance and same versus different sentence condition, suggesting an involvement of the left IFG in the interactive processing of speaker and speech information during learning. Importantly, at test reduced activation for voices correctly classified as "old" compared to "new" emerged in a network of brain areas including temporal voice areas (TVAs) of the right posterior superior temporal gyrus (pSTG), as well as the right inferior/middle frontal gyrus (IFG/MFG), the right medial frontal gyrus, and the left caudate. This effect of voice novelty did not interact with sentence condition, suggesting a role of temporal voice-selective areas and extra-temporal areas in the explicit recognition of learned voice identity

  16. Educational Pedagogy Explored: Attachment, Voice, and Students’ Limited Recognition of the Purpose of Writing

    Directory of Open Access Journals (Sweden)

    Rebecca A. Fairchild

    2013-07-01

    Full Text Available The following teacher research case-study involved an exploration of educational pedagogy by working with a freshman composition student at a college university. All data collected for the study was gathered during the 2013 spring semester. The study was driven by an inquiry based approach where the researcher determined the center of focus that arose from an exploration of the student as a writer through a survey, a classroom observation, multiple one-on-one meetings, and email conversations. The focus area that arose was the student’s limited recognition that writing was done solely for school purposes. Related puzzlements stemming from this focus area included the student’s lack of attachment and lack of voice in her writing. The conclusive data provided insights for how to educate students in future classrooms regarding how vital it is for students to be able to attach themselves to their work.

  17. Voice Activity Detector of Wake-Up-Word Speech Recognition System Design on FPGA

    Directory of Open Access Journals (Sweden)

    Veton Z. Këpuska

    2014-12-01

    Full Text Available A typical speech recognition system is push-to-talk operated that requires activation. However for those who use hands-busy applications, movement may by restricted or impossible. One alternative is to use Speech-Only Interface. The proposed method that is called Wake-Up-Word Speech Recognition (WUW-SR that utilizes speech only interface. A WUW-SR system would allow the user to activate systems (Cell phone, Computer, etc. with only speech commands instead of manual activation. The trend in WUW-SR hardware design is towards implementing a complete system on a single chip intended for various applications. This paper presents an experimental FPGA design and implementation of a novel architecture of a real time feature extraction processor that includes: Voice Activity Detector (VAD, and features extraction, MFCC, LPC, and ENH_MFCC. In the WUW-SR system, the recognizer front-end with VAD is located at the terminal which is typically connected over a data network(e.g., serverfor remote back-end recognition. VAD is responsible for segmenting the signal into speech-like and non-speech-like segments. For any given frame VAD reports one of two possible states: VAD_ON or VAD_OFF. The back-end is then responsible to score the features that are being segmented during VAD_ON stage. The most important characteristic of the presented design is that it should guarantee virtually 100% correct rejection for non-WUW (out of vocabulary words - OOV while maintaining correct acceptance rate of 99.9% or higher (in vocabulary words - INV. This requirement sets apart WUW-SR from other speech recognition tasks because no existing system can guarantee 100% reliability by any measure.

  18. Technologies for Self-Determination for Youth with Developmental Disabilities. Technologies for Voice: A Critical Issues Brief

    Science.gov (United States)

    Skouge, James R.; Kelly, Mary L.; Roberts, Kelly D.; Leake, David W.; Stodden, Robert A.

    2007-01-01

    This paper focuses on "technologies for voice" that are related to the self-determination of youth with developmental disabilities. The authors describe a self-determination model that values family-focused, community-referenced pedagogies employing "new media" to give voice to youth and their families. In line with the adage that a picture is…

  19. Using Voice Boards: Pedagogical Design, Technological Implementation, Evaluation and Reflections

    Science.gov (United States)

    Yaneske, Elisabeth; Oates, Briony

    2011-01-01

    We present a case study to evaluate the use of a Wimba Voice Board to support asynchronous audio discussion. We discuss the learning strategy and pedagogic rationale when a Voice Board was implemented within an MA module for language learners, enabling students to create learning objects and facilitating peer-to-peer learning. Previously students…

  20. Application of Multi- Tier Applications Technology Datasnap in Designing a System of Automatic Segmentation and Recognition of Sppech Signal

    Directory of Open Access Journals (Sweden)

    Yedilkhan N. Amirgaliyev

    2016-03-01

    Full Text Available In this paper we will address current issues in the field of development and application of automatic identification systems and segmentation of speech signals. The basic criteria for the shortcomings of such systems were formulated. The review of the types of speech recognition systems was conducted, and the optimum architecture for them, including information used in leading IT companies was described. The possibility of using multi-tier architectures for solving problems of speech recognition and their advantages were considered. Also practical implementation of multi-tier architecture based on DataSnap technology in voice recognition system for geo search in Kazakh language was described.

  1. Using Voice Boards: pedagogical design, technological implementation, evaluation and reflections

    OpenAIRE

    Yaneske, Elisabeth; Oates, Briony

    2010-01-01

    We present a case study to evaluate the use of a Wimba Voice Board to support asynchronous audio discussion. We discuss the learning strategy and pedagogic rationale when a Voice Board was implemented within an MA module for language learners, enabling students to create learning objects and facilitating peer-to-peer learning. Previously students studying the module had communicated using text-based synchronous and asynchronous discussion only. A common criticism of text-based media is the la...

  2. Automatic Speech Recognition Systems for the Evaluation of Voice and Speech Disorders in Head and Neck Cancer

    Directory of Open Access Journals (Sweden)

    Andreas Maier

    2010-01-01

    Full Text Available In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appropriate for objective and quick evaluation of intelligibility. In this study we investigate the applicability of the method to speech disorders caused by head and neck cancer. Intelligibility was quantified by speech recognition on recordings of a standard text read by 41 German laryngectomized patients with cancer of the larynx or hypopharynx and 49 German patients who had suffered from oral cancer. The speech recognition provides the percentage of correctly recognized words of a sequence, that is, the word recognition rate. Automatic evaluation was compared to perceptual ratings by a panel of experts and to an age-matched control group. Both patient groups showed significantly lower word recognition rates than the control group. Automatic speech recognition yielded word recognition rates which complied with experts' evaluation of intelligibility on a significant level. Automatic speech recognition serves as a good means with low effort to objectify and quantify the most important aspect of pathologic speech—the intelligibility. The system was successfully applied to voice and speech disorders.

  3. 病态嗓音的识别与研究%study and recognition of pathological voice

    Institute of Scientific and Technical Information of China (English)

    陈承义; 高俊芬

    2013-01-01

      通过分析嗓音的发音机理,提取正常与病态嗓音的传统声学参数:基频、共振峰、Mel 倒谱系数(MFCC),以及非线性特征参数:计盒维数与截距,作为病态嗓音识别的特征矢量集.应用高斯混合模型(GMM)对156例正常嗓音与146例病态嗓音进行建模与识别.结果表明:非线性特征参数计盒维数与截距能很好地区分正常与病态嗓音,它们与传统声学参数基频和共振峰的组合,能够取得92.60%的识别率.%By analyzing the mechanism of pronunciation, normal and pathological voice of traditional acoustic parameters:fun-damental frequency, formant, Mel Frequency Cepstrum Coefficient(MFCC), and non-linear feature parameters:box-counting dimension and intercept, are extracted as feature vectors of recognition of pathological voice. 156 normal voice samples and 146 pathological voice samples are recognized based on Gaussian Mixture Model(GMM). The results show that the nonlinear fea-ture parameters of box-counting dimension and intercept can well distinguish between normal and pathological voice. The com-bination of box-counting dimension, intercept and the traditional acoustic parameters-fundamental frequency and formant can achieve a better recognition rate of 92.60%.

  4. Speech Recognition Technology for Hearing Disabled Community

    Directory of Open Access Journals (Sweden)

    Tanvi Dua

    2014-09-01

    Full Text Available As the number of people with hearing disabilities are increasing significantly in the world, it is always required to use technology for filling the gap of communication between Deaf and Hearing communities. To fill this gap and to allow people with hearing disabilities to communicate this paper suggests a framework that contributes to the efficient integration of people with hearing disabilities. This paper presents a robust speech recognition system, which converts the continuous speech into text and image. The results are obtained with an accuracy of 95% with the small size vocabulary of 20 greeting sentences of continuous speech form tested in a speaker independent mode. In this testing phase all these continuous sentences were given as live input to the proposed system.

  5. Applications of VoiceThread(©) Technology in Graduate Nursing Education.

    Science.gov (United States)

    Donnelly, Mary K; Kverno, Karan S; Belcher, Anne E; Ledebur, Lindsay R; Gerson, Linda D

    2016-11-01

    Online graduate courses provide opportunities for faculty to use technology and digital applications to enhance student learning and learning environments. In nursing education, as we become increasingly dependent on technology, it is important to ensure that both faculty and students add digital literacy to their repertoire of knowledge and skills. VoiceThread(©), one type of Web-based digital application tool, allows students and faculty to verbally communicate and collaborate asynchronously. This article discusses the use of VoiceThread technology in graduate nursing education and offers four examples of VoiceThread teaching methods: personal introductions, issues discussions, case presentations, and the elevator speech. Student participation in VoiceThread assignments is evaluated using leveled rubrics. A poll of the students in one of the graduate courses showed high overall satisfaction with VoiceThread in the online classroom. Strategies for effective use of VoiceThread technology to enhance student engagement and learning are recommended. [J Nurs Educ. 2016;55(11):655-658.]. Copyright 2016, SLACK Incorporated.

  6. Scientific Bases of Human-Machine Communication by Voice

    Science.gov (United States)

    Schafer, Ronald W.

    1995-10-01

    The scientific bases for human-machine communication by voice are in the fields of psychology, linguistics, acoustics, signal processing, computer science, and integrated circuit technology. The purpose of this paper is to highlight the basic scientific and technological issues in human-machine communication by voice and to point out areas of future research opportunity. The discussion is organized around the following major issues in implementing human-machine voice communication systems: (i) hardware/software implementation of the system, (ii) speech synthesis for voice output, (iii) speech recognition and understanding for voice input, and (iv) usability factors related to how humans interact with machines.

  7. A self-teaching image processing and voice-recognition-based, intelligent and interactive system to educate visually impaired children

    Science.gov (United States)

    Iqbal, Asim; Farooq, Umar; Mahmood, Hassan; Asad, Muhammad Usman; Khan, Akrama; Atiq, Hafiz Muhammad

    2010-02-01

    A self teaching image processing and voice recognition based system is developed to educate visually impaired children, chiefly in their primary education. System comprises of a computer, a vision camera, an ear speaker and a microphone. Camera, attached with the computer system is mounted on the ceiling opposite (on the required angle) to the desk on which the book is placed. Sample images and voices in the form of instructions and commands of English, Urdu alphabets, Numeric Digits, Operators and Shapes are already stored in the database. A blind child first reads the embossed character (object) with the help of fingers than he speaks the answer, name of the character, shape etc into the microphone. With the voice command of a blind child received by the microphone, image is taken by the camera which is processed by MATLAB® program developed with the help of Image Acquisition and Image processing toolbox and generates a response or required set of instructions to child via ear speaker, resulting in self education of a visually impaired child. Speech recognition program is also developed in MATLAB® with the help of Data Acquisition and Signal Processing toolbox which records and process the command of the blind child.

  8. 17 Ways to Say Yes: Toward Nuanced Tone of Voice in AAC and Speech Technology.

    Science.gov (United States)

    Pullin, Graham; Hennig, Shannon

    2015-06-01

    People with complex communication needs who use speech-generating devices have very little expressive control over their tone of voice. Despite its importance in human interaction, the issue of tone of voice remains all but absent from AAC research and development however. In this paper, we describe three interdisciplinary projects, past, present and future: The critical design collection Six Speaking Chairs has provoked deeper discussion and inspired a social model of tone of voice; the speculative concept Speech Hedge illustrates challenges and opportunities in designing more expressive user interfaces; the pilot project Tonetable could enable participatory research and seed a research network around tone of voice. We speculate that more radical interactions might expand frontiers of AAC and disrupt speech technology as a whole.

  9. Self Assistive Technology for Disabled People – Voice Controlled Wheel Chair and Home Automation System

    Directory of Open Access Journals (Sweden)

    R. Puviarasi

    2014-07-01

    Full Text Available This paper describes the design of an innovative and low cost self-assistive technology that is used to facilitate the control of a wheelchair and home appliances by using advanced voice commands of the disabled people. This proposed system will provide an alternative to the physically challenged people with quadriplegics who is permanently unable to move their limbs (but who is able to speak and hear and elderly people in controlling the motion of the wheelchair and home appliances using their voices to lead an independent, confident and enjoyable life. The performance of this microcontroller based and voice integrated design is evaluated in terms of accuracy and velocity in various environments. The results show that it could be part of an assistive technology for the disabled persons without any third person’s assistance.

  10. The Cambridge Mindreading Face-Voice Battery for Children (CAM-C): complex emotion recognition in children with and without autism spectrum conditions.

    Science.gov (United States)

    Golan, Ofer; Sinai-Gavrilov, Yana; Baron-Cohen, Simon

    2015-01-01

    Difficulties in recognizing emotions and mental states are central characteristics of autism spectrum conditions (ASC). However, emotion recognition (ER) studies have focused mostly on recognition of the six 'basic' emotions, usually using still pictures of faces. This study describes a new battery of tasks for testing recognition of nine complex emotions and mental states from video clips of faces and from voice recordings taken from the Mindreading DVD. This battery (the Cambridge Mindreading Face-Voice Battery for Children or CAM-C) was given to 30 high-functioning children with ASC, aged 8 to 11, and to 25 matched controls. The ASC group scored significantly lower than controls on complex ER from faces and voices. In particular, participants with ASC had difficulty with six out of nine complex emotions. Age was positively correlated with all task scores, and verbal IQ was correlated with scores in the voice task. CAM-C scores were negatively correlated with parent-reported level of autism spectrum symptoms. Children with ASC show deficits in recognition of complex emotions and mental states from both facial and vocal expressions. The CAM-C may be a useful test for endophenotypic studies of ASC and is one of the first to use dynamic stimuli as an assay to reveal the ER profile in ASC. It complements the adult version of the CAM Face-Voice Battery, thus providing opportunities for developmental assessment of social cognition in autism.

  11. 语音情感识别研究现状综述%A General Summary of the Research Status Que about the Voice Emotion Recognition

    Institute of Scientific and Technical Information of China (English)

    何秉羲

    2015-01-01

    This article starts from the concept and process of voice emotion recognition, the phased research situation about the process of voice emotion recognition has carried on the comprehensive elaboration in recent years, and the fu-ture research and its development are prospected.%本文从语音情感识别的概念以及流程入手,对近些年来关于语音情感识别过程情况的阶段性研究成果进行了综合阐述,并对其未来研究及其发展进行了展望。

  12. Transmission by an Embedded System with Enhancements in Voice Processing Technologies

    Directory of Open Access Journals (Sweden)

    G.Sitha Annapurna

    2014-03-01

    Full Text Available The paper reports that the robot can transmit the data such as video, audio, images. The robot can be controlled using the human voice. There are two embedded systems first one is robot controlling system(MASTER which is used to control the robot, second one is voice controlled robot(SLAVE which responds according to the instructions coming from the controlling system. These two embedded systems are communicated through wireless. We can use anyone one of wireless protocols such as IR, NFC, Bluetooth, Zigbee, WI-FI in order to establish a bridge between the MASTER and SLAVE. The voice controlled robot can understand the instructions with the help of the voice recognition system. Spinx-4 is a speech recognizer system written entirely in the java programming language.Sphinx-4 started out as a port of Sphinx-3 to the Java programming language, but evolved into a recognizer designed to be much more flexible than Sphinx-3, thus becoming an excellent platform for speech research.Sphinx-4 is an HMM-based speech recognizer. HMM stands for Hidden Markov Models, Sphinx-4 are a type of statistical model In HMM based speech recognizers.

  13. Giving Voice to Emotion: Voice Analysis Technology Uncovering Mental States is Playing a Growing Role in Medicine, Business, and Law Enforcement.

    Science.gov (United States)

    Allen, Summer

    2016-01-01

    It's tough to imagine anything more frustrating than interacting with a call center. Generally, people don't reach out to call centers when they?re happy-they're usually trying to get help with a problem or gearing up to do battle over a billing error. Add in an automatic phone tree, and you have a recipe for annoyance. But what if that robotic voice offering you a smorgasbord of numbered choices could tell that you were frustrated and then funnel you to an actual human being? This type of voice analysis technology exists, and it's just one example of the many ways that computers can use your voice to extract information about your mental and emotional state-including information you may not think of as being accessible through your voice alone.

  14. Revisiting vocal perception in non-human animals: a review of vowel discrimination, speaker voice recognition, and speaker normalization

    Directory of Open Access Journals (Sweden)

    Buddhamas eKriengwatana

    2015-01-01

    Full Text Available The extent to which human speech perception evolved by taking advantage of predispositions and pre-existing features of vertebrate auditory and cognitive systems remains a central question in the evolution of speech. This paper reviews asymmetries in vowel perception, speaker voice recognition, and speaker normalization in non-human animals – topics that have not been thoroughly discussed in relation to the abilities of non-human animals, but are nonetheless important aspects of vocal perception. Throughout this paper we demonstrate that addressing these issues in non-human animals is relevant and worthwhile because many non-human animals must deal with similar issues in their natural environment. That is, they must also discriminate between similar-sounding vocalizations, determine signaler identity from vocalizations, and resolve signaler-dependent variation in vocalizations from conspecifics. Overall, we find that, although plausible, the current evidence is insufficiently strong to conclude that directional asymmetries in vowel perception are specific to humans, or that non-human animals can use voice characteristics to recognize human individuals. However, we do find some indication that non-human animals can normalize speaker differences. Accordingly, we identify avenues for future research that would greatly improve and advance our understanding of these topics.

  15. Dynamic gesture recognition based on multiple sensors fusion technology.

    Science.gov (United States)

    Wenhui, Wang; Xiang, Chen; Kongqiao, Wang; Xu, Zhang; Jihai, Yang

    2009-01-01

    This paper investigates the roles of a three-axis accelerometer, surface electromyography sensors and a webcam for dynamic gesture recognition. A decision-level multiple sensor fusion method based on action elements is proposed to distinguish a set of 20 kinds of dynamic hand gestures. Experiments are designed and conducted to collect three kinds of sensor data stream simultaneously during gesture implementation and compare the performance of different subsets in gesture recognition. Experimental results from three subjects show that the combination of three kinds of sensor achieves recognition accuracies at 87.5%-91.8%, which are higher largely than that of the single sensor conditions. This study is valuable to realize continuous and dynamic gesture recognition based on multiple sensor fusion technology for multi-model interaction.

  16. Giving Canadian Science, Mathematics, and Technology Education an Independent Voice

    Science.gov (United States)

    Hodson, Derek

    2015-01-01

    It is noted that the "Canadian Journal of Science, Mathematics and Technology Education" (CJSMTE) was founded with the support of a donation of $1.0 million from the Imperial Oil Charitable Foundation. Four goals were uppermost in the thinking behind the journal: first, it should be bilingual; second, it should be cross-disciplinary;…

  17. Interface Everywhere: Further Development of a Gesture and Voice Commanding Interface Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Natural User Interface (NUI) is a term used to describe a number of technologies such as speech recognition, multi-touch, and kinetic interfaces. Gesture and voice...

  18. Design and Implementation of Monophones and Triphones-Based Speech Recognition Systems for Voice Activated Telephony

    Directory of Open Access Journals (Sweden)

    Rupayan Das

    2013-07-01

    Full Text Available Speech recognition is the ability of a machine or program to convert spoken words into its equivalent text form. Nowadays, most recognition systems use Hidden Markov Models for modeling the spoken utterances. In this paper we have implemented two speaker independent speech recognition systems which include all the words required for dialing a phone. The systems contain 42 words including digits from zero to nine and also include names of 20 persons. A total of 16,800 utterances have been used for training each system. The two systems are able to recognize continuous speech and it is implemented with the help of monophones and triphones using HTK. Experimental results show an accuracy of 74.11% for monophones based models and 93.77% for triphones based models.

  19. Implementation of a Tour Guide Robot System Using RFID Technology and Viterbi Algorithm-Based HMM for Speech Recognition

    Directory of Open Access Journals (Sweden)

    Neng-Sheng Pai

    2014-01-01

    Full Text Available This paper applied speech recognition and RFID technologies to develop an omni-directional mobile robot into a robot with voice control and guide introduction functions. For speech recognition, the speech signals were captured by short-time processing. The speaker first recorded the isolated words for the robot to create speech database of specific speakers. After the speech pre-processing of this speech database, the feature parameters of cepstrum and delta-cepstrum were obtained using linear predictive coefficient (LPC. Then, the Hidden Markov Model (HMM was used for model training of the speech database, and the Viterbi algorithm was used to find an optimal state sequence as the reference sample for speech recognition. The trained reference model was put into the industrial computer on the robot platform, and the user entered the isolated words to be tested. After processing by the same reference model and comparing with previous reference model, the path of the maximum total probability in various models found using the Viterbi algorithm in the recognition was the recognition result. Finally, the speech recognition and RFID systems were achieved in an actual environment to prove its feasibility and stability, and implemented into the omni-directional mobile robot.

  20. Analyzing of MOS and Codec Selection for Voice over IP Technology

    Directory of Open Access Journals (Sweden)

    Mohd Nazri Ismail

    2009-01-01

    Full Text Available In this research, we propose an architectural solution to implement the voice over IP (VoIP service in campus environment network. Voice over IP (VoIP technology has become a discussion issue for this time being. Today, the deployment of this technology on an organization truly can give a great financial benefit over traditional telephony. Therefore, this study is to analyze the VoIP Codec selection and investigate the Mean Opinion Score (MOS performance areas evolved with the quality of service delivered by soft phone and IP phone. This study focuses on quality of voice prediction such as i accuracy of MOS between automated system and human perception and ii different types of codec performance measurement via human perception using MOS technique. In this study, network management system (NMS is used to monitor and capture the performance of VoIP in campus environment. In addition, the most apparent of implementing soft phone and IP phone in campus environment is to define the best codec selection that can be used in operational environment. Based on the finding result, the MOS measurement through automated and manual system is able to predict and evaluate VoIP performance. In addition, based on manual MOS measurement, VoIP conversations over LAN contribute more reliability and availability performance compare to WAN.

  1. A Real-Time Face Motion Based Approach towards Modeling Socially Assistive Wireless Robot Control with Voice Recognition

    Directory of Open Access Journals (Sweden)

    Abhinaba Bhattacharjee

    2015-10-01

    Full Text Available The robotics domain has a couple of specific general design requirements which requires the close integration of planning, sensing, control and modeling and for sure the robot must take into account the interactions between itself, its task and its environment surrounding it. Thus considering the fundamental configurations, the main motive is to design a system with user-friendly interfaces that possess the ability to control embedded robotic systems by natural means. While earlier works have focused primarily on issues such as manipulation and navigation only, this proposal presents a conceptual and intuitive approach towards man-machine interaction in order to provide a secured live biometric logical authorization to the user access, while making an intelligent interaction with the control station to navigate advanced gesture controlled wireless Robotic prototypes or mobile surveillance systems along desired directions through required displacements. The intuitions are based on tracking real-time 3-Dimensional Face Motions using skin tone segmentation and maximum area considerations of segmented face-like blobs, Or directing the system with voice commands using real-time speech recognition. The system implementation requires designing a user interface to communicate between the Control station and prototypes wirelessly, either by accessing the internet over an encrypted Wi-Fi Protected Access (WPA via a HTML web page for communicating with face motions or with the help of natural voice commands like “Trace 5 squares”, “Trace 10 triangles”, “Move 10 meters”, etc. evaluated on an iRobot Create over Bluetooth connectivity using a Bluetooth Access Module (BAM. Such an implementation can prove to be highly effective for designing systems of elderly aid and maneuvering the physically challenged.

  2. DLMS Voice Data Entry.

    Science.gov (United States)

    1980-06-01

    between operator and computer displayed on ADM-3A 20c A-I Possible Hardware Configuration for a Multistation Cartographic VDES ...this program a Voice Recognition System (VRS) which can be used to explore the use of voice data entry ( VDE ) in the DIMS or other cartographic data...Multi-Station Cartographic Voice Data Entry System An engineering development model voice data entry system ( VDES ) could be most efficiently

  3. A basic study on application of voice recognition input to an electronic nursing record system -evaluation of the function as an input interface-.

    Science.gov (United States)

    Marukami, Terutaka; Tani, Shoko; Matsuda, Atsuko; Takemoto, Keiko; Shindo, Akiko; Inada, Hiroshi

    2012-06-01

    As computerization in the nursing field has been recently progressing, an electronic nursing record system is gradually introduced in the medical institution in Japan. Although it is expected for the electronic nursing record system to reduce the load of nursing work, the conventional keyboard operation is used for information input of the present electronic nursing record system and it has some problems concerning the input time and the operationability for common nurses who are unfamiliar with the computer operation. In the present study, we conducted a basic study on application of voice recognition input to an electronic nursing record system. The voice input is recently introduced to an electronic medical record system in a few clinics. However, so far the entered information cannot be processed because the information of the medical record must be entered as a free sentence. Therefore, we contrived a template for an electronic nursing record system and introduced it to the system for simple information entry and easy processing of the entered information in this study. Furthermore, an input experiment for evaluation of the voice input with the template was carried out by voluntary subjects for evaluation of the function as an input interface of an electronic nursing record system. The results of the experiment revealed that the input time by the voice input is obviously fast compared with that by the keyboard input and operationability of the voice input was superior to the keyboard input although all subjects had inexperience of the voice input. As a result, it was suggested our method, the voice input using the template made by us, might be useful for an input interface of an electronic nursing record system.

  4. Offline Chinese handwriting recognition :an assessment of current technology

    Institute of Scientific and Technical Information of China (English)

    Sargur N.Srihari; Xuanshen Yang; Gregory R.Ball

    2007-01-01

    Offline Chinese handwriting recognition (OCHR)is a typically difficult pattern recognition problem.Many authors have presented various approaches to recognizing its different aspects.We present a survey and an assessment of relevant papers appearing in recent publications of relevant conferences and journals,including those appearing in ICDAR,SDIUT,IWFHR,ICPR,PAMI,PR,PRL,SPIE-DRR,and IJDAR.The methods are assessed in the sense that we document their technical approaches,strengths,and weaknesses,as well as the data sets on which they were reportedly tested and on which results were generated.We also identify a list of technology gaps with respect to Chinese handwriting recognition and identify technical approaches that show promise in these areas as well as identify the leading researchers for the applicable topics,discussing difficulties associated with any given approach.

  5. [Recognition of corn seeds based on pattern recognition and near infrared spectroscopy technology].

    Science.gov (United States)

    Liu, Tian-Ling; Su, Qi-Ya; Sun, Qun; Yang, Li-Ming

    2012-06-01

    Pattern recognition technology and data mining methods have become a hot topic in chemometrics. Near infrared (NIR) spectroscopic analysis has been widely used in spectrum signal processing and modeling due to its advantages of quickness, simplicity and nondestructiveness. Based on five different methods of pattern recognition, namely the locally linear embedding (LLE), wavelet transform (WT), principal component analysis (PCA), partial least squares (PLS) and support vector machine (SVM), the pattern recognition system for corn seeds is proposed using NIR technology, and applied to classification of 108 hybrid samples and 178 female samples for corn seeds. Firstly, we get rid of noise or reduce the dimension using LLE, WT, PCA and PLS, and then use SVM to identify two-class samples. In the meantime, 1-norm SVM is the method of direct classification and identification. Experimental results for three different spectral regions show that the performances of three methods, i. e. PCA+SVM, LLE+SVM, PLS+SVM, are superior to WT+SVM and 1-norm SVM methods, and obtain a high classification accuracy, which indicates the feasibility and effectiveness of the proposed methods. Moreover, this investigation provides the theoretical support and practical method for recognition of corn seeds utilizing near infrared spectral data.

  6. Multimodal user input to supervisory control systems - Voice-augmented keyboard

    Science.gov (United States)

    Mitchell, Christine M.; Forren, Michelle G.

    1987-01-01

    The use of a voice-augmented keyboard input modality is evaluated in a supervisory control application. An implementation of voice recognition technology in supervisory control is proposed: voice is used to request display pages, while the keyboard is used to input system reconfiguration commands. Twenty participants controlled GT-MSOCC, a high-fidelity simulation of the operator interface to a NASA ground control system, via a workstation equipped with either a single keyboard or a voice-augmented keyboard. Experimental results showed that in all cases where significant performance differences occurred, performance with the voice-augmented keyboard modality was inferior to and had greater variance than the keyboard-only modality. These results suggest that current moderately priced voice recognition systems are an inappropriate human-computer interaction technology in supervisory control systems.

  7. Administration of neuropsychological tests using interactive voice response technology in the elderly: validation and limitations.

    Science.gov (United States)

    Miller, Delyana Ivanova; Talbot, Vincent; Gagnon, Michèle; Messier, Claude

    2013-01-01

    Interactive voice response (IVR) systems are computer programs, which interact with people to provide a number of services from business to health care. We examined the ability of an IVR system to administer and score a verbal fluency task (fruits) and the digit span forward and backward in 158 community dwelling people aged between 65 and 92 years of age (full scale IQ of 68-134). Only six participants could not complete all tasks mostly due to early technical problems in the study. Participants were also administered the Wechsler Intelligence Scale fourth edition (WAIS-IV) and Wechsler Memory Scale fourth edition subtests. The IVR system correctly recognized 90% of the fruits in the verbal fluency task and 93-95% of the number sequences in the digit span. The IVR system typically underestimated the performance of participants because of voice recognition errors. In the digit span, these errors led to the erroneous discontinuation of the test: however the correlation between IVR scoring and clinical scoring was still high (93-95%). The correlation between the IVR verbal fluency and the WAIS-IV Similarities subtest was 0.31. The correlation between the IVR digit span forward and backward and the in-person administration was 0.46. We discuss how valid and useful IVR systems are for neuropsychological testing in the elderly.

  8. Administration of Neuropsychological Tests Using Interactive Voice Response Technology in the Elderly: Validation and Limitations

    Science.gov (United States)

    Miller, Delyana Ivanova; Talbot, Vincent; Gagnon, Michèle; Messier, Claude

    2013-01-01

    Interactive voice response (IVR) systems are computer programs, which interact with people to provide a number of services from business to health care. We examined the ability of an IVR system to administer and score a verbal fluency task (fruits) and the digit span forward and backward in 158 community dwelling people aged between 65 and 92 years of age (full scale IQ of 68–134). Only six participants could not complete all tasks mostly due to early technical problems in the study. Participants were also administered the Wechsler Intelligence Scale fourth edition (WAIS-IV) and Wechsler Memory Scale fourth edition subtests. The IVR system correctly recognized 90% of the fruits in the verbal fluency task and 93–95% of the number sequences in the digit span. The IVR system typically underestimated the performance of participants because of voice recognition errors. In the digit span, these errors led to the erroneous discontinuation of the test: however the correlation between IVR scoring and clinical scoring was still high (93–95%). The correlation between the IVR verbal fluency and the WAIS-IV Similarities subtest was 0.31. The correlation between the IVR digit span forward and backward and the in-person administration was 0.46. We discuss how valid and useful IVR systems are for neuropsychological testing in the elderly. PMID:23950755

  9. Administration of neuropsychological tests using interactive voice response technology in the elderly: validation and limitations

    Directory of Open Access Journals (Sweden)

    Delyana Ivanova Miller

    2013-08-01

    Full Text Available Interactive voice response systems (IVR are computer programs, which interact with people to provide a number of services from business to health care. We examined the ability of an IVR system to administer and score a verbal fluency task (fruits and the digit span forward and backward in 158 community dwelling people aged between 65 and 92 years of age (full scale IQ of 68 to 134. Only 6 participants could not complete all tasks mostly due to early technical problems in the study. Participants were also administered the WAIS-IV and WMS-IV sub-tests. The IVR system correctly recognized 90% of the fruits in the verbal fluency task and 93-95% of the number sequences in the digit span. The IVR system typically underestimated the performance of participants because of voice recognition errors. In the digit span, these errors led to the erroneous discontinuation of the test: however the correlation between IVR scoring and clinical scoring was still high (93-95%. The correlation between the IVR verbal fluency and the WAIS-IV Similarities sub-test was 0.31. The correlation between the IVR digit span forward and backward and the in-person administration was 0.46. We discuss how valid and useful IVR systems are for neuropsychological testing in the elderly.

  10. Speech Rate Control for Improving Elderly Speech Recognition of Smart Devices

    Directory of Open Access Journals (Sweden)

    SON, G.

    2017-05-01

    Full Text Available Although smart devices have become a widely-adopted tool for communication in modern society, it still requires a steep learning curve among the elderly. By introducing a voice-based interface for smart devices using voice recognition technology, smart devices can become more user-friendly and useful to the elderly. However, the voice recognition technology used in current devices is attuned to the voice patterns of the young. Therefore, speech recognition falters when an elderly user speaks into the device. This paper has identified that the elderly's improper speech rate by each syllable contributes to the failure in the voice recognition system. Thus, upon modifying the speech rate by each syllable, the voice recognition rate saw an increase of 12.3%. This paper demonstrates that by simply modifying the speech rate by each syllable, which is one of the factors that causes errors in voice recognition, the recognition rate can be substantially increased. Such improvements in voice recognition technology can make it easier for the elderly to operate smart devices that will allow them to be more socially connected in a mobile world and access information at their fingertips. It may also be helpful in bridging the communication divide between generations.

  11. Evaluation of MPEG-7-Based Audio Descriptors for Animal Voice Recognition over Wireless Acoustic Sensor Networks.

    Science.gov (United States)

    Luque, Joaquín; Larios, Diego F; Personal, Enrique; Barbancho, Julio; León, Carlos

    2016-05-18

    Environmental audio monitoring is a huge area of interest for biologists all over the world. This is why some audio monitoring system have been proposed in the literature, which can be classified into two different approaches: acquirement and compression of all audio patterns in order to send them as raw data to a main server; or specific recognition systems based on audio patterns. The first approach presents the drawback of a high amount of information to be stored in a main server. Moreover, this information requires a considerable amount of effort to be analyzed. The second approach has the drawback of its lack of scalability when new patterns need to be detected. To overcome these limitations, this paper proposes an environmental Wireless Acoustic Sensor Network architecture focused on use of generic descriptors based on an MPEG-7 standard. These descriptors demonstrate it to be suitable to be used in the recognition of different patterns, allowing a high scalability. The proposed parameters have been tested to recognize different behaviors of two anuran species that live in Spanish natural parks; the Epidalea calamita and the Alytes obstetricans toads, demonstrating to have a high classification performance.

  12. NEW HOPE AND TECHNOLOGICAL ISSUES ON VOICE AND VIDEO OVER ATM

    Directory of Open Access Journals (Sweden)

    Dr.S.S.Riaz Ahamed

    2010-12-01

    Full Text Available Changes in the structure of the telecommunications industry and market conditions have brought new opportunities and challenges for network operators and public service providers. Networks that have been primarily focused on providing better voice services are evolving to meet new multimedia communications challenges and competitive pressures. Services based on asynchronous transfer mode (ATM provide the flexible infrastructure essential for success in this evolving market. ATM, which was once envisioned as the technology of future public networks, is now a reality, with service providers around the world introducing and rolling out ATM and ATM–based services.

  13. Identification of Alfalfa Leaf Diseases Using Image Recognition Technology

    Science.gov (United States)

    Qin, Feng; Liu, Dongxia; Sun, Bingda; Ruan, Liu; Ma, Zhanhong; Wang, Haiguang

    2016-01-01

    Common leaf spot (caused by Pseudopeziza medicaginis), rust (caused by Uromyces striatus), Leptosphaerulina leaf spot (caused by Leptosphaerulina briosiana) and Cercospora leaf spot (caused by Cercospora medicaginis) are the four common types of alfalfa leaf diseases. Timely and accurate diagnoses of these diseases are critical for disease management, alfalfa quality control and the healthy development of the alfalfa industry. In this study, the identification and diagnosis of the four types of alfalfa leaf diseases were investigated using pattern recognition algorithms based on image-processing technology. A sub-image with one or multiple typical lesions was obtained by artificial cutting from each acquired digital disease image. Then the sub-images were segmented using twelve lesion segmentation methods integrated with clustering algorithms (including K_means clustering, fuzzy C-means clustering and K_median clustering) and supervised classification algorithms (including logistic regression analysis, Naive Bayes algorithm, classification and regression tree, and linear discriminant analysis). After a comprehensive comparison, the segmentation method integrating the K_median clustering algorithm and linear discriminant analysis was chosen to obtain lesion images. After the lesion segmentation using this method, a total of 129 texture, color and shape features were extracted from the lesion images. Based on the features selected using three methods (ReliefF, 1R and correlation-based feature selection), disease recognition models were built using three supervised learning methods, including the random forest, support vector machine (SVM) and K-nearest neighbor methods. A comparison of the recognition results of the models was conducted. The results showed that when the ReliefF method was used for feature selection, the SVM model built with the most important 45 features (selected from a total of 129 features) was the optimal model. For this SVM model, the

  14. Making women's voices heard: technological change and women's employment in Malaysia.

    Science.gov (United States)

    Ng Choon Sim, C

    1999-01-01

    This paper examines the 1994-96 UN University Institute for New Technologies policy research project on technological change and women's employment in Asia. The project was conducted to provide a voice for nongovernmental organizations (NGOs) representing women workers. It focuses on the Malaysian experience in terms of the impact of technology on women's work and employment in the telecommunications and electronic industry. The results of the NGO research project revealed that the shift to a more intensive production has no uniform impact on women. Although new jobs were created, women employment status remains vulnerable. Meaning, female workers are afraid of the technological redundancy, casualization of labor, as well as health and safety hazards associated with new technology. A good example of the effect of industrialization to women¿s rights is the situation in Malaysia. Although cutting edge technology, combined with restructuring, has yielded some benefits in terms of a vastly expanded network and services, better performances and economies of scale, employment situation of the majority of women still remained in the low-skilled or semi-skilled categories. In order to upgrade women employment status along with the technological advancement, open communication and cooperation of all types is needed to ensure a successful outcome.

  15. Intelligent Facial Recognition Systems: Technology advancements for security applications

    Energy Technology Data Exchange (ETDEWEB)

    Beer, C.L.

    1993-07-01

    Insider problems such as theft and sabotage can occur within the security and surveillance realm of operations when unauthorized people obtain access to sensitive areas. A possible solution to these problems is a means to identify individuals (not just credentials or badges) in a given sensitive area and provide full time personnel accountability. One approach desirable at Department of Energy facilities for access control and/or personnel identification is an Intelligent Facial Recognition System (IFRS) that is non-invasive to personnel. Automatic facial recognition does not require the active participation of the enrolled subjects, unlike most other biological measurement (biometric) systems (e.g., fingerprint, hand geometry, or eye retinal scan systems). It is this feature that makes an IFRS attractive for applications other than access control such as emergency evacuation verification, screening, and personnel tracking. This paper discusses current technology that shows promising results for DOE and other security applications. A survey of research and development in facial recognition identified several companies and universities that were interested and/or involved in the area. A few advanced prototype systems were also identified. Sandia National Laboratories is currently evaluating facial recognition systems that are in the advanced prototype stage. The initial application for the evaluation is access control in a controlled environment with a constant background and with cooperative subjects. Further evaluations will be conducted in a less controlled environment, which may include a cluttered background and subjects that are not looking towards the camera. The outcome of the evaluations will help identify areas of facial recognition systems that need further development and will help to determine the effectiveness of the current systems for security applications.

  16. A meta-analysis of in-vehicle and nomadic voice-recognition system interaction and driving performance.

    Science.gov (United States)

    Simmons, Sarah M; Caird, Jeff K; Steel, Piers

    2017-09-01

    Driver distraction is a growing and pervasive issue that requires multiple solutions. Voice-recognition (V-R) systems may decrease the visual-manual (V-M) demands of a wide range of in-vehicle system and smartphone interactions. However, the degree that V-R systems integrated into vehicles or available in mobile phone applications affect driver distraction is incompletely understood. A comprehensive meta-analysis of experimental studies was conducted to address this knowledge gap. To meet study inclusion criteria, drivers had to interact with a V-R system while driving and doing everyday V-R tasks such as dialing, initiating a call, texting, emailing, destination entry or music selection. Coded dependent variables included detection, reaction time, lateral position, speed and headway. Comparisons of V-R systems with baseline driving and/or a V-M condition were also coded. Of 817 identified citations, 43 studies involving 2000 drivers and 183 effect sizes (r) were analyzed in the meta-analysis. Compared to baseline, driving while interacting with a V-R system is associated with increases in reaction time and lane positioning, and decreases in detection. When V-M systems were compared to V-R systems, drivers had slightly better performance with the latter system on reaction time, lane positioning and headway. Although V-R systems have some driving performance advantages over V-M systems, they have a distraction cost relative to driving without any system at all. The pattern of results indicates that V-R systems impose moderate distraction costs on driving. In addition, drivers minimally engage in compensatory performance adjustments such as reducing speed and increasing headway while using V-R systems. Implications of the results for theory, design guidelines and future research are discussed. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Innovative Technology for the Assisted Delivery of Intensive Voice Treatment (LSVT[R]LOUD) for Parkinson Disease

    Science.gov (United States)

    Halpern, Angela E.; Ramig, Lorraine O.; Matos, Carlos E. C.; Petska-Cable, Jill A.; Spielman, Jennifer L.; Pogoda, Janice M.; Gilley, Phillip M.; Sapir, Shimon; Bennett, John K.; McFarland, David H.

    2012-01-01

    Purpose: To assess the feasibility and effectiveness of a newly developed assistive technology system, Lee Silverman Voice Treatment Companion (LSVT[R] Companion[TM], hereafter referred to as "Companion"), to support the delivery of LSVT[R]LOUD, an efficacious speech intervention for individuals with Parkinson disease (PD). Method: Sixteen…

  18. Innovative Technology for the Assisted Delivery of Intensive Voice Treatment (LSVT[R]LOUD) for Parkinson Disease

    Science.gov (United States)

    Halpern, Angela E.; Ramig, Lorraine O.; Matos, Carlos E. C.; Petska-Cable, Jill A.; Spielman, Jennifer L.; Pogoda, Janice M.; Gilley, Phillip M.; Sapir, Shimon; Bennett, John K.; McFarland, David H.

    2012-01-01

    Purpose: To assess the feasibility and effectiveness of a newly developed assistive technology system, Lee Silverman Voice Treatment Companion (LSVT[R] Companion[TM], hereafter referred to as "Companion"), to support the delivery of LSVT[R]LOUD, an efficacious speech intervention for individuals with Parkinson disease (PD). Method: Sixteen…

  19. Infrared facial recognition technology being pushed toward emerging applications

    Science.gov (United States)

    Evans, David C.

    1997-02-01

    Human identification is a two-step process of initial identity assignment and later verification or recognition. The positive identification requirement is a major part of the classic security, legal, banking, and police task of granting or denying access to a facility, authority to take an action or, in police work, to identify or verify the identity of an individual. To meet this requirement, a three-part research and development (R&D) effort was undertaken Betac International Corporation, through its subsidiaries of Betac Corporation and Technology Recognition Systems, to develop an automated access control system using infrared (IR) facial images to verify the identity of an individual in real time. The system integrates IR facial imaging and a computer-based matching algorithm to perform the human recognition task rapidly, accurately, and nonintrusively, based on three basic principles: every human IR facial image (or thermogram) is unique to that individual; an IR camera can be used to capture human thermograms; and captured thermograms can be digitized, stored, and matched using a computer and mathematical algorithms. The first part of the development effort, an operator-assisted IR image matching proof-of-concept demonstration, was successfully completed in the spring of 1994. The second part of the R&D program, the design and evaluation of a prototype automated access control unit using the IR image matching technology, was completed in April 1995. This paper describes the final development effort to identify, assess, and evaluate the availability and suitability of robust image matching algorithms capable of supporting and enhancing the use of IR facial recognition technology. The most promising mature and available image matching algorithm was integrated into a demonstration access control unit (ACU) using a state-of-the-art IR imager and a performance evaluation was compared against that of a prototype automated ACU using a less robust algorithm and a

  20. Speaker Recognition

    DEFF Research Database (Denmark)

    Mølgaard, Lasse Lohilahti; Jørgensen, Kasper Winther

    2005-01-01

    Speaker recognition is basically divided into speaker identification and speaker verification. Verification is the task of automatically determining if a person really is the person he or she claims to be. This technology can be used as a biometric feature for verifying the identity of a person...... in applications like banking by telephone and voice mail. The focus of this project is speaker identification, which consists of mapping a speech signal from an unknown speaker to a database of known speakers, i.e. the system has been trained with a number of speakers which the system can recognize....

  1. Voice integrated systems

    Science.gov (United States)

    Curran, P. Mike

    1977-01-01

    The program at Naval Air Development Center was initiated to determine the desirability of interactive voice systems for use in airborne weapon systems crew stations. A voice recognition and synthesis system (VRAS) was developed and incorporated into a human centrifuge. The speech recognition aspect of VRAS was developed using a voice command system (VCS) developed by Scope Electronics. The speech synthesis capability was supplied by a Votrax, VS-5, speech synthesis unit built by Vocal Interface. The effects of simulated flight on automatic speech recognition were determined by repeated trials in the VRAS-equipped centrifuge. The relationship of vibration, G, O2 mask, mission duration, and cockpit temperature and voice quality was determined. The results showed that: (1) voice quality degrades after 0.5 hours with an O2 mask; (2) voice quality degrades under high vibration; and (3) voice quality degrades under high levels of G. The voice quality studies are summarized. These results were obtained with a baseline of 80 percent recognition accuracy with VCS.

  2. New Ideas for Speech Recognition and Related Technologies

    Energy Technology Data Exchange (ETDEWEB)

    Holzrichter, J F

    2002-06-17

    The ideas relating to the use of organ motion sensors for the purposes of speech recognition were first described by.the author in spring 1994. During the past year, a series of productive collaborations between the author, Tom McEwan and Larry Ng ensued and have lead to demonstrations, new sensor ideas, and algorithmic descriptions of a large number of speech recognition concepts. This document summarizes the basic concepts of recognizing speech once organ motions have been obtained. Micro power radars and their uses for the measurement of body organ motions, such as those of the heart and lungs, have been demonstrated by Tom McEwan over the past two years. McEwan and I conducted a series of experiments, using these instruments, on vocal organ motions beginning in late spring, during which we observed motions of vocal folds (i.e., cords), tongue, jaw, and related organs that are very useful for speech recognition and other purposes. These will be reviewed in a separate paper. Since late summer 1994, Lawrence Ng and I have worked to make many of the initial recognition ideas more rigorous and to investigate the applications of these new ideas to new speech recognition algorithms, to speech coding, and to speech synthesis. I introduce some of those ideas in section IV of this document, and we describe them more completely in the document following this one, UCRL-UR-120311. For the design and operation of micro-power radars and their application to body organ motions, the reader may contact Tom McEwan directly. The capability for using EM sensors (i.e., radar units) to measure body organ motions and positions has been available for decades. Impediments to their use appear to have been size, excessive power, lack of resolution, and lack of understanding of the value of organ motion measurements, especially as applied to speech related technologies. However, with the invention of very low power, portable systems as demonstrated by McEwan at LLNL researchers have begun

  3. Adoption of Speech Recognition Technology in Community Healthcare Nursing.

    Science.gov (United States)

    Al-Masslawi, Dawood; Block, Lori; Ronquillo, Charlene

    2016-01-01

    Adoption of new health information technology is shown to be challenging. However, the degree to which new technology will be adopted can be predicted by measures of usefulness and ease of use. In this work these key determining factors are focused on for design of a wound documentation tool. In the context of wound care at home, consistent with evidence in the literature from similar settings, use of Speech Recognition Technology (SRT) for patient documentation has shown promise. To achieve a user-centred design, the results from a conducted ethnographic fieldwork are used to inform SRT features; furthermore, exploratory prototyping is used to collect feedback about the wound documentation tool from home care nurses. During this study, measures developed for healthcare applications of the Technology Acceptance Model will be used, to identify SRT features that improve usefulness (e.g. increased accuracy, saving time) or ease of use (e.g. lowering mental/physical effort, easy to remember tasks). The identified features will be used to create a low fidelity prototype that will be evaluated in future experiments.

  4. Voice-Controlled Artificial Handspeak System

    Directory of Open Access Journals (Sweden)

    Carlo Fonda

    2014-01-01

    Full Text Available A man-machine interaction project is described which aims to establish an automated voice to sign language translator for communication with the deaf using integrated open technologies. The first prototype consists of a robotic hand designed with OpenSCAD and manufactured with a low-cost 3D printer ─which smoothly reproduces the alphabet of the sign language controlled by voice only. The core automation comprises an Arduino UNO controller used to activate a set of servo motors that follow instructions from a Raspberry Pi mini-computer having installed the open source speech recognition engine Julius. We discuss its features, limitations and possible future developments.

  5. Giving children voice in the design of technology for education in the developing world

    Directory of Open Access Journals (Sweden)

    Helene Gelderblom

    2014-10-01

    Full Text Available Of the numerous projects that involve ICTs to solve the problems of the developing world, many are unsuccessful. Reasons include lack of attention to how the human and social systems need to adapt to the new technologies, problems with the intent of the initiators, and lack of user involvement. Focusing on the design of ICT for education and acknowledging the range of complex reasons for possible failure, this article focuses on lack on involvement of end users (specifically children in the design and development of ICT solutions. Children in the developing world are not given voice when it comes to the design of technology aimed at providing them with better education. Through examination of the concept of “children’s voice” as well as through discussion of a practical design case to support underprivileged children in South Africa, this article shows that (1 listening to children requires that adult co-designers have the correct attitude towards their child partners and that they are committed to really hearing them; (2 power relations and context plays an important role in the contribution children can make; and (3 South African children have the ability to provide essential input into the design of technology for education.

  6. Speech Recognition: A World of Opportunities

    Science.gov (United States)

    PACER Center, 2004

    2004-01-01

    Speech recognition technology helps people with disabilities interact with computers more easily. People with motor limitations, who cannot use a standard keyboard and mouse, can use their voices to navigate the computer and create documents. The technology is also useful to people with learning disabilities who experience difficulty with spelling…

  7. 17 Ways to Say Yes: Toward Nuanced Tone of Voice in AAC and Speech Technology

    OpenAIRE

    Pullin, Graham; Hennig, Shannon

    2015-01-01

    Abstract People with complex communication needs who use speech-generating devices have very little expressive control over their tone of voice. Despite its importance in human interaction, the issue of tone of voice remains all but absent from AAC research and development however. In this paper, we describe three interdisciplinary projects, past, present and future: The critical design collection Six Speaking Chairs has provoked deeper discussion and inspired a social model of tone of voice;...

  8. Voice, Schooling, Inequality, and Scale

    Science.gov (United States)

    Collins, James

    2013-01-01

    The rich studies in this collection show that the investigation of voice requires analysis of "recognition" across layered spatial-temporal and sociolinguistic scales. I argue that the concepts of voice, recognition, and scale provide insight into contemporary educational inequality and that their study benefits, in turn, from paying attention to…

  9. FUNDAMENTALS OF SPEAKER RECOGNITION

    OpenAIRE

    ERTAŞ, Figen

    2000-01-01

    The explosive growth of information technology in the last decade has made a considerable impact on the design and construction of systems for human-machine communication, which is becoming increasingly important in many aspects of life. Amongst other speech processing tasks, a great deal of attention has been devoted to developing procedures that identify people from their voices, and the design and construction of speaker recognition systems has been a fascinating enterprise pursued over ma...

  10. Voice-coil technology for the E-ELT M4 Adaptive Unit

    Science.gov (United States)

    Gallieni, D.; Tintori, M.; Mantegazza, M.; Anaclerio, E.; Crimella, L.; Acerboni, M.; Biasi, R.; Angerer, G.; Andrigettoni, M.; Merler, A.; Veronese, D.; Carel, J.-L.; Marque, G.; Molinari, E.; Tresoldi, D.; Toso, G.; Spanó, P.; Riva, M.; Mazzoleni, R.; Riccardi, A.; Mantegazza, P.; Manetti, M.; Morandini, M.; Vernet, E.; Hubin, N.; Jochum, L.; Madec, P.; Dimmler, M.; Koch, F.

    We present our design of the E-ELT M4 Adaptive Unit based on voice-coil driven deformable mirror technology. This technology was developed by INAF-Arcetri, Microgate and ADS team in the past 15 years and it has been adopted by a number of large ground based telescopes as the MMT, LBT, Magellan and lastly the VLT in the frame of the Adaptive Telescope Facility project. Our design is based on contactless force actuators made by permanent magnets glued on the back of the deformable mirror and coils mounted on a stiff reference structure. We use capacitive sensors to close a position loop co-located with each actuator. Dedicated high performance parallel processors are used to implement the local de-centralized control at actuator level and a centralized feed-forward computation of all the actuators forces. This allowed achieving in our previous systems dynamic performances well in line with the requirements of the M4 Adaptive Unit (M4AU) case. The actuator density of our design is in the order of 30-mm spacing for a figure of about 6000 actuators on the M4AU and it allows fulfilling the fitting error and corrections requirements of the E-ELT high order DM. Moreover, our contact-less technology makes the Deformable Mirror tolerant to up 5% actuators failures without spoiling system capability to reach its specified performances, besides allowing large mechanical tolerances between the reference structure and the deformable mirror. Finally, we present the Demonstration Prototype we are building in the frame of the M4AU Phase B study to measure the optical dynamical performances predicted by our design. Such a prototype will be fully representative of the M4AU features, in particular it will address the controllability of two adjacent segments of the 2-mm thick mirror and implement the actuators "brick" modular concept that has been adopted to dramatically improve the maintainability of the final unit.

  11. Face and Voice Recognition Algorithms of Sign-in System for Underground Coalmine%人脸与声音结合的矿井人员签到识别

    Institute of Scientific and Technical Information of China (English)

    王君; 李成武; 杨茜; 刘世森

    2012-01-01

    矿井时有安全事故发生,签到管理系统可及时、准确掌握人员出入人员状况,保障矿井安全生产,方便及时救援.针对传统签到管理系统用于矿井,遇到光线昏暗、人脸易附着粉尘、干扰噪音等因素影响,签到识别方法检测率低,提出了—种根据KL变换(Karhunen-Loeve Transform)和TAN分类(Tree-Augmented Naive Bayesian network)相结合的人脸识别,并辅以声音识别的方法.通过形态学滤波变换快速去掉大部分无用背景,使处理更快速,特征点更突出;自动根据具体环境选择图像识别或声音识别,使识别准确率更高.仿真结果表明:结合声音的系统识别方法既减小了计算复杂度,又提高了人员识别率,还增强了适应性.%Coalmine accidents happen sometimes. It is significant to know the accurate statement of the miners in coalmine or outside, which is convenient for rescue. When the traditional Sign—in Management System was used in coal mine, the system meets new problems, such as black, hazy face, etc. Aiming at this issue, this paper put forward a face recognition algorithm based on the combination of Karhunen—Loeve Transform and Tree—Augmented Naive Bayesian network classifier, which uses the morphological filtering to remove most of useless transform background quickly. In addition, the voice recognition method was addede to that algorithm which makes feature point more outstanding and identification more accuracy, according to the specific environment automatic selection of face recognition or voice recognition. The simulation shows that this algorithm not only reduces the computational complexity and improves the human face recognition rate, but also enhances the adaptability.

  12. Validity of jitter measures in non-quasi-periodic voices. Part I: perceptual and computer performances in cycle pattern recognition.

    Science.gov (United States)

    Dejonckere, Philippe; Schoentgen, Jean; Giordano, Andrea; Fraj, Samia; Bocchi, Leonardo; Manfredi, Claudia

    2011-07-01

    The limit of about 5% for reliable quantification of jitter in sustained vowels of dysphonic voices-a widely accepted guideline-deserves critical analysis. The present study pertains to the effect of experience and training on the perceptual (visual) capability of correctly identifying periods in (highly) perturbed signals, and to a comparison of the performance of several programs for voice analysis. Synthesized realistic vowels (/a:/) with exactly known jitter (2.7%-31.5%) are used as material. After selection and training, experienced raters demonstrate excellent agreement in correctly identifying periods up to high values of jitter put in. Perceptual rating outperforms all computer programs in accuracy. Most remain reliable up to 10% jitter; one of them correctly measures up to the highest level.

  13. An Overview of Optical Character Recognition (OCR) Technology and Techniques.

    Science.gov (United States)

    1978-06-01

    using optico -electric filters. In practice, the choice of preprocessing techniques must necessarily be related to the recognition method. For example...inter- face), Recognition Unit and Operator Communication device. A brief description of each major system component including the Input Sensor follows...accomplish several types of mark- sensor recognition. Together, they are ’ capable of recognizing machine and handprinted data intermixed on the same line. .3

  14. Smartphone App for Voice Disorders

    Science.gov (United States)

    ... on. Feature: Taste, Smell, Hearing, Language, Voice, Balance Smartphone App for Voice Disorders Past Issues / Fall 2013 ... developed a mobile monitoring device that relies on smartphone technology to gather a week's worth of talking, ...

  15. FUNDAMENTALS OF SPEAKER RECOGNITION

    Directory of Open Access Journals (Sweden)

    Figen ERTAŞ

    2000-02-01

    Full Text Available The explosive growth of information technology in the last decade has made a considerable impact on the design and construction of systems for human-machine communication, which is becoming increasingly important in many aspects of life. Amongst other speech processing tasks, a great deal of attention has been devoted to developing procedures that identify people from their voices, and the design and construction of speaker recognition systems has been a fascinating enterprise pursued over many decades. This paper introduces speaker recognition in general and discusses its relevant parameters in relation to system performance.

  16. A Voice-Activated, Interactive Videodisc Case Study for Use in the Medical School Classroom.

    Science.gov (United States)

    Harless, William G.; And Others

    1986-01-01

    The Technological Innovations in Medical Education (TIME) Project of the Lister Hill National Center for Biomedical Communications is exploring the use of interactive videodisc, microcomputer, and voice recognition technology to create interactive case studies of simulated patients to train second-year medical students in the introduction to…

  17. A Voice-Activated, Interactive Videodisc Case Study for Use in the Medical School Classroom.

    Science.gov (United States)

    Harless, William G.; And Others

    1986-01-01

    The Technological Innovations in Medical Education (TIME) Project of the Lister Hill National Center for Biomedical Communications is exploring the use of interactive videodisc, microcomputer, and voice recognition technology to create interactive case studies of simulated patients to train second-year medical students in the introduction to…

  18. Facial Recognition Technology: An analysis with scope in India

    CERN Document Server

    Thorat, S B; Dandale, Jyoti P

    2010-01-01

    A facial recognition system is a computer application for automatically identifying or verifying a person from a digital image or a video frame from a video source. One of the way is to do this is by comparing selected facial features from the image and a facial database.It is typically used in security systems and can be compared to other biometrics such as fingerprint or eye iris recognition systems. In this paper we focus on 3-D facial recognition system and biometric facial recognision system. We do critics on facial recognision system giving effectiveness and weaknesses. This paper also introduces scope of recognision system in India.

  19. Mandarin recognition over the telephone

    Science.gov (United States)

    Kao, Yuhung

    1996-06-01

    Mandarin Chinese is the official language in China and Taiwan, it is the native language of a quarter of the world population. As the services enabled by speech recognition technology (e.g. telephone voice dialing, information query) become more popular in English, we would like to extend this capability to other languages. Mandarin is one of the major languages under research in our laboratory. This paper describes how we extend our work in English speech recognition into Mandarin. We will described the corpus: Voice Across Taiwan, the training of a complete set of Mandarin syllable models, preliminary performance results and error analysis. A fast prototyping system was built, where a user can write any context free grammar with no restriction of vocabulary, then the grammar can be compiled into recognition models. It enables user to quickly test the performance of a new vocabulary.

  20. Privacy in the Face of Surveillance: Fourth Amendment Considerations for Facial Recognition Technology

    Science.gov (United States)

    2015-03-01

    five decades later, are: “head rotation and tilt, lighting intensity and angle, facial expression , aging, etc.”51 47 Charlie Savage, “ Facial ...OF SURVEILLANCE: FOURTH AMENDMENT CONSIDERATIONS FOR FACIAL RECOGNITION TECHNOLOGY by Eric Z. Wynn March 2015 Thesis Advisor: Carolyn... FACIAL RECOGNITION TECHNOLOGY 6. AUTHOR(S) Eric Z. Wynn 7. PERFORMING ORGANIZATION NA:i\\tiE(S) AND ADDRESS(ES) Naval Postgraduate School Monterey

  1. Advances in Speech Recognition

    CERN Document Server

    Neustein, Amy

    2010-01-01

    This volume is comprised of contributions from eminent leaders in the speech industry, and presents a comprehensive and in depth analysis of the progress of speech technology in the topical areas of mobile settings, healthcare and call centers. The material addresses the technical aspects of voice technology within the framework of societal needs, such as the use of speech recognition software to produce up-to-date electronic health records, not withstanding patients making changes to health plans and physicians. Included will be discussion of speech engineering, linguistics, human factors ana

  2. Using Automatic Speech Recognition Technology with Elicited Oral Response Testing

    Science.gov (United States)

    Cox, Troy L.; Davies, Randall S.

    2012-01-01

    This study examined the use of automatic speech recognition (ASR) scored elicited oral response (EOR) tests to assess the speaking ability of English language learners. It also examined the relationship between ASR-scored EOR and other language proficiency measures and the ability of the ASR to rate speakers without bias to gender or native…

  3. CCD camera automatic calibration technology and ellipse recognition algorithm

    Institute of Scientific and Technical Information of China (English)

    Changku Sun; Xiaodong Zhang; Yunxia Qu

    2005-01-01

    A novel two-dimensional (2D) pattern used in camera calibration is presented. With one feature circle located at the center, an array of circles is photo-etched on this pattern. An ellipse recognition algorithm is proposed to implement the acquisition of interest calibration points without human intervention. According to the circle arrangement of the pattern, the relation between three-dimensional (3D) and 2D coordinates of these points can be established automatically and accurately. These calibration points are computed for intrinsic parameters calibration of charge-coupled device (CCD) camera with Tsai method. A series of experiments have shown that the algorithm is robust and reliable with the calibration error less than 0.4 pixel. This new calibration pattern and ellipse recognition algorithm can be widely used in computer vision.

  4. Review of Speech-to-Text Recognition Technology for Enhancing Learning

    Science.gov (United States)

    Shadiev, Rustam; Hwang, Wu-Yuin; Chen, Nian-Shing; Huang, Yueh-Min

    2014-01-01

    This paper reviewed literature from 1999 to 2014 inclusively on how Speech-to-Text Recognition (STR) technology has been applied to enhance learning. The first aim of this review is to understand how STR technology has been used to support learning over the past fifteen years, and the second is to analyze all research evidence to understand how…

  5. The Affordance of Speech Recognition Technology for EFL Learning in an Elementary School Setting

    Science.gov (United States)

    Liaw, Meei-Ling

    2014-01-01

    This study examined the use of speech recognition (SR) technology to support a group of elementary school children's learning of English as a foreign language (EFL). SR technology has been used in various language learning contexts. Its application to EFL teaching and learning is still relatively recent, but a solid understanding of its…

  6. Review of Speech-to-Text Recognition Technology for Enhancing Learning

    Science.gov (United States)

    Shadiev, Rustam; Hwang, Wu-Yuin; Chen, Nian-Shing; Huang, Yueh-Min

    2014-01-01

    This paper reviewed literature from 1999 to 2014 inclusively on how Speech-to-Text Recognition (STR) technology has been applied to enhance learning. The first aim of this review is to understand how STR technology has been used to support learning over the past fifteen years, and the second is to analyze all research evidence to understand how…

  7. The Affordance of Speech Recognition Technology for EFL Learning in an Elementary School Setting

    Science.gov (United States)

    Liaw, Meei-Ling

    2014-01-01

    This study examined the use of speech recognition (SR) technology to support a group of elementary school children's learning of English as a foreign language (EFL). SR technology has been used in various language learning contexts. Its application to EFL teaching and learning is still relatively recent, but a solid understanding of its…

  8. Investigation on the Realization of Voice Technology in the Early Stage of LTE Constructions%LTE建设初期语音技术实现探讨

    Institute of Scientific and Technical Information of China (English)

    2013-01-01

      通过对语音IP化的技术特点进行分析,探讨了VoIP在LTE承载时应重点考虑语音协议、语音编码、时延及抖动等因素,并根据LTE主要技术特点,得出了LTE语音终极解决方案是IMS、初期语音解决方案是双待机终端方案或CSFB。%Based on the analysis of voice over IP (VoIP) technology, the key factors such as voice protocols, speech coding, delay and jitter should be considered when VoIP is implemented in LTE. According to the characteristics of LTE technology, the final solution to voice over LTE is IMS, while its preliminary solution is dual standby terminal or CSFB.

  9. Survey of Commercial Technologies for Face Recognition in Video

    Science.gov (United States)

    2014-09-01

    search facial components, identify a gestalt face 11 and compare it to a stored set of facial characteristics of known human faces. 3.2 Recognition System...theorize that a face is not merely a set of facial features but is rather something meaningful in its form. This is consistent with the Gestalt theory that...an image is seen in its entirety, not by its individual parts. Hence, the “ gestalt face” refers to a holistic representation of face. Gestalt’s theory

  10. 基于语音分析技术的电力客户服务质量检测与分析探究%The study of service quality detection and analysis of power customer voice analysis technology based on

    Institute of Scientific and Technical Information of China (English)

    王大伟

    2014-01-01

    Speech analysis technology is the core technology through voice recognition voice of unstructured information into structured index to achieve massive audio files,audio files of knowledge mining.Save a large amount of call center customer service record data,let recording voice analysis to manage customer data, auxiliary customer service quality,targeted to improve customer service quality,improve customer satisfaction.You can dig through the system to the user behavior data,thus timely and accurate market decisions.This paper describes the speech recognition,sentiment analysis.%为了进一步改善电力客户服务质量,让语音分析驾驭客服录音数据,辅助客服质检,不断提高客户满意度。本文研究通过系统挖掘呼叫中心保存的大量客服录音数据,应用语音分析技术,从语音识别、情感分析等维度入手,开展对海量录音文件、音频文件的系统挖掘。通过开展用户行为数据全方位挖掘分析,实现客户需求及市场决策的准确把握和科学规划。

  11. Voice application development for Android

    CERN Document Server

    McTear, Michael

    2013-01-01

    This book will give beginners an introduction to building voice-based applications on Android. It will begin by covering the basic concepts and will build up to creating a voice-based personal assistant. By the end of this book, you should be in a position to create your own voice-based applications on Android from scratch in next to no time.Voice Application Development for Android is for all those who are interested in speech technology and for those who, as owners of Android devices, are keen to experiment with developing voice apps for their devices. It will also be useful as a starting po

  12. Voice-Controlled Artificial Handspeak System

    Directory of Open Access Journals (Sweden)

    Jonathan Gatti

    2014-04-01

    Full Text Available A man-machine interaction project is described whic h aims to establish an automated voice to sign language translator for communication with the deaf using integrated open technologies. The first prototype consists of a robotic hand designed with OpenSCAD and manufactured with a low-cost 3D printer which smoothly reproduces the alphabet of the sign language controlled by voice only. The core automation comprises an Arduino UNO controller used to activate a set of servo motors that follow instructions from a Raspberry Pi mini-computer havi ng installed the open source speech recognition eng ine Julius. We discuss its features, limitations and po ssible future developmen

  13. 人脸识别技术综述%Survey of face recognition technology

    Institute of Scientific and Technical Information of China (English)

    何春

    2016-01-01

    This paper introduces face recognition technology firstly, then reviews the development process and the basic classification method of face recognition. After that, the paper discusses the current methods of face recognition in detail, therefore proposes the existing problems in the research of recognition faces and future’ s research direction.%文章首先对人脸识别技术进行了介绍,其次回顾了人脸识别研究的发展历程及识别方法的基本分类,然后对当前主流的人脸识别方法展开了详细的论述,最后提出了人脸识别技术面临的问题及研究方向。

  14. The expression and recognition of emotions in the voice across five nations: A lens model analysis based on acoustic features.

    Science.gov (United States)

    Laukka, Petri; Elfenbein, Hillary Anger; Thingujam, Nutankumar S; Rockstuhl, Thomas; Iraki, Frederick K; Chui, Wanda; Althoff, Jean

    2016-11-01

    This study extends previous work on emotion communication across cultures with a large-scale investigation of the physical expression cues in vocal tone. In doing so, it provides the first direct test of a key proposition of dialect theory, namely that greater accuracy of detecting emotions from one's own cultural group-known as in-group advantage-results from a match between culturally specific schemas in emotional expression style and culturally specific schemas in emotion recognition. Study 1 used stimuli from 100 professional actors from five English-speaking nations vocally conveying 11 emotional states (anger, contempt, fear, happiness, interest, lust, neutral, pride, relief, sadness, and shame) using standard-content sentences. Detailed acoustic analyses showed many similarities across groups, and yet also systematic group differences. This provides evidence for cultural accents in expressive style at the level of acoustic cues. In Study 2, listeners evaluated these expressions in a 5 × 5 design balanced across groups. Cross-cultural accuracy was greater than expected by chance. However, there was also in-group advantage, which varied across emotions. A lens model analysis of fundamental acoustic properties examined patterns in emotional expression and perception within and across groups. Acoustic cues were used relatively similarly across groups both to produce and judge emotions, and yet there were also subtle cultural differences. Speakers appear to have a culturally nuanced schema for enacting vocal tones via acoustic cues, and perceivers have a culturally nuanced schema in judging them. Consistent with dialect theory's prediction, in-group judgments showed a greater match between these schemas used for emotional expression and perception. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  15. Frequency and analysis of non-clinical errors made in radiology reports using the National Integrated Medical Imaging System voice recognition dictation software.

    Science.gov (United States)

    Motyer, R E; Liddy, S; Torreggiani, W C; Buckley, O

    2016-11-01

    Voice recognition (VR) dictation of radiology reports has become the mainstay of reporting in many institutions worldwide. Despite benefit, such software is not without limitations, and transcription errors have been widely reported. Evaluate the frequency and nature of non-clinical transcription error using VR dictation software. Retrospective audit of 378 finalised radiology reports. Errors were counted and categorised by significance, error type and sub-type. Data regarding imaging modality, report length and dictation time was collected. 67 (17.72 %) reports contained ≥1 errors, with 7 (1.85 %) containing 'significant' and 9 (2.38 %) containing 'very significant' errors. A total of 90 errors were identified from the 378 reports analysed, with 74 (82.22 %) classified as 'insignificant', 7 (7.78 %) as 'significant', 9 (10 %) as 'very significant'. 68 (75.56 %) errors were 'spelling and grammar', 20 (22.22 %) 'missense' and 2 (2.22 %) 'nonsense'. 'Punctuation' error was most common sub-type, accounting for 27 errors (30 %). Complex imaging modalities had higher error rates per report and sentence. Computed tomography contained 0.040 errors per sentence compared to plain film with 0.030. Longer reports had a higher error rate, with reports >25 sentences containing an average of 1.23 errors per report compared to 0-5 sentences containing 0.09. These findings highlight the limitations of VR dictation software. While most error was deemed insignificant, there were occurrences of error with potential to alter report interpretation and patient management. Longer reports and reports on more complex imaging had higher error rates and this should be taken into account by the reporting radiologist.

  16. The Poetics of "Pattern Recognition": William Gibson's Shifting Technological Subject

    Science.gov (United States)

    Wetmore, Alex

    2007-01-01

    William Gibson's 1984 cyberpunk novel "Neuromancer" continues to be a touchstone in cultural representations of the impact of new information and communication technologies on the self. As critics have noted, the posthumanist, capital-driven, urban landscape of "Neuromancer" resembles a Foucaultian vision of a panoptically engineered social space…

  17. Health Care in Home Automation Systems with Speech Recognition and Mobile Technology

    Directory of Open Access Journals (Sweden)

    Jasmin Kurti

    2016-08-01

    Full Text Available - Home automation systems use technology to facilitate the lives of people using it, and it is especially useful for assisting the elderly and persons with special needs. These kind of systems have been a popular research subject in last few years. In this work, I present the design and development of a system that provides a life assistant service in a home environment, a smart home-based healthcare system controlled with speech recognition and mobile technology. This includes developing software with speech recognition, speech synthesis, face recognition, controls for Arduino hardware, and a smartphone application for remote controlling the system. With the developed system, elderly and persons with special needs can stay independently in their own home secure and with care facilities. This system is tailored towards the elderly and disabled, but it can also be embedded in any home and used by anybody. It provides healthcare, security, entertainment, and total local and remote control of home.

  18. Voice Disorders

    Science.gov (United States)

    Voice is the sound made by air passing from your lungs through your larynx, or voice box. In your larynx are your vocal cords, ... to make sound. For most of us, our voices play a big part in who we are, ...

  19. Every Voice

    Science.gov (United States)

    Patrick, Penny

    2008-01-01

    This article discusses how the author develops an approach that allows her students, who are part of the marginalized population, to learn the power of their own voices--not just their writing voices, but their oral voices as well. The author calls it "TWIST": Thoughts, Writing folder, Inquiring mind, Supplies, and Teamwork. It is where…

  20. Every Voice

    Science.gov (United States)

    Patrick, Penny

    2008-01-01

    This article discusses how the author develops an approach that allows her students, who are part of the marginalized population, to learn the power of their own voices--not just their writing voices, but their oral voices as well. The author calls it "TWIST": Thoughts, Writing folder, Inquiring mind, Supplies, and Teamwork. It is where…

  1. Voice restoration

    NARCIS (Netherlands)

    Hilgers, F.J.M.; Balm, A.J.M.; van den Brekel, M.W.M.; Tan, I.B.; Remacle, M.; Eckel, H.E.

    2010-01-01

    Surgical prosthetic voice restoration is the best possible option for patients to regain oral communication after total laryngectomy. It is considered to be the present "gold standard" for voice rehabilitation of laryngectomized individuals. Surgical prosthetic voice restoration, in essence, is alwa

  2. Ethnographic Voice Memo Narratives

    DEFF Research Database (Denmark)

    Rasmussen, Mette Apollo; Conradsen, Maria Bosse

    1800-01-01

    -based technique which actively involves actors in producing ethnography-based data concerning their everyday practice. With the help from smartphone technology it is possible to complement ethnography-based research methods when involving the actors and having them create small voice memo narratives. The voice...... memos create insights of actors‟ everyday practice, without the direct presence of a researcher and could be considered a step towards meeting the dilemmas of research in complex fieldwork settings....

  3. Automatic Speech Recognition Technology as an Effective Means for Teaching Pronunciation

    Science.gov (United States)

    Elimat, Amal Khalil; AbuSeileek, Ali Farhan

    2014-01-01

    This study aimed to explore the effect of using automatic speech recognition technology (ASR) on the third grade EFL students' performance in pronunciation, whether teaching pronunciation through ASR is better than regular instruction, and the most effective teaching technique (individual work, pair work, or group work) in teaching pronunciation…

  4. Developing and Evaluating an Oral Skills Training Website Supported by Automatic Speech Recognition Technology

    Science.gov (United States)

    Chen, Howard Hao-Jan

    2011-01-01

    Oral communication ability has become increasingly important to many EFL students. Several commercial software programs based on automatic speech recognition (ASR) technologies are available but their prices are not affordable for many students. This paper will demonstrate how the Microsoft Speech Application Software Development Kit (SASDK), a…

  5. Developing and Evaluating an Oral Skills Training Website Supported by Automatic Speech Recognition Technology

    Science.gov (United States)

    Chen, Howard Hao-Jan

    2011-01-01

    Oral communication ability has become increasingly important to many EFL students. Several commercial software programs based on automatic speech recognition (ASR) technologies are available but their prices are not affordable for many students. This paper will demonstrate how the Microsoft Speech Application Software Development Kit (SASDK), a…

  6. Object/Shape Recognition Technology: An Assessment of the Feasibility of Implementation at Defense Logistics Agency Disposition Services

    Science.gov (United States)

    2015-02-25

    IV.  ANALYSIS OF THE CURRENT PROPERTY PROCESS AT DEFENSE LOGISTICS AGENCY DISPOSITION SERVICES AND MATURITY ASSESSMENT OF OBJECT/SHAPE RECOGNITION...implement full automation with optical sorting and data mining that included sensors, laser, object/shape recognition technology on conveyor belt...the current state of object/shape recognition technology and assess the feasibility of implementing it at DLA DS. C. RESEARCH QUESTIONS, SCOPE AND

  7. Industrial Applications of Automatic Speech Recognition Systems

    Directory of Open Access Journals (Sweden)

    Dr. Jayashri Vajpai

    2016-03-01

    Full Text Available Current trends in developing technologies form important bridges to the future, fortified by the early and productive use of technology for enriching the human life. Speech signal processing, which includes automatic speech recognition, synthetic speech, and natural language processing, is beginning to have a significant impact on business, industry and ease of operation of personal computers. Apart from this, it facilitates the deeper understanding of complex mechanism of functioning of human brain. Advances in speech recognition technology, over the past five decades, have enabled a wide range of industrial applications. Yet today's applications provide a small preview of a rich future for speech and voice interface technology that will eventually replace keyboards with microphones for designing human machine interface for providing easy access to increasingly intelligent machines. It also shows how the capabilities of speech recognition systems in industrial applications are evolving over time to usher in the next generation of voice-enabled services. This paper aims to present an effective survey of the speech recognition technology described in the available literature and integrate the insights gained during the process of study of individual research and developments. The current applications of speech recognition for real world and industry have also been outlined with special reference to applications in the areas of medical, industrial robotics, forensic, defence and aviation

  8. Leveraging voice

    DEFF Research Database (Denmark)

    Frølunde, Lisbeth

    2017-01-01

    researchers improve our practices and how could digital online video help offer more positive stories about research and higher education? How can academics in higher education be better to tell about our research, thereby reclaiming and leveraging our voice in a post-factual era? As higher education......This paper speculates on how researchers share research without diluting our credibility and how to make strategies for the future. It also calls for consideration of new traditions and practices for communicating knowledge to a wider audience across multiple media platforms. How might we...... continues to engage with digital and networked technologies it becomes increasingly relevant to question why and how academics could (re) position research knowledge in the digital and online media landscape of today and the future. The paper highlights methodological issues that arise in relation...

  9. Research on PCA face recognition technology%关于PCA的人脸识别技术的研究

    Institute of Scientific and Technical Information of China (English)

    李婧; 李志强

    2016-01-01

    With thedevelopment of intelligent technology,face recognition technology gradually strengthen, face recognition technology is a kind of identification technology,which covers the computer vision technology,image processing technology and other advanced technology,through the attention to details such as facial contour features,complete face recognition.To further do a good job in face recognition,it is necessary to study PCA face recognition technology.This article starts with the basic situation of face recognition process,combining with the commonly used method in face detection, from the three sides facing the PCA face recognition technology is studied.%随着智能技术的发展,人脸识别技术逐渐增强,人脸识别技术是一种识别身份的技术,其中涵盖了计算机视觉技术、图像处理技术等多种先进技术,通过对人脸轮廓等细节特征的关注,完成人脸识别.为进一步做好人脸识别,有必要对PCA的人脸识别技术展开研究.本文将从人脸识别流程基本情况入手,结合在人脸检测中常用的方法,从三方面对PCA人脸识别技术进行研究.

  10. Performance Evaluation of Speech Recognition Systems as a Next-Generation Pilot-Vehicle Interface Technology

    Science.gov (United States)

    Arthur, Jarvis J., III; Shelton, Kevin J.; Prinzel, Lawrence J., III; Bailey, Randall E.

    2016-01-01

    During the flight trials known as Gulfstream-V Synthetic Vision Systems Integrated Technology Evaluation (GV-SITE), a Speech Recognition System (SRS) was used by the evaluation pilots. The SRS system was intended to be an intuitive interface for display control (rather than knobs, buttons, etc.). This paper describes the performance of the current "state of the art" Speech Recognition System (SRS). The commercially available technology was evaluated as an application for possible inclusion in commercial aircraft flight decks as a crew-to-vehicle interface. Specifically, the technology is to be used as an interface from aircrew to the onboard displays, controls, and flight management tasks. A flight test of a SRS as well as a laboratory test was conducted.

  11. Speech Recognition: How Do We Teach It?

    Science.gov (United States)

    Barksdale, Karl

    2002-01-01

    States that growing use of speech recognition software has made voice writing an essential computer skill. Describes how to present the topic, develop basic speech recognition skills, and teach speech recognition outlining, writing, proofreading, and editing. (Contains 14 references.) (SK)

  12. Recognition and development of "educational technology" as a scientific field and school subject

    Directory of Open Access Journals (Sweden)

    Danilović Mirčeta S.

    2004-01-01

    Full Text Available The paper explores the process of development, establishment and recognition of "educational technology" as an independent scientific field and a separate teaching subject at universities. The paper points to: (a the problems that this field deals with or should deal with, (b knowledge needed for the profession of "educational technologist", (c various scientific institutions across the world involved in educational technology, (d scientific journals treating issues of modern educational technology, (e the authors i.e. psychologists and educators who developed and formulated the basic principles of this scientific field, (f educational features and potentials of educational technologies. Emphasis is placed on the role and importance of AV technology in developing, establishing and recognition of educational technology, and it is also pointed out that AV technology i.e. AV teaching aids and a movement for visualization of teaching were its forerunners and crucial factors for its establishing and developing into an independent area of teaching i.e. school subject. In summary it is stressed that educational technology provides for the execution of instruction through emission transmission, selection, coding, decoding, reception, memorization transformation of all types of pieces of information in teaching.

  13. The method of parallel-hierarchical transformation for rapid recognition of dynamic images using GPGPU technology

    Science.gov (United States)

    Timchenko, Leonid; Yarovyi, Andrii; Kokriatskaya, Nataliya; Nakonechna, Svitlana; Abramenko, Ludmila; Ławicki, Tomasz; Popiel, Piotr; Yesmakhanova, Laura

    2016-09-01

    The paper presents a method of parallel-hierarchical transformations for rapid recognition of dynamic images using GPU technology. Direct parallel-hierarchical transformations based on cluster CPU-and GPU-oriented hardware platform. Mathematic models of training of the parallel hierarchical (PH) network for the transformation are developed, as well as a training method of the PH network for recognition of dynamic images. This research is most topical for problems on organizing high-performance computations of super large arrays of information designed to implement multi-stage sensing and processing as well as compaction and recognition of data in the informational structures and computer devices. This method has such advantages as high performance through the use of recent advances in parallelization, possibility to work with images of ultra dimension, ease of scaling in case of changing the number of nodes in the cluster, auto scan of local network to detect compute nodes.

  14. Smart Homes with Voice Activated Systems for Disabled People

    Directory of Open Access Journals (Sweden)

    Bekir Busatlic

    2017-02-01

    Full Text Available Smart home refers to the application of various technologies to semi-unsupervised home control It refers to systems that control temperature, lighting, door locks, windows and many other appliances. The aim of this study was to design a system that will use existing technology to showcase how it can benefit people with disabilities. This work uses only off-the-shelf products (smart home devices and controllers, speech recognition technology, open-source code libraries. The Voice Activated Smart Home application was developed to demonstrate online grocery shopping and home control using voice comments and tested by measuring its effectiveness in performing tasks as well as its efficiency in recognizing user speech input.

  15. Voice over IP: how computing technology is being used in mobile communications.

    Science.gov (United States)

    Johnson, William

    2005-01-01

    This article explains how computing technology was used to address the need for mobile communications among nursing staff. In 2004, nursing staff at Fauquier Hospital relocated from one nursing floor in an older building to two floors in a new structure. This resulted in complaints and supervision issues as nursing managers, who had previously been relatively sedentary, now became quite mobile as they attempted to control nursing operations on two separate floors. Complaints arose from several sources. Nursing staff and managers both complained about the increased difficulty in communicating with each other Physicians expressed frustration to hospital administration at playing "telephone tag" with managers. The solution involved Internet Protocol technology that is in widespread use on most computer networks. The article details how this technology was selected over several other communications technologies and used to implement wireless telephony over the hospital's existing computer network. It reviews key standards and technologies and issues surrounding their use. Finally, the article demonstrates how this computing technology improved patient care by facilitating mobile communications.

  16. Wireless Technology Recognition Based on RSSI Distribution at Sub-Nyquist Sampling Rate for Constrained Devices.

    Science.gov (United States)

    Liu, Wei; Kulin, Merima; Kazaz, Tarik; Shahid, Adnan; Moerman, Ingrid; De Poorter, Eli

    2017-09-12

    Driven by the fast growth of wireless communication, the trend of sharing spectrum among heterogeneous technologies becomes increasingly dominant. Identifying concurrent technologies is an important step towards efficient spectrum sharing. However, due to the complexity of recognition algorithms and the strict condition of sampling speed, communication systems capable of recognizing signals other than their own type are extremely rare. This work proves that multi-model distribution of the received signal strength indicator (RSSI) is related to the signals' modulation schemes and medium access mechanisms, and RSSI from different technologies may exhibit highly distinctive features. A distinction is made between technologies with a streaming or a non-streaming property, and appropriate feature spaces can be established either by deriving parameters such as packet duration from RSSI or directly using RSSI's probability distribution. An experimental study shows that even RSSI acquired at a sub-Nyquist sampling rate is able to provide sufficient features to differentiate technologies such as Wi-Fi, Long Term Evolution (LTE), Digital Video Broadcasting-Terrestrial (DVB-T) and Bluetooth. The usage of the RSSI distribution-based feature space is illustrated via a sample algorithm. Experimental evaluation indicates that more than 92% accuracy is achieved with the appropriate configuration. As the analysis of RSSI distribution is straightforward and less demanding in terms of system requirements, we believe it is highly valuable for recognition of wideband technologies on constrained devices in the context of dynamic spectrum access.

  17. Analyzing the mediated voice - a datasession

    DEFF Research Database (Denmark)

    Lawaetz, Anna

    Broadcasted voices are technologically manipulated. In order to achieve a certain autencity or sound of “reality” paradoxically the voices are filtered and trained in order to reach the listeners. This “mis-en-scene” is important knowledge when it comes to the development of a consistent method o...... of analysis of the mediated voice...

  18. Analyzing the mediated voice - a datasession

    DEFF Research Database (Denmark)

    Lawaetz, Anna

    Broadcasted voices are technologically manipulated. In order to achieve a certain autencity or sound of “reality” paradoxically the voices are filtered and trained in order to reach the listeners. This “mis-en-scene” is important knowledge when it comes to the development of a consistent method...... of analysis of the mediated voice...

  19. Middle Years Science Teachers Voice Their First Experiences with Interactive Whiteboard Technology

    Science.gov (United States)

    Gadbois, Shannon A.; Haverstock, Nicole

    2012-01-01

    Among new technologies, interactive whiteboards (IWBs) particularly seem to engage students and offer entertainment value that may make them highly beneficial for learning. This study examined 10 Grade 6 teachers' initial experiences and uses of IWBs for teaching science. Through interviews, classroom visits, and field notes, the outcomes…

  20. Examining the Relationship Between Information Acquisition, Entrepreneurial Opportunity Recognition, and Innovation Performance in the High Technology Sector in Taiwan

    Science.gov (United States)

    Wang, Yu-Lin; Ellinger, Andrea D.

    2007-01-01

    The purpose of this study was to investigate the relationships between information acquisition, entrepreneurial opportunity recognition, and innovation performance in the high technology sector in Taiwan. The results suggest that both information acquisition and entrepreneurial opportunity recognition positively contribute to individual-level and…

  1. Sperry Univac speech communications technology

    Science.gov (United States)

    Medress, Mark F.

    1977-01-01

    Technology and systems for effective verbal communication with computers were developed. A continuous speech recognition system for verbal input, a word spotting system to locate key words in conversational speech, prosodic tools to aid speech analysis, and a prerecorded voice response system for speech output are described.

  2. Low-Resolution Vehicle Image Recognition Technology by Frame-Composition of Moving Images

    Science.gov (United States)

    Kanzawa, Yusuke; Kobayashi, Hiroki; Ohkawa, Takenao; Ito, Toshio

    Developing on-board automotive driver assistance systems aiming to alert drivers about driving environments, and possible collision with other vehicles has attracted a lot of attention lately. Especially, many researchers have suggested the forward vehicle recognition technology by a camera on vehicle. In the forward vehicle recognition, however, it is difficult to detect the features of vehicle from a distant vehicle image by conventional methods because the image is too low-resolution (LR). This paper presents vehicle image recognition technology for detecting of the features of a distant vehicle by frame-composition of moving images. To detect the vehicle features of a distant LR vehicle image, we use the moving images obtained from the camera on the vehicle, and utilize super-resolution (SR) image reconstruction. SR image reconstruction is to use signal processing techniques to obtain a high-resolution (or sequence) image from observed multiple LR images. Use of this technique on real road image, we show the effectiveness of the proposed techniques.

  3. Sustainable Consumer Voices

    DEFF Research Database (Denmark)

    Klitmøller, Anders; Rask, Morten; Jensen, Nevena

    2011-01-01

    Aiming to explore how user driven innovation can inform high level design strategies, an in-depth empirical study was carried out, based on data from 50 observations of private vehicle users. This paper reports the resulting 5 consumer voices: Technology Enthusiast, Environmentalist, Design Lover...

  4. Finding a Voice

    Science.gov (United States)

    Stuart, Shannon

    2012-01-01

    Schools have struggled for decades to provide expensive augmentative and alternative communication (AAC) resources for autistic students with communication challenges. Clunky voice output devices, often included in students' individualized education plans, cost about $8,000, a difficult expense to cover in hard times. However, mobile technology is…

  5. Mechanics of human voice production and control

    Science.gov (United States)

    Zhang, Zhaoyan

    2016-01-01

    As the primary means of communication, voice plays an important role in daily life. Voice also conveys personal information such as social status, personal traits, and the emotional state of the speaker. Mechanically, voice production involves complex fluid-structure interaction within the glottis and its control by laryngeal muscle activation. An important goal of voice research is to establish a causal theory linking voice physiology and biomechanics to how speakers use and control voice to communicate meaning and personal information. Establishing such a causal theory has important implications for clinical voice management, voice training, and many speech technology applications. This paper provides a review of voice physiology and biomechanics, the physics of vocal fold vibration and sound production, and laryngeal muscular control of the fundamental frequency of voice, vocal intensity, and voice quality. Current efforts to develop mechanical and computational models of voice production are also critically reviewed. Finally, issues and future challenges in developing a causal theory of voice production and perception are discussed. PMID:27794319

  6. Sustainable Consumer Voices

    DEFF Research Database (Denmark)

    Klitmøller, Anders; Rask, Morten; Jensen, Nevena

    2011-01-01

    Aiming to explore how user driven innovation can inform high level design strategies, an in-depth empirical study was carried out, based on data from 50 observations of private vehicle users. This paper reports the resulting 5 consumer voices: Technology Enthusiast, Environmentalist, Design Lover......, Pragmatist and Status Seeker. Expedient use of the voices in creating design strategies is discussed, thus contributing directly to the practice of high level design managers. The main academic contribution of this paper is demonstrating how applied anthropology can be used to generate insights...... into disruptive emergence of product service systems, where quantitative user analyses rely on historical continuation....

  7. Familiarity and Voice Representation: From Acoustic-Based Representation to Voice Averages

    Directory of Open Access Journals (Sweden)

    Maureen Fontaine

    2017-07-01

    Full Text Available The ability to recognize an individual from their voice is a widespread ability with a long evolutionary history. Yet, the perceptual representation of familiar voices is ill-defined. In two experiments, we explored the neuropsychological processes involved in the perception of voice identity. We specifically explored the hypothesis that familiar voices (trained-to-familiar (Experiment 1, and famous voices (Experiment 2 are represented as a whole complex pattern, well approximated by the average of multiple utterances produced by a single speaker. In experiment 1, participants learned three voices over several sessions, and performed a three-alternative forced-choice identification task on original voice samples and several “speaker averages,” created by morphing across varying numbers of different vowels (e.g., [a] and [i] produced by the same speaker. In experiment 2, the same participants performed the same task on voice samples produced by familiar speakers. The two experiments showed that for famous voices, but not for trained-to-familiar voices, identification performance increased and response times decreased as a function of the number of utterances in the averages. This study sheds light on the perceptual representation of familiar voices, and demonstrates the power of average in recognizing familiar voices. The speaker average captures the unique characteristics of a speaker, and thus retains the information essential for recognition; it acts as a prototype of the speaker.

  8. The Indoor Personnel Inertial Positioning Based on Image Recognition Technology in Substation Depth

    Directory of Open Access Journals (Sweden)

    Xu Min

    2017-01-01

    Full Text Available A kind of inertial positioning technology personnel indoor substation depth based on image recognition, scene modeling using 3D scanning depth camera for image acquisition of fast; key position of image recognition front-end uses image recognition software algorithm, the image information is converted to dot matrix data, connect the background server to obtain accurate positioning coordinates, greatly reduce the direct image transmission to the identification of the amount of data in the deployment of additional equipment to achieve precise positioning of field personnel without the aid, effectively solve the traditional Bluetooth, millimeter wave and laser positioning need in the site layout a lot of equipment problems, eliminate safety hazards in the substation; finally, the inertial navigation algorithm, field personnel positioning using gyroscope and accelerometer do the positioning of the auxiliary, smooth, real-time and accurate, have The utility model solves the problem of poor positioning accuracy of the electric field operation, and improves the working efficiency of the field operators.

  9. The Technology of ZigBee for Builiding Wireless Voice Communications Network%ZigBee无线语音通信组网技术

    Institute of Scientific and Technical Information of China (English)

    陈海燕; 张晨

    2012-01-01

    Whereas most of the low security intercom equipment, signal instability, conflict-prone, frequency of use license and other issues, the paper design of a ZigBee as communication technology to voice intercom system, significantly improved the above-mentioned problems. Terminal to the system microcontroller MSP430F149 microprocessor, provided through the use of TI CC2430 RF chip, ZigBee wireless communication standard, application technology related to the node in design, hardware test platform, the formation of ZigBee star network realization voice communication, of the entire system tests. Test results show that, using the ZigBee wireless technology has the advantages of voice communication, and meet the WSN network of power and transmission requirements.%鉴于目前多数对讲设备安全性低、信号不稳定、易冲突、使用频率许可等问题,设计了一种以ZigBee为通信技术的语音对讲系统.明显改善了上述问题.该系统终端以单片机MSP430F149为微处理器,通过使用TI公司提供的CC2430射频芯片,应用ZigBee无线通信标准相关技术,在所设计的节点硬件试验平台上,组建ZigBee星型网络实现语音通信,并且对整个系统进行测试.测试结果表明,利用ZigBee技术进行无线语音通信是具有优越性的,并且满足了无线个域网网络对功耗和传输的要求.

  10. Keeping Your Voice Healthy

    Science.gov (United States)

    ... Find an ENT Doctor Near You Keeping Your Voice Healthy Keeping Your Voice Healthy Patient Health Information ... heavily voice-related. Key Steps for Keeping Your Voice Healthy Drink plenty of water. Moisture is good ...

  11. Multimodal recognition of emotions

    NARCIS (Netherlands)

    Datcu, D.

    2009-01-01

    This thesis proposes algorithms and techniques to be used for automatic recognition of six prototypic emotion categories by computer programs, based on the recognition of facial expressions and emotion patterns in voice. Considering the applicability in real-life conditions, the research is carried

  12. Movement recognition technology as a method of assessing spontaneous general movements in high risk infants

    Directory of Open Access Journals (Sweden)

    Claire eMarcroft

    2015-01-01

    Full Text Available Preterm birth is associated with increased risks of neurological and motor impairments such as cerebral palsy. The risks are highest in those born at the lowest gestations. Early identification of those most at risk is challenging meaning that a critical window of opportunity to improve outcomes through therapy-based interventions may be missed. Clinically, the assessment of spontaneous general movements is an important tool which can be used for the prediction of movement impairments in high risk infants.Movement recognition aims to capture and analyze relevant limb movements through computerized approaches focusing on continuous, objective, and quantitative assessment. Different methods of recording and analyzing infant movements have recently been explored in high risk infants. These range from camera-based solutions to body-worn miniaturized movement sensors used to record continuous time-series data that represent the dynamics of limb movements. Various machine learning methods have been developed and applied to the analysis of the recorded movement data. This analysis has focused on the detection and classification of atypical spontaneous general movements. This paper aims to identify recent translational studies using movement recognition technology as a method of assessing movement in high risk infants. The application of this technology within pediatric practice represents a growing area of inter-disciplinary collaboration which may lead to a greater understanding of the development of the nervous system in infants at high risk of motor impairment.

  13. Voice over Internet Protocol (VoIP) Technology as a Global Learning Tool: Information Systems Success and Control Belief Perspectives

    Science.gov (United States)

    Chen, Charlie C.; Vannoy, Sandra

    2013-01-01

    Voice over Internet Protocol- (VoIP) enabled online learning service providers struggling with high attrition rates and low customer loyalty issues despite VoIP's high degree of system fit for online global learning applications. Effective solutions to this prevalent problem rely on the understanding of system quality, information quality, and…

  14. Citizen Journalism and Digital Voices: Instituting a Collaborative Process between Global Youth, Technology and Media for Positive Social Change

    Science.gov (United States)

    Worley, Robin

    2011-01-01

    Millions of youths in developing countries are described by UNICEF as "invisible and excluded." They live at the margins of society, facing challenges to their daily existence, powerless to make positive changes. But the emergence of citizen journalism and digital storytelling may offer these youths a chance to share their voices and…

  15. Using Continuous Voice Recognition Technology as an Input Medium to the Naval Warfare Interactive Simulation System (NWISS).

    Science.gov (United States)

    1984-06-01

    grammars cannot properly characterize major subsets of English sentences, unless sentence coplexity is severely limited, they are auite appropriate for...a menu-driven facility, called GFIL, for creatinj gramars whist is basic- ally user friendly. ith GRID, tae designer inputs poten- tial grammars for

  16. Objects Control through Speech Recognition Using LabVIEW

    Directory of Open Access Journals (Sweden)

    Ankush Sharma

    2013-01-01

    Full Text Available Speech is the natural form of human communication and the speech processing is the one of the most stimulating area of the signal processing. Speech recognition technology has made it possible for computer to follow the human voice command and understand the human languages. The objects (LED, Toggle switch etc. control through human speech is designed in this paper. By combine the virtual instrumentation technology and speech recognition techniques. And also provided password authentication. This can be done with the help of LabVIEW programming concepts. The microphone is using to take voice commands from Human. This microphone signals interface with LabVIEW code. The LabVIEW code will generate appropriate control signal to control the objects. The entire work done on the LabVIEW platform.

  17. A Unique Wavelet Steganography Based Voice Biometric Protection Scheme

    Directory of Open Access Journals (Sweden)

    Sanjaypande M. B

    2013-03-01

    Full Text Available Voice biometric is an easy and cost effective biometric technique which requires minimalistic hardware and software complexity. General voice biometric needs a voice phrase by user which is processed with Mel Filter and Vector Quantized features are extracted. Vector quantization reduces the codebook size but decreases the accuracy of recognition. Therefore we propose a voice biometric system where voice file's non quantized code books are matched with spoken phrase. In order to ensure security to such direct voice sample we embed the voice file in a randomly selected image using DWT technique. Imposters are exposed to only images and are unaware of the voice files. We show that the technique produces better efficiency in comparison to VQ based technique.

  18. A rapid automatic analyzer and its methodology for effective bentonite content based on image recognition technology

    Directory of Open Access Journals (Sweden)

    Wei Long

    2016-09-01

    Full Text Available Fast and accurate determination of effective bentonite content in used clay bonded sand is very important for selecting the correct mixing ratio and mixing process to obtain high-performance molding sand. Currently, the effective bentonite content is determined by testing the ethylene blue absorbed in used clay bonded sand, which is usually a manual operation with some disadvantages including complicated process, long testing time and low accuracy. A rapid automatic analyzer of the effective bentonite content in used clay bonded sand was developed based on image recognition technology. The instrument consists of auto stirring, auto liquid removal, auto titration, step-rotation and image acquisition components, and processor. The principle of the image recognition method is first to decompose the color images into three-channel gray images based on the photosensitive degree difference of the light blue and dark blue in the three channels of red, green and blue, then to make the gray values subtraction calculation and gray level transformation of the gray images, and finally, to extract the outer circle light blue halo and the inner circle blue spot and calculate their area ratio. The titration process can be judged to reach the end-point while the area ratio is higher than the setting value.

  19. Space Technology: A study of the significance of recognition for innovators of spinoff technologies. A case study on the impact of the space technology hall of fame award

    Science.gov (United States)

    1993-01-01

    This report represents the preliminary effort in studying the significance of recognition for innovators of spinoff technologies. The purpose of this initial year's effort in this area was to gather preliminary data and define the direction for the remainder of the research. This report focuses on the most recent recipients of the Hall of Fame Award, the developers of liquid-cooled garments. Liquid-cooled garments technology and its spinoffs were used as a case study to define and explore the factors involved in technology transfer and to consider the possible incentives in developing commercial applications including the Hall of Fame Award. Through interviews, views of award recipients were obtained on factors encouraging spinoffs as well as impediments to spinoffs. The researchers observed complex inter-relationships among the significant entities (government, individuals, large and small business), the importance of people, the importance of resource availability, and the significance of intrinsic motivation; drew preliminary conclusions pertaining to the direct and indirect influence of recognition like the Hall of Fame Award; and planned the direction for next year's follow-on research.

  20. Hands-free human-machine interaction with voice

    Science.gov (United States)

    Juang, B. H.

    2001-05-01

    Voice is natural communication interface between a human and a machine. The machine, when placed in today's communication networks, may be configured to provide automation to save substantial operating cost, as demonstrated in AT&T's VRCP (Voice Recognition Call Processing), or to facilitate intelligent services, such as virtual personal assistants, to enhance individual productivity. These intelligent services often need to be accessible anytime, anywhere (e.g., in cars when the user is in a hands-busy-eyes-busy situation or during meetings where constantly talking to a microphone is either undersirable or impossible), and thus call for advanced signal processing and automatic speech recognition techniques which support what we call ``hands-free'' human-machine communication. These techniques entail a broad spectrum of technical ideas, ranging from use of directional microphones and acoustic echo cancellatiion to robust speech recognition. In this talk, we highlight a number of key techniques that were developed for hands-free human-machine communication in the mid-1990s after Bell Labs became a unit of Lucent Technologies. A video clip will be played to demonstrate the accomplishement.

  1. 3D Imaging for hand gesture recognition: Exploring the software-hardware interaction of current technologies

    Science.gov (United States)

    Periverzov, Frol; Ilieş, Horea T.

    2012-09-01

    Interaction with 3D information is one of the fundamental and most familiar tasks in virtually all areas of engineering and science. Several recent technological advances pave the way for developing hand gesture recognition capabilities available to all, which will lead to more intuitive and efficient 3D user interfaces (3DUI). These developments can unlock new levels of expression and productivity in all activities concerned with the creation and manipulation of virtual 3D shapes and, specifically, in engineering design. Building fully automated systems for tracking and interpreting hand gestures requires robust and efficient 3D imaging techniques as well as potent shape classifiers. We survey and explore current and emerging 3D imaging technologies, and focus, in particular, on those that can be used to build interfaces between the users' hands and the machine. The purpose of this paper is to categorize and highlight the relevant differences between these existing 3D imaging approaches in terms of the nature of the information provided, output data format, as well as the specific conditions under which these approaches yield reliable data. Furthermore we explore the impact of each of these approaches on the computational cost and reliability of the required image processing algorithms. Finally we highlight the main challenges and opportunities in developing natural user interfaces based on hand gestures, and conclude with some promising directions for future research. [Figure not available: see fulltext.

  2. 基于BP和ARM的发动机声音识别系统%Voice recognition engine based on BP's system in realization of ARM

    Institute of Scientific and Technical Information of China (English)

    姜愉

    2012-01-01

    Aimed at addressing automatic fee charging of highway toll stations and large-scale re- chargeable parking lots, this paper introduces the design of a embedded speech recognition system based on ARM9 and embedded Linux system of the engine sound by analyzing the BP neural network recognition theory. The design consisting of S3C2410 microprocessors and Linux operating systems involves trans- planting the C language of speech recognition program to the embedded Linux operating system's file system when cross-compiled. The paper describes the system s hardware and software framework, and offers the experiments results produced by real-time recognition of the car type by the engine sound. The results prove its accuracy, real-time and validity.%为解决高速公路收费站及大型停车收费场自动收费问题,依据BP神经网络识别理论,设计了一个基于ARM9及嵌入式Linux系统的发动机声音识别系统。选用S3C2410微处理器和嵌入式Linux操作系统,把交叉编译后的发动机声音识别C语言程序移植到操作系统的文件中,实现了发动机声音实时识别功能,给出了系统整体软硬件结构框架以及实时输入发动机声音判别汽车类型的识别结果。现场实验证实了该系统的准确性、实时性和有效性。

  3. Effect of Technological Changes in Information Transfer on the Delivery of Pharmacy Services.

    Science.gov (United States)

    Barker, Kenneth N.; And Others

    1989-01-01

    Personal computer technology has arrived in health care. Specific technological advances are optical disc storage, smart cards, voice recognition, and robotics. This paper discusses computers in medicine, in nursing, in conglomerates, and with patients. Future health care will be delivered in primary care centers, medical supermarkets, specialized…

  4. Effect of Technological Changes in Information Transfer on the Delivery of Pharmacy Services.

    Science.gov (United States)

    Barker, Kenneth N.; And Others

    1989-01-01

    Personal computer technology has arrived in health care. Specific technological advances are optical disc storage, smart cards, voice recognition, and robotics. This paper discusses computers in medicine, in nursing, in conglomerates, and with patients. Future health care will be delivered in primary care centers, medical supermarkets, specialized…

  5. Voice over IP Security

    CERN Document Server

    Keromytis, Angelos D

    2011-01-01

    Voice over IP (VoIP) and Internet Multimedia Subsystem technologies (IMS) are rapidly being adopted by consumers, enterprises, governments and militaries. These technologies offer higher flexibility and more features than traditional telephony (PSTN) infrastructures, as well as the potential for lower cost through equipment consolidation and, for the consumer market, new business models. However, VoIP systems also represent a higher complexity in terms of architecture, protocols and implementation, with a corresponding increase in the potential for misuse. In this book, the authors examine the

  6. Improved sensitivity of wearable nanogenerators made of electrospun Eu3+ doped P(VDF-HFP)/graphene composite nanofibers for self-powered voice recognition

    Science.gov (United States)

    Adhikary, Prakriti; Biswas, Anirban; Mandal, Dipankar

    2016-12-01

    Composite nanofibers of Eu3+ doped poly(vinylidene fluoride-co-hexafluoropropylene) (P(VDF-HFP))/graphene are prepared by the electrospinning technique for the fabrication of ultrasensitive wearable piezoelectric nanogenerators (WPNGs) where the post-poling technique is not necessary. It is found that the complete conversion of the piezoelectric β-phase and the improvement of the degree of crystallinity is governed by the incorporation of Eu3+ and graphene sheets into P(VDF-HFP) nanofibers. The flexible nanocomposite fibers are associated with a hypersensitive electronic transition that results in an intense red light emission, and WPNGs also have the capability of detecting external pressure as low as ~23 Pa with a higher degree of acoustic sensitivity, ~11 V Pa-1, than has ever been previously reported. This means that ultrasensitive WPNGs can be utilized to recognize human voices, which suggests they could be a potential tool in the biomedical and national security sectors. The capacitor’s ability to charge from abundant environmental vibrations, such as music, wind, body motion, etc, drives WPNGs as a power source for portable electronics. This fact may open up the prospect of using the Eu3+ doped P(VDF-HFP)/graphene composite electrospun nanofibers, with their multifunctional properties such as vibration sensitivity, wearability, red light emission capability and piezoelectric energy harvesting, for various promising applications in portable electronics, health care monitoring, noise detection and security monitoring.

  7. Nanomechanical recognition of prognostic biomarker suPAR with DVD-ROM optical technology

    Science.gov (United States)

    Bache, Michael; Bosco, Filippo G.; Brøgger, Anna L.; Frøhling, Kasper B.; Sonne Alstrøm, Tommy; Hwu, En-Te; Chen, Ching-Hsiu; Eugen-Olsen, Jesper; Hwang, Ing-Shouh; Boisen, Anja

    2013-11-01

    In this work the use of a high-throughput nanomechanical detection system based on a DVD-ROM optical drive and cantilever sensors is presented for the detection of urokinase plasminogen activator receptor inflammatory biomarker (uPAR). Several large scale studies have linked elevated levels of soluble uPAR (suPAR) to infectious diseases, such as HIV, and certain types of cancer. Using hundreds of cantilevers and a DVD-based platform, cantilever deflection response from antibody-antigen recognition is investigated as a function of suPAR concentration. The goal is to provide a cheap and portable detection platform which can carry valuable prognostic information. In order to optimize the cantilever response the antibody immobilization and unspecific binding are initially characterized using quartz crystal microbalance technology. Also, the choice of antibody is explored in order to generate the largest surface stress on the cantilevers, thus increasing the signal. Using optimized experimental conditions the lowest detectable suPAR concentration is currently around 5 nM. The results reveal promising research strategies for the implementation of specific biochemical assays in a portable and high-throughput microsensor-based detection platform.

  8. Toward Successful Implementation of Speech Recognition Technology: A Survey of SRT Utilization Issues in Healthcare Settings.

    Science.gov (United States)

    Clarke, Martina A; King, Joshua L; Kim, Min Soon

    2015-07-01

    To evaluate physician utilization of speech recognition technology (SRT) for medical documentation in two hospitals. A quantitative survey was used to collect data in the areas of practice, electronic equipment used for documentation, documentation created after providing care, and overall thoughts about and satisfaction with the SRT. The survey sample was from one rural and one urban facility in central Missouri. In addition, qualitative interviews were conducted with a chief medical officer and a physician champion regarding implementation issues, training, choice of SRT, and outcomes from their perspective. Seventy-one (60%) of the anticipated 125 surveys were returned. A total of 16 (23%) participants were practicing in internal medicine and 9 (13%) were practicing in family medicine. Fifty-six (79%) participants used a desktop and 14 (20%) used a laptop (2%) computer. SRT products from Nuance were the dominant SRT used by 59 participants (83%). Windows operating systems (Microsoft, Redmond, WA) was used by more than 58 (82%) of the survey respondents. With regard to user experience, 42 (59%) participants experienced spelling and grammatical errors, 15 (21%) encountered clinical inaccuracy, 9 (13%) experienced word substitution, and 4 (6%) experienced misleading medical information. This study shows critical issues of inconsistency, unreliability, and dissatisfaction in the functionality and usability of SRT. This merits further attention to improve the functionality and usability of SRT for better adoption within varying healthcare settings.

  9. Gender Based Emotion Recognition System for Telugu Rural Dialects Using Hidden Markov Models

    CERN Document Server

    D, Prasad Reddy P V G; Srinivas, Y; Brahmaiah, P

    2010-01-01

    Automatic emotion recognition in speech is a research area with a wide range of applications in human interactions. The basic mathematical tool used for emotion recognition is Pattern recognition which involves three operations, namely, pre-processing, feature extraction and classification. This paper introduces a procedure for emotion recognition using Hidden Markov Models (HMM), which is used to divide five emotional states: anger, surprise, happiness, sadness and neutral state. The approach is based on standard speech recognition technology using hidden continuous markov model by selection of low level features and the design of the recognition system. Emotional Speech Database from Telugu Rural Dialects of Andhra Pradesh (TRDAP) was designed using several speaker's voices comprising the emotional states. The accuracy of recognizing five different emotions for both genders of classification is 80% for anger-emotion which is achieved by using the best combination of 39-dimensioanl feature vector for every f...

  10. Practice an Architecture of Voice,Video,and Integrated Data Technology%论三网合一技术架构的实践

    Institute of Scientific and Technical Information of China (English)

    廖红云; 雷雪冰

    2011-01-01

    在上世纪八十年代,美国开发了一个基于TCP/IP的通信系统--信息高速公路,也称为因特网,它带来了一个极大的冲击波,影响并改变了许多公司和个人开展业务的方式.随之而来的二十一世纪初,全球先进的技术进一步发展了这些TCP/IP通信设备,进入到新的水平--基于TCP/IP近乎实时地传输语音和视频数据.这就带来了一场新的基于网络融合的产业革新--"三网合一",它正改变着整个世界开展业务的方式.对"三网合一"概念,讨论了:①什么是"三网合一";②实现"三网合一"硬件和软件技术的解决方案;③转换现有网络到"三网合一";④操作"三网合一"应用和软件方案;⑤设计考虑.%In the 1980's, the United States has developed a system to transfer TCP/IP traffic-the most commonly known TCP/IP highway being the World -Wide Web, also known as Internet.It has made a huge impact on how many companies and individuals do business.In the beginning of the 21st century, the global advanced technologies has taken their TCP/IP equipment to the next level-the transmission of time - sensitive voice and video over TCP/IP.This new initiative is what is known to the consumer market as AVVID-An Architecture of Voice, Video, and Integrated Data.it is an acronym that is changing the way the world does business! This paper discusses the:①An Introduction to AVVID Technology;②An Overview of AVVID technology Solutions; ③Migrating Current Network to AVVID;④Utilizing AVVID Applications and Software Solutions Technologies;⑤Design Considerations.

  11. The Glasgow Voice Memory Test: Assessing the ability to memorize and recognize unfamiliar voices.

    Science.gov (United States)

    Aglieri, Virginia; Watson, Rebecca; Pernet, Cyril; Latinus, Marianne; Garrido, Lúcia; Belin, Pascal

    2017-02-01

    One thousand one hundred and twenty subjects as well as a developmental phonagnosic subject (KH) along with age-matched controls performed the Glasgow Voice Memory Test, which assesses the ability to encode and immediately recognize, through an old/new judgment, both unfamiliar voices (delivered as vowels, making language requirements minimal) and bell sounds. The inclusion of non-vocal stimuli allows the detection of significant dissociations between the two categories (vocal vs. non-vocal stimuli). The distributions of accuracy and sensitivity scores (d') reflected a wide range of individual differences in voice recognition performance in the population. As expected, KH showed a dissociation between the recognition of voices and bell sounds, her performance being significantly poorer than matched controls for voices but not for bells. By providing normative data of a large sample and by testing a developmental phonagnosic subject, we demonstrated that the Glasgow Voice Memory Test, available online and accessible from all over the world, can be a valid screening tool (~5 min) for a preliminary detection of potential cases of phonagnosia and of "super recognizers" for voices.

  12. Multimodal approaches for emotion recognition: a survey

    Science.gov (United States)

    Sebe, Nicu; Cohen, Ira; Gevers, Theo; Huang, Thomas S.

    2005-01-01

    Recent technological advances have enabled human users to interact with computers in ways previously unimaginable. Beyond the confines of the keyboard and mouse, new modalities for human-computer interaction such as voice, gesture, and force-feedback are emerging. Despite important advances, one necessary ingredient for natural interaction is still missing-emotions. Emotions play an important role in human-to-human communication and interaction, allowing people to express themselves beyond the verbal domain. The ability to understand human emotions is desirable for the computer in several applications. This paper explores new ways of human-computer interaction that enable the computer to be more aware of the user's emotional and attentional expressions. We present the basic research in the field and the recent advances into the emotion recognition from facial, voice, and physiological signals, where the different modalities are treated independently. We then describe the challenging problem of multimodal emotion recognition and we advocate the use of probabilistic graphical models when fusing the different modalities. We also discuss the difficult issues of obtaining reliable affective data, obtaining ground truth for emotion recognition, and the use of unlabeled data.

  13. Speaking and Nonspeaking Voice Professionals: Who Has the Better Voice?

    Science.gov (United States)

    Chitguppi, Chandala; Raj, Anoop; Meher, Ravi; Rathore, P K

    2017-04-18

    Voice professionals can be classified into two major subgroups: the primarily speaking and the primarily nonspeaking voice professionals. Nonspeaking voice professionals mainly include singers, whereas speaking voice professionals include the rest of the voice professionals. Although both of these groups have high vocal demands, it is currently unknown whether both groups show similar voice changes after their daily voice use. Comparison of these two subgroups of voice professionals has never been done before. This study aimed to compare the speaking voice of speaking and nonspeaking voice professionals with no obvious vocal fold pathology or voice-related complaints on the day of assessment. After obtaining relevant voice-related history, voice analysis and videostroboscopy were performed in 50 speaking and 50 nonspeaking voice professionals. Speaking voice professionals showed significantly higher incidence of voice-related complaints as compared with nonspeaking voice professionals. Voice analysis revealed that most acoustic parameters including fundamental frequency, jitter percent, and harmonic-to-noise ratio were significantly higher in speaking voice professionals, whereas videostroboscopy did not show any significant difference between the two groups. This is the first study of its kind to analyze the effect of daily voice use in the two subgroups of voice professionals with no obvious vocal fold pathology. We conclude that voice professionals should not be considered as a homogeneous group. The detrimental effects of excessive voice use were observed to occur more significantly in speaking voice professionals than in nonspeaking voice professionals. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  14. Feeling voices.

    Directory of Open Access Journals (Sweden)

    Paolo Ammirante

    Full Text Available Two experiments investigated deaf individuals' ability to discriminate between same-sex talkers based on vibrotactile stimulation alone. Nineteen participants made same/different judgments on pairs of utterances presented to the lower back through voice coils embedded in a conforming chair. Discrimination of stimuli matched for F0, duration, and perceived magnitude was successful for pairs of spoken sentences in Experiment 1 (median percent correct = 83% and pairs of vowel utterances in Experiment 2 (median percent correct = 75%. Greater difference in spectral tilt between "different" pairs strongly predicted their discriminability in both experiments. The current findings support the hypothesis that discrimination of complex vibrotactile stimuli involves the cortical integration of spectral information filtered through frequency-tuned skin receptors.

  15. Using automatic identification technologies for logistic support on battlefields of the future\\c James D. Kinkade.

    OpenAIRE

    Kinkade, James D.

    1996-01-01

    This thesis analyzes potential uses of automatic identification technologies to support Army forces on future battlefields. The thesis emphasizes radio frequency (RF) tag systems, but also presents an overview and comparison of six other automatic identification technologies (bar codes, optical character recognition, magnetic stripe, smart cards, optical cards, and voice recognition). The dynamics shaping the Army of the future, the characteristics of that Army, and the characteristics of the...

  16. Comparison of post menopausal voice changes across professional and non-professional users of the voice

    Directory of Open Access Journals (Sweden)

    Pallavi Vishwas Sovani

    2010-12-01

    Full Text Available Menopause effects a permanent change in certain body functions, one of them being voice. Moreover, if the voice is used continuously as a part of one’s occupation, this may further impact postmenopausal voice changes. The present study investigated the impact of menopause and professional voice use, and their interaction effect, on the voice. 92 women were classified into reproductive (52 and postmenopausal (40. Each group was divided into Level II (teachers and Level IV (clerks of Koufman and Isaacson’s (1991 classification. Acoustic parameters were analyzed using the VisiPitch III software. Aerodynamic parameters were manually calculated. The VHI (Voice Handicap Index was also included to improve the face validity of the study. Results suggest that Fo, SFo and MPT reduce post menopause while NHR and VTI increase. Some changes are accelerated in teachers as compared to clerks while some are decelerated. VHI scores of teachers are significantly greater than clerks, though not significantly different across menopause. Thus the presence or absence of voice use in one’s profession differentially affects postmenopausal changes. The study has implications in improving the condition of teachers in India, developing norms for menopausal changes and modifying allowable limits for voice recognition systems in future.

  17. Voice-controlled Debugging of Spreadsheets

    CERN Document Server

    Flood, Derek

    2008-01-01

    Developments in Mobile Computing are putting pressure on the software industry to research new modes of interaction that do not rely on the traditional keyboard and mouse combination. Computer users suffering from Repetitive Strain Injury also seek an alternative to keyboard and mouse devices to reduce suffering in wrist and finger joints. Voice-control is an alternative approach to spreadsheet development and debugging that has been researched and used successfully in other domains. While voice-control technology for spreadsheets is available its effectiveness has not been investigated. This study is the first to compare the performance of a set of expert spreadsheet developers that debugged a spreadsheet using voice-control technology and another set that debugged the same spreadsheet using keyboard and mouse. The study showed that voice, despite its advantages, proved to be slower and less accurate. However, it also revealed ways in which the technology might be improved to redress this imbalance.

  18. 77 FR 27501 - In the Matter of One Voice Technologies, Inc., Orchestra Therapeutics, Inc., Path 1 Network...

    Science.gov (United States)

    2012-05-10

    ... Technologies, Inc., Pavilion Energy Resources, Inc. (f/k/a Global Business Services, Inc.), Pine Valley Mining... securities of Pavilion Energy Resources, Inc. (f/k/a Global Business Services, Inc.) because it has not...

  19. Authentication System for Smart Homes Based on ARM7TDMI-S and IRIS-Fingerprint Recognition Technologies

    Directory of Open Access Journals (Sweden)

    Fredrick R. Ishengoma

    2014-12-01

    Full Text Available With the rapid advancement in technology, smart homes have become applicable and so the need arise to solve the security challenges that are accompanied with its operation. Passwords and identity cards have been used as traditional authentication mechanisms in home environments, however, the rise of misuse of these mechanisms are proving them to be less reliable. For instance, ID cards can be misplaced, copied or counterfeited and being misused. Conversely, studies have shown that biometrics authentication systems particularly Iris Recognition Technology (IRT and Fingerprint Recognition Technology (FRT have the most reliable mechanisms to date providing tremendous accuracy and speed. As the technology becomes less expensive, application of IRT& FRT in smart-homes becomes more reliable and appropriate solution for security challenges. In this paper, we present our approach to design an authentication system for smart homes based on IRT, FRT and ARM7TDMI-S.The system employs two biometrics mechanisms for high reliability whereby initially, system users must enroll their fingerprints and eyes into the camera. Iris and fingerprint biometrics are scanned and the images are stored in the database. In the stage of authentication, FRT and IRT fingerprint scan and analyze points of the user's current input iris and fingerprint and match with the database contents. If one or more captured images do not match with the one in the database, then the system will not give authorization.

  20. 车载仪表真人发声提示技术的研究%Study of the Automotive Instrument with Real Voice Prompt Technology

    Institute of Scientific and Technical Information of China (English)

    范维全; 邓华

    2013-01-01

    目前仪表发出的指示和报警信息主要是视觉信息。如数字显示、图形图像显示及文字显示。报警仪表在发送视觉信息的同时加上铃声这个听觉方式,但这仅仅是引起人们的注意。人们通过视觉了解的具体内容,如果在显示报警仪表上采用数字语音技术,使适合用听觉传送的信息用语音来传送,就可以发挥听觉的优势,弥补完全用视觉信号传送信息的不足。%At present, the instrument indication and alarm information is mainly the visual information, such as digital display, image display and text. When the alarm instrument is sending visual information, a bell also rings, i.e. an audio way, which can only attract people’s attention. If the visual information on the alarm instrument can be indicated by audio signals with real voice by using digital speech technology, the advantages of hearing can be utilized, which can make up for the inadequacy of visual signals alone.

  1. Research of Obstacle Recognition Technology in Cross-Country Environment for Unmanned Ground Vehicle

    Directory of Open Access Journals (Sweden)

    Zhao Yibing

    2014-01-01

    Full Text Available Being aimed at the obstacle recognition problem of unmanned ground vehicles in cross-country environment, this paper uses monocular vision sensor to realize the obstacle recognition of typical obstacles. Firstly, median filtering algorithm is applied during image preprocessing that can eliminate the noise. Secondly, image segmentation method based on the Fisher criterion function is used to segment the region of interest. Then, morphological method is used to process the segmented image, which is preparing for the subsequent analysis. The next step is to extract the color feature S, color feature a and edge feature “verticality” of image are extracted based on the HSI color space, the Lab color space, and two value images. Finally multifeature fusion algorithm based on Bayes classification theory is used for obstacle recognition. Test results show that the algorithm has good robustness and accuracy.

  2. A preliminary analysis of human factors affecting the recognition accuracy of a discrete word recognizer for C3 systems

    Science.gov (United States)

    Yellen, H. W.

    1983-03-01

    Literature pertaining to Voice Recognition abounds with information relevant to the assessment of transitory speech recognition devices. In the past, engineering requirements have dictated the path this technology followed. But, other factors do exist that influence recognition accuracy. This thesis explores the impact of Human Factors on the successful recognition of speech, principally addressing the differences or variability among users. A Threshold Technology T-600 was used for a 100 utterance vocubalary to test 44 subjects. A statistical analysis was conducted on 5 generic categories of Human Factors: Occupational, Operational, Psychological, Physiological and Personal. How the equipment is trained and the experience level of the speaker were found to be key characteristics influencing recognition accuracy. To a lesser extent computer experience, time or week, accent, vital capacity and rate of air flow, speaker cooperativeness and anxiety were found to affect overall error rates.

  3. Fractal Dimension of Voice-Signal Waveforms

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The fractal dimension is one important parameter that characterizes waveforms. In this paper, we derive a new method to calculate fractal dimension of digital voice-signal waveforms. We show that fractal dimension is an efficient tool for speaker recognition or speech recognition. It can be used to identify different speakers or distinguish speech. We apply our results to Chinese speaker recognition and numerical experiment shows that fractal dimension is an efficient parameter to characterize individual Chinese speakers. We have developed a semiautomatic voiceprint analysis system based on the theory of this paper and former researches.

  4. Perceiving a stranger's voice as being one's own: a 'rubber voice' illusion?

    Directory of Open Access Journals (Sweden)

    Zane Z Zheng

    Full Text Available We describe an illusion in which a stranger's voice, when presented as the auditory concomitant of a participant's own speech, is perceived as a modified version of their own voice. When the congruence between utterance and feedback breaks down, the illusion is also broken. Compared to a baseline condition in which participants heard their own voice as feedback, hearing a stranger's voice induced robust changes in the fundamental frequency (F0 of their production. Moreover, the shift in F0 appears to be feedback dependent, since shift patterns depended reliably on the relationship between the participant's own F0 and the stranger-voice F0. The shift in F0 was evident both when the illusion was present and after it was broken, suggesting that auditory feedback from production may be used separately for self-recognition and for vocal motor control. Our findings indicate that self-recognition of voices, like other body attributes, is malleable and context dependent.

  5. Using voice input and audio feedback to enhance the reality of a virtual experience

    Energy Technology Data Exchange (ETDEWEB)

    Miner, N.E.

    1994-04-01

    Virtual Reality (VR) is a rapidly emerging technology which allows participants to experience a virtual environment through stimulation of the participant`s senses. Intuitive and natural interactions with the virtual world help to create a realistic experience. Typically, a participant is immersed in a virtual environment through the use of a 3-D viewer. Realistic, computer-generated environment models and accurate tracking of a participant`s view are important factors for adding realism to a virtual experience. Stimulating a participant`s sense of sound and providing a natural form of communication for interacting with the virtual world are equally important. This paper discusses the advantages and importance of incorporating voice recognition and audio feedback capabilities into a virtual world experience. Various approaches and levels of complexity are discussed. Examples of the use of voice and sound are presented through the description of a research application developed in the VR laboratory at Sandia National Laboratories.

  6. The research on Face recognition technology based on dynamic video%基于动态视频的人脸识别技术研究

    Institute of Scientific and Technical Information of China (English)

    茹志鹃

    2016-01-01

    随着高科技的发展,人脸识别技术也在不断发展.如今市场上已经出现多种人脸识别技术.人脸识别技术作为一种生物特征识别的典型应用被应用到各个领域,如国防、司法、金融等,受到了社会的关注与认可.%With the development of high-tech,face recognition technology is also in constant development. Now It has appeared on the market a variety of facial recognition technology.Facial recognition technology,as a typical application of biometrics has been applied to various fields,such as national defen se,justice,finance,etc.,has received the attention and recognition.

  7. Differences in Access to Information and Communication Technologies: Voices of British Muslim Teenage Girls at Islamic Faith Schools

    Science.gov (United States)

    Hardaker, Glenn; Sabki, Aishah; Qazi, Atika; Iqbal, Javed

    2017-01-01

    Purpose: Most research on information and communication technologies (ICT) differences has been related to gender and ethnicity, and to a lesser extent religious affiliation. The purpose of this paper is to contribute to this field of research by situating the discussion in the context of British Muslims and extending current research into ICT…

  8. Dimensionality in voice quality.

    Science.gov (United States)

    Bele, Irene Velsvik

    2007-05-01

    This study concerns speaking voice quality in a group of male teachers (n = 35) and male actors (n = 36), as the purpose was to investigate normal and supranormal voices. The goal was the development of a method of valid perceptual evaluation for normal to supranormal and resonant voices. The voices (text reading at two loudness levels) had been evaluated by 10 listeners, for 15 vocal characteristics using VA scales. In this investigation, the results of an exploratory factor analysis of the vocal characteristics used in this method are presented, reflecting four dimensions of major importance for normal and supranormal voices. Special emphasis is placed on the effects on voice quality of a change in the loudness variable, as two loudness levels are studied. Furthermore, the vocal characteristics Sonority and Ringing voice quality are paid special attention, as the essence of the term "resonant voice" was a basic issue throughout a doctoral dissertation where this study was included.

  9. Voice box (image)

    Science.gov (United States)

    The larynx, or voice box, is located in the neck and performs several important functions in the body. The larynx is involved in swallowing, breathing, and voice production. Sound is produced when the air which ...

  10. Voice and Aging

    Science.gov (United States)

    ... dramatic voice changes are those during childhood and adolescence. The larynx (or voice box) and vocal cord tissues do not fully mature until late teenage years. Hormone-related changes during adolescence are ...

  11. N-best: The Northern- and Southern-Dutch Benchmark Evaluation of Speech recognition Technology

    NARCIS (Netherlands)

    Kessens, J.M.; Leeuwen, D.A. van

    2007-01-01

    In this paper, we describe N-best 2008, the first Large Vocabulary Speech Recognition (LVCSR) benchmark evaluation held for the Dutch language. Both the accent as spoken in the Netherlands (Northern-Dutch) and in Belgium (Southern-Dutch or Flemish), will be evaluated. The evaluation tasks are

  12. Voice and endocrinology

    OpenAIRE

    KVS Hari Kumar; Anurag Garg; Ajai Chandra, N. S.; Singh, S. P.; Rakesh Datta

    2016-01-01

    Voice is one of the advanced features of natural evolution that differentiates human beings from other primates. The human voice is capable of conveying the thoughts into spoken words along with a subtle emotion to the tone. This extraordinary character of the voice in expressing multiple emotions is the gift of God to the human beings and helps in effective interpersonal communication. Voice generation involves close interaction between cerebral signals and the peripheral apparatus consistin...

  13. Delay related issues in integrated voice and data networks

    Science.gov (United States)

    Gruber, J. G.

    1981-06-01

    The described investigation is concerned with the problem of transmitting voice with data in a computer communications network. The motivations for considering mixed voice and data traffic in such a shared network environment include the advent of new voice related applications with the technology now existing to economically support them, and the desire to plan for and design future integrated networks for reasons of economy and flexibility. Attention is given to the problem of variable delays in a shared network environment handling voice traffic. Previous work in packetized voice, as well as various approaches to integrated voice and data transmission, are reviewed. These approaches may be regarded as enhanced versions of circuit, packet, and hybrid switching. The impact of network interfacing and delay considerations for voice traffic is discussed.

  14. Writing with Voice

    Science.gov (United States)

    Kesler, Ted

    2012-01-01

    In this Teaching Tips article, the author argues for a dialogic conception of voice, based in the work of Mikhail Bakhtin. He demonstrates a dialogic view of voice in action, using two writing examples about the same topic from his daughter, a fifth-grade student. He then provides five practical tips for teaching a dialogic conception of voice in…

  15. Tips for Healthy Voices

    Science.gov (United States)

    ... social interaction as well as for most people’s occupation. Proper care and use of your voice will give you the best chance for having a healthy voice for your entire lifetime. Hoarseness or roughness in your voice is often ...

  16. Voice Compression Technology in Wireless Through-the-earth Communication in Mines%矿井无线透地通信中语音压缩技术探讨

    Institute of Scientific and Technical Information of China (English)

    赵红玉; 彭慧

    2015-01-01

    针对矿井透地无线通信中语音传输存在的噪声干扰大、传输信道窄、实际传输距离有限等特点,通过比较现阶段的语音压缩技术,提出增量调制(delta modulation,DM)及其改进方案自适应增量调制(adaptive delta modulation,ADM)不适宜作为透地无线通信语音压缩的压缩编码方案;语音压缩感知(Compressed Sensing)作为一种新近提出的压缩编码方案,能够以远低于奈奎斯特采样速率采样压缩信号,虽然语音重建质量有限,编码复杂,技术难度高,目前不适宜作为透地无线通信语音压缩编码的方案,但是可能成为未来无线通信压缩技术;码激励线性预测CELP(Code Excited Linear Prediction coding)比较适合作为透地无线通信语音压缩编码方案。%Voice transmission in wireless through-the-earth communication in mines has problems as high noise, narrow channel, and limited real transmission distance. Compared with the present voice compression technology, the study proposes delta modulation (DM) and its improvement scheme. Adaptive delta modulation (ADM) is not suitable for the compression coding in the voice compression. Compressing sensing of voice, as a new coding, could compress the signal with sampling rate much less than Nyquist rate, but it is not appropriate to be the voice compression coding in the wireless through-the-earth communication for the time being, with limited quality of reconstructed speech, complex coding, and difficult technology. However, it is promising in the future. Code excited linear prediction coding (CELP) is more appropriate to be the voice compression coding scheme for wireless through-the-earth communication.

  17. Investigating Applications of Speech-to-Text Recognition Technology for a Face-to-Face Seminar to Assist Learning of Non-Native English-Speaking Participants

    Science.gov (United States)

    Shadiev, Rustam; Hwang, Wu-Yuin; Huang, Yueh-Min; Liu, Chia-Ju

    2016-01-01

    This study applied speech-to-text recognition (STR) technology to assist non-native English-speaking participants to learn at a seminar given in English. How participants used transcripts generated by the STR technology for learning and their perceptions toward the STR were explored. Three main findings are presented in this study. Most…

  18. Analysis of Documentation Speed Using Web-Based Medical Speech Recognition Technology: Randomized Controlled Trial.

    Science.gov (United States)

    Vogel, Markus; Kaisers, Wolfgang; Wassmuth, Ralf; Mayatepek, Ertan

    2015-11-03

    Clinical documentation has undergone a change due to the usage of electronic health records. The core element is to capture clinical findings and document therapy electronically. Health care personnel spend a significant portion of their time on the computer. Alternatives to self-typing, such as speech recognition, are currently believed to increase documentation efficiency and quality, as well as satisfaction of health professionals while accomplishing clinical documentation, but few studies in this area have been published to date. This study describes the effects of using a Web-based medical speech recognition system for clinical documentation in a university hospital on (1) documentation speed, (2) document length, and (3) physician satisfaction. Reports of 28 physicians were randomized to be created with (intervention) or without (control) the assistance of a Web-based system of medical automatic speech recognition (ASR) in the German language. The documentation was entered into a browser's text area and the time to complete the documentation including all necessary corrections, correction effort, number of characters, and mood of participant were stored in a database. The underlying time comprised text entering, text correction, and finalization of the documentation event. Participants self-assessed their moods on a scale of 1-3 (1=good, 2=moderate, 3=bad). Statistical analysis was done using permutation tests. The number of clinical reports eligible for further analysis stood at 1455. Out of 1455 reports, 718 (49.35%) were assisted by ASR and 737 (50.65%) were not assisted by ASR. Average documentation speed without ASR was 173 (SD 101) characters per minute, while it was 217 (SD 120) characters per minute using ASR. The overall increase in documentation speed through Web-based ASR assistance was 26% (P=.04). Participants documented an average of 356 (SD 388) characters per report when not assisted by ASR and 649 (SD 561) characters per report when assisted

  19. [Research on Barrier-free Home Environment System Based on Speech Recognition].

    Science.gov (United States)

    Zhu, Husheng; Yu, Hongliu; Shi, Ping; Fang, Youfang; Jian, Zhuo

    2015-10-01

    The number of people with physical disabilities is increasing year by year, and the trend of population aging is more and more serious. In order to improve the quality of the life, a control system of accessible home environment for the patients with serious disabilities was developed to control the home electrical devices with the voice of the patients. The control system includes a central control platform, a speech recognition module, a terminal operation module, etc. The system combines the speech recognition control technology and wireless information transmission technology with the embedded mobile computing technology, and interconnects the lamp, electronic locks, alarms, TV and other electrical devices in the home environment as a whole system through a wireless network node. The experimental results showed that speech recognition success rate was more than 84% in the home environment.

  20. Voice Recognition Accuracy: What Is Acceptable?

    Science.gov (United States)

    1982-11-01

    127. Modesto Sumter 128. Worchester Catskills 129. Huntsville Janesville 130. Waterville Osage Beach 131. Baton Rouge Phoenix 132. Marquette Billings...133. New Orleans Antingua 134. Walla Walla Modesto 135. Tupelo Augusta 136. Astoria Greinville 137. Catskills Bermuda 138. Atlanta Huntsville 139...78 Sort Sos 79 Type Up 80 Debug 81 Papa Alpha 82 Quebec Stack 83 Romeo Tango 84 Sierra Alpha 85 Tango Romeo 86 Uniform Nine 87 Victor

  1. Controlling An Electric Car Starter System Through Voice

    Directory of Open Access Journals (Sweden)

    A.B. Muhammad Firdaus

    2015-04-01

    Full Text Available Abstract These days automotive has turned into a stand out amongst the most well-known modes of transportation on the grounds that a large number of Malaysians could bear to have an auto. There are numerous decisions of innovations in auto that have in the market. One of the engineering is voice controlled framework. Voice Recognition is the procedure of consequently perceiving a certain statement talked by a specific speaker focused around individual data included in discourse waves. This paper is to make an car controlled by voice of human. An essential pre-processing venture in Voice Recognition systems is to recognize the vicinity of noise. Sensitivity to speech variability lacking recognition precision and helplessness to mimic are among the principle specialized obstacles that keep the far reaching selection of speech-based recognition systems. Voice recognition systems work sensibly well with a quiet conditions however inadequately under loud conditions or in twisted channels. The key focus of the project is to control an electric car starter system.

  2. Voice handicap in singers.

    Science.gov (United States)

    Murry, Thomas; Zschommler, Anne; Prokop, Jan

    2009-05-01

    The study aimed to determine the differences in responses to the Voice Handicap Index (VHI-10) between singers and nonsingers and to evaluate the ranked order differences of the VHI-10 statements for both groups. The VHI-10 was modified to include statements related to the singing voice for comparison to the original VHI-10. Thirty-five nonsingers with documented voice disorders responded to the VHI-10. A second group, consisting of 35 singers with voice complaints, responded to the VHI-10 with three statements added specifically addressing the singing voice. Data from both groups were analyzed in terms of overall subject self-rating of voice handicap and the rank order of statements from least to most important. The difference between the mean VHI-10 for the singers and nonsingers was not statistically significant, thus, supporting the validity of the VHI-10. However, the 10 statements were ranked differently in terms of their importance by both groups. In addition, when three statements related specifically to the singing voice were substituted in the original VHI-10, the singers judged their voice problem to be more severe than when using the original VHI-10. The type of statements used to assess self-perception of voice handicap may be related to the subject population. Singers with voice problems do not rate their voices to be more handicapped than nonsingers unless statements related specifically to singing are included.

  3. Space technology: A study of the significance of recognition for innovators of spinoff technologies. 1993 activities/1994, 1995 plans

    Science.gov (United States)

    1994-01-01

    During the past 30 years as NASA has conducted technology transfer programs, it has gained considerable experience - particularly pertaining to the processes. However, three areas have not had much scrutiny: the examination of the contributions of the individuals who have developed successful spinoffs, the commercial success of the spinoffs themselves, and the degree to which they are understood by the public. In short, there has been limited evaluation to measure the success of technology transfer efforts mandated by Congress. Research conducted during the first year of a three-year NASA grant to the United States Space Foundation has taken the initial steps toward measuring the success of methodologies to accomplish that Congressionally-mandated technology transfer. In particular, the US Space Foundation, in cooperation with ARAC, technology transfer experts; JKA, a nationally recognized themed entertainment design company; and top evaluation consultants, inaugurated and evaluated a fresh approach including commercial practices to encourage, motivate, and energize technology transfer by: recognizing already successful efforts (Space Technology Hall of Fame Award), drawing potential business and industrial players into the process (Space Commerce Expo), and informing and motivating the general public (Space Technology Hall of Fame public venues). The first year's efforts are documented and directions for the future are outlined.

  4. LABORATORY VOICE DATA ENTRY SYSTEM.

    Energy Technology Data Exchange (ETDEWEB)

    PRAISSMAN,J.L.SUTHERLAND,J.C.

    2003-04-01

    We have assembled a system using a personal computer workstation equipped with standard office software, an audio system, speech recognition software and an inexpensive radio-based wireless microphone that permits laboratory workers to enter or modify data while performing other work. Speech recognition permits users to enter data while their hands are holding equipment or they are otherwise unable to operate a keyboard. The wireless microphone allows unencumbered movement around the laboratory without a ''tether'' that might interfere with equipment or experimental procedures. To evaluate the potential of voice data entry in a laboratory environment, we developed a prototype relational database that records the disposal of radionuclides and/or hazardous chemicals Current regulations in our laboratory require that each such item being discarded must be inventoried and documents must be prepared that summarize the contents of each container used for disposal. Using voice commands, the user enters items into the database as each is discarded. Subsequently, the program prepares the required documentation.

  5. METHODS FOR QUALITY ENHANCEMENT OF USER VOICE SIGNAL IN VOICE AUTHENTICATION SYSTEMS

    Directory of Open Access Journals (Sweden)

    O. N. Faizulaieva

    2014-03-01

    Full Text Available The reasonability for the usage of computer systems user voice in the authentication process is proved. The scientific task for improving the signal/noise ratio of the user voice signal in the authentication system is considered. The object of study is the process of input and output of the voice signal of authentication system user in computer systems and networks. Methods and means for input and extraction of voice signal against external interference signals are researched. Methods for quality enhancement of user voice signal in voice authentication systems are suggested. As modern computer facilities, including mobile ones, have two-channel audio card, the usage of two microphones is proposed in the voice signal input system of authentication system. Meanwhile, the task of forming a lobe of microphone array in a desired area of voice signal registration (100 Hz to 8 kHz is solved. The usage of directional properties of the proposed microphone array gives the possibility to have the influence of external interference signals two or three times less in the frequency range from 4 to 8 kHz. The possibilities for implementation of space-time processing of the recorded signals using constant and adaptive weighting factors are investigated. The simulation results of the proposed system for input and extraction of signals during digital processing of narrowband signals are presented. The proposed solutions make it possible to improve the value of the signal/noise ratio of the useful signals recorded up to 10, ..., 20 dB under the influence of external interference signals in the frequency range from 4 to 8 kHz. The results may be useful to specialists working in the field of voice recognition and speaker’s discrimination.

  6. Separation of Singing Voice from Music Accompaniment for Monaural Recordings

    Science.gov (United States)

    2005-09-01

    singing voice is a key step for applications such as karaoke [43] and currently it remains labor-intensive work. Automating this process therefore will be...copyright issues. On the other hand some modern commercial karaoke compact disks (CDs) are recorded with multiplex technology in which singing voice...extracted. We extracted 10 songs from karaoke CDs obtained from [1] to construct a database for singing voice detection. These songs are sampled at 16

  7. Nanomechanical recognition of prognostic biomarker suPAR with DVD-ROM optical technology

    DEFF Research Database (Denmark)

    Bache, Michael; Bosco, Filippo; Brøgger, Anna Line

    2013-01-01

    In this work the use of a high-throughput nanomechanical detection system based on a DVD-ROM optical drive and cantilever sensors is presented for the detection of urokinase plasminogen activator receptor inflammatory biomarker (uPAR). Several large scale studies have linked elevated levels...... of soluble uPAR (suPAR) to infectious diseases, such as HIV, and certain types of cancer. Using hundreds of cantilevers and a DVD-based platform, cantilever deflection response from antibody–antigen recognition is investigated as a function of suPAR concentration. The goal is to provide a cheap and portable...

  8. Adherence to self-monitoring via interactive voice response technology in an eHealth intervention targeting weight gain prevention among Black women: randomized controlled trial.

    Science.gov (United States)

    Steinberg, Dori M; Levine, Erica L; Lane, Ilana; Askew, Sandy; Foley, Perry B; Puleo, Elaine; Bennett, Gary G

    2014-04-29

    eHealth interventions are effective for weight control and have the potential for broad reach. Little is known about the use of interactive voice response (IVR) technology for self-monitoring in weight control interventions, particularly among populations disproportionately affected by obesity. This analysis sought to examine patterns and predictors of IVR self-monitoring adherence and the association between adherence and weight change among low-income black women enrolled in a weight gain prevention intervention. The Shape Program was a randomized controlled trial comparing a 12-month eHealth behavioral weight gain prevention intervention to usual care among overweight and obese black women in the primary care setting. Intervention participants (n=91) used IVR technology to self-monitor behavior change goals (eg, no sugary drinks, 10,000 steps per day) via weekly IVR calls. Weight data were collected in clinic at baseline, 6, and 12 months. Self-monitoring data was stored in a study database and adherence was operationalized as the percent of weeks with a successful IVR call. Over 12 months, the average IVR completion rate was 71.6% (SD 28.1) and 52% (47/91) had an IVR completion rate ≥80%. At 12 months, IVR call completion was significantly correlated with weight loss (r =-.22; P=.04) and participants with an IVR completion rate ≥80% had significantly greater weight loss compared to those with an IVR completion rate educated participants were more likely to achieve high IVR call completion. Participants reported positive attitudes toward IVR self-monitoring. Adherence to IVR self-monitoring was high among socioeconomically disadvantaged black women enrolled in a weight gain prevention intervention. Higher adherence to IVR self-monitoring was also associated with greater weight change. IVR is an effective and useful tool to promote self-monitoring and has the potential for widespread use and long-term sustainability. Clinicaltrials.gov NCT00938535; http

  9. Ageing Voices: The Effect of Changes in Voice Parameters on ASR Performance

    Directory of Open Access Journals (Sweden)

    Ravichander Vipperla

    2010-01-01

    Full Text Available With ageing, human voices undergo several changes which are typically characterized by increased hoarseness and changes in articulation patterns. In this study, we have examined the effect on Automatic Speech Recognition (ASR and found that the Word Error Rates (WER on older voices is 10% absolute higher compared to those of adult voices. Subsequently, we compared several voice source parameters including fundamental frequency, jitter, shimmer, harmonicity, and cepstral peak prominence of adult and older males. Several of these parameters show statistically significant difference for the two groups. However, artificially increasing jitter and shimmer measures do not effect the ASR accuracies significantly. Artificially lowering the fundamental frequency degrades the ASR performance marginally but this drop in performance can be overcome to some extent using Vocal Tract Length Normalisation (VTLN. Overall, we observe that the changes in the voice source parameters do not have a significant impact on ASR performance. Comparison of the likelihood scores of all the phonemes for the two age groups show that there is a systematic mismatch in the acoustic space of the two age groups. Comparison of the phoneme recognition rates show that mid vowels, nasals, and phonemes that depend on the ability to create constrictions with tongue tip for articulation are more affected by ageing than other phonemes.

  10. A Voice Operated Tour Planning System for Autonomous Mobile Robots

    Directory of Open Access Journals (Sweden)

    Charles V. Smith Iii

    2010-06-01

    Full Text Available Control systems driven by voice recognition software have been implemented before but lacked the context driven approach to generate relevant responses and actions. A partially voice activated control system for mobile robotics is presented that allows an autonomous robot to interact with people and the environment in a meaningful way, while dynamically creating customized tours. Many existing control systems also require substantial training for voice application. The system proposed requires little to no training and is adaptable to chaotic environments. The traversable area is mapped once and from that map a fully customized route is generated to the user

  11. 基于红外光电传感器和语音识别技术的智能循迹小车设计%Design of Intelligent-tracking Car Based on Infrared Photoelectric Sensor and Speech Recognition Technology

    Institute of Scientific and Technical Information of China (English)

    张兢; 王猛; 李成勇; 李雪梅; 徐伟

    2012-01-01

    An intelligent-tracking car has been designed based on infrared photoelectric sensor and speech recognition technology. The car uses the infrared photoelectric sensor to obtain the path information and adopts SPCE061A single chip of Sunplus Inc to work as the core processor of the controlcircuit. The car can adjust the direction and speed by the location of the black line on the path information to implement self-tracking. Speech recognition technology is adopted to achieve voice control based on voice process functions of SPCE061A. This design has simple structure and runs steadily and reliably. It can be used in such fields as smart wheelchair, intelligent toys and unmanned driving vehicles.%基于红外光电传感器和语音识别技术设计了一种智能循迹小车。该小车采用红外反射式光电传感器获取路径信息,以凌阳SPCE061A单片机为控制处理器,根据路径信息中黑线的位置调整小车的运动方向和速度,从而实现自循迹功能。同时结合SPCE061A所具有的语音处理功能,采用语音识别技术实现对小车的控制。该设计结构简单,运行稳定可靠,可应用于智能轮椅、智能玩具、无人驾驶机动车等领域。

  12. Singing voice outcomes following singing voice therapy.

    Science.gov (United States)

    Dastolfo-Hromack, Christina; Thomas, Tracey L; Rosen, Clark A; Gartner-Schmidt, Jackie

    2016-11-01

    The objectives of this study were to describe singing voice therapy (SVT), describe referred patient characteristics, and document the outcomes of SVT. Retrospective. Records of patients receiving SVT between June 2008 and June 2013 were reviewed (n = 51). All diagnoses were included. Demographic information, number of SVT sessions, and symptom severity were retrieved from the medical record. Symptom severity was measured via the 10-item Singing Voice Handicap Index (SVHI-10). Treatment outcome was analyzed by diagnosis, history of previous training, and SVHI-10. SVHI-10 scores decreased following SVT (mean change = 11, 40% decrease) (P singing lessons (n = 10) also completed an average of three SVT sessions. Primary muscle tension dysphonia (MTD1) and benign vocal fold lesion (lesion) were the most common diagnoses. Most patients (60%) had previous vocal training. SVHI-10 decrease was not significantly different between MTD and lesion. This is the first outcome-based study of SVT in a disordered population. Diagnosis of MTD or lesion did not influence treatment outcomes. Duration of SVT was short (approximately three sessions). Voice care providers are encouraged to partner with a singing voice therapist to provide optimal care for the singing voice. This study supports the use of SVT as a tool for the treatment of singing voice disorders. 4 Laryngoscope, 126:2546-2551, 2016. © 2016 The American Laryngological, Rhinological and Otological Society, Inc.

  13. Intelligent Home Speech Recognition System Based on NL6621%语音识别技术在智能家居中的应用

    Institute of Scientific and Technical Information of China (English)

    王爱芸

    2015-01-01

    The research of intelligent home speech recognition system is very important for the development of smart home. Through the analysis of the embedded speech recognition technology and smart home control technology, voice is recorded with NL6621 board as the platform and VS1003 as audio decoding chip. And Hidden Markov Model (HMM) algorithm is used to carry out voice model training and voice matching, so that we can achieve a smart home voice con-trol system. Experiments prove that the speech control system has high recognition rate and real-time performance.%研究实用的智能家居语音识别系统,对于智能家居的发展具有重要意义。通过分析嵌入式语音识别技术以及智能家居控制技术,以 NL6621板为平台,VS1003为音频解码芯片录制语音。并利用隐马尔可夫(HMM)算法进行语音模型训练和语音匹配,实现智能家居语音控制系统。实验证明此语音控制系统具有较高的识别率和实时性。

  14. GIS technology in regional recognition of the distribution pattern of multifloral honey: The chemical traits in Serbia

    Directory of Open Access Journals (Sweden)

    Radović D.I.

    2014-01-01

    Full Text Available GIS is a computer-based system to input, store, manipulate, analyze and output spatially referenced data. There is a huge range application of GIS that generally sets out to fulfill: mapping, measurement, monitoring, modeling and management. In this study, GIS technology was used for the regional recognition of origin and distribution patterns of multifloral honey chemical traits in Serbia. This included organizing and analyzing the spatial and attributive data of 164 honey samples collected from different regions of Serbia during the harvesting season of 2009. Multifloral honey was characterized in regards to mineral composition, sugar content and basic physicochemical properties. The kriging method of Geostatistical Analyst was used for interpolation to predict values of a sampled variable over the whole territory of Serbia. [Projekat Ministarstva nauke Republike Srbije, br. III 46002, OI 172017 and 451-03-2372-IP Type 1/107

  15. Design of a digital voice data compression technique for orbiter voice channels

    Science.gov (United States)

    1975-01-01

    Candidate techniques were investigated for digital voice compression to a transmission rate of 8 kbps. Good voice quality, speaker recognition, and robustness in the presence of error bursts were considered. The technique of delayed-decision adaptive predictive coding is described and compared with conventional adaptive predictive coding. Results include a set of experimental simulations recorded on analog tape. The two FM broadcast segments produced show the delayed-decision technique to be virtually undegraded or minimally degraded at .001 and .01 Viterbi decoder bit error rates. Preliminary estimates of the hardware complexity of this technique indicate potential for implementation in space shuttle orbiters.

  16. Clinical Voices - an update

    DEFF Research Database (Denmark)

    Fusaroli, Riccardo; Weed, Ethan

    Anomalous aspects of speech and voice, including pitch, fluency, and voice quality, are reported to characterise many mental disorders. However, it has proven difficult to quantify and explain this oddness of speech by employing traditional statistical methods. In this talk we will show how the t...... the temporal dynamics of the voice in Asperger's patients enable us to automatically reconstruct the diagnosis, and assess the Autism quotient score. We then generalise the findings to Danish and American children with autism....

  17. Effects of Medications on Voice

    Science.gov (United States)

    ... ENT Doctor Near You Effects of Medications on Voice Effects of Medications on Voice Patient Health Information ... entnet.org . Could Your Medication Be Affecting Your Voice? Some medications including prescription, over-the-counter, and ...

  18. Qos and Voice Over IP

    Directory of Open Access Journals (Sweden)

    Adrian GHENCEA

    Full Text Available As Voice over Internet Protocol (VoIP technology matures, companies are increasingly adopting it to cut costs, improve efficiency and enhance customer service. Using the Internet as an existing network for integrating data and telecom systems through intelligent VoIP, a range of benefits results: lower long distance costs, cost cuts in cabling processes and more flexible telephony management. However, as voice over IP services grow in popularity, major threats arise: this rapid growth leads to traffic congestion, security is jeopardizedand the poor quality of calls affects communication. The objective of this article is to present all the elements that can affect voicequality in a VoIP network and to provide methods for solving them. A detailed analysis to minimize the impact of implementation of QoS will be made, and at the end solutions to management strategies will be proposed.

  19. A New Technology:3 D Facial Recognition%面部识别新技术:三维面部识别

    Institute of Scientific and Technical Information of China (English)

    王玥; 李丽娜

    2014-01-01

    3D face recognition is a reliable technology t in the field of facial recognition, and has been widely applied in sensitive places. This paper describes the development of 3D facial recognition, technical characteristics, difficulties and hotspots within the application. The future development of 3D facial recognition is also prospected in the end.%三维面部识别是面部识别领域中一项识别率可靠的技术,已经在国内外一些敏感应用场所得到了推广使用。文章介绍了三维面部识别的发展、技术特点、难点与应用热点,最后对三维面部识别的未来进行了展望。

  20. Research and Implementation of Text Recognition Technology Based on VB%基于VB的文本识别技术的研究与实现

    Institute of Scientific and Technical Information of China (English)

    钱文婷

    2013-01-01

    Text recognition is the use of computer technology,automatic character recognition,is one of the most important fields in pattern recognition applications.This paper introduced in Visual Basic 6 programming environment,using Microsoft Office Document Imaging components and Kodak image in Microsoft Office edit control method for text recognition.%文本识别是利用计算机自动识别字符的技术,是模式识别应用的一个重要领域。介绍在Visual Basic 6.0编程环境下,借助微软Office中的Microsoft Office Document Imaging组件和Kodak图像编辑控件进行文本识别的方法。

  1. Voice-based assessments of trustworthiness, competence, and warmth in blind and sighted adults

    OpenAIRE

    Oleszkiewicz, Anna; Pisanski, Katarzyna; Lachowicz-Tabaczek, Kinga; Sorokowska, Agnieszka

    2016-01-01

    The study of voice perception in congenitally blind individuals allows researchers rare insight into how a lifetime of visual deprivation affects the development of voice perception. Previous studies have suggested that blind adults outperform their sighted counterparts in low-level auditory tasks testing spatial localization and pitch discrimination, as well as in verbal speech processing; however, blind persons generally show no advantage in nonverbal voice recognition or discrimination tas...

  2. Automatic Speech Recognition and Training for Severely Dysarthric Users of Assistive Technology: The STARDUST Project

    Science.gov (United States)

    Parker, Mark; Cunningham, Stuart; Enderby, Pam; Hawley, Mark; Green, Phil

    2006-01-01

    The STARDUST project developed robust computer speech recognizers for use by eight people with severe dysarthria and concomitant physical disability to access assistive technologies. Independent computer speech recognizers trained with normal speech are of limited functional use by those with severe dysarthria due to limited and inconsistent…

  3. View of Recent Advances in Iris Recognition Technology%虹膜识别技术的最新进展综述

    Institute of Scientific and Technical Information of China (English)

    尚睿

    2015-01-01

    This paper introduces the history, development and present situation of iris recognition technology, and expounds the in?novation points and work progress of the latest iris recognition technology. Finally, the conclusion is drawn that the appropriate method can find more accurate iris inner and outer edge, which can improve the efficiency of iris recognition.%文章介绍了虹膜识别技术的历史、发展和现状,详细阐述了最新的虹膜识别技术文献的创新点及工作进展,最后得出结论:采取适当的方法会在较短时间内找到较为准确的虹膜内、外边缘,这对提高虹膜识别的效率有较为明显的影响.

  4. Onset and Maturation of Fetal Heart Rate Response to the Mother's Voice over Late Gestation

    Science.gov (United States)

    Kisilevsky, Barbara S.; Hains, Sylvia M. J.

    2011-01-01

    Background: Term fetuses discriminate their mother's voice from a female stranger's, suggesting recognition/learning of some property of her voice. Identification of the onset and maturation of the response would increase our understanding of the influence of environmental sounds on the development of sensory abilities and identify the period when…

  5. EXPERIMENTAL STUDY OF FIRMWARE FOR INPUT AND EXTRACTION OF USER’S VOICE SIGNAL IN VOICE AUTHENTICATION SYSTEMS

    Directory of Open Access Journals (Sweden)

    O. N. Faizulaieva

    2014-09-01

    Full Text Available Scientific task for improving the signal-to-noise ratio for user’s voice signal in computer systems and networks during the process of user’s voice authentication is considered. The object of study is the process of input and extraction of the voice signal of authentication system user in computer systems and networks. Methods and means for input and extraction of the voice signal on the background of external interference signals are investigated. Ways for quality improving of the user’s voice signal in systems of voice authentication are investigated experimentally. Firmware means for experimental unit of input and extraction of the user’s voice signal against external interference influence are considered. As modern computer means, including mobile, have two-channel audio card, two microphones are used in the voice signal input. The distance between sonic-wave sensors is 20 mm and it provides forming one direction pattern lobe of microphone array in a desired area of voice signal registration (from 100 Hz to 8 kHz. According to the results of experimental studies, the usage of directional properties of the proposed microphone array and space-time processing of the recorded signals with implementation of constant and adaptive weighting factors has made it possible to reduce considerably the influence of interference signals. The results of firmware experimental studies for input and extraction of the user’s voice signal against external interference influence are shown. The proposed solutions will give the possibility to improve the value of the signal/noise ratio of the useful signals recorded up to 20 dB under the influence of external interference signals in the frequency range from 4 to 8 kHz. The results may be useful to specialists working in the field of voice recognition and speaker discrimination.

  6. [Information technology in learning sign language].

    Science.gov (United States)

    Hernández, Cesar; Pulido, Jose L; Arias, Jorge E

    2015-01-01

    To develop a technological tool that improves the initial learning of sign language in hearing impaired children. The development of this research was conducted in three phases: the lifting of requirements, design and development of the proposed device, and validation and evaluation device. Through the use of information technology and with the advice of special education professionals, we were able to develop an electronic device that facilitates the learning of sign language in deaf children. This is formed mainly by a graphic touch screen, a voice synthesizer, and a voice recognition system. Validation was performed with the deaf children in the Filadelfia School of the city of Bogotá. A learning methodology was established that improves learning times through a small, portable, lightweight, and educational technological prototype. Tests showed the effectiveness of this prototype, achieving a 32 % reduction in the initial learning time for sign language in deaf children.

  7. [Emphasizing the technological specification of glaucoma examination to promote mutual recognition to the results].

    Science.gov (United States)

    Wang, Ningli

    2014-05-01

    It is very important for us to establish the technological specifications in medical field. Many problems will appear if we neglect them, which include difficulty in approving the results of examination with each other, waste of medical resources and increase of patients' time and expense, lack of guarantee for medical quality and safety and insupportable for further study with collecting medical data, etc. Although many technical operations were included in glaucoma field, unified technical specification was still deficiency in our country, which will have some certain influence on the clinical and research work. Only increasing the understanding of importance of technological specifications and establishing the applications of glaucoma's technical specifications, our whole level of glaucoma diagnosis and treatment will be raised in the whole country.

  8. Voiced Reading and Rhythm

    Institute of Scientific and Technical Information of China (English)

    詹艳萍

    2007-01-01

    Since voiced reading is an important way in learning English,rhythm is the most critical factor that enables to read beautifully.This article illustrates the relationship between rhythm and voiced reading,the importance of rhythm,and the methods to develop the sense of rhythm.

  9. Clinical Voices - an update

    DEFF Research Database (Denmark)

    Fusaroli, Riccardo; Weed, Ethan

    Anomalous aspects of speech and voice, including pitch, fluency, and voice quality, are reported to characterise many mental disorders. However, it has proven difficult to quantify and explain this oddness of speech by employing traditional statistical methods. In this talk we will show how...

  10. Borderline Space for Voice

    Science.gov (United States)

    Batchelor, Denise

    2012-01-01

    Being on the borderline as a student in higher education is not always negative, to do with marginalisation, exclusion and having a voice that is vulnerable. Paradoxically, being on the edge also has positive connections with integration, inclusion and having a voice that is strong. Alternative understandings of the concept of borderline space can…

  11. Voice and endocrinology

    Directory of Open Access Journals (Sweden)

    KVS Hari Kumar

    2016-01-01

    Full Text Available Voice is one of the advanced features of natural evolution that differentiates human beings from other primates. The human voice is capable of conveying the thoughts into spoken words along with a subtle emotion to the tone. This extraordinary character of the voice in expressing multiple emotions is the gift of God to the human beings and helps in effective interpersonal communication. Voice generation involves close interaction between cerebral signals and the peripheral apparatus consisting of the larynx, vocal cords, and trachea. The human voice is susceptible to the hormonal changes throughout life right from the puberty until senescence. Thyroid, gonadal and growth hormones have tremendous impact on the structure and function of the vocal apparatus. The alteration of voice is observed even in physiological states such as puberty and menstruation. Astute clinical observers make out the changes in the voice and refer the patients for endocrine evaluation. In this review, we shall discuss the hormonal influence on the voice apparatus in normal and endocrine disorders.

  12. Face the voice

    DEFF Research Database (Denmark)

    Lønstrup, Ansa

    2014-01-01

    will be based on a reception aesthetic and phenomenological approach, the latter as presented by Don Ihde in his book Listening and Voice. Phenomenologies of Sound , and my analytical sketches will be related to theoretical statements concerning the understanding of voice and media (Cavarero, Dolar, La...

  13. Ontario's Student Voice Initiative

    Science.gov (United States)

    Courtney, Jean

    2014-01-01

    This article describes in some detail aspects of the Student Voice initiative funded and championed by Ontario's Ministry of Education since 2008. The project enables thousands of students to make their voices heard in meaningful ways and to participate in student-led research. Some students from grades 7 to 12 become members of the Student…

  14. Technology

    Directory of Open Access Journals (Sweden)

    Xu Jing

    2016-01-01

    Full Text Available The traditional answer card reading method using OMR (Optical Mark Reader, most commonly, OMR special card special use, less versatile, high cost, aiming at the existing problems proposed a method based on pattern recognition of the answer card identification method. Using the method based on Line Segment Detector to detect the tilt of the image, the existence of tilt image rotation correction, and eventually achieve positioning and detection of answers to the answer sheet .Pattern recognition technology for automatic reading, high accuracy, detect faster

  15. EasyVoice: Integrating voice synthesis with Skype

    CERN Document Server

    Condado, Paulo A

    2007-01-01

    This paper presents EasyVoice, a system that integrates voice synthesis with Skype. EasyVoice allows a person with voice disabilities to talk with another person located anywhere in the world, removing an important obstacle that affect these people during a phone or VoIP-based conversation.

  16. Voice and Data Network of Convergence and the Application of Voice over IP

    Energy Technology Data Exchange (ETDEWEB)

    Eldridge, J.M.

    2000-11-01

    This paper looks at emerging technologies for converging voice and data networks and telephony transport over a data network using Internet Protocols. Considered are the benefits and drivers for this convergence. The paper describes these new technologies, how they are being used, and their application to Sandia.

  17. Automatic Speech Acquisition and Recognition for Spacesuit Audio Systems

    Science.gov (United States)

    Ye, Sherry

    2015-01-01

    NASA has a widely recognized but unmet need for novel human-machine interface technologies that can facilitate communication during astronaut extravehicular activities (EVAs), when loud noises and strong reverberations inside spacesuits make communication challenging. WeVoice, Inc., has developed a multichannel signal-processing method for speech acquisition in noisy and reverberant environments that enables automatic speech recognition (ASR) technology inside spacesuits. The technology reduces noise by exploiting differences between the statistical nature of signals (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, ASR accuracy can be improved to the level at which crewmembers will find the speech interface useful. System components and features include beam forming/multichannel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, and ASR decoding. Arithmetic complexity models were developed and will help designers of real-time ASR systems select proper tasks when confronted with constraints in computational resources. In Phase I of the project, WeVoice validated the technology. The company further refined the technology in Phase II and developed a prototype for testing and use by suited astronauts.

  18. Smartphones Offer New Opportunities in Clinical Voice Research.

    Science.gov (United States)

    Manfredi, C; Lebacq, J; Cantarella, G; Schoentgen, J; Orlandi, S; Bandini, A; DeJonckere, P H

    2017-01-01

    Smartphone technology provides new opportunities for recording standardized voice samples of patients and sending the files by e-mail to the voice laboratory. This drastically improves the collection of baseline data, as used in research on efficiency of voice treatments. However, the basic requirement is the suitability of smartphones for recording and digitizing pathologic voices (mainly characterized by period perturbations and noise) without significant distortion. In this experiment, two smartphones (a very inexpensive one and a high-level one) were tested and compared with direct microphone recordings in a soundproof room. The voice stimuli consisted in synthesized deviant voice samples (median of fundamental frequency: 120 and 200 Hz) with three levels of jitter and three levels of added noise. All voice samples were analyzed using PRAAT software. The results show high correlations between jitter, shimmer, and noise-to-harmonics ratio measured on the recordings via both smartphones, the microphone, and measured directly on the sound files from the synthesizer. Smartphones thus appear adequate for reliable recording and digitizing of pathologic voices. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  19. Voice Savers for Music Teachers

    Science.gov (United States)

    Cookman, Starr

    2012-01-01

    Music teachers are in a class all their own when it comes to voice use. These elite vocal athletes require stamina, strength, and flexibility from their voices day in, day out for hours at a time. Voice rehabilitation clinics and research show that music education ranks high among the professionals most commonly affected by voice problems.…

  20. Voice-to-Phoneme Conversion Algorithms for Voice-Tag Applications in Embedded Platforms

    Directory of Open Access Journals (Sweden)

    Yan Ming Cheng

    2008-08-01

    Full Text Available We describe two voice-to-phoneme conversion algorithms for speaker-independent voice-tag creation specifically targeted at applications on embedded platforms. These algorithms (batch mode and sequential are compared in speech recognition experiments where they are first applied in a same-language context in which both acoustic model training and voice-tag creation and application are performed on the same language. Then, their performance is tested in a cross-language setting where the acoustic models are trained on a particular source language while the voice-tags are created and applied on a different target language. In the same-language environment, both algorithms either perform comparably to or significantly better than the baseline where utterances are manually transcribed by a phonetician. In the cross-language context, the voice-tag performances vary depending on the source-target language pair, with the variation reflecting predicted phonological similarity between the source and target languages. Among the most similar languages, performance nears that of the native-trained models and surpasses the native reference baseline.

  1. Additive attacks on speaker recognition

    Science.gov (United States)

    Farrokh Baroughi, Alireza; Craver, Scott

    2014-02-01

    Speaker recognition is used to identify a speaker's voice from among a group of known speakers. A common method of speaker recognition is a classification based on cepstral coefficients of the speaker's voice, using a Gaussian mixture model (GMM) to model each speaker. In this paper we try to fool a speaker recognition system using additive noise such that an intruder is recognized as a target user. Our attack uses a mixture selected from a target user's GMM model, inverting the cepstral transformation to produce noise samples. In our 5 speaker data base, we achieve an attack success rate of 50% with a noise signal at 10dB SNR, and 95% by increasing noise power to 0dB SNR. The importance of this attack is its simplicity and flexibility: it can be employed in real time with no processing of an attacker's voice, and little computation is needed at the moment of detection, allowing the attack to be performed by a small portable device. For any target user, knowing that user's model or voice sample is sufficient to compute the attack signal, and it is enough that the intruder plays it while he/she is uttering to be classiffed as the victim.

  2. Dominant Voice in Hamlet

    Institute of Scientific and Technical Information of China (English)

    李丹

    2015-01-01

    <正>The Tragedy of Hamlet dramatizes the revenge Prince Hamlet exacts on his uncle Claudius for murdering King Hamlet,Claudius’s brother and Prince Hamlet’s father,and then succeeding to the throne and taking as his wife Gertrude,the old king’s widow and Prince Hamlet’s mother.This paper will discuss something about dominant voice in the play.Dominant voice is the major voice in the country,the society,or the whole world.Those people who have the power or

  3. Far-Field Voice Activity Detection and Its Applications in Adverse Acoustic Environments

    DEFF Research Database (Denmark)

    Petsatodis, Theodoros

    2012-01-01

    Voice Activity Detection (VAD), being in the focus of speech processing research for many years, is nowadays a mature technology with application in several sectors. Embedded VAD components in telecommunications systems (like in cellular telephony) attempt to reduce power consumption of transmitt......Voice Activity Detection (VAD), being in the focus of speech processing research for many years, is nowadays a mature technology with application in several sectors. Embedded VAD components in telecommunications systems (like in cellular telephony) attempt to reduce power consumption...... of transmitters and bandwidth utilization. VAD technology is also integrated in speech-processing systems, such as Speaker Identification, Automatic Event Detection, and Automatic Speech Recognition, to prevent their operation in the absence of speech, and thus reduce the error rates of each of these systems....... The performance of VAD systems depends strongly on various factors, including the discriminative ability of the classification criterion employed, the dynamics of the additive noise and the signal to noise ratio. Speech signals transmitted within reverberant enclosures and captured using far-field microphones...

  4. Experiences with Voice to Design Ceramics

    DEFF Research Database (Denmark)

    Hansen, Flemming Tvede; Jensen, Kristoffer

    2013-01-01

    This article presents SoundShaping, a system to create ceramics from the human voice and thus how digital technology makes new possibilities in ceramic craft. The article is about how experiential knowledge that the craftsmen gains in a direct physical and tactile interaction with a responding....... The shape is output to a 3D printer to make ceramic results. The system demonstrates the close connection between digital technology and craft practice. Several experiments and reflections demonstrate the validity of this work....

  5. Experiences with voice to design ceramics

    DEFF Research Database (Denmark)

    Hansen, Flemming Tvede; Jensen, Kristoffer

    2014-01-01

    This article presents SoundShaping, a system to create ceramics from the human voice and thus how digital technology makes new possibilities in ceramic craft. The article is about how experiential knowledge that the craftsmen gains in a direct physical and tactile interaction with a responding....... The shape is output to a 3D printer to make ceramic results. The system demonstrates the close connection between digital technology and craft practice. Several experiments and reflections demonstrate the validity of this work....

  6. SPEECH EMOTION RECOGNITION USING MODIFIED QUADRATIC DISCRIMINATION FUNCTION

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    Quadratic Discrimination Function(QDF)is commonly used in speech emotion recognition,which proceeds on the premise that the input data is normal distribution.In this Paper,we propose a transformation to normalize the emotional features,then derivate a Modified QDF(MQDF) to speech emotion recognition.Features based on prosody and voice quality are extracted and Principal Component Analysis Neural Network (PCANN) is used to reduce dimension of the feature vectors.The results show that voice quality features are effective supplement for recognition.and the method in this paper could improve the recognition ratio effectively.

  7. Voice disorders without organic diseases of the larynx. A 10-year review of 62 patients.

    Science.gov (United States)

    Watanabe, Y; Miura, M; Shoji, H

    1983-01-01

    We reviewed the clinical records of 62 patients with voice disorders without organic diseases of the larynx who were examined in the Department of Otolaryngology, Kurume University Hospital during the 10 years from 1971 to 1980. There were 9 patients with psychogenic dysphonia, 24 with vocal abuse, 4 with spastic dysphonia, 9 with mutational voice disturbance, 8 with virilization of voice, 7 with dysphonia attributed to diseases of other organs than the larynx, and 1 with senile change of voice. The clinical service to the patients with these kinds of dysphonia has been poor in Japan for lack of trained voice pathologists. This has been also the case with our department. Recognition for the necessity of voice pathologists is strongly demanded.

  8. Research Progress of 3 D Facial Expression Recognition Technology%三维面部表情识别技术的研究进展

    Institute of Scientific and Technical Information of China (English)

    魏永超; 庄夏; 傅强; 杜冬

    2015-01-01

    三维采集设备的快速发展,极大推动了三维数据技术的研究。其中,以三维人脸数据为载体的三维面部表情识别研究成果不断涌现。三维面部表情识别可以极大克服二维识别中的姿态和光照变化等方面问题。对三维表情识别技术进行了系统概括,尤其针对三维表情的关键技术,即对表情特征提取、表情编码分类及表情数据库进行了总结分析,并提出了三维表情识别的研究建议。三维面部表情识别技术在识别率上基本满足要求,但实时性上需要进一步优化。相关内容对该领域的研究具有指导意义。%The rapid development of three-dimensional(3D) acquisition devices has greatly promoted the researches based on dimensional data and the achievements in 3 D facial expression recognition research is constantly emerging. 3D facial recognition can greatly overcome the gesture and illumination changes and other issues of two-dimensional(2D) recognition. This paper summarizes 3D facial expression recognition technologies with emphasis on analysis of the key technologies of 3D expression,including expression fea-ture extraction,coding and database. It also gives some research suggestions about 3D facial expression rec-ognition. 3D facial expression recognition technology can basically meet the requirements in recognition rate,but its real-time performance needs to be further optimized. The research in this paper has reference value for researchers in the field.

  9. 导弹防御系统中红外光电识别技术分析%Analyses on infrared optoelectronics recognition technology in missile defense system

    Institute of Scientific and Technical Information of China (English)

    吴瑕; 周焰; 崔建; 杨龙坡

    2009-01-01

    目标识别问题是弹道导弹防御系统中的核心难题之一,针对弹道导弹突防中威胁目标群飞行各阶段呈现出的不同红外特性.介绍了天基红外系统和拦截弹的最新研究进展及其红外目标识别技术手段.在温度测量、测辐射强度、红外成像等关键技术方面,系统地论述了其在反导系统弹道目标识别中所运用的红外光电子学方法与技术,并且就相应的反红外识别手段--红外隐身与红外诱饵,进行了探讨.最后对导弹防御系统中红外目标识别与反识别的研究动向进行了展望,提出了进行导弹防御系统目标识别研究的总体建议.%The target recognition is one of the core difficult problems of ballistic missile defense system. According to the different infrared characteristics of all sorts of dangerous targets in the apiece phases of ballistic missile flying, the newest development and the technical means of infrared target recognition for SBIRS and kinetic kill vehicle were introduced. The means and technologies of infrared optoelectronics applying in target recognition were depicted systematically considering the development of some key technologies, such as infrared temperature measurement, radiant intensity test and infrared imaging. Infrared stealth and infrared decoy technique were also discussed, which were the relevant infrared target counter-recognition means and technologies during ballistic missile attacking. Finally, some new representative developments for infrared target recognition and counter-recognition in ballistics missile defense system were expected, and some general suggestions to develop target recognition in ballistic missile defense system were provided.

  10. 指纹图像识别技术及其应用%The Technology and Application For Fingerprint Images Recognition

    Institute of Scientific and Technical Information of China (English)

    王波涛; 蔡安妮; 孙景鳌

    2001-01-01

    This paper introduces the definition and development of Biometric,and emphatically introduces the technology and application for fingerprint images recognition.%介绍了生物识别技术的定义以及发展情况,着重阐述了其中最为热门的指纹识别技术的原理及其应用。

  11. 基于人脸识别技术的学生考勤系统%Student Attendance System Based on Face Recognition Technology

    Institute of Scientific and Technical Information of China (English)

    董雷刚; 崔晓微; 张丹; 张华

    2014-01-01

    结合嵌入式技术和人脸识别技术,设计一套用于学生晚间归寝考勤的系统。其中嵌入式技术主要用于图像的处理以及数据的传输,人脸识别技术主要用于面部图片的采集与识别。考勤终端将人脸的识别结果通过网络传送到后台数据库,各宿舍楼管理员可通过Web浏览器终端或Android手机终端查看学生的归宿情况。该系统可有效实现对学生的归宿考勤,在避免学生替人签到的同时,也在很大程度上减少了管理者的工作量。%By the combination of embedded technology and face recognition technology , we design a system for the students’ attendance to sleep late.The embedded technology is mainly used for image processing and data transmission, and the face recognition technology is mainly used for acquisition and recognition of facial images. The attendance terminal will face recognition results through the network to the database , each dormitory admin-istrator can view the incidence of students through the web browser terminal or Android mobile phone terminal . The system can effectively realize the students ’ attendance , to avoid students for attendance at the same time, but also greatly reduce the workload of managers .

  12. Arabic Speech Recognition System using CMU-Sphinx4

    CERN Document Server

    Satori, H; Chenfour, N

    2007-01-01

    In this paper we present the creation of an Arabic version of Automated Speech Recognition System (ASR). This system is based on the open source Sphinx-4, from the Carnegie Mellon University. Which is a speech recognition system based on discrete hidden Markov models (HMMs). We investigate the changes that must be made to the model to adapt Arabic voice recognition. Keywords: Speech recognition, Acoustic model, Arabic language, HMMs, CMUSphinx-4, Artificial intelligence.

  13. Using the Voice to Design Ceramics

    DEFF Research Database (Denmark)

    Tvede Hansen, Flemming; Jensen, Kristoffer

    2011-01-01

    Digital technology makes new possibilities in ceramic craft. This project is about how experiential knowledge that the craftsmen gains in a direct physical and tactile interaction with a responding material can be transformed and utilized in the use of digital technologies. The project presents...... SoundShaping, a system to create ceramics from the human voice. Based on a generic audio feature extraction system, and the principal component analysis to ensure that the pertinent information in the voice is used, a 3D shape is created using simple geometric rules. This shape is output to a 3D printer...... to make ceramic results. The system demonstrates the close connection between digital technology and craft practice....

  14. Experiences with Voice to Design Ceramics

    DEFF Research Database (Denmark)

    Hansen, Flemming Tvede; Jensen, Kristoffer

    2013-01-01

    This article presents SoundShaping, a system to create ceramics from the human voice and thus how digital technology makes new possibilities in ceramic craft. The article is about how experiential knowledge that the craftsmen gains in a direct physical and tactile interaction with a responding ma....... The shape is output to a 3D printer to make ceramic results. The system demonstrates the close connection between digital technology and craft practice. Several experiments and reflections demonstrate the validity of this work....... material can be transformed and utilized in the use of digital technologies. SoundShaping is based on a generic audio feature extraction system and the principal component analysis to ensure that the pertinent information in the voice is used. Moreover, 3D shape is created using simple geometric rules...

  15. AN APPLICATION OF SPEAKER RECOGNITION USING ARTIFICIAL NEURAL NETWORKS

    Directory of Open Access Journals (Sweden)

    Murat CANER

    2006-02-01

    Full Text Available In this study an artificial neural network (ANN is implemented, which has been used frequently as an implementation model in recent years, to recognize speaker identification. Generally, recognition is consist of three stages that, processing of signal, obtaining attributes and comparing them. Speech samples are transformed into digital data according to voice card of PC. In the analysis of voice stage, recurrent periods and white noise of voice data are trimmed by hamming window method and voice attribute part of the digital data is obtained. For obtaining attribute of voice data LPC (linear predictive coding and DFT (discrete fourier transform methods are used. Of those 28 coefficents, that is used for speaker recognition, 16 were obtained by the analysis of DFT and 12 were obtained by the analysis of LPC. The parameters that represent speaker voice, is used for training and test of ANN. Multilayer perceptron model is used as an architecture of ANN and backpropagation algorithm is used for training method. Voices of "a" is taken from 7 different person and their attributes are found. ANN is trained with these features to find the speaker who is the owner of the sample voice. And then using the test data that is not used for training part, recognition achievement of ANN is tested. As a result, good results were obtained with low failure rate.

  16. The value of visualizing tone of voice.

    Science.gov (United States)

    Pullin, Graham; Cook, Andrew

    2013-10-01

    Whilst most of us have an innate feeling for tone of voice, it is an elusive quality that even phoneticians struggle to describe with sufficient subtlety. For people who cannot speak themselves this can have particularly profound repercussions. Augmentative communication often involves text-to-speech, a technology that only supports a basic choice of prosody based on punctuation. Given how inherently difficult it is to talk about more nuanced tone of voice, there is a risk that its absence from current devices goes unremarked and unchallenged. Looking ahead optimistically to more expressive communication aids, their design will need to involve more subtle interactions with tone of voice-interactions that the people using them can understand and engage with. Interaction design can play a role in making tone of voice visible, tangible, and accessible. Two projects that have already catalysed interdisciplinary debate in this area, Six Speaking Chairs and Speech Hedge, are introduced together with responses. A broader role for design is advocated, as a means to opening up speech technology research to a wider range of disciplinary perspectives, and also to the contributions and influence of people who use it in their everyday lives.

  17. Voice over IP in Wireless Heterogeneous Networks

    DEFF Research Database (Denmark)

    Fathi, Hanane; Chakraborty, Shyam; Prasad, Ramjee

    The convergence of different types of traffic has preceded the convergence of systems and services in a wireless heterogeneous network. Voice and data traffic are usually treated separate in both 2G and 2.5G wireless networks. With advances in packet switching technology and especially with the d...... and to the discruption caused by the user mobility during the session. Voice over IP in Wireless Hetetrogeneous Networks thus investigates and proposes cross-layer techniques for realizing time-efficient control mechanisms for VoIP: signaling, mobility and security.......The convergence of different types of traffic has preceded the convergence of systems and services in a wireless heterogeneous network. Voice and data traffic are usually treated separate in both 2G and 2.5G wireless networks. With advances in packet switching technology and especially....... The focus of Voice over IP in Wierless Heterogeneous Networks is on mechanisms that affect the VoIP user satisfaction  while not explicitly involved in the media session. This relates to the extra delays introduced by the security and the signaling protocols used to set up an authorized VoIP session...

  18. Voice Therapy Practices and Techniques: A Survey of Voice Clinicians.

    Science.gov (United States)

    Mueller, Peter B.; Larson, George W.

    1992-01-01

    Eighty-three voice disorder therapists' ratings of statements regarding voice therapy practices indicated that vocal nodules are the most frequent disorder treated; vocal abuse and hard glottal attack elimination, counseling, and relaxation were preferred treatment approaches; and voice therapy is more effective with adults than with children.…

  19. Voice in early glottic cancer compared to benign voice pathology

    NARCIS (Netherlands)

    Van Gogh, C. D. L.; Mahieu, H. F.; Kuik, D. J.; Rinkel, R. N. P. M.; Langendijk, J. A.; Verdonck-de Leeuw, I. M.

    2007-01-01

    The purpose of this study is to compare (Dutch) Voice Handicap Index (VHIvumc) scores from a selected group of patients with voice problems after treatment for early glottic cancer with patients with benign voice disorders and subjects from the normal population. The study included a group of 35 pat

  20. The inner voice

    Directory of Open Access Journals (Sweden)

    Anthony James Ridgway

    2009-12-01

    Full Text Available The inner voice- we all know what it is because we all have it and use it when we are thinking or reading, for example. Little work has been done on it in our field, with the notable exception of Brian Tomlinson, but presumably it must be a cognitive phenomenon which is of great importance in thinking, language learning, and reading in a foreign language. The inner voice will be discussed as a cognitive psychological phenomenon associated with short-term memory, and distinguished from the inner ear. The process of speech recoding will be examined (the process of converting written language into the inner voice and the importance of developing the inner voice, as a means of both facilitating the production of a new language and enhancing the comprehension of a text in a foreign language, will be emphasized. Finally, ways of developing the inner voice in beginning and intermediate readers of a foreign language will be explored and recommended.

  1. 基于Asterisk的语音识别技术研究和实现%Research and Implementation of Speech Recognition Technology Based on Asterisk

    Institute of Scientific and Technical Information of China (English)

    陈可新; 黄伟民

    2015-01-01

    This paper analyzes the problems existing in traditional IVR in call center, introduces the function of speech recognition tech-nology in call center, expounds the development principles and procedure of speech recognition by use of Asterisk dial plan and AGI, and finally, gives the implementation of speech recognition by use of speech recognition engine, which is called by AGI program to rec-ognize inbound user' s speech.%本文简要地分析了当前呼叫中心中传统IVR系统存在的问题,介绍了语音识别技术在呼叫中心的作用,阐述了利用Asterisk的拨号方案和AGI接口开发语音识别功能的原理,最后给出了在AGI程序中调用语音识别引擎实现呼入用户语音信息识别的过程.

  2. The effects of thematic context and presentation mode on memory for sentence voice.

    Science.gov (United States)

    Kerr, N H; Butler, S F; Maykuth, P L; Delis, D

    1982-05-01

    A sentence in discourse may appear in the passive voice to emphasize the logical object rather than the logical subject when it is thematically more important. Two experiments are reported that explore the impact of this textual function of voice on sentence memory. The first experiment required subjects to listen to prose passages and then recall them. Sentences were recalled predominantly in the active voice regardless of voice or thematic focus in the prose passage, showing that the English-language bias for the active voice was a more important determinant of sentence reconstruction than was the experiment manipulation of thematic context. The second experiment required subjects to listen to or read either prose passages or lists of unrelated sentences and then to try to recognize "key" sentences that were either unchanged or changed lexically, semantically, or in voice. Recognition, both overall and specifically for voice, was better for sentences that were read than for those that were heard, and recognition for semantic change was consistently higher than for any other. Only when passages were read was there evidence in support of a thematic textual influence on memory for sentence voice.

  3. Voice-based assessments of trustworthiness, competence, and warmth in blind and sighted adults.

    Science.gov (United States)

    Oleszkiewicz, Anna; Pisanski, Katarzyna; Lachowicz-Tabaczek, Kinga; Sorokowska, Agnieszka

    2017-06-01

    The study of voice perception in congenitally blind individuals allows researchers rare insight into how a lifetime of visual deprivation affects the development of voice perception. Previous studies have suggested that blind adults outperform their sighted counterparts in low-level auditory tasks testing spatial localization and pitch discrimination, as well as in verbal speech processing; however, blind persons generally show no advantage in nonverbal voice recognition or discrimination tasks. The present study is the first to examine whether visual experience influences the development of social stereotypes that are formed on the basis of nonverbal vocal characteristics (i.e., voice pitch). Groups of 27 congenitally or early-blind adults and 23 sighted controls assessed the trustworthiness, competence, and warmth of men and women speaking a series of vowels, whose voice pitches had been experimentally raised or lowered. Blind and sighted listeners judged both men's and women's voices with lowered pitch as being more competent and trustworthy than voices with raised pitch. In contrast, raised-pitch voices were judged as being warmer than were lowered-pitch voices, but only for women's voices. Crucially, blind and sighted persons did not differ in their voice-based assessments of competence or warmth, or in their certainty of these assessments, whereas the association between low pitch and trustworthiness in women's voices was weaker among blind than sighted participants. This latter result suggests that blind persons may rely less heavily on nonverbal cues to trustworthiness compared to sighted persons. Ultimately, our findings suggest that robust perceptual associations that systematically link voice pitch to the social and personal dimensions of a speaker can develop without visual input.

  4. Automatic Speech Recognition from Neural Signals: A Focused Review

    Directory of Open Access Journals (Sweden)

    Christian Herff

    2016-09-01

    Full Text Available Speech interfaces have become widely accepted and are nowadays integrated in various real-life applications and devices. They have become a part of our daily life. However, speech interfaces presume the ability to produce intelligible speech, which might be impossible due to either loud environments, bothering bystanders or incapabilities to produce speech (i.e.~patients suffering from locked-in syndrome. For these reasons it would be highly desirable to not speak but to simply envision oneself to say words or sentences. Interfaces based on imagined speech would enable fast and natural communication without the need for audible speech and would give a voice to otherwise mute people.This focused review analyzes the potential of different brain imaging techniques to recognize speech from neural signals by applying Automatic Speech Recognition technology. We argue that modalities based on metabolic processes, such as functional Near Infrared Spectroscopy and functional Magnetic Resonance Imaging, are less suited for Automatic Speech Recognition from neural signals due to low temporal resolution but are very useful for the investigation of the underlying neural mechanisms involved in speech processes. In contrast, electrophysiologic activity is fast enough to capture speech processes and is therefor better suited for ASR. Our experimental results indicate the potential of these signals for speech recognition from neural data with a focus on invasively measured brain activity (electrocorticography. As a first example of Automatic Speech Recognition techniques used from neural signals, we discuss the emph{Brain-to-text} system.

  5. Effect of retroflex sounds on the recognition of Hindi stops

    Science.gov (United States)

    Dev, Amita; Agrawal, S. S.; Choudhary, D. Roy

    2004-05-01

    As development of the speech recognition system entirely depends upon the spoken language used for its development and the very fact that speech technology is highly language dependent and reverse engineering is not possible, there is an utmost need to develop such systems for Indian languages. In this paper we present the implementation of a time-delay neural network system (TDNN) in a modular fashion by exploiting the hidden structure of previously phonetic subcategory network for the recognition of Hindi consonants. For the present study we have selected all the Hindi phonemes for the recognition. A vocabulary of 207 Hindi words was designed for the task-specific environment and used as a database. For the recognition of phonemes a three-layered network was constructed and the network was trained using the backpropagation learning algorithm. Experiments were conducted to categorize the Hindi voiced and unvoiced stops, semivowels, vowels, nasals, and fricatives. A close observation of the confusion matrix of Hindi stops revealed maximum confusion of retroflex stops with their nonretroflex counterparts.

  6. Mobile Phones and Voice-Based Educational Services in Rural India: Project RuralVoice

    OpenAIRE

    Ruohonen, Mikko; Turunen, Markku; Mahajan, Gururaj; Linna, Juhani; Kumar, Vivek; Das, Himadri

    2012-01-01

    Part 1: Mobile Learning; International audience; Voice-based services offer major business opportunities in developing areas such as India and Africa. In these areas mobile phones have become very popular, and their usage is increasing all the time. In this project, we study the deployment of voice-based mobile educational services for developing countries. Our study is based on a Spoken Web technology developed by IBM Research Labs, and our focus is on India’s Bottom of the Pyramid (BoP). It...

  7. VoiceRelay: voice key operation using visual basic.

    Science.gov (United States)

    Abrams, Lise; Jennings, David T

    2004-11-01

    Using a voice key is a popular method for recording vocal response times in a variety of language production tasks. This article describes a class module called VoiceRelay that can be easily utilized in Visual Basic programs for voice key operation. This software-based voice key offers the precision of traditional voice keys (although accuracy is system dependent), as well as the flexibility of volume and sensitivity control. However, VoiceRelay is a considerably less expensive alternative for recording vocal response times because it operates with existing PC hardware and does not require the purchase of external response boxes or additional experiment-generation software. A sample project demonstrating implementation of the VoiceRelay class module may be downloaded from the Psychonomic Society Web archive, www.psychonomic.org/archive.

  8. Distributed information-processing system with voice control based on OS Android

    Directory of Open Access Journals (Sweden)

    E. V. Apolonov

    2012-12-01

    Full Text Available Introduction: Trends of increase of ACS and AIS and their use in everyday life are discussed. The need a voice mode of human interaction with AIS is mentioned. Noticed that network integration of AIS allows to combine their resources and contributes to progress in speech recognition. The emergence of smart phones and their widespread use is the desire to use them as personal voice terminals for access to distributed information networks. Main part: Possibility of use of Android-based personal portable mobile devices (PPMD like terminals and like autonomous units, as well as possibility of use of Windows-based stationary PC like servers of distributed data-processing system (DDPS with voice control are considered. Criteria for selection of PPMD and OS of client terminals, as well as requirements DDPS and its structure are formulated. Concept of building of DDPS by "client - server" and "a lot of clients — many servers" technologies are submitted. Concept of a PPMD virtual interface and server virtual interface are offered. Communication between threads within the process of the PPMD virtual interface of client terminal and the interaction between the processes of the client and server in the autonomous mode, as well as in the DDPS mode are considered. The results of experimental tests of the prototype of DDPS when exchanging data between Windows and Android clients, and Windows Server are running; the accuracy and reliability of embedded solutions and scalability of DDPS are confirmed. Conclusions: Modern PPMD on Android OS with can be used as terminal devices for construction on the basis of their different specialized voice control DDPS with technology "client - server" and "a lot of customers - many servers". Unification APIs of PPMD with different OS can be done by implementing a virtual PPMD interface. Exchanging data between processes of DDPS better sell through technology Berkeley sockets, which are supported by most modern operating

  9. Integration of Voice and Gesture Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Speech recognition technology is relatively mature.  In spite of this it is not always accurate. Greater accuracy occurs when the speech is constrained by the...

  10. A pattern recognition mezzanine based on associative memory and FPGA technology for L1 track triggering at HL-LHC

    Energy Technology Data Exchange (ETDEWEB)

    Alunni, L. [INFN Sezione di Perugia (Italy); Biesuz, N. [INFN Sezione di Pisa (Italy); Bilei, G.M. [INFN Sezione di Perugia (Italy); Citraro, S. [Università di Pisa, Pisa (Italy); Crescioli, F. [LPNHE, Paris (France); Fanò, L. [INFN Sezione di Perugia (Italy); Fedi, G., E-mail: giacomo.fedi@pi.infn.it [INFN Sezione di Pisa (Italy); Magalotti, D. [INFN Sezione di Perugia (Italy); UNIMORE, Modena (Italy); Magazzù, G. [INFN Sezione di Pisa (Italy); Servoli, L.; Storchi, L. [INFN Sezione di Perugia (Italy); Palla, F. [INFN Sezione di Pisa (Italy); Placidi, P. [INFN Sezione di Perugia (Italy); DIEI, Perugia (Italy); Papi, A. [INFN Sezione di Perugia (Italy); Piadyk, Y. [LPNHE, Paris (France); Rossi, E. [INFN Sezione di Pisa (Italy); Spiezia, A. [IHEP (China)

    2016-07-11

    The increase of luminosity at HL-LHC will require the introduction of tracker information at Level-1 trigger system for the experiments to maintain an acceptable trigger rate to select interesting events despite the one order of magnitude increase in the minimum bias interactions. To extract in the required latency the track information a dedicated hardware has to be used. We present the tests of a prototype system (Pattern Recognition Mezzanine) as core of pattern recognition and track fitting for HL-LHC ATLAS and CMS experiments, combining the power of both Associative Memory custom ASIC and modern Field Programmable Gate Array (FPGA) devices.

  11. A pattern recognition mezzanine based on associative memory and FPGA technology for L1 track triggering at HL-LHC

    Science.gov (United States)

    Alunni, L.; Biesuz, N.; Bilei, G. M.; Citraro, S.; Crescioli, F.; Fanò, L.; Fedi, G.; Magalotti, D.; Magazzù, G.; Servoli, L.; Storchi, L.; Palla, F.; Placidi, P.; Papi, A.; Piadyk, Y.; Rossi, E.; Spiezia, A.

    2016-07-01

    The increase of luminosity at HL-LHC will require the introduction of tracker information at Level-1 trigger system for the experiments to maintain an acceptable trigger rate to select interesting events despite the one order of magnitude increase in the minimum bias interactions. To extract in the required latency the track information a dedicated hardware has to be used. We present the tests of a prototype system (Pattern Recognition Mezzanine) as core of pattern recognition and track fitting for HL-LHC ATLAS and CMS experiments, combining the power of both Associative Memory custom ASIC and modern Field Programmable Gate Array (FPGA) devices.

  12. Dissociating the cortical basis of memory for voices, words and tones.

    Science.gov (United States)

    Stevens, Alexander A

    2004-01-01

    Human speech carries both linguistic content and information about the speaker's identity and affect. While neuroimaging has been used extensively to study verbal memory, there has been little attention to the neural basis of memory for voices. Evidence from studies of aphasia and auditory agnosia suggests that voice memory may rely on anatomically distinct areas in the right temporal and parietal lobes regions, but there is little data on the broader neural systems involved in voice memory. The present study tested the hypothesis that the neural systems involved in voice memory are functionally distinct from the systems involved in word recognition and are primarily located in the right cerebral hemisphere. Subjects performed two-back tasks in which they were required to alternately remember the voices speaking (Voice condition), and the words they produced (Word condition). A tone memory condition was also included, as a non-speech comparison. The contrast between the Voice and Word conditions revealed greater Voice-related effects in left temporal, right frontal and right medial parietal areas, while the Word-related effects appeared in left frontal and bilateral parietal areas. These findings map out a partially right-lateralized fronto-parietal network associated with voice memory, which can be distinguished from predominantly left-hemisphere regions associated with verbal working memory. These results provide further evidence that distinct neural systems are associated with the carrier waves of speech and word identity.

  13. Voices of courage

    Directory of Open Access Journals (Sweden)

    Noraida Abdullah Karim

    2007-07-01

    Full Text Available In May 2007 the Women’s Commission for Refugee Women and Children1 presented its annual Voices of Courage awards to three displaced people who have dedicated their lives to promoting economic opportunities for refugee and displaced women and youth. These are their (edited testimonies.

  14. Listen to a voice

    DEFF Research Database (Denmark)

    Hølge-Hazelton, Bibi

    2001-01-01

    Listen to the voice of a young girl Lonnie, who was diagnosed with Type 1 diabetes at 16. Imagine that she is deeply involved in the social security system. She lives with her mother and two siblings in a working class part of a small town. She is at a special school for problematic youth, and he...

  15. Political animal voices

    NARCIS (Netherlands)

    Meijer, E.R.

    2017-01-01

    In this thesis, I develop a theory of political animal voices. The first part of the thesis focuses on non-human animal languages and forming interspecies worlds. I first investigate the relation between viewing language as exclusively human and seeing humans as categorically different from other

  16. the Voice of Tomorrow

    Institute of Scientific and Technical Information of China (English)

    AlanBurdick

    2003-01-01

    Have you heard Mide? Coule be.Mike is a professional reader,and he's everywhere these days. On MapQuest, the Web-based map service,he'll read aloud whatever directions you ask for. If you like to have AOL or Yahoo! e-mail read aloud to you over the phone, that's Mike's voice you 're hearing. Soon

  17. What the voice reveals.

    NARCIS (Netherlands)

    Ko, Sei Jin

    2007-01-01

    Given that the voice is our main form of communication, we know surprisingly little about how it impacts judgment and behavior. Furthermore, the modern advancement in telecommunication systems, such as cellular phones, has meant that a large proportion of our everyday interactions are conducted voca

  18. The Inner Voice

    Science.gov (United States)

    Ridgway, Anthony James

    2009-01-01

    The inner voice- we all know what it is because we all have it and use it when we are thinking or reading, for example. Little work has been done on it in our field, with the notable exception of Brian Tomlinson, but presumably it must be a cognitive phenomenon which is of great importance in thinking, language learning, and reading in a foreign…

  19. Moving beyond Youth Voice

    Science.gov (United States)

    Serido, Joyce; Borden, Lynne M.; Perkins, Daniel F.

    2011-01-01

    This study combines research documenting the benefits of positive relationships between youth and caring adults on a young person's positive development with studies on youth voice to examine the mechanisms through which participation in youth programs contributes to positive developmental outcomes. Specifically, the study explores whether youth's…

  20. Bodies and Voices

    DEFF Research Database (Denmark)

    A wide-ranging collection of essays centred on readings of the body in contemporary literary and socio-anthropological discourse, from slavery and rape to female genital mutilation, from clothing, ocular pornography, voice, deformation and transmutation to the imprisoned, dismembered, remembered...

  1. Voices for Careers.

    Science.gov (United States)

    York, Edwin G.; Kapadia, Madhu

    Listed in this annotated bibliography are 502 cassette tapes of value to career exploration for Grade 7 through the adult level, whether as individualized instruction, small group study, or total class activity. Available to New Jersey educators at no charge, this Voices for Careers System is also available for duplication on request from the New…

  2. What the voice reveals

    NARCIS (Netherlands)

    Ko, Sei Jin

    2007-01-01

    Given that the voice is our main form of communication, we know surprisingly little about how it impacts judgment and behavior. Furthermore, the modern advancement in telecommunication systems, such as cellular phones, has meant that a large proportion of our everyday interactions are conducted voca

  3. Bodies and Voices

    DEFF Research Database (Denmark)

    A wide-ranging collection of essays centred on readings of the body in contemporary literary and socio-anthropological discourse, from slavery and rape to female genital mutilation, from clothing, ocular pornography, voice, deformation and transmutation to the imprisoned, dismembered, remembered...

  4. 图像识别技术在煤层识别中的应用%Application of Image Recognition Technology in Coal Seam Identification

    Institute of Scientific and Technical Information of China (English)

    林雯

    2013-01-01

    T he image recognition technology is widely used in geological exploration, seam exploration,meteorological exploration and other fields. In the paper, the coal and rock boundaries identification technology currently used widely is analyzed. the coal seam recognition techhology is proposed. At last, these technologies are integrated into the album acquisition system. These have some positive significance for the future coal mining unattended and fully automated mining.%图像识别技术在地质勘探、煤层勘探、气象勘探等领域被广泛应用。文章分析了当前应用比较广泛的煤岩界识别技术,提出了基于图像识别技术的煤层识别方法,最后将这些技术融合到专辑采集系统中。这些对未来煤矿实现无人值守开采,全自动化开采具有一定的积极意议。

  5. Double Fourier analysis for Emotion Identification in Voiced Speech

    Science.gov (United States)

    Sierra-Sosa, D.; Bastidas, M.; Ortiz P., D.; Quintero, O. L.

    2016-04-01

    We propose a novel analysis alternative, based on two Fourier Transforms for emotion recognition from speech. Fourier analysis allows for display and synthesizes different signals, in terms of power spectral density distributions. A spectrogram of the voice signal is obtained performing a short time Fourier Transform with Gaussian windows, this spectrogram portraits frequency related features, such as vocal tract resonances and quasi-periodic excitations during voiced sounds. Emotions induce such characteristics in speech, which become apparent in spectrogram time-frequency distributions. Later, the signal time-frequency representation from spectrogram is considered an image, and processed through a 2-dimensional Fourier Transform in order to perform the spatial Fourier analysis from it. Finally features related with emotions in voiced speech are extracted and presented.

  6. Location-Enhanced Activity Recognition in Indoor Environments Using Off the Shelf Smart Watch Technology and BLE Beacons

    Science.gov (United States)

    Filippoupolitis, Avgoustinos; Oliff, William; Takand, Babak; Loukas, George

    2017-01-01

    Activity recognition in indoor spaces benefits context awareness and improves the efficiency of applications related to personalised health monitoring, building energy management, security and safety. The majority of activity recognition frameworks, however, employ a network of specialised building sensors or a network of body-worn sensors. As this approach suffers with respect to practicality, we propose the use of commercial off-the-shelf devices. In this work, we design and evaluate an activity recognition system composed of a smart watch, which is enhanced with location information coming from Bluetooth Low Energy (BLE) beacons. We evaluate the performance of this approach for a variety of activities performed in an indoor laboratory environment, using four supervised machine learning algorithms. Our experimental results indicate that our location-enhanced activity recognition system is able to reach a classification accuracy ranging from 92% to 100%, while without location information classification accuracy it can drop to as low as 50% in some cases, depending on the window size chosen for data segmentation. PMID:28555022

  7. Location-Enhanced Activity Recognition in Indoor Environments Using Off the Shelf Smart Watch Technology and BLE Beacons

    Directory of Open Access Journals (Sweden)

    Avgoustinos Filippoupolitis

    2017-05-01

    Full Text Available Activity recognition in indoor spaces benefits context awareness and improves the efficiency of applications related to personalised health monitoring, building energy management, security and safety. The majority of activity recognition frameworks, however, employ a network of specialised building sensors or a network of body-worn sensors. As this approach suffers with respect to practicality, we propose the use of commercial off-the-shelf devices. In this work, we design and evaluate an activity recognition system composed of a smart watch, which is enhanced with location information coming from Bluetooth Low Energy (BLE beacons. We evaluate the performance of this approach for a variety of activities performed in an indoor laboratory environment, using four supervised machine learning algorithms. Our experimental results indicate that our location-enhanced activity recognition system is able to reach a classification accuracy ranging from 92% to 100%, while without location information classification accuracy it can drop to as low as 50% in some cases, depending on the window size chosen for data segmentation.

  8. Location-Enhanced Activity Recognition in Indoor Environments Using Off the Shelf Smart Watch Technology and BLE Beacons.

    Science.gov (United States)

    Filippoupolitis, Avgoustinos; Oliff, William; Takand, Babak; Loukas, George

    2017-05-27

    Activity recognition in indoor spaces benefits context awareness and improves the efficiency of applications related to personalised health monitoring, building energy management, security and safety. The majority of activity recognition frameworks, however, employ a network of specialised building sensors or a network of body-worn sensors. As this approach suffers with respect to practicality, we propose the use of commercial off-the-shelf devices. In this work, we design and evaluate an activity recognition system composed of a smart watch, which is enhanced with location information coming from Bluetooth Low Energy (BLE) beacons. We evaluate the performance of this approach for a variety of activities performed in an indoor laboratory environment, using four supervised machine learning algorithms. Our experimental results indicate that our location-enhanced activity recognition system is able to reach a classification accuracy ranging from 92% to 100%, while without location information classification accuracy it can drop to as low as 50% in some cases, depending on the window size chosen for data segmentation.

  9. Multidimensional assessment of strongly irregular voices such as in substitution voicing and spasmodic dysphonia: a compilation of own research.

    Science.gov (United States)

    Moerman, Mieke; Martens, Jean-Pierre; Dejonckere, Philippe

    2015-04-01

    This article is a compilation of own research performed during the European COoperation in Science and Technology (COST) action 2103: 'Advance Voice Function Assessment', an initiative of voice and speech processing teams consisting of physicists, engineers, and clinicians. This manuscript concerns analyzing largely irregular voicing types, namely substitution voicing (SV) and adductor spasmodic dysphonia (AdSD). A specific perceptual rating scale (IINFVo) was developed, and the Auditory Model Based Pitch Extractor (AMPEX), a piece of software that automatically analyses running speech and generates pitch values in background noise, was applied. The IINFVo perceptual rating scale has been shown to be useful in evaluating SV. The analysis of strongly irregular voices stimulated a modification of the European Laryngological Society's assessment protocol which was originally designed for the common types of (less severe) dysphonia. Acoustic analysis with AMPEX demonstrates that the most informative features are, for SV, the voicing-related acoustic features and, for AdSD, the perturbation measures. Poor correlations between self-assessment and acoustic and perceptual dimensions in the assessment of highly irregular voices argue for a multidimensional approach.

  10. Forensic Speaker Recognition Law Enforcement and Counter-Terrorism

    CERN Document Server

    Patil, Hemant

    2012-01-01

    Forensic Speaker Recognition: Law Enforcement and Counter-Terrorism is an anthology of the research findings of 35 speaker recognition experts from around the world. The volume provides a multidimensional view of the complex science involved in determining whether a suspect’s voice truly matches forensic speech samples, collected by law enforcement and counter-terrorism agencies, that are associated with the commission of a terrorist act or other crimes. While addressing such topics as the challenges of forensic case work, handling speech signal degradation, analyzing features of speaker recognition to optimize voice verification system performance, and designing voice applications that meet the practical needs of law enforcement and counter-terrorism agencies, this material all sounds a common theme: how the rigors of forensic utility are demanding new levels of excellence in all aspects of speaker recognition. The contributors are among the most eminent scientists in speech engineering and signal process...

  11. Fiscal 1997 report on the introductory study. Human behavior recognition evaluation technology; 1997 nendo sentan kenkyu hokokusho. Ningen kodo ninchi hyoka gijutsu

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1998-03-01

    Importance of the human behavior technology was paid attention to to get true safety and comfortableness of life through adaptability of products/systems to all humans. In consideration of the social and technological background, the human behavior recognition evaluation technology avoids economic/social losses caused by accidents and troubles and also provides the safe life environment for living people including the aged. Further, the gist of the project was proposed to make it clear that it can give a course of making things from a viewpoint of development of products dealing with individuals (personal fit) and can give new added values and contribute to heightening of international competitiveness at the same time. Development is made of technology of dividing concrete behavior patterns into types and also accumulating them in usable forms at the time of product design and in emergency and of technology of measuring all the time information on human behaviors in daily life without restrictions and on site. Supporting technology is developed for making the most of behavior information of users for product design and in emergency. Effects of the spread are also estimated. 76 refs., 13 figs., 9 tabs.

  12. Reaching out, inviting back: using Interactive voice response (IVR technology to recycle relapsed smokers back to Quitline treatment – a randomized controlled trial

    Directory of Open Access Journals (Sweden)

    Carlini Beatriz H

    2012-07-01

    Full Text Available Abstract Background Tobacco dependence is a chronic, relapsing condition that typically requires multiple quit attempts and extended treatment. When offered the opportunity, relapsed smokers are interested in recycling back into treatment for a new, assisted quit attempt. This manuscript presents the results of a randomized controlled trial testing the efficacy of interactive voice response (IVR in recycling low income smokers who had previously used quitline (QL support back to QL support for a new quit attempt. Methods A sample of 2985 previous QL callers were randomized to either receive IVR screening for current smoking (control group or IVR screening plus an IVR intervention. The IVR intervention consists of automated questions to identify and address barriers to re-cycling in QL support, followed by an offer to be transferred to the QL and reinitiate treatment. Re-enrollment in QL services for both groups was documented. Results The IVR system successfully reached 715 (23.9% former QL participants. Of those, 27% (194/715 reported to the IVR system that they had quit smoking and were therefore excluded from the study and analysis. The trial’s final sample was composed of 521 current smokers. The re-enrollment rate was 3.3% for the control group and 28.2% for the intervention group (p  Conclusion Proactive IVR outreach is a promising tool to engage low income, relapsed smokers back into a new cycle of treatment. Integration of IVR intervention for recycling smokers with previous QL treatment has the potential to decrease tobacco-related disparities. Trial registration ClinicalTrials.gov Identifier: NCT01260597

  13. Effects of a walking intervention using mobile technology and interactive voice response on serum adipokines among postmenopausal women at increased breast cancer risk

    Science.gov (United States)

    Llanos, Adana A.M.; Krok, Jessica L.; Peng, Juan; Pennell, Michael L.; Vitolins, Mara Z.; Degraffinreid, Cecilia R.; Paskett, Electra D.

    2014-01-01

    Practical methods to reduce the risk of obesity-related breast cancer among high-risk subgroups are lacking. Few studies have investigated the effects of exercise on circulating adipokines, which have been shown to be associated with obesity and breast cancer. The aim of this study was to examine the effects of a walking intervention on serum adiponectin, leptin and the adiponectin-to-leptin ratio (A/L). Seventy-one overweight and obese postmenopausal women at increased risk of developing breast cancer were stratified by BMI (25-30 kg/m2 or >30 kg/m2) and randomized to a 12-week, 2-arm walking intervention administered through interactive voice response (IVR) and mobile devices. The intervention arms were: IVR + coach and IVR + no coach condition. Pre-post changes in serum adiponectin, leptin and the A/L ratio were examined using mixed regression models, with ratio estimates (and 95% confidence intervals [CI]) corresponding to post-intervention adipokine concentrations relative to pre-intervention concentrations. While post-intervention effects included statistically significant improvements in anthropometric measures, the observed decreases in adiponectin and leptin (Ratio=0.86, 95% CI 0.74-1.01 and Ratio=0.94, 95% CI 0.87-1.01, respectively) and increase in A/L (Ratio=1.09, 95% CI 0.94-1.26) were not significant. Thus, these findings do not support significant effects of the walking intervention on circulating adipokines among overweight and obese postmenopausal women. Additional studies are essential to determine the most effective and practical lifestyle interventions that can promote beneficial modification of serum adipokine concentrations, which may prove useful for obesity-related breast cancer prevention. PMID:24435584

  14. Visual Guide Technology Based on Character Recognition and ROI%字符识别耦合的ROI视觉引导应用

    Institute of Scientific and Technical Information of China (English)

    王帮元

    2015-01-01

    字符是标识产品的重要信息,由于产品表面成像画质多样性,当字符目标不清晰或者背景干扰大,往往会影响识别算法的精准度. 鉴于此,提出了一个基于Emgucv与Tesseract的字符识别机制,用来识别平板电脑表面薄膜字符. 首先利用网络摄像头,对平板电脑表面薄膜字符区域取像;再对获取的灰度图进行阈值分割得到包含目标的二值图;然后利用形态学处理去除杂质干扰、提取目标特征,得出感兴趣区域(ROI);最后基于Tesseract开源库,实现对ROI区域的字符识别. 整个系统图像处理部分由C#和Emgucv实现,根据字符识别结果,用运动控制卡传递命令给机构,对薄膜进行分流,完成视觉引导. 通过实验测试本文字符识别系统性能,结果表明本机制与当前字符识别技术相比具有更好的识别效果.%Character is important information identifying a product. Owing to the product surface imaging quality diversity, with the character not clear or background interference, the accuracy of recognition algorithm would be greatly influenced. Therefore, this paper proposes a character recognition mechanism based on Emgucv and Tesseract to identify the tablet PC's surface film character. First, to capture the tablet PC's surface film character area via a webcam. Then, to harvest a binary image containing the character object via making threshold segmentation in the grey-scale map obtained. Third, to get ROI region via wiping off impurity. Finally, based on the open source Tesseract library, to realize the character recognition in ROI region. The image processing in the whole system is achieved by C# and Emgucv, according to the character recognition results, the films were shunt based on the motion control card to send orders to complete visual guide. It comes to a conclusion that this mechanism has better recognition effect compared with the current character recognition technology.

  15. April 16th : The World Voice Day

    NARCIS (Netherlands)

    Svec, Jan G.; Behlau, Mara

    2007-01-01

    Although the voice is used as an everyday basis of speech, most people realize its importance only when a voice problem arises. Increasing public awareness of the importance of the voice and alertness to voice problems are the main goals of the World Voice Day, which is celebrated yearly on April 16

  16. Risk factors for voice problems in teachers

    NARCIS (Netherlands)

    Kooijman, P. G. C.; de Jong, F. I. C. R. S.; Thomas, G.; Huinck, W.; Donders, R.; Graamans, K.; Schutte, H. K.

    2006-01-01

    In order to identify factors that are associated with voice problems and voice-related absenteeism in teachers, 1,878 questionnaires were analysed. The questionnaires inquired about personal data, voice complaints, voice-related absenteeism from work and conditions that may lead to voice complaints

  17. You're a What? Voice Actor

    Science.gov (United States)

    Liming, Drew

    2009-01-01

    This article talks about voice actors and features Tony Oliver, a professional voice actor. Voice actors help to bring one's favorite cartoon and video game characters to life. They also do voice-overs for radio and television commercials and movie trailers. These actors use the sound of their voice to sell a character's emotions--or an advertised…

  18. Risk factors for voice problems in teachers

    NARCIS (Netherlands)

    Kooijman, P. G. C.; de Jong, F. I. C. R. S.; Thomas, G.; Huinck, W.; Donders, R.; Graamans, K.; Schutte, H. K.

    2006-01-01

    In order to identify factors that are associated with voice problems and voice-related absenteeism in teachers, 1,878 questionnaires were analysed. The questionnaires inquired about personal data, voice complaints, voice-related absenteeism from work and conditions that may lead to voice complaints

  19. Risk factors for voice problems in teachers.

    NARCIS (Netherlands)

    Kooijman, P.G.C.; Jong, F.I.C.R.S. de; Thomas, G.; Huinck, W.J.; Donders, A.R.T.; Graamans, K.; Schutte, H.K.

    2006-01-01

    In order to identify factors that are associated with voice problems and voice-related absenteeism in teachers, 1,878 questionnaires were analysed. The questionnaires inquired about personal data, voice complaints, voice-related absenteeism from work and conditions that may lead to voice complaints

  20. Wireless Controlled Methods via Voice and Internet (e-mail for Home Automation System

    Directory of Open Access Journals (Sweden)

    R.A.Ramlee

    2013-08-01

    Full Text Available This paper presents a wireless Home Automation System (HAS that mainly performed by computer. The system is designed with several control methods in order to control the target electrical appliances. This various control methods implemented to fulfill the needs of users at home even at outside. The computer application is designed in Microsoft Windows OS that integrated with speech recognition voice control by using Microsoft Speech Application Programming Interface (SAPI. The voice control method provides more convenience especially to the blind and paralyzed users at home. This system is designed to perform short distance control by using wireless Bluetooth technology and long distance control by using Simple Mail Transfer Protocol (SMTP email control method. The short distance control is considered as the control that performed inside the house. Moreover, the long distance control can be performed at everywhere by devices that installed with browser or email application, and also with the internet access. The system intended to control electrical appliances at home with relatively low cost design, user-friendly interface and ease of installation.

  1. Voice and GPS Based Navigation System For Visually Impaired

    Directory of Open Access Journals (Sweden)

    Harsha Gawari

    2014-04-01

    Full Text Available The paper represents the architecture and implementation of a system that will help to navigate the visually impaired people. The system designed uses GPS and voice recognition along with obstacle avoidance for the purpose of guiding visually impaired. The visually impaired person issues the command and receives the direction response using audio signals. The latitude and longitude values are received continuously from the GPS receiver. The directions are given to the user with the help of audio signals. An obstacle detector is used to help the user to avoid obstacles by sending an audio message.GPS receivers use NMEA standard. With the advancement in voice recognition it becomes easier to issue commands regarding directions to the visually impaired.

  2. Speech recognition in university classrooms

    OpenAIRE

    Wald, Mike; Bain, Keith; Basson, Sara H

    2002-01-01

    The LIBERATED LEARNING PROJECT (LLP) is an applied research project studying two core questions: 1) Can speech recognition (SR) technology successfully digitize lectures to display spoken words as text in university classrooms? 2) Can speech recognition technology be used successfully as an alternative to traditional classroom notetaking for persons with disabilities? This paper addresses these intriguing questions and explores the underlying complex relationship between speech recognition te...

  3. Speech recognition with amplitude and frequency modulations

    Science.gov (United States)

    Zeng, Fan-Gang; Nie, Kaibao; Stickney, Ginger S.; Kong, Ying-Yee; Vongphoe, Michael; Bhargave, Ashish; Wei, Chaogang; Cao, Keli

    2005-02-01

    Amplitude modulation (AM) and frequency modulation (FM) are commonly used in communication, but their relative contributions to speech recognition have not been fully explored. To bridge this gap, we derived slowly varying AM and FM from speech sounds and conducted listening tests using stimuli with different modulations in normal-hearing and cochlear-implant subjects. We found that although AM from a limited number of spectral bands may be sufficient for speech recognition in quiet, FM significantly enhances speech recognition in noise, as well as speaker and tone recognition. Additional speech reception threshold measures revealed that FM is particularly critical for speech recognition with a competing voice and is independent of spectral resolution and similarity. These results suggest that AM and FM provide independent yet complementary contributions to support robust speech recognition under realistic listening situations. Encoding FM may improve auditory scene analysis, cochlear-implant, and audiocoding performance. auditory analysis | cochlear implant | neural code | phase | scene analysis

  4. Keyboard With Voice Output

    Science.gov (United States)

    Huber, W. C.

    1986-01-01

    Voice synthesizer tells what key is about to be depressed. Verbal feedback useful for blind operators or where dim light prevents sighted operator from seeing keyboard. Also used where operator is busy observing other things while keying data into control system. Used as training aid for touch typing, and to train blind operators to use both standard and braille keyboards. Concept adapted to such equipment as typewriters, computers, calculators, telephones, cash registers, and on/off controls.

  5. Why Is My Voice Changing? (For Teens)

    Science.gov (United States)

    ... Week of Healthy Breakfasts Shyness Why Is My Voice Changing? KidsHealth > For Teens > Why Is My Voice ... deeper than a girl's, though. What Causes My Voice to Change? At puberty, guys' bodies begin producing ...

  6. Common Problems That Can Affect Your Voice

    Science.gov (United States)

    ... near you Common Problems That Can Affect Your Voice Common Problems That Can Affect Your Voice Patient ... that traditionally accompany gastro esophageal reflux disease (GERD). Voice Misuse and Overuse Speaking is a physical task ...

  7. Secure voice for mobile satellite applications

    Science.gov (United States)

    Vaisnys, Arvydas; Berner, Jeff

    The initial system studies are described which were performed at JPL on secure voice for mobile satellite applications. Some options are examined for adapting existing Secure Telephone Unit III (STU-III) secure telephone equipment for use over a digital mobile satellite link, as well as for the evolution of a dedicated secure voice mobile earth terminal (MET). The work has included some lab and field testing of prototype equipment. The work is part of an ongoing study at JPL for the National Communications System (NCS) on the use of mobile satellites for emergency communications. The purpose of the overall task is to identify and enable the technologies which will allow the NCS to use mobile satellite services for its National Security Emergency Preparedness (NSEP) communications needs. Various other government agencies will also contribute to a mobile satellite user base, and for some of these, secure communications will be an essential feature.

  8. Secure voice for mobile satellite applications

    Science.gov (United States)

    Vaisnys, Arvydas; Berner, Jeff

    1990-01-01

    The initial system studies are described which were performed at JPL on secure voice for mobile satellite applications. Some options are examined for adapting existing Secure Telephone Unit III (STU-III) secure telephone equipment for use over a digital mobile satellite link, as well as for the evolution of a dedicated secure voice mobile earth terminal (MET). The work has included some lab and field testing of prototype equipment. The work is part of an ongoing study at JPL for the National Communications System (NCS) on the use of mobile satellites for emergency communications. The purpose of the overall task is to identify and enable the technologies which will allow the NCS to use mobile satellite services for its National Security Emergency Preparedness (NSEP) communications needs. Various other government agencies will also contribute to a mobile satellite user base, and for some of these, secure communications will be an essential feature.

  9. Pattern recognition

    CERN Document Server

    Theodoridis, Sergios

    2003-01-01

    Pattern recognition is a scientific discipline that is becoming increasingly important in the age of automation and information handling and retrieval. Patter Recognition, 2e covers the entire spectrum of pattern recognition applications, from image analysis to speech recognition and communications. This book presents cutting-edge material on neural networks, - a set of linked microprocessors that can form associations and uses pattern recognition to ""learn"" -and enhances student motivation by approaching pattern recognition from the designer's point of view. A direct result of more than 10

  10. Voice and silence in organizations

    Directory of Open Access Journals (Sweden)

    Moaşa, H.

    2011-01-01

    Full Text Available Unlike previous research on voice and silence, this article breaksthe distance between the two and declines to treat them as opposites. Voice and silence are interrelated and intertwined strategic forms ofcommunication which presuppose each other in such a way that the absence of one would minimize completely the other’s presence. Social actors are not voice, or silence. Social actors can have voice or silence, they can do both because they operate at multiple levels and deal with multiple issues at different moments in time.

  11. VOICE REHABILITATION FOLLOWING TOTAL LARYNGECTOMY

    Directory of Open Access Journals (Sweden)

    Balasubramanian Thiagarajan

    2015-03-01

    Full Text Available Despite continuing advances in surgical management of laryngeal malignancy, total laryngectomy is still the treatment of choice in advanced laryngeal malignancies. Considering the longevity of the patient following total laryngectomy, various measures have been adopted in order to provide voice function to the patient. Significant advancements have taken place in voice rehabilitation of post laryngectomy patients. Advancements in oncological surgical techniques and irradiation techniques have literally cured laryngeal malignancies. Among the various voice rehabilitation techniques available TEP (Tracheo oesophageal puncture is considered to be the gold standard. This article attempts to explore the various voice rehabilitation technique available with primary focus on TEP.

  12. Student Voice in the Mobile Phone Environment: A Grounded Theory Approach

    Science.gov (United States)

    Daher, Wajeeh

    2017-01-01

    Student voice is recently attracting educational researchers' attention for its influence on various aspects of student lives and futures, as well as social life in general. Mobile technologies are proliferating in social and practical life. This article studies student voice in carrying out outdoor activities with mobile phones. Thirty middle…

  13. ELearning Strategic Planning 2020: The Voice of Future Students as Stakeholders in Higher Education

    Science.gov (United States)

    Finger, Glenn; Smart, Vicky

    2013-01-01

    Most universities are undertaking information technology (IT) strategic planning. The development of those plans often includes the voices of academics and sometimes engages alumni and current students. However, few engage and acknowledge the voice of future students. This paper is situated within the "Griffith University 2020 Strategic…

  14. Multispectral Palmprint Recognition Using a Hybrid Feature

    CERN Document Server

    Mistani, Sina Akbari; Fatemizadeh, Emad

    2011-01-01

    Personal identification problem has been a major field of research in recent years. Biometrics-based technologies that exploit fingerprints, iris, face, voice and palmprints, have been in the center of attention to solve this problem. Palmprints can be used instead of fingerprints that have been of the earliest of these biometrics technologies. A palm is covered with the same skin as the fingertips but has a larger surface, giving us more information that the fingertips. The major features of the palm are palm-lines, including principal lines, wrinkles and ridges. Using these lines is one of the most popular approaches towards solving palmprint recognition problem. Another robust feature is the wavelet energy of palms. In this paper, we used a hybrid of these two features. Moreover, multispectral analysis is applied to improve the performance of the system. Main steps of our approach are: extracting principal lines and computing wavelet transform of the palm, computing block-based power of the resulting image...

  15. Speech emotion recognition based on statistical pitch model

    Institute of Scientific and Technical Information of China (English)

    WANG Zhiping; ZHAO Li; ZOU Cairong

    2006-01-01

    A modified Parzen-window method, which keep high resolution in low frequencies and keep smoothness in high frequencies, is proposed to obtain statistical model. Then, a gender classification method utilizing the statistical model is proposed, which have a 98% accuracy of gender classification while long sentence is dealt with. By separation the male voice and female voice, the mean and standard deviation of speech training samples with different emotion are used to create the corresponding emotion models. Then the Bhattacharyya distance between the test sample and statistical models of pitch, are utilized for emotion recognition in speech.The normalization of pitch for the male voice and female voice are also considered, in order to illustrate them into a uniform space. Finally, the speech emotion recognition experiment based on K Nearest Neighbor shows that, the correct rate of 81% is achieved, where it is only 73.85%if the traditional parameters are utilized.

  16. 基于NIR分析和模式识别技术的玉米种子识别系统%Recognition of Corn Seeds Based on Pattern Recognition and Near Infrared Spectroscopy Technology

    Institute of Scientific and Technical Information of China (English)

    刘天玲; 苏琪雅; 孙群; 杨丽明

    2012-01-01

    Pattern recognition technology and data mining methods have become a hot topic in chemometrics. Near infrared (NIR) spectroscopic analysis has been widely used in spectrum signal processing and modeling since it has advantages of quickness, simplicity and nondestructiveness. Based on five different methods of pattern recognition, namely the locally linear embedding (LLE), wavelet transform (WT), principal component analysis (PCA), partial least squares (PLS) and support vector machine (SVM), the pattern recognition system for corn seeds was proposed using NIR technology, and applied to classification of 108 hybrid samples and 178 female samples for com seeds. Firstly, we get rid of noise or reduce the dimension using LLE, WT, PCA, PLS, and then use SVM to identify two-class samples. In the meantime, 1-norm SVM is the method of direct classification and identification. Experimental results of three different spectral regions show that the performances of three methods: PCA+SVM, LLE+SVM, PLS+SVM are superior to WT+SVM and 1-norm SVM methods, and obtain a high classification accuracy, which indicates the feasibility and effectiveness of the proposed methods. Moreover, this investigation provides the theoretical support and practical method for recognition of corn seeds utilizing near infrared spectral data.%模式识别技术及数据挖掘方法已成为化学计量学的研究热点.近红外(NIR)光谱分析以其快速、简便、非破坏性等优势广泛应用于光谱信号的处理和分析模型的建立.基于五种不同的模式识别方法:局部线性嵌入(LLE),小波变换(WT),主成分分析(PCA),偏最小二乘(PLS)和支持向量机(SVM),利用NIR技术建立了玉米种子的模式识别系统,并将其应用于108玉米杂交种和母本178种子的近红外光谱样品.首先利用LLE,WT,PCA,PLS进行消噪或降维,然后运用SVM进行分类识别,而一模支持向量机(1-normSVM)算法直接进行分类识别.三个不同NIR光谱范

  17. 基于NIR分析和模式识别技术的玉米种子识别系统%Recognition of Corn Seeds Based on Pattern Recognition and Near Infrared Spectroscopy Technology

    Institute of Scientific and Technical Information of China (English)

    刘天玲; 苏琪雅; 孙群; 杨丽明

    2012-01-01

    模式识别技术及数据挖掘方法已成为化学计量学的研究热点.近红外(NIR)光谱分析以其快速、简便、非破坏性等优势广泛应用于光谱信号的处理和分析模型的建立.文章基于五种不同的模式识别方法:局部线性嵌入(LLE),小波变换(WT),主成分分析(PCA),偏最小二乘(PLS)和支持向量机(SVM),利用NIR技术建立了玉米种子的模式识别系统,并将其应用于108玉米杂交种和母本178种子的近红外光谱样品.首先利用LLE,WT,PCA,PLS进行消噪或降维,然后运用SVM进行分类识别,而一模支持向量机(1-norm SVM)算法直接进行分类识别.三个不同NIR光谱范围的数值实验显示:PCA+ SVM,LLE+SVM,PLS十SVM识别效果甚佳,而WT+SVM和1-norm SVM方法也有较高的分类精度.实验结果表明了本文提出方法的可行性和有效性,为利用近红外光谱和模式识别技术进行种子识别研究提供了理论依据和实用方法.%Pattern recognition technology and data mining methods have become a hot topic in chemometrics. Near infrared (NIR) spectroscopic analysis has been widely used in spectrum signal processing and modeling due to its advantages of quickness, simplicity and nondestructiveness. Based on five different methods of pattern recognition, namely the locally linear embedding (LLE), wavelet transform (WT), principal component analysis (PCA), partial least squares (PLS) and support vector machine (SVM), the pattern recognition system for corn seeds is proposed using NIR technology, and applied to classification of 108 hybrid samples and 178 female samples for corn seeds. Firstly, we get rid of noise or reduce the dimension using LLE, WT, PCA and PLS, and then use SVM to identify two-class samples. In the meantime, 1-norm SVM is the method of direct classification and identification. Experimental results for three different spectral regions show that the performances of three methods, i. e. PCA+SVM, LLE+SVM, PLS+SVM, are

  18. The impact of voice on speech realization

    OpenAIRE

    Jelka Breznik

    2014-01-01

    The study discusses spoken literary language and the impact of voice on speech realization. The voice consists of a sound made by a human being using the vocal folds for talking, singing, laughing, crying, screaming… The human voice is specifically the part of human sound production in which the vocal folds (vocal cords) are the primary sound source. Our voice is our instrument and identity card. How does the voice (voice tone) affect others and how do they respond, positively or negatively? ...

  19. The Role of the Electronic Portfolio in Enhancing Information and Communication Technology and English Language Skills: The Voices of Six Malaysian Undergraduates

    Science.gov (United States)

    Thang, Siew Ming; Lee, Yit Sim; Zulkifli, Nurul Farhana

    2012-01-01

    This study investigated the effects of the construction and development of electronic portfolios (e-portfolios) on a small user population at a public university in Malaysia. The study was based on a three-month Information and Communication Technology (ICT) and language learning course offered to the undergraduates of the university. One of the…

  20. Recognition of Emerging Technology Trends. Class-selective study of citations in the U.S. Patent Citation Network

    CERN Document Server

    Bruck, Péter; Szente, Judit; Tobochnik, Jan; Érdi, Péter

    2016-01-01

    By adopting a citation-based recursive ranking method for patents the evolution of new fields of technology can be traced. Specifically, it is demonstrated that the laser / inkjet printer technology emerged from the recombination of two existing technologies: sequential printing and static image production. The dynamics of the citations coming from the different "precursor" classes illuminates the mechanism of the emergence of new fields and give the possibility to make predictions about future technological development. For the patent network the optimal value of the PageRank damping factor is close to 0.5; the application of d=0.85 leads to unacceptable ranking results.

  1. Key Technologies in Speech Emotion Recognition%语音情感识别的关键技术

    Institute of Scientific and Technical Information of China (English)

    张雪英; 孙颖; 张卫; 畅江

    2015-01-01

    语音信号中的情感信息是一种很重要的信息资源,仅靠单纯的数学模型搭建和计算来进行语音情感识别就显现出不足。情感是由外部刺激引发人的生理、心理变化,从而表现出来的一种对人或事物的感知状态,因此,将认知心理学与语音信号处理相结合有益于更好地处理情感语音。首先介绍了语音情感与人类认知的关联性,总结了该领域的最新进展和研究成果,主要包括情感数据库的建立、情感特征的提取以及情感识别网络等。其次介绍了基于认知心理学构建的模糊认知图网络在情感语音识别中的应用。接着,探讨了人脑对情感语音的认知机理,并试图把事件相关电位融合到语音情感识别中,从而提高情感语音识别的准确率,为今后情感语音识别与认知心理学交叉融合发展提出了构思与展望。%Emotional information in speech signal is an important information resource .When verbal expression is combined with human emotion ,emotional speech processing is no longer a simple mathematical model or pure calculation .Fluctuations of the mood are controlled by the brain perception ;speech signal processing based on cognitive psychology can capture emotion bet‐ter .In this paper the relevance analysis between speech emotion and human cognition is intro‐duced firstly .The recent progress in speech emotion recognition is summarized ,including the re‐view of speech emotion databases ,feature extraction and emotion recognition networks .Secondly a fuzzy cognitive map network based on cognitive psychology is introduced into emotional speech recognition .In addition ,the mechanism of the human brain for cognitive emotional speech is ex‐plored .To improve the recognition accuracy ,this report also tries to integrate event‐related poten‐tials to speech emotion recognition .This idea is the conception and prospect of speech emotion recognition

  2. Handbook of Face Recognition

    CERN Document Server

    Li, Stan Z

    2011-01-01

    This highly anticipated new edition provides a comprehensive account of face recognition research and technology, spanning the full range of topics needed for designing operational face recognition systems. After a thorough introductory chapter, each of the following chapters focus on a specific topic, reviewing background information, up-to-date techniques, and recent results, as well as offering challenges and future directions. Features: fully updated, revised and expanded, covering the entire spectrum of concepts, methods, and algorithms for automated face detection and recognition systems

  3. The Voice Handicap Index with Post-Laryngectomy Male Voices

    Science.gov (United States)

    Evans, Eryl; Carding, Paul; Drinnan, Michael

    2009-01-01

    Background: Surgical treatment for advanced laryngeal cancer involves complete removal of the larynx ("laryngectomy") and initial total loss of voice. Post-laryngectomy rehabilitation involves implementation of different means of "voicing" for these patients wherever possible. There is little information about laryngectomees'…

  4. Pedagogic Voice: Student Voice in Teaching and Engagement Pedagogies

    Science.gov (United States)

    Baroutsis, Aspa; McGregor, Glenda; Mills, Martin

    2016-01-01

    In this paper, we are concerned with the notion of "pedagogic voice" as it relates to the presence of student "voice" in teaching, learning and curriculum matters at an alternative, or second chance, school in Australia. This school draws upon many of the principles of democratic schooling via its utilisation of student voice…

  5. Low Impedance Voice Coils for Improved Loudspeaker Efficiency

    DEFF Research Database (Denmark)

    Iversen, Niels Elkjær; Knott, Arnold; Andersen, Michael A. E.

    2015-01-01

    In modern audio systems utilizing switch-mode amplifiers the total efficiency is dominated by the rather poor efficiency of the loudspeaker. For decades voice coils have been designed so that nominal resistances of 4 to 8 Ohms is obtained, despite modern audio amplifiers, using switch-mode techno......In modern audio systems utilizing switch-mode amplifiers the total efficiency is dominated by the rather poor efficiency of the loudspeaker. For decades voice coils have been designed so that nominal resistances of 4 to 8 Ohms is obtained, despite modern audio amplifiers, using switch......-mode technology, can be designed to much lower loads. A thorough analysis of the loudspeaker efficiency is presented and its relation to the voice coil fill factor is described. A new parameter, the drivers mass ratio, is introduced and it indicates how much a fill factor optimization will improve a driver......’s efficiency. Different voice coil winding layouts are described and their fill factors analyzed. It is found that by lowering the nominal resistance of a voice coil, using rectangular wire, one can increase the fill factor. Three voice coils are designed for a standard 10” woofer and corresponding frequency...

  6. Chord Recognition Based on Temporal Correlation Support Vector Machine

    OpenAIRE

    Zhongyang Rao; Xin Guan; Jianfu Teng

    2016-01-01

    In this paper, we propose a method called temporal correlation support vector machine (TCSVM) for automatic major-minor chord recognition in audio music. We first use robust principal component analysis to separate the singing voice from the music to reduce the influence of the singing voice and consider the temporal correlations of the chord features. Using robust principal component analysis, we expect the low-rank component of the spectrogram matrix to contain the musical accompaniment and...

  7. Performance evaluation of UHF RFID technologies for real-time bus recognition in the Taipei Bus Station.

    Science.gov (United States)

    Own, Chung-Ming; Lee, Da-Sheng; Wang, Ti-Ho; Wang, De-Jun; Ting, Yu-Lun

    2013-06-18

    Transport stations such as airports, ports, and railways have adopted blocked-type pathway management to process and control travel systems in a one-directional manner. However, this excludes highway transportation where large buses have great variability and mobility; thus, an instant influx of numerous buses increases risks and complicates station management. Focusing on Taipei Bus Station, this study employed RFID technology to develop a system platform integrated with modern information technology that has numerous characteristics. This modern information technology comprised the following systems: ultra-high frequency (UHF) radio-frequency identification (RFID), ultrasound and license number identification, and backstage graphic controls. In conclusion, the system enabled management, bus companies, and passengers to experience the national bus station's new generation technology, which provides diverse information and synchronization functions. Furthermore, this technology reached a new milestone in the energy-saving and efficiency-increasing performance of Taiwan's buses.

  8. ALPHABET SIGN LANGUAGE RECOGNITION USING LEAP MOTION TECHNOLOGY AND RULE BASED BACKPROPAGATION-GENETIC ALGORITHM NEURAL NETWORK (RBBPGANN

    Directory of Open Access Journals (Sweden)

    Wijayanti Nurul Khotimah

    2017-01-01

    Full Text Available Sign Language recognition was used to help people with normal hearing communicate effectively with the deaf and hearing-impaired. Based on survey that conducted by Multi-Center Study in Southeast Asia, Indonesia was on the top four position in number of patients with hearing disability (4.6%. Therefore, the existence of Sign Language recognition is important. Some research has been conducted on this field. Many neural network types had been used for recognizing many kinds of sign languages. However, their performance are need to be improved. This work focuses on the ASL (Alphabet Sign Language in SIBI (Sign System of Indonesian Language which uses one hand and 26 gestures. Here, thirty four features were extracted by using Leap Motion. Further, a new method, Rule Based-Backpropagation Genetic Al-gorithm Neural Network (RB-BPGANN, was used to recognize these Sign Languages. This method is combination of Rule and Back Propagation Neural Network (BPGANN. Based on experiment this pro-posed application can recognize Sign Language up to 93.8% accuracy. It was very good to recognize large multiclass instance and can be solution of overfitting problem in Neural Network algorithm.

  9. 手势识别技术在物联网课程教学中的应用%The Application of Gesture Recognition Technology in the Internet of Things Courses

    Institute of Scientific and Technical Information of China (English)

    薛莹

    2016-01-01

    Human gesture is a natural and intuitive communication mode. The newest Kinect equipment can provide a new human-computer interaction way which can capture, track and decrypt the body movements, gestures and voice. This paper uses Kinect equipment for human gesture recognition, with no contact interaction in the Internet of things in the course of daily teaching.%人体手势是一种自然并且直观的人际交流模式,最新的Kinect设备可提供一种新的人机交互的方式,能够捕捉、跟踪以及解密人体的动作、手势以及声音。文章利用Kinect进行人体手势识别,在物联网课程的日常教学上提供无接触式互动。

  10. Cloud Payment:Mobile Digital Wallets of Face Recognition Technology%云支付:移动数字钱包之人脸识别技术

    Institute of Scientific and Technical Information of China (English)

    舒晓苓

    2015-01-01

    This article from the traditional payment and mobile payment security problem considering starting early, with the rapid development of mobile devices, traditional payment gradually transition to the mobile payment, to the birth of mobile digital wal⁃let, this paper will mobile digital wallet need bank accounts and credit card numbers and other important financial information and the authentication data stored in the cloud, avoid moving equipment the loss of potential safety problems;one to one user iden⁃tification using face recognition technology and identity the ID binding, greatly reduces the risk of a digital wallet financial infor⁃mation infringement, face recognition method by experimental test mentioned in this paper can achieve the effect of real-time de⁃tection and recognition of human face.%该文从传统支付和初期移动支付安全问题考虑出发,随着移动设备的飞速发展,传统支付慢慢过渡到移动支付,从而诞生了移动数字钱包,本文将移动数字钱包需要用到的银行账户号码等重要财务信息以及人脸身份认证数据存储在云端,避免移动设备丢失造成的安全隐患;利用人脸识别技术和身份ID绑定一对一识别用户身份,大大降低了数字钱包财务信息受侵害的风险,通过实验测试本文提到的人脸识别方法能达到实时人脸检测与识别效果。

  11. Voice and Speech after Laryngectomy

    Science.gov (United States)

    Stajner-Katusic, Smiljka; Horga, Damir; Musura, Maja; Globlek, Dubravka

    2006-01-01

    The aim of the investigation is to compare voice and speech quality in alaryngeal patients using esophageal speech (ESOP, eight subjects), electroacoustical speech aid (EACA, six subjects) and tracheoesophageal voice prosthesis (TEVP, three subjects). The subjects reading a short story were recorded in the sound-proof booth and the speech samples…

  12. Voice Quality of Psychological Origin

    Science.gov (United States)

    Teixeira, Antonio; Nunes, Ana; Coimbra, Rosa Lidia; Lima, Rosa; Moutinho, Lurdes

    2008-01-01

    Variations in voice quality are essentially related to modifications of the glottal source parameters, such as: F[subscript 0], jitter, and shimmer. Voice quality is affected by prosody, emotional state, and vocal pathologies. Psychogenic vocal pathology is particularly interesting. In the present case study, the speaker naturally presented a…

  13. Voice handicap index in Swedish.

    Science.gov (United States)

    Ohlsson, Ann-Christine; Dotevall, Hans

    2009-01-01

    The objective of this study was to evaluate a Swedish version of the voice handicap index questionnaire (Sw-VHI). A total of 57 adult, dysphonic patients and 15 healthy controls completed the Sw-VHI and rated the degree of vocal fatigue and hoarseness on visual analogue scales. A perceptual voice evaluation was also performed. Test-retest reliability was analyzed in 38 subjects without voice complaints. Sw-VHI distinguished between dysphonic subjects and controls (P 0.84) and test-retest reliability (intraclass correlation coefficient >0.75) were good. Only moderate or weak correlations were found between Sw-VHI and the subjective and perceptual voice ratings. The data indicate that a difference above 13 points for the total Sw-VHI score and above 6 points for the Sw-VHI subscales is significant for an individual when comparing two different occasions. In conclusion, the Sw-VHI appears to be a robust instrument for assessment of the psycho-social impact of a voice disorder. However, Sw-VHI seems to, at least partly, capture different aspects of voice function to the subjective voice ratings and the perceptual voice evaluation.

  14. Enhancing Author's Voice through Scripting

    Science.gov (United States)

    Young, Chase J.; Rasinski, Timothy V.

    2011-01-01

    The authors suggest using scripting as a strategy to mentor and enhance author's voice in writing. Through gradual release, students use authentic literature as a model for writing with voice. The authors also propose possible extensions for independent practice, integration across content areas, and tips for evaluation.

  15. Voices in History

    Directory of Open Access Journals (Sweden)

    Ivan Leudar

    2001-06-01

    Full Text Available Experiences of “hearing voices” nowadays usually count as verbal hallucinations and they indicate serious mental illness. Some are first rank symptoms of schizophrenia, and the mass media, at least in Britain, tend to present them as antecedents of impulsive violence. They are, however, also found in other psychiatric conditions and epidemiological surveys reveal that even individuals with no need of psychiatric help can hear voices, sometimes following bereavement or abuse, but sometimes for no discernible reason. So do these experiences necessarily mean insanity and violence, and must they be thought of as pathogenic hallucinations; or are there other ways to understand them and live with them, and with what consequences?One way to make our thinking more flexible is to turn to history. We find that hearing voices was always an enigmatic experience, and the people who had it were rare. The gallery of voice hearers is, though, distinguished and it includes Galilei, Bunyan and St Teresa. Socrates heard a daemon who guided his actions, but in his time this did not signify madness, nor was it described as a hallucination. Yet in 19th century French psychological medicine the daemon became a hallucination and Socrates was retrospectively diagnosed as mentally ill. This paper examines the controversies which surrounded the experience at different points in history as well as the practice of retrospective psychiatry. The conclusion reached on the basis of the historical materials is that the experience and the ontological status it is ascribed are not trans-cultural or trans-historic but situated both in history and in the contemporary conflicts.

  16. Facing Sound - Voicing Art

    DEFF Research Database (Denmark)

    Lønstrup, Ansa

    2013-01-01

    This article is based on examples of contemporary audiovisual art, with a special focus on the Tony Oursler exhibition Face to Face at Aarhus Art Museum ARoS in Denmark in March-July 2012. My investigation involves a combination of qualitative interviews with visitors, observations of the audienc......´s interactions with the exhibition and the artwork in the museum space and short analyses of individual works of art based on reception aesthetics and phenomenology and inspired by newer writings on sound, voice and listening....

  17. Effects on vocal range and voice quality of singing voice training: the classically trained female voice.

    Science.gov (United States)

    Pabon, Peter; Stallinga, Rob; Södersten, Maria; Ternström, Sten

    2014-01-01

    A longitudinal study was performed on the acoustical effects of singing voice training under a given study program, using the voice range profile (VRP). Pretraining and posttraining recordings were made of students who participated in a 3-year bachelor singing study program. A questionnaire that included questions on optimal range, register use, classification, vocal health and hygiene, mixing technique, and training goals was used to rate and categorize self-assessed voice changes. Based on the responses, a subgroup of 10 classically trained female voices was selected, which was homogeneous enough for effects of training to be identified. The VRP perimeter contour was analyzed for effects of voice training. Also, a mapping within the VRP of voice quality, as expressed by the crest factor, was used to indicate the register boundaries and to monitor the acoustical consequences of the newly learned vocal technique of "mixed voice." VRPs were averaged across subjects. Findings were compared with the self-assessed vocal changes. Pre/post comparison of the average VRPs showed, in the midrange, (1) a decrease in the VRP area that was associated with the loud chest voice, (2) a reduction of the crest factor values, and (3) a reduction of maximum sound pressure level values. The students' self-evaluations of the voice changes appeared in some cases to contradict the VRP findings. VRPs of individual voices were seen to change over the course of a singing education. These changes were manifest also in the average group. High-resolution computerized recording, complemented with an acoustic register marker, allows a meaningful assessment of some effects of training, on an individual basis and for groups that comprise singers of a specific genre. It is argued that this kind of investigation is possible only within a focused training program, given by a faculty who has agreed on the goals. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  18. 基于计算机视觉技术的手势识别方法%Gesture Recognition Method Based on Computer Vision Technology

    Institute of Scientific and Technical Information of China (English)

    陈冰超; 李永刚

    2014-01-01

    Designs a gesture recognition method based on computer vision technology. Extracts and pre-processes gestures images from videos. Implements image segmentation and feature extraction, matches the result with the established gesture model library, then gets the final recognition result. Experiments show that the system can identify simple human-computer interaction gestures with good performance of accuracy and stability.%基于计算机视觉技术设计手势识别算法。采集手势图像并进行预处理,进行图像分割和特征提取,与已建立的手势模型库进行匹配,得到最终识别结果。实验证明该设计的手势识别系统能够识别基础的人机交互手势,识别准确率高,稳定性好。

  19. Questioning Photovoice Research: Whose Voice?

    Science.gov (United States)

    Evans-Agnew, Robin A; Rosemberg, Marie-Anne S

    2016-07-01

    Photovoice is an important participatory research tool for advancing health equity. Our purpose is to critically review how participant voice is promoted through the photovoice process of taking and discussing photos and adding text/captions. PubMed, Scopus, PsycINFO, and Web of Science databases were searched from the years 2008 to 2014 using the keywords photovoice, photonovella, photovoice and social justice, and photovoice and participatory action research. Research articles were reviewed for how participant voice was (a) analyzed, (b) exhibited in community forums, and (c) disseminated through published manuscripts. Of 21 studies, 13 described participant voice in the data analysis, 14 described participants' control over exhibiting photo-texts, seven manuscripts included a comprehensive set of photo-texts, and none described participant input on choice of manuscript photo-texts. Photovoice designs vary in the advancement of participant voice, with the least advancement occurring in manuscript publication. Future photovoice researchers should expand approaches to advancing participant voice.

  20. Voice quality of psychological origin.

    Science.gov (United States)

    Teixeira, Antonio; Nunes, Ana; Coimbra, Rosa Lídia; Lima, Rosa; Moutinho, Lurdes

    2008-01-01

    Variations in voice quality are essentially related to modifications of the glottal source parameters, such as: F0, jitter, and shimmer. Voice quality is affected by prosody, emotional state, and vocal pathologies. Psychogenic vocal pathology is particularly interesting. In the present case study, the speaker naturally presented a ventricular band voice whereas in a controlled production he was able to use a more normal phonation process. A small corpus was recorded which included sustained vowels and short sentences in both registers. A normal speaker was also recorded in similar tasks. Annotation and extraction of parameters were made using Praat's voice report function. Application of the Hoarseness Diagram to sustained productions situates this case in the pseudo-glottic phonation region. Analysis of several different parameters related to F0, jitter, shimmer, and harmonicity revealed that the speaker with psychogenic voice was capable of controlling certain parameters (e.g. F0 maximum) but was unable to correct others such as shimmer.

  1. VOICE OVER INTERNET PROTOCOL (VOIP: FUTURE POTENTIAL

    Directory of Open Access Journals (Sweden)

    Deepti Kumari

    2015-10-01

    Full Text Available VoIP (voice over IP delivers standard voice over telephone services over Internet Protocol (IP. VoIP is the technology of digitizing sound, compressing it, breaking it up into data packets, and sending it over an IP (internet protocol network where it is reassembled, decompressed, and converted back into an analog wave form. Gateways are the key component required to facilitate IP Telephony. A gateway is used to bridge the traditional circuit switched PSTN with the packet switched Internet. The paper covers software, hardware and protocol requirements followed by weighing the VoIP advantages such as low cost, portability, free and advanced features, bandwidth efficiency, call recording and monitoring against the VoIP disadvantages such as power dependency, quality of voice and service, security, and reliability. With ever increasing internet penetration and better broadband connectivity, VoIP is going to expand further with businesses already using VoIP standalone or in a hybrid format, although our focus and scope here remains VoIP. Mobile VoIP, an infant with less than 4% market share, has so far been focusing on increasing active subscriptions without a sustainable revenue model, but has the potential and is going to see tussle with static VoIP for space in days ahead.

  2. Voice over Internet Protocol (VOIP: Future Potential

    Directory of Open Access Journals (Sweden)

    Ms. Deepti

    2014-11-01

    Full Text Available VoIP (voice over IP delivers standard voice over telephone services over Internet Protocol (IP. VoIP is the technology of digitizing sound, compressing it, breaking it up into data packets, and sending it over an IP (internet protocol network where it is reassembled, decompressed, and converted back into an analog wave form. Gateways are the key component required to facilitate IP Telephony. A gateway is used to bridge the traditional circuit switched PSTN with the packet switched Internet. The paper covers software, hardware and protocol requirements followed by weighing the VoIP advantages such as low cost, portability, free and advanced features, bandwidth efficiency, call recording and monitoring against the VoIP disadvantages such as power dependency, quality of voice and service, security, and reliability. With ever increasing internet penetration and better broadband connectivity, VoIP is going to expand further with businesses already using VoIP standalone or in a hybrid format, although our focus and scope here remains VoIP. Mobile VoIP, an infant with less than 4% market share, has so far been focusing on increasing active subscriptions without a sustainable revenue model, but has the potential and is going to see tussle with static VoIP for space in days ahead.

  3. Muscular tension and body posture in relation to voice handicap and voice quality in teachers with persistent voice complaints.

    NARCIS (Netherlands)

    Kooijman, P.G.C.; Jong, F.I.C.R.S. de; Oudes, M.J.; Huinck, W.J.; Acht, H. van; Graamans, K.

    2005-01-01

    The aim of this study was to investigate the relationship between extrinsic laryngeal muscular hypertonicity and deviant body posture on the one hand and voice handicap and voice quality on the other hand in teachers with persistent voice complaints and a history of voice-related absenteeism. The st

  4. Objective voice and speech analysis of persons with chronic hoarseness by prosodic analysis of speech samples.

    Science.gov (United States)

    Haderlein, Tino; Döllinger, Michael; Matoušek, Václav; Nöth, Elmar

    2016-10-01

    Automatic voice assessment is often performed using sustained vowels. In contrast, speech analysis of read-out texts can be applied to voice and speech assessment. Automatic speech recognition and prosodic analysis were used to find regression formulae between automatic and perceptual assessment of four voice and four speech criteria. The regression was trained with 21 men and 62 women (average age 49.2 years) and tested with another set of 24 men and 49 women (48.3 years), all suffering from chronic hoarseness. They read the text 'Der Nordwind und die Sonne' ('The North Wind and the Sun'). Five voice and speech therapists evaluated the data on 5-point Likert scales. Ten prosodic and recognition accuracy measures (features) were identified which describe all the examined criteria. Inter-rater correlation within the expert group was between r = 0.63 for the criterion 'match of breath and sense units' and r = 0.87 for the overall voice quality. Human-machine correlation was between r = 0.40 for the match of breath and sense units and r = 0.82 for intelligibility. The perceptual ratings of different criteria were highly correlated with each other. Likewise, the feature sets modeling the criteria were very similar. The automatic method is suitable for assessing chronic hoarseness in general and for subgroups of functional and organic dysphonia. In its current version, it is almost as reliable as a randomly picked rater from a group of voice and speech therapists.

  5. Commercial applications of speech interface technology: an industry at the threshold.

    Science.gov (United States)

    Oberteuffer, J A

    1995-10-24

    Speech interface technology, which includes automatic speech recognition, synthetic speech, and natural language processing, is beginning to have a significant impact on business and personal computer use. Today, powerful and inexpensive microprocessors and improved algorithms are driving commercial applications in computer command, consumer, data entry, speech-to-text, telephone, and voice verification. Robust speaker-independent recognition systems for command and navigation in personal computers are now available; telephone-based transaction and database inquiry systems using both speech synthesis and recognition are coming into use. Large-vocabulary speech interface systems for document creation and read-aloud proofing are expanding beyond niche markets. Today's applications represent a small preview of a rich future for speech interface technology that will eventually replace keyboards with microphones and loud-speakers to give easy accessibility to increasingly intelligent machines.

  6. Bodies, Spaces, Voices, Silences

    Directory of Open Access Journals (Sweden)

    Donatella Mazzoleni

    2013-07-01

    Full Text Available A good architecture should not only allow functional, formal and technical quality for urban spaces, but also let the voice of the city be perceived, listened, enjoyed. Every city has got its specific sound identity, or “ISO” (R. O. Benenzon, made up of a complex texture of background noises and fluctuation of sound figures emerging and disappearing in a game of continuous fadings. For instance, the ISO of Naples is characterized by a spread need of hearing the sound return of one’s/others voices, by a hate of silence. Cities may fall ill: illness from noise, within super-crowded neighbourhoods, or illness from silence, in the forced isolation of peripheries. The proposal of an urban music therapy denotes an unpublished and innovative enlarged interdisciplinary research path, where architecture, music, medicine, psychology, communication science may converge, in order to work for rebalancing spaces and relation life of the urban collectivity, through the care of body and sound dimensions.

  7. Use of portable digital media players increases patient motivation and practice in voice therapy.

    Science.gov (United States)

    van Leer, Eva; Connor, Nadine P

    2012-07-01

    There are many documented barriers to successful adherence to voice therapy. However, methods for facilitating adherence are not well understood. The purpose of this study was to determine if patient adherence and motivation for practice could be improved by providing patients with practice support between sessions using mobile treatment videos. Thirteen voice therapy participants were provided with portable media players containing videos of voice exercises exemplified by their therapists and themselves. A randomized crossover design of two conditions was used: (1) standard of care voice therapy where participants were provided with written homework descriptions; and (2) video-enhanced voice therapy where participants received a portable digital media player with clinician and self-videos. The duration of each condition was 1 week. Self-report measures of practice frequency and aspects of motivation were obtained at the end of each session. Practice of voice exercises was significantly greater in the video-enhanced voice therapy condition than in the standard of care "written" condition (Pdigital media players in voice therapy for individuals who are comfortable using such technology. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  8. Recent progress in fingerprint recognition

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Fingerprint recognition has been increasingly used to realize personal identification in civilian's daily life, such as ID card, fingerprints hard disk and so on. Great improvement has been achieved in the on-line fingerprint sensing technology and automatic fingerprint recognition algorithms. Various fingerprint recognition techniques, including fingerprint acquisition, classification, enhancement and matching, are highly improved. This paper overviews recent advances in fingerprint recognition and summarizes the algorithm proposed for every step with special focuses on the enhancement of low-quality fingerprints and the matching of the distorted fingerprint images. Both issues are believed to be significant and challenging tasks. In addition, we also discuss the common evaluation for the fingerprint recognition algorithm of the Fingerprint Verification Competition 2004 (FVC2004) and the Fingerprint Vendor Technology Evaluation 2003 (FpVTE2003), based on which we could measure the performance of the recognition algorithm objectively and uniformly.

  9. Crossing Cultures with Multi-Voiced Journals

    Science.gov (United States)

    Styslinger, Mary E.; Whisenant, Alison

    2004-01-01

    In this article, the authors discuss the benefits of using multi-voiced journals as a teaching strategy in reading instruction. Multi-voiced journals, an adaptation of dual-voiced journals, encourage responses to reading in varied, cultured voices of characters. It is similar to reading journals in that they prod students to connect to the lives…

  10. Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System

    Science.gov (United States)

    Partila, Pavol; Voznak, Miroslav; Tovarek, Jaromir

    2015-01-01

    The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency. PMID:26346654

  11. Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System

    Directory of Open Access Journals (Sweden)

    Pavol Partila

    2015-01-01

    Full Text Available The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.

  12. A lesson in listening: Is the student voice heard in the rush to ...

    African Journals Online (AJOL)

    African Journal of Health Professions Education ... the call to incorporate technology in teaching and learning in higher education is increasing. The student voice in the planning and implementation of blended learning strategies is, however, ...

  13. Coaching Academic English through Voice and Text Production Models

    Science.gov (United States)

    Greenman, Caroline

    2004-01-01

    We report on how technological developments have enabled us to change our concepts and practices regarding voice and text coaching and how this in turn has raised the level of literary competence among non-native doctoral students seeking publication in English in scientific journals. We describe models for marking, peer reviewing and coaching…

  14. Image/Music/Voice: Song Dubbing in Hollywood Musicals.

    Science.gov (United States)

    Siefert, Marsha

    1995-01-01

    Uses the practice of song dubbing in the Hollywood film musical to explore the implications and consequences of the singing voice for imaging practices in the 1930s through 1960s. Discusses the ideological, technological, and socioeconomic basis for song dubbing. Discusses gender, race, and ethnicity patterns of image-sound practices. (SR)

  15. Mobile user experience for voice services: A theoretical framework

    CSIR Research Space (South Africa)

    Botha, Adèle

    2012-02-01

    Full Text Available The purpose of this paper is to provide a “Mobile User Experience Framework for Voice services.” The rapid spread of mobile cellular technology within Africa has made it a prime vehicle for accessing services and content. The challenge remains...

  16. A Conjoint Analysis of Voice Over IP Attributes.

    Science.gov (United States)

    Zubey, Michael L.; Wagner, William; Otto, James R.

    2002-01-01

    Managers need to understand the tradeoffs associated with voice over Internet protocol (VoIP) networks as compared to the Public Switched Telephone Network (PSTN). This article measures the preference structures between IP telephony and PSTN services using conjoint analysis. The purpose is to suggest VoIP technology attributes that best meet…

  17. Study on Digital Watermarking Technology of Image Recognition Technology%数字水印技术的造假图像识别技术的探究

    Institute of Scientific and Technical Information of China (English)

    王晶

    2015-01-01

    With the development of science and technology, digital information technology is developing rapidly, especially the popularity of the Internet, lead to the unprecedented access to information, At the same time information security becomes one of hot topics today. Digital image watermark technology is developed based on the theory of information hiding a new anti-piracy technology, which USES digital image processing technology to realize information hiding, in the field of digital works copyright protection will play an important role. This paper mainly discusses the application of digital watermarking technology development present situation and the main, And the application of the digital watermarking technology in reality and the future development direction.%随着科学技术的发展,数字信息技术发展迅速,尤其是互联网的普及,使信息获取得到了空前的发展,同时信息安全成为当今热门话题之一。数字图像水印技术是基于信息隐藏理论发展而来的一种新兴的防盗版技术,它利用数字图像处理技术来实现信息隐藏,在数字作品版权保护领域发挥着重要的作用。主要探讨数字水印技术的发展现状和主要应用,以及数字水印技术在现实中的应用和未来发展方向。

  18. Lexical frequency and voice assimilation.

    Science.gov (United States)

    Ernestus, Mirjam; Lahey, Mybeth; Verhees, Femke; Baayen, R Harald

    2006-08-01

    Acoustic duration and degree of vowel reduction are known to correlate with a word's frequency of occurrence. The present study broadens the research on the role of frequency in speech production to voice assimilation. The test case was regressive voice assimilation in Dutch. Clusters from a corpus of read speech were more often perceived as unassimilated in lower-frequency words and as either completely voiced (regressive assimilation) or, unexpectedly, as completely voiceless (progressive assimilation) in higher-frequency words. Frequency did not predict the voice classifications over and above important acoustic cues to voicing, suggesting that the frequency effects on the classifications were carried exclusively by the acoustic signal. The duration of the cluster and the period of glottal vibration during the cluster decreased while the duration of the release noises increased with frequency. This indicates that speakers reduce articulatory effort for higher-frequency words, with some acoustic cues signaling more voicing and others less voicing. A higher frequency leads not only to acoustic reduction but also to more assimilation.

  19. Voice over IP in Wireless Heterogeneous Networks

    DEFF Research Database (Denmark)

    Fathi, Hanane; Chakraborty, Shyam; Prasad, Ramjee

    The convergence of different types of traffic has preceded the convergence of systems and services in a wireless heterogeneous network. Voice and data traffic are usually treated separate in both 2G and 2.5G wireless networks. With advances in packet switching technology and especially......IP communications are difficult to achieve in a time-varying environment due to channel errors and traffic congestion and across different systems. The provision of VoIP in wireless heterogeneous networks requires a set of time-efficient control mechanisms to support a VoIP session with acceptable quality...

  20. Pattern Recognition Technology Based on Continuous Hidden Markov Model and its Application%基于连续隐马尔柯夫模型的模式识别技术及其应用

    Institute of Scientific and Technical Information of China (English)

    刘伯高

    2015-01-01

    对利用基因算法训练连续隐马尔柯夫模型的语音识别的具体算法进行系统的研究;然后基于该语音识别技术对深圳市司法局社区矫正声纹识别系统进行详细设计。该系统上线后的运行结果表明,利用基因算法训练连续隐马尔柯夫模型的语音识别算法的识别速度较快同时具有较高的识别率。基于模式识别技术的司法社区矫正声纹识别系统建设在我国司法系统目前尚处于起步阶段,推广和建设司法社区矫正声纹识别系统具有重要的现实意义。%Systematic research was done on the specific algorithm for speech recognition in using genetic algorithm to train continuous hidden Markov mode.Then the detailed design of Voiceprint Recognition System of Community Correction Objects in the Shenzhen City Bureau of Justice has been done based on the speech recognition technology.The system run-ning results show that the recognition rate of recognition algorithm using genetic algorithm to train continuous hidden Mark-ov model is faster and has a higher rate of recognition.Construction of voiceprint recognition system of judicial community correction objects based on pattern recognition is still in the junior stage in our judicial system,and promotion and the con-struction of voiceprint recognition system of judicial community correction objects have the important practical significance.

  1. Facial, vocal and musical emotion recognition is altered in paranoid schizophrenic patients.

    Science.gov (United States)

    Weisgerber, Anne; Vermeulen, Nicolas; Peretz, Isabelle; Samson, Séverine; Philippot, Pierre; Maurage, Pierre; De Graeuwe D'Aoust, Catherine; De Jaegere, Aline; Delatte, Benoît; Gillain, Benoît; De Longueville, Xavier; Constant, Eric

    2015-09-30

    Disturbed processing of emotional faces and voices is typically observed in schizophrenia. This deficit leads to impaired social cognition and interactions. In this study, we investigated whether impaired processing of emotions also affects musical stimuli, which are widely present in daily life and known for their emotional impact. Thirty schizophrenic patients and 30 matched healthy controls evaluated the emotional content of musical, vocal and facial stimuli. Schizophrenic patients are less accurate than healthy controls in recognizing emotion in music, voices and faces. Our results confirm impaired recognition of emotion in voice and face stimuli in schizophrenic patients and extend this observation to the recognition of emotion in musical stimuli.

  2. Face Recognition Technology Based on Gabor Wavelet%基于Gabor小波的人脸识别技术

    Institute of Scientific and Technical Information of China (English)

    张秀艳; 裴雷雷

    2012-01-01

    本文首先通过直方图均衡化等预处理方法增强图像的整体对比度,使图像的细节更加清晰.然后利用Gabor小波变换,选取不同的尺度和方向对人脸表情特征进行提取.最后通过实验结果对比证明预处理后的图片在进行小波变换时能节省大量的运算时间,并提高识别率.%At first, the paper enhances overall contrast of the images through histogram equalization processing to make images detail clearer. Then the paper uses the Gabor wavelet transform and selects different scales and direction to extract facial expression feature. Finally, by comparing experimental results, it proves that a lot of computing time can be saved and improves the recognition rate through wavelet transform processing the pre-images. .

  3. Emotional cues during simultaneous face and voice processing: electrophysiological insights.

    Directory of Open Access Journals (Sweden)

    Taosheng Liu

    Full Text Available Both facial expression and tone of voice represent key signals of emotional communication but their brain processing correlates remain unclear. Accordingly, we constructed a novel implicit emotion recognition task consisting of simultaneously presented human faces and voices with neutral, happy, and angry valence, within the context of recognizing monkey faces and voices task. To investigate the temporal unfolding of the processing of affective information from human face-voice pairings, we recorded event-related potentials (ERPs to these audiovisual test stimuli in 18 normal healthy subjects; N100, P200, N250, P300 components were observed at electrodes in the frontal-central region, while P100, N170, P270 were observed at electrodes in the parietal-occipital region. Results indicated a significant audiovisual stimulus effect on the amplitudes and latencies of components in frontal-central (P200, P300, and N250 but not the parietal occipital region (P100, N170 and P270. Specifically, P200 and P300 amplitudes were more positive for emotional relative to neutral audiovisual stimuli, irrespective of valence, whereas N250 amplitude was more negative for neutral relative to emotional stimuli. No differentiation was observed between angry and happy conditions. The results suggest that the general effect of emotion on audiovisual processing can emerge as early as 200 msec (P200 peak latency post stimulus onset, in spite of implicit affective processing task demands, and that such effect is mainly distributed in the frontal-central region.

  4. Emotional cues during simultaneous face and voice processing: electrophysiological insights.

    Science.gov (United States)

    Liu, Taosheng; Pinheiro, Ana; Zhao, Zhongxin; Nestor, Paul G; McCarley, Robert W; Niznikiewicz, Margaret A

    2012-01-01

    Both facial expression and tone of voice represent key signals of emotional communication but their brain processing correlates remain unclear. Accordingly, we constructed a novel implicit emotion recognition task consisting of simultaneously presented human faces and voices with neutral, happy, and angry valence, within the context of recognizing monkey faces and voices task. To investigate the temporal unfolding of the processing of affective information from human face-voice pairings, we recorded event-related potentials (ERPs) to these audiovisual test stimuli in 18 normal healthy subjects; N100, P200, N250, P300 components were observed at electrodes in the frontal-central region, while P100, N170, P270 were observed at electrodes in the parietal-occipital region. Results indicated a significant audiovisual stimulus effect on the amplitudes and latencies of components in frontal-central (P200, P300, and N250) but not the parietal occipital region (P100, N170 and P270). Specifically, P200 and P300 amplitudes were more positive for emotional relative to neutral audiovisual stimuli, irrespective of valence, whereas N250 amplitude was more negative for neutral relative to emotional stimuli. No differentiation was observed between angry and happy conditions. The results suggest that the general effect of emotion on audiovisual processing can emerge as early as 200 msec (P200 peak latency) post stimulus onset, in spite of implicit affective processing task demands, and that such effect is mainly distributed in the frontal-central region.

  5. Web Voice Browser Based on an ISLPC Text-to-Speech Algorithm

    Institute of Scientific and Technical Information of China (English)

    LIAO Rikun; JI Yuefeng; LI Hui

    2006-01-01

    A kind of Web voice browser based on improved synchronous linear predictive coding (ISLPC) and Text-to-Speech (TTS) algorithm and Internet application was proposed. The paper analyzes the features of TTS system with ISLPC speech synthesis and discusses the design and implementation of ISLPC TTS-based Web voice browser. The browser integrates Web technology, Chinese information processing, artificial intelligence and the key technology of Chinese ISLPC speech synthesis. It's a visual and audible web browser that can improve information precision for network users. The evaluation results show that ISLPC-based TTS model has a better performance than other browsers in voice quality and capability of identifying Chinese characters.

  6. Deploying a simple voice over IP network using a simulation tool

    OpenAIRE

    Limbu, Prajil

    2016-01-01

    Voice over IP is a major advancement in the field of IP communications systems technology since the advent of Internet. It is a communication technology which enables a device to transmit and receive voice traffic with the help of an IP-based network such as the Internet. Various types and deployments of Voice over IP are prevailing due to its popularity since its origin. Since its advent, it has managed to evolve and has given a platform to be benefited with its numerous advantages not only ...

  7. Memory for faces and voices varies as a function of sex and expressed emotion.

    Science.gov (United States)

    S Cortes, Diana; Laukka, Petri; Lindahl, Christina; Fischer, Håkan

    2017-01-01

    We investigated how memory for faces and voices (presented separately and in combination) varies as a function of sex and emotional expression (anger, disgust, fear, happiness, sadness, and neutral). At encoding, participants judged the expressed emotion of items in forced-choice tasks, followed by incidental Remember/Know recognition tasks. Results from 600 participants showed that accuracy (hits minus false alarms) was consistently higher for neutral compared to emotional items, whereas accuracy for specific emotions varied across the presentation modalities (i.e., faces, voices, and face-voice combinations). For the subjective sense of recollection ("remember" hits), neutral items received the highest hit rates only for faces, whereas for voices and face-voice combinations anger and fear expressions instead received the highest recollection rates. We also observed better accuracy for items by female expressers, and own-sex bias where female participants displayed memory advantage for female faces and face-voice combinations. Results further suggest that own-sex bias can be explained by recollection, rather than familiarity, rates. Overall, results show that memory for faces and voices may be influenced by the expressions that they carry, as well as by the sex of both items and participants. Emotion expressions may also enhance the subjective sense of recollection without enhancing memory accuracy.

  8. A Transliteration Algorithm for Adapting a Japanese Voice Controlled Browser to English

    Science.gov (United States)

    Saito, Kuniko; Shinohara, Akio; Nagata, Masaaki; Ohara, Hisashi

    We propose a novel algorithm to transliterate English to Japanese and its application to a voice controlled browser, which enable ordinary Japanese people to browse English Web site by voice. Speech recognition software designed for native English speakers do not work for most Japanese because Japanese can't pronounce English as native English speakers do. Therefore, we combined Japanese speech recognition software with English-to-Japanese transliteration software. The accuracy of our transliteration algorithm is 80% recall for the top candidate, and 92% recall for the top three candidates. The browser using this transliteration algorithm makes it possible for Japanese to navigate English Web pages almost as accurate as Japanese pages by voice commands.

  9. Design of household control system based on speech recognition%基于语音识别的家居控制系统设计

    Institute of Scientific and Technical Information of China (English)

    黄辉健; 程良鸿; 黄明杰; 林垣华; 李志杰

    2014-01-01

    This paper studied the technology of speaker-dependent recognition based on Sunplus SPCE061A, voice recognition technology will be applied to the home control system. Proposed a control scheme which is convenient operation,easy to expand, and applicable to home applications. The system will be analyzed from the perspective of hardware circuit and software design. Also in the Google App Inventer platform, built out a control software based on Android smartphone’s Bluetooth communication.The tested results showed that the system has successfully realized the voice technology appliances and Android smartphones remote control technology.%本文研究了凌阳SPCE061A的特定人的语音识别与控制技术,将语音识别技术应用到家居控制系统中。提出一种操作简便、易扩展、适用于家庭应用的控制方案。分析了系统的硬件组成和软件设计流程。同时在Google App Inventer平台下,介绍了基于蓝牙通信的Android智能手机控制软件的搭建。经实际测试表明,本系统成功地实现对家电的声控技术和Android智能手机远程控制。

  10. The voice of emotion across species: how do human listeners recognize animals' affective states?

    Directory of Open Access Journals (Sweden)

    Marina Scheumann

    Full Text Available Voice-induced cross-taxa emotional recognition is the ability to understand the emotional state of another species based on its voice. In the past, induced affective states, experience-dependent higher cognitive processes or cross-taxa universal acoustic coding and processing mechanisms have been discussed to underlie this ability in humans. The present study sets out to distinguish the influence of familiarity and phylogeny on voice-induced cross-taxa emotional perception in humans. For the first time, two perspectives are taken into account: the self- (i.e. emotional valence induced in the listener versus the others-perspective (i.e. correct recognition of the emotional valence of the recording context. Twenty-eight male participants listened to 192 vocalizations of four different species (human infant, dog, chimpanzee and tree shrew. Stimuli were recorded either in an agonistic (negative emotional valence or affiliative (positive emotional valence context. Participants rated the emotional valence of the stimuli adopting self- and others-perspective by using a 5-point version of the Self-Assessment Manikin (SAM. Familiarity was assessed based on subjective rating, objective labelling of the respective stimuli and interaction time with the respective species. Participants reliably recognized the emotional valence of human voices, whereas the results for animal voices were mixed. The correct classification of animal voices depended on the listener's familiarity with the species and the call type/recording context, whereas there was less influence of induced emotional states and phylogeny. Our results provide first evidence that explicit voice-induced cross-taxa emotional recognition in humans is shaped more by experience-dependent cognitive mechanisms than by induced affective states or cross-taxa universal acoustic coding and processing mechanisms.

  11. Optical gesture sensing and depth mapping technologies for head-mounted displays: an overview

    Science.gov (United States)

    Kress, Bernard; Lee, Johnny

    2013-05-01

    Head Mounted Displays (HMDs), and especially see-through HMDs have gained renewed interest in recent time, and for the first time outside the traditional military and defense realm, due to several high profile consumer electronics companies presenting their products to hit market. Consumer electronics HMDs have quite different requirements and constrains as their military counterparts. Voice comments are the de-facto interface for such devices, but when the voice recognition does not work (not connection to the cloud for example), trackpad and gesture sensing technologies have to be used to communicate information to the device. We review in this paper the various technologies developed today integrating optical gesture sensing in a small footprint, as well as the various related 3d depth mapping sensors.

  12. Voice Habits and Behaviors: Voice Care Among Flamenco Singers.

    Science.gov (United States)

    Garzón García, Marina; Muñoz López, Juana; Y Mendoza Lara, Elvira

    2017-03-01

    The purpose of this study is to analyze the vocal behavior of flamenco singers, as compared with classical music singers, to establish a differential vocal profile of voice habits and behaviors in flamenco music. Bibliographic review was conducted, and the Singer's Vocal Habits Questionnaire, an experimental tool designed by the authors to gather data regarding hygiene behavior, drinking and smoking habits, type of practice, voice care, and symptomatology perceived in both the singing and the speaking voice, was administered. We interviewed 94 singers, divided into two groups: the flamenco experimental group (FEG, n = 48) and the classical control group (CCG, n = 46). Frequency analysis, a Likert scale, and discriminant and exploratory factor analysis were used to obtain a differential profile for each group. The FEG scored higher than the CCG in speaking voice symptomatology. The FEG scored significantly higher than the CCG in use of "inadequate vocal technique" when singing. Regarding voice habits, the FEG scored higher in "lack of practice and warm-up" and "environmental habits." A total of 92.6% of the subjects classified themselves correctly in each group. The Singer's Vocal Habits Questionnaire has proven effective in differentiating flamenco and classical singers. Flamenco singers are exposed to numerous vocal risk factors that make them more prone to vocal fatigue, mucosa dehydration, phonotrauma, and muscle stiffness than classical singers. Further research is needed in voice training in flamenco music, as a means to strengthen the voice and enable it to meet the requirements of this musical genre. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  13. Prototype app for voice therapy: a peer review.

    Science.gov (United States)

    Lavaissiéri, Paula; Melo, Paulo Eduardo Damasceno

    2017-03-09

    Voice speech therapy promotes changes in patients' voice-related habits and rehabilitation. Speech-language therapists use a host of materials ranging from pictures to electronic resources and computer tools as aids in this process. Mobile technology is attractive, interactive and a nearly constant feature in the daily routine of a large part of the population and has a growing application in healthcare. To develop a prototype application for voice therapy, submit it to peer assessment, and to improve the initial prototype based on these assessments. a prototype of the Q-Voz application was developed based on Apple's Human Interface Guidelines. The prototype was analyzed by seven speech therapists who work in the voice area. Improvements to the product were made based on these assessments. all features of the application were considered satisfactory by most evaluators. All evaluators found the application very useful; evaluators reported that patients would find it easier to make changes in voice behavior with the application than without it; the evaluators stated they would use this application with their patients with dysphonia and in the process of rehabilitation and that the application offers useful tools for voice self-management. Based on the suggestions provided, six improvements were made to the prototype. the prototype Q-Voz Application was developed and evaluated by seven judges and subsequently improved. All evaluators stated they would use the application with their patients undergoing rehabilitation, indicating that the Q-Voz Application for mobile devices can be considered an auxiliary tool for voice speech therapy.

  14. Voices of the Unheard

    DEFF Research Database (Denmark)

    Matthiesen, Noomi Christine Linde

    2014-01-01

    . They were in two different classes at both schools, i.e. four classes in total. The families were followed for 18 months. Formal interviews were conducted with mothers and teachers, parent-teacher conferences were recorded, participant observations were conducted in classrooms and playgrounds, afterschool...... is that Somali diaspora parents (and with special focus on mothers as these where the parents who took most responsibility in the four cases of this research) have difficulty expressing their opinions as there are structural, historical and social dynamics that create conditions in which their voices...... are silenced, or at least restricted significantly, resulting in marginalizing consequences. The focus in each article is on here-and-now interactional dynamics but in order to understand these constitutive negotiations, it is argued that the analysis must be situated in a description of the constituted...

  15. Passing on power & voice

    DEFF Research Database (Denmark)

    Noer, Vibeke Røn; Nielsen, Cathrine Sand

    2014-01-01

    . The education lasts for 3,5 years and the landmark of the educational model is the continuously shifts between teaching in classroom and teaching in clinical practice. Clinical teaching takes place at approved clinical placement institutions in hospitals and in the social and health care services outside...... intention of gaining knowledge about other possible ways to perform the education. The class, named the E-class, followed what in the field was named ‘an experimental educational model based on experienced-based learning’ (Nielsen et al. 2011). The experiential educational model is argued as an experiment.......aspx Higher degree of student involvement in planning as well as teaching was in the field presented as a part of ‘the overall educational approach’. In the course ‘Acute, Critical Nursing & Terminal, Palliative Care’ this was transferred into an innovative pedagogy with intend to pass on power and voice...

  16. Voice stress analysis

    Science.gov (United States)

    Brenner, Malcolm; Shipp, Thomas

    1988-01-01

    In a study of the validity of eight candidate voice measures (fundamental frequency, amplitude, speech rate, frequency jitter, amplitude shimmer, Psychological Stress Evaluator scores, energy distribution, and the derived measure of the above measures) for determining psychological stress, 17 males age 21 to 35 were subjected to a tracking task on a microcomputer CRT while parameters of vocal production as well as heart rate were measured. Findings confirm those of earlier studies that increases in fundamental frequency, amplitude, and speech rate are found in speakers involved in extreme levels of stress. In addition, it was found that the same changes appear to occur in a regular fashion within a more subtle level of stress that may be characteristic, for example, of routine flying situations. None of the individual speech measures performed as robustly as did heart rate.

  17. Voice over IP

    OpenAIRE

    Mantula, Juha

    2006-01-01

    Tämä opinnäytetyö käsittelee Voice over Internet Protocol -tekniikkaa ja sen tuomia mahdollisuuksia yrityselämässä. Teoriaosa käsittelee VoIP:n kannalta tärkeitä pro-tokollia ja standardeja, VoIP:n ominaisuuksia sekä esittelee erilaisia puheohjelmia, jotka käyttävät VoIP-tekniikkaa hyväkseen. Empiirinen osuus tutkii Viestintä Ky Pitkärannan Skype-ohjelman käyttöä. Työn tarkoituksena on selvittää VoIP:n hyviä ja huonoja puolia ja sitä kuinka tek-niikkaa voidaan käyttää hyväksi päivittäisessä ...

  18. Dialogic Showcases Innovative Asian CT Solutions at Voice Asia '98

    Institute of Scientific and Technical Information of China (English)

    1998-01-01

    Dialogic Corporation showcases Computer Telephony (CT) solutions from some of Asia's leading CT developers at this Voice Asia'98 show. These vendors display the latest Asian solutions for IP Telephony,Speech Recognition, Telephone Company Enhanced Services Platform. Call Center and Unified Messaging, Open Switch and CT Servers.

  19. AJAX技术在手势识别系统中的应用%Application of AJAX Technology in Gesture Recognition System

    Institute of Scientific and Technical Information of China (English)

    王仁丽; 王倩

    2016-01-01

    This paper states the basic principles and key technology of Ajax by comparing with the traditional Web application model, and gives an application examples: in a gesture recognition system, adopt Python scripts to build a web server, and analyze the use of Ajax technology in the web for realizing data automatically refresh.Experimental results show that it reduces the time of waiting and the pressure of the server, and obtains a good system performance and user experience.%通过与传统Web应用程序模型对比,阐述了Ajax技术的基本原理和关键技术,给出了应用实例:基于一个手势识别系统,采用Python脚本搭建Web服务器,实现并分析Ajax技术在其Web端数据的定时自动刷新效果。实验结果表明,Ajax的应用减少了用户等待的时间和服务器的压力,获得了良好的系统性能和用户体验。

  20. Research and Practice of EMR based on Voice Cloud Computing%基于语音云的电子病历研究与实践

    Institute of Scientific and Technical Information of China (English)

    徐冬; 陶石; 刘雨生

    2012-01-01

    随着云计算概念的提出,基于云的语音识别技术得到快速发展.基于中文语音识别的云计算技术,结合模板化的电子病历应用实践,探索研究了中文语音识别技术在临床电子病历整合与集成的最佳应用实践.%Voice Cloud computing is the delivery of ASR (Automatic Speech Recognition) computing as a service. In this article, we combined the template-based EMR (Electronic Medical Record) with ASR technology and built the best practice of application integration between clinical EMR and ASR technology.

  1. Image Processing Technology in the Motor Vehicle License Plate Recognition Technology Application%图像处理技术在机动车车牌自动识别技术中的应用

    Institute of Scientific and Technical Information of China (English)

    宁彬

    2013-01-01

    主要分析了图像处理技术在机动车车牌自动识别技术中的应用.按照车牌定位由彩色图转化到灰度图、车牌区域分割、车牌位置校正等步骤,对车牌字符的识别进行了分析,并对自动识别技术进行了改进.基于图像处理技术设计的机动车辆车牌自动识别系统,在保障交通顺畅运行方面发挥着巨大作用.从实际应用效果看图像处理技术在机动车车牌自动识别技术实际运用中效果良好,具有一定的推广价值.%The application of image processing technology in the motor vehicle license plate automatic identification technology was analyzed. In accordance with the license plate location transformed by the color chart to grayscale, license plate region segmentation, the license plate position correction step on the license plate character recognition, and automatic identification technology improvements. Motor vehicle car brand automatic identification system design based on the image processing technology, play a huge role in the protection of traffic running smoothly. From the effect of practical application of image processing technology in the motor vehicle license plate automatic identification technology to practical use in the good results, the promotional value.

  2. Introduction to Arabic Speech Recognition Using CMUSphinx System

    CERN Document Server

    Satori, H; Chenfour, N

    2007-01-01

    In this paper Arabic was investigated from the speech recognition problem point of view. We propose a novel approach to build an Arabic Automated Speech Recognition System (ASR). This system is based on the open source CMU Sphinx-4, from the Carnegie Mellon University. CMU Sphinx is a large-vocabulary; speaker-independent, continuous speech recognition system based on discrete Hidden Markov Models (HMMs). We build a model using utilities from the OpenSource CMU Sphinx. We will demonstrate the possible adaptability of this system to Arabic voice recognition.

  3. Citizen voices performing public participation in science and environment communication

    CERN Document Server

    Carvalho, Anabela; Doyle, Julie

    2012-01-01

    How is "participation" ascribed meaning and practised in science and environment communication? And how are citizen voices articulated, invoked, heard, marginalised or silenced in those processes? Citizen Voices takes its starting point in the so-called dialogic or participatory turn in scientific and environmental governance in which practices claiming to be based on principles of participation, dialogue and citizen involvement have proliferated. The book goes beyond the buzzword of "participation" in order to give empirically rich, theoretically informed and critical accounts of how citizen participation is understood and enacted in mass mediation and public engagement practices. A diverse series of studies across Europe and the US are presented, providing readers with empirical insights into the articulation of citizen voices in different national, cultural and institutional contexts. Building bridges across media and communication studies, science and technology studies, environmental studies and urban pl...

  4. Voice and choice by delegation.

    Science.gov (United States)

    van de Bovenkamp, Hester; Vollaard, Hans; Trappenburg, Margo; Grit, Kor

    2013-02-01

    In many Western countries, options for citizens to influence public services are increased to improve the quality of services and democratize decision making. Possibilities to influence are often cast into Albert Hirschman's taxonomy of exit (choice), voice, and loyalty. In this article we identify delegation as an important addition to this framework. Delegation gives individuals the chance to practice exit/choice or voice without all the hard work that is usually involved in these options. Empirical research shows that not many people use their individual options of exit and voice, which could lead to inequality between users and nonusers. We identify delegation as a possible solution to this problem, using Dutch health care as a case study to explore this option. Notwithstanding various advantages, we show that voice and choice by delegation also entail problems of inequality and representativeness.

  5. The Christian voice in philosophy

    Directory of Open Access Journals (Sweden)

    Stuart Fowler

    1982-03-01

    Full Text Available In this paper the Rev. Stuart Fowler outlines a Christian voice in Philosophy and urges the Christian philosopher to investigate his position and his stance with integrity and honesty.

  6. Voice Force tulekul / Tõnu Ojala

    Index Scriptorium Estoniae

    Ojala, Tõnu, 1969-

    2005-01-01

    60. sünnipäeva tähistava Tallinna Tehnikaülikooli Akadeemilise Meeskoori juubelihooaja üritusest - a capella pop-gruppide festivalist Voice Force (kontserdid 12. nov. klubis Parlament ja 3. dets. Vene Kultuurikeskuses)

  7. Voice Force tulekul / Tõnu Ojala

    Index Scriptorium Estoniae

    Ojala, Tõnu, 1969-

    2005-01-01

    60. sünnipäeva tähistava Tallinna Tehnikaülikooli Akadeemilise Meeskoori juubelihooaja üritusest - a capella pop-gruppide festivalist Voice Force (kontserdid 12. nov. klubis Parlament ja 3. dets. Vene Kultuurikeskuses)

  8. Assessment voice synthesizers for reading in digital books

    Directory of Open Access Journals (Sweden)

    Sérvulo Fernandes da Silva Neto

    2013-07-01

    Full Text Available The digital accessibility shows ways to information access in digital media that assist people with different types of disabilities to a better interaction with the computer independent of its limitations. Of these tools are composed by voice synthesizers, that supposedly simplifying their access to any recorded knowledge through digital technologies. However such tools have emerged originally in countries foreign language. Which brings us to the following research problem: the voice synthesizers are appropriate for reading digital books in the Portuguese language? The objective of this study was to analyze and classify different software tools voice synthesizers in combination with software digital book readers to support accessibility to e-books in Portuguese. Through literature review were identified applications software voice synthesizers, composing the sample analyzed in this work. We used a simplified version of the method of Multiple Criteria Decision Support - MMDA, to assess these. In the research 12 were considered readers of e-books and 11 software voice synthesizer, tested with six formats of e-books (E-pub, PDF, HTML, DOC, TXT, and Mobi. In accordance with the results, the software Virtual Vision achieved the highest score. Relative to formats, it was found that the PDF has measured a better score when summed the results of the three synthesizers. In the studied universe contacted that many synthesizers simply cannot be used because they did not support the Portuguese language.

  9. Feature Extraction of Voice Segments Using Cepstral Analysis for Voice Regeneration

    OpenAIRE

    Banerjee, P. S.; Baisakhi Chakraborty; Jaya Banerjee

    2015-01-01

    Even though a lot of work has been done on areas of speech to text and vice versa or voice detection or similarity analysis of two voice samples but very less emphasis has be given to voice regeneration. General algorithms for distinct voice checking for two voice sources paved way for our endeavor in reconstructing the voice from the source voice samples provided. By utilizing these algorithms and putting further stress on the feature extraction part we tried to fabricate the source voice wi...

  10. Voice Simulation in Nursing Education.

    Science.gov (United States)

    Kepler, Britney B; Lee, Heeyoung; Kane, Irene; Mitchell, Ann M

    2016-01-01

    The goal of this study was to improve prelicensure nursing students' attitudes toward and self-efficacy related to delivering nursing care to patients with auditory hallucinations. Based on the Hearing Voices That Are Distressing curriculum, 87 participants were instructed to complete 3 tasks while wearing headphones delivering distressing voices. Comparing presimulation and postsimulation results, this study suggests that the simulation significantly improved attitudes toward patients with auditory hallucinations; however, self-efficacy related to caring for these patients remained largely unchanged.

  11. Work-related voice disorder

    OpenAIRE

    Paulo Eduardo Przysiezny; Luciana Tironi Sanson Przysiezny

    2015-01-01

    INTRODUCTION: Dysphonia is the main symptom of the disorders of oral communication. However, voice disorders also present with other symptoms such as difficulty in maintaining the voice (asthenia), vocal fatigue, variation in habitual vocal fundamental frequency, hoarseness, lack of vocal volume and projection, loss of vocal efficiency, and weakness when speaking. There are several proposals for the etiologic classification of dysphonia: functional, organofunctional, organic, and work-related...

  12. Tracheostomy cannulas and voice prosthesis.

    Science.gov (United States)

    Kramp, Burkhard; Dommerich, Steffen

    2009-01-01

    Cannulas and voice prostheses are mechanical aids for patients who had to undergo tracheotomy or laryngectomy for different reasons. For better understanding of the function of those artificial devices, first the indications and particularities of the previous surgical intervention are described in the context of this review. Despite the established procedure of percutaneous dilatation tracheotomy e.g. in intensive care units, the application of epithelised tracheostomas has its own position, especially when airway obstruction is persistent (e.g. caused by traumata, inflammations, or tumors) and a longer artificial ventilation or special care of the patient are required. In order to keep the airways open after tracheotomy, tracheostomy cannulas of different materials with different functions are available. For each patient the most appropriate type of cannula must be found. Voice prostheses are meanwhile the device of choice for rapid and efficient voice rehabilitation after laryngectomy. Individual sizes and materials allow adaptation of the voice prostheses to the individual anatomical situation of the patients. The combined application of voice prostheses with HME (Head and Moisture Exchanger) allows a good vocal as well as pulmonary rehabilitation. Precondition for efficient voice prosthesis is the observation of certain surgical principles during laryngectomy. The duration of the prosthesis mainly depends on material properties and biofilms, mostly consisting of funguses and bacteries. The quality of voice with valve prosthesis is clearly superior to esophagus prosthesis or electro-laryngeal voice. Whenever possible, tracheostoma valves for free-hand speech should be applied. Physicians taking care of patients with speech prostheses after laryngectomy should know exactly what to do in case the device fails or gets lost.

  13. Challenging Institutional Conventions and Forming a Voice through Creativity

    DEFF Research Database (Denmark)

    Nielsen, Margit Saltofte

    2013-01-01

    This article explores and discusses examples of students’ everyday creativity that seem to be overlooked by teachers but are acknowledged by ‘peers’ in a 9th Grade (age 15–16) at a Danish free school. Creativity emerged as part of the everyday student interactions at school in ‘in-between’ social...... spaces, outside the formal teaching zones. Creative activities took place in the interstitial zones of time and space, where they gave voice to those students whose voice is not always heard in the formal teaching context. Creativity occurred also among students as a way to challenge institutional...... conditions and this practice gave them recognition by their peers. The argument is being made that students’ interactions in these zones draw on other forms of knowledge and ways of performing than those used in structured teaching zones. The creativity expressed in interstitial zones contributes to forming...

  14. Phoneme Recognition Using Acoustic Events

    CERN Document Server

    Huebener, K; Huebener, Kai; Carson-Berndsen, Julie

    1994-01-01

    This paper presents a new approach to phoneme recognition using nonsequential sub--phoneme units. These units are called acoustic events and are phonologically meaningful as well as recognizable from speech signals. Acoustic events form a phonologically incomplete representation as compared to distinctive features. This problem may partly be overcome by incorporating phonological constraints. Currently, 24 binary events describing manner and place of articulation, vowel quality and voicing are used to recognize all German phonemes. Phoneme recognition in this paradigm consists of two steps: After the acoustic events have been determined from the speech signal, a phonological parser is used to generate syllable and phoneme hypotheses from the event lattice. Results obtained on a speaker--dependent corpus are presented.

  15. Voice Collection under Different Spectrum

    Directory of Open Access Journals (Sweden)

    Min Li

    2013-05-01

    Full Text Available According to the short-time Fourier transform theory and principle of digital filtering, this paper established a mathematical model called collection of voice signal collection at different spectrum. The voice signal was a non-stationary process, while the standard Fourier transform only applied to the periodic signal, transient signals or stationary random signal. Therefore, the standard Fourier transform could not be directly used for the speech signal. By controlling the input different types and parameters, this paper analyzed the collected original voice signal spectrum with the use of MATLAB software platform. At the same time, it realized the extraction, recording and playback of the speech signal at different frequencies. Therefore, the waveforms could be displayed obviously on the graphic user interface and voice effect could be more clearly. Meanwhile, the result was verified by the hardware platforms, which consisted of TMS320VC5509A [1] chip and TLV320AIC23 voice chip. The results showed that the extraction of voice signal under different spectrum model was scientific, rational and effective.

  16. Speech Recognition on Mobile Devices

    DEFF Research Database (Denmark)

    Tan, Zheng-Hua; Lindberg, Børge

    2010-01-01

    The enthusiasm of deploying automatic speech recognition (ASR) on mobile devices is driven both by remarkable advances in ASR technology and by the demand for efficient user interfaces on such devices as mobile phones and personal digital assistants (PDAs). This chapter presents an overview of ASR...... in the mobile context covering motivations, challenges, fundamental techniques and applications. Three ASR architectures are introduced: embedded speech recognition, distributed speech recognition and network speech recognition. Their pros and cons and implementation issues are discussed. Applications within...... command and control, text entry and search are presented with an emphasis on mobile text entry....

  17. Speech Recognition System For Robotic Control And Movement

    Directory of Open Access Journals (Sweden)

    Biraja Nalini Rout

    2015-08-01

    Full Text Available Abstract In a current scenario voice and data recognition is one of the most sought after field in the area of artificial intelligence and robotic 1 engineering. The idea specializes on deriving a voice to voice intelligent system which operates purely on audiovoice instructions using a specialized voice recognition module a micro controller a set of wheels and a movable arm to operate. The working involves real time voice inputs feeded to the VR module which equivalently processes the audio signals and produces the output in audio format. It consists an IDE for both Windows and UNIX based operating system for manipulating and processing instructions both at software and hardware levels. The system also can perform a basic set of manual operations decides through the expert system. The VR module processes the data using multilayer perceptron to generate the required result. Movable arm operates to pick and place objects as per the given voice instructions. Its usability involves substituting manual work at both personal and professional levels.

  18. The impact of voice on speech realization

    Directory of Open Access Journals (Sweden)

    Jelka Breznik

    2014-12-01

    Full Text Available The study discusses spoken literary language and the impact of voice on speech realization. The voice consists of a sound made by a human being using the vocal folds for talking, singing, laughing, crying, screaming… The human voice is specifically the part of human sound production in which the vocal folds (vocal cords are the primary sound source. Our voice is our instrument and identity card. How does the voice (voice tone affect others and how do they respond, positively or negatively? How important is voice (voice tone in communication process? The study presents how certain individuals perceive voice. The results of the research on the relationships between the spoken word, excellent speaker, voice and description / definition / identification of specific voices done by experts in the field of speech and voice as well as non-professionals are presented. The study encompasses two focus groups. One consists of amateurs (non-specialists in the field of speech or voice who have no knowledge in this field and the other consists of professionals who work with speech or language or voice. The questions were intensified from general to specific, directly related to the topic. The purpose of such a method of questioning was to create relaxed atmosphere, promote discussion, allow participants to interact, complement, and to set up self-listening and additional comments.

  19. Towards very large vocabulary word recognition

    Science.gov (United States)

    Waibel, A.

    1982-11-01

    In this paper, preliminary considerations and some experimental results are presented in an effort to design Very Large Vocabulary Recognition (VLVR) systems. We will first consider the applicability of current recognition techniques and argue their inadequacy for VLVR. Possible alternate strategies will be explored and their potential usefulness statistically evaluated. Our results indicate that suprasegmental cues such as syllabification, stress patterns, rhythmic patterns, rhythmic patterns and the voiced - unvoiced patterns in the syllables of a word provide powerful mechanisms for search space reduction. Suprasegmental feature could thus operate in a complementary fashion to segmental features.

  20. 车载话音集中控制器中VoIP网关技术实现%VoIP Gateway Technology for Vehicle-Borne Voice Centralized Controller

    Institute of Scientific and Technical Information of China (English)

    李志国; 李乔

    2012-01-01

    介绍了车载话音集中控制器的整体设计及其IP电话(V oIP)联网总体构成。从V oIP网关的话音模式、录音管理以及消息转发机制角度,阐述了V oIP网关设计。通过车载话音控制器完成了野战地域网与无线网络的互连互通,增强了指挥调度能力。%The overall design of the vehicle-borne voice centralized controller and the general voice over internet protocol (VoIP) networking is described. The VoIP gateway design is illustrated from the voice mode of the VoIP gateway, the recording management and the message forwarding mechanism. Experimental results show that the vehicle-borne voice centralized controller can en- able the interconnection between the field data communication network and the wireless network, thus enhancing command and dispatching capabilities.

  1. Future Educators' Explaining Voices

    Science.gov (United States)

    de Oliveira, Janaina Minelli; Caballero, Pablo Buenestado; Camacho, Mar

    2013-01-01

    Teacher education programs must offer pre-service students innovative technology-supported learning environments, guiding them in the revision of their preconceptions on literacy and technology. This present paper presents a case study that uses podcast to inquiry into future educators' views on technology and the digital age. Results show future…

  2. Does knowing speaker sex facilitate vowel recognition at short durations?

    Science.gov (United States)

    Smith, David R R

    2014-05-01

    A man, woman or child saying the same vowel do so with very different voices. The auditory system solves the complex problem of extracting what the man, woman or child has said despite substantial differences in the acoustic properties of their voices. Much of the acoustic variation between the voices of men and woman is due to changes in the underlying anatomical mechanisms for producing speech. If the auditory system knew the sex of the speaker then it could potentially correct for speaker sex related acoustic variation thus facilitating vowel recognition. This study measured the minimum stimulus duration necessary to accurately discriminate whether a brief vowel segment was spoken by a man or woman, and the minimum stimulus duration necessary to accuately recognise what vowel was spoken. Results showed that reliable vowel recognition precedesreliable speaker sex discrimination, thus questioning the use of speaker sex information in compensating for speaker sex related acoustic variation in the voice. Furthermore, the pattern of performance across experiments where the fundamental frequency and formant frequency information of speaker's voices were systematically varied, was markedly different depending on whether the task was speaker-sex discrimination or vowel recognition. This argues for there being little relationship between perception of speaker sex (indexical information) and perception of what has been said (linguistic information) at short durations. Copyright © 2014 Elsevier B.V. All rights reserved.

  3. On the definition and interpretation of voice selective activation in the temporal cortex

    Directory of Open Access Journals (Sweden)

    Anja eBethmann

    2014-07-01

    Full Text Available Regions along the superior temporal sulci and in the anterior temporal lobes have been found to be involved in voice processing. It has even been argued that parts of the temporal cortices serve as voice-selective areas. Yet, evidence for voice-selective activation in the strict sense is still missing. The current fMRI study aimed at assessing the degree of voice-specific processing in different parts of the superior and middle temporal cortices. To this end, voices of famous persons were contrasted with widely different categories, which were sounds of animals and musical instruments. The argumentation was that only brain regions with statistically proven absence of activation by the control stimuli may be considered as candidates for voice-selective areas. Neural activity was found to be stronger in response to human voices in all analyzed parts of the temporal lobes except for the middle and posterior STG. More importantly, the activation differences between voices and the other environmental sounds increased continuously from the mid-posterior STG to the anterior MTG. Here, only voices but not the control stimuli excited an increase of the BOLD response above a resting baseline level. The findings are discussed with reference to the function of the anterior temporal lobes in person recognition and the general question on how to define selectivity of brain regions for a specific class of stimuli or tasks. In addition, our results corroborate recent assumptions about the hierarchical organization of auditory processing building on a processing stream from the primary auditory cortices to anterior portions of the temporal lobes.

  4. 基于视觉的手势识别技术及其应用研究%Research on Visual Gesture Recognition Technology and Its Application

    Institute of Scientific and Technical Information of China (English)

    张圆圆

    2015-01-01

    研究基于视觉的手势识别技术,并在 OpenCV 的平台基础上实现基于该技术的多媒体教学的应用,即在幻灯片播放的过程中能够由动态手势来控制幻灯片的翻页。首先通过摄像头来采集图像,利用背景差分法结合颜色直方图检测动态信息完成手势的检测。其次通过几种动态手势的跟踪算法的分析与比较,采用主流的非线性跟踪算法—粒子滤波算法。最后是应用实现部分,将手势识别的结果应用于多媒体演示文稿的播放中,实现通过动态手势实时控制 PPT 翻页的功能。%This paper studied the gesture recognition technology based on the vision,and realized the application of mul-timedia teaching based on the technology on the platform of OpenCV.That is,in the process of the slideshow flip,slide can be controlled by the dynamic gesture.Firstly,the gesture detection was impleminted through the camera to capture images and using background subtraction method combined with color histogram.Secondly,through the analysis and comparison of several dynamic gesture tracking algorithms,the mainstream nonlinear tracking algorithm-particle filter algorithm,was ad-rpted.The results of pattern recognition were applied to multimedia presentations,realizing the real-time control of PPT flip function by dynamic hand gesture.

  5. Research on IM Protocol Recognition System Based on DPI Technology%基于DPI技术的IM协议识别系统研究

    Institute of Scientific and Technical Information of China (English)

    王凯; 吴君钦

    2013-01-01

    Aiming at lawbreakers take advantage of the IM protocol communication software to divulge state and corporate secrets,as well as dissemination of reactionary remarks,in order to solve this problem,on the basis of studying and analyzing in-depth the IM protocol of a variety of instant messaging software,summarizing the previous IM protocol identification system's defects,combined with the applica-tion of DPI technology a novel IM protocol detection system is designed,which is the IM protocol recognition system based on DPI tech-nology. This system is capable of effective identification and monitoring of a variety of real-time communication software. Through the real-time monitoring experiments on a variety of instant messaging software such as QQ,fetion,MSN,the Sina microblogging desktop e-dition,googletalk,yahoomsg etc,verify the system with superior recognition rate and perfect stability.%针对不法分子利用IM协议通信软件泄露国家和企业机密以及传播反动言论的问题,文中在深入研究和分析多种即时通信软件的IM协议的基础上,总结以往IM协议识别系统的缺陷,配合DPI技术的应用设计了一个全新的IM协议检测系统,即基于DPI技术的IM协议识别系统,该系统能够有效地对多种即时通信软件进行识别和监控。通过实验对多种即时通信软件如QQ,fetion,MSN,新浪微博桌面版,googletalk,yahoomsg等的文本信息进行实时监控,验证了该系统对IM协议识别具备极高的识别率以及优越的稳定性。

  6. Touchless palmprint recognition systems

    CERN Document Server

    Genovese, Angelo; Scotti, Fabio

    2014-01-01

    This book examines the context, motivation and current status of biometric systems based on the palmprint, with a specific focus on touchless and less-constrained systems. It covers new technologies in this rapidly evolving field and is one of the first comprehensive books on palmprint recognition systems.It discusses the research literature and the most relevant industrial applications of palmprint biometrics, including the low-cost solutions based on webcams. The steps of biometric recognition are described in detail, including acquisition setups, algorithms, and evaluation procedures. Const

  7. Highly flexible self-powered sensors based on printed circuit board technology for human motion detection and gesture recognition

    Science.gov (United States)

    Fuh, Yiin-Kuen; Ho, Hsi-Chun

    2016-03-01

    In this paper, we demonstrate a new integration of printed circuit board (PCB) technology-based self-powered sensors (PSSs) and direct-write, near-field electrospinning (NFES) with polyvinylidene fluoride (PVDF) micro/nano fibers (MNFs) as source materials. Integration with PCB technology is highly desirable for affordable mass production. In addition, we systematically investigate the effects of electrodes with intervals in the range of 0.15 mm to 0.40 mm on the resultant PSS output voltage and current. The results show that at a strain of 0.5% and 5 Hz, a PSS with a gap interval 0.15 mm produces a maximum output voltage of 3 V and a maximum output current of 220 nA. Under the same dimensional constraints, the MNFs are massively connected in series (via accumulation of continuous MNFs across the gaps ) and in parallel (via accumulation of parallel MNFs on the same gap) simultaneously. Finally, encapsulation in a flexible polymer with different interval electrodes demonstrated that electrical superposition can be realized by connecting MNFs collectively and effectively in serial/parallel patterns to achieve a high current and high voltage output, respectively. Further improvement in PSSs based on the effect of cooperativity was experimentally realized by rolling-up the device into a cylindrical shape, resulting in a 130% increase in power output due to the cooperative effect. We assembled the piezoelectric MNF sensors on gloves, bandages and stockings to fabricate devices that can detect different types of human motion, including finger motion and various flexing and extensions of an ankle. The firmly glued PSSs were tested on the glove and ankle respectively to detect and harvest the various movements and the output voltage was recorded as ∼1.5 V under jumping movement (one PSS) and ∼4.5 V for the clenched fist with five fingers bent concurrently (five PSSs). This research shows that piezoelectric MNFs not only have a huge impact on harvesting various external

  8. Highly flexible self-powered sensors based on printed circuit board technology for human motion detection and gesture recognition.

    Science.gov (United States)

    Fuh, Yiin-Kuen; Ho, Hsi-Chun

    2016-03-01

    In this paper, we demonstrate a new integration of printed circuit board (PCB) technology-based self-powered sensors (PSSs) and direct-write, near-field electrospinning (NFES) with polyvinylidene fluoride (PVDF) micro/nano fibers (MNFs) as source materials. Integration with PCB technology is highly desirable for affordable mass production. In addition, we systematically investigate the effects of electrodes with intervals in the range of 0.15 mm to 0.40 mm on the resultant PSS output voltage and current. The results show that at a strain of 0.5% and 5 Hz, a PSS with a gap interval 0.15 mm produces a maximum output voltage of 3 V and a maximum output current of 220 nA. Under the same dimensional constraints, the MNFs are massively connected in series (via accumulation of continuous MNFs across the gaps ) and in parallel (via accumulation of parallel MNFs on the same gap) simultaneously. Finally, encapsulation in a flexible polymer with different interval electrodes demonstrated that electrical superposition can be realized by connecting MNFs collectively and effectively in serial/parallel patterns to achieve a high current and high voltage output, respectively. Further improvement in PSSs based on the effect of cooperativity was experimentally realized by rolling-up the device into a cylindrical shape, resulting in a 130% increase in power output due to the cooperative effect. We assembled the piezoelectric MNF sensors on gloves, bandages and stockings to fabricate devices that can detect different types of human motion, including finger motion and various flexing and extensions of an ankle. The firmly glued PSSs were tested on the glove and ankle respectively to detect and harvest the various movements and the output voltage was recorded as ∼1.5 V under jumping movement (one PSS) and ∼4.5 V for the clenched fist with five fingers bent concurrently (five PSSs). This research shows that piezoelectric MNFs not only have a huge impact on harvesting various external

  9. Voice recognition software can be used for scientific articles

    DEFF Research Database (Denmark)

    Pommergaard, Hans-Christian; Huang, Chenxi; Burcharth, Jacob

    2015-01-01

    be an alternative. The purpose of this study was to evaluate the out-of-the-box accuracy of VRS. METHODS: Eleven young researchers without dictation experience dictated the first draft of their own scientific article after thorough preparation according to a pre-defined schedule. The dictate transcribed by VRS...... with a median score of five (range: 3-9), which was improved with the addition of 5,000 words. CONCLUSION: The out-of-the-box performance of VRS was acceptable and improved after additional words were added. Further studies are needed to investigate the effect of additional software accuracy training....

  10. Practical Voice Recognition for the Aircraft Cockpit Project

    Data.gov (United States)

    National Aeronautics and Space Administration — This proposal responds to the urgent need for improved pilot interfaces in the modern aircraft cockpit. Recent advances in aircraft equipment bring tremendous...

  11. Voice Interactive Systems Technology (VIST) Research.

    Science.gov (United States)

    1984-01-01

    Z ’-’,- FIN J. , F: .J. *. I.1. ~ ~UNCLASSIFIED S;ECURITY CLASSIFICATION OF THIS PAGE...Ground DEC PDP-11 /45 SYSTEM GEERATION INFORMiTION RSX-11M Operating System Initializat ion Command File: File Name: INIT7.CMD Contents: SET/SPEED=TT7...34 " • - - - ’ ’ v " ’ ’ .- ’ % ’ - * ’ k’" ’ " ’ ’ ’ *. , , ,", . . . . .- - . . , . .- ’ ." - . .- . , . % . - -,- - , z , . . . . -*. . * .* -. 4 --

  12. Indonesian Automatic Speech Recognition For Command Speech Controller Multimedia Player

    Directory of Open Access Journals (Sweden)

    Vivien Arief Wardhany

    2014-12-01

    Full Text Available The purpose of multimedia devices development is controlling through voice. Nowdays voice that can be recognized only in English. To overcome the issue, then recognition using Indonesian language model and accousticc model and dictionary. Automatic Speech Recognizier is build using engine CMU Sphinx with modified english language to Indonesian Language database and XBMC used as the multimedia player. The experiment is using 10 volunteers testing items based on 7 commands. The volunteers is classifiedd by the genders, 5 Male & 5 female. 10 samples is taken in each command, continue with each volunteer perform 10 testing command. Each volunteer also have to try all 7 command that already provided. Based on percentage clarification table, the word “Kanan” had the most recognize with percentage 83% while “pilih” is the lowest one. The word which had the most wrong clarification is “kembali” with percentagee 67%, while the word “kanan” is the lowest one. From the result of Recognition Rate by male there are several command such as “Kembali”, “Utama”, “Atas “ and “Bawah” has the low Recognition Rate. Especially for “kembali” cannot be recognized as the command in the female voices but in male voice that command has 4% of RR this is because the command doesn’t have similar word in english near to “kembali” so the system unrecognize the command. Also for the command “Pilih” using the female voice has 80% of RR but for the male voice has only 4% of RR. This problem is mostly because of the different voice characteristic between adult male and female which male has lower voice frequencies (from 85 to 180 Hz than woman (165 to 255 Hz.The result of the experiment showed that each man had different number of recognition rate caused by the difference tone, pronunciation, and speed of speech. For further work needs to be done in order to improving the accouracy of the Indonesian Automatic Speech Recognition system

  13. Towards Real-Time Speech Emotion Recognition for Affective E-Learning

    Science.gov (United States)

    Bahreini, Kiavash; Nadolski, Rob; Westera, Wim

    2016-01-01

    This paper presents the voice emotion recognition part of the FILTWAM framework for real-time emotion recognition in affective e-learning settings. FILTWAM (Framework for Improving Learning Through Webcams And Microphones) intends to offer timely and appropriate online feedback based upon learner's vocal intonations and facial expressions in order…

  14. Word Intelligibility in Multi-voice Singing: The Influence of Chorus Size.

    Science.gov (United States)

    Condit-Schultz, Nathaniel; Huron, David

    2017-01-01

    This study investigated how the intelligibility of sung words is influenced by the number of singers in a choral music style. The study used repeated measures factorial. One hundred forty-nine participants listened to recordings of spoken and sung English words and attempted to identify the words. Each stimuli word was sung or spoken in sync by either one, four, eight, sixteen, or twenty-seven members of a high-quality Soprano Alto Tenor Bass (SATB) choir. In general, single-voice word recognition was higher than multi-voice word recognition in the sung condition. However, the difference between four concurrent singers and the full choir was negligible; that is, reduced intelligibility with multiple singers shows little sensitivity to the number of singers. The principal effect of voice density on intelligibility is found to occur with coda consonants-a result consistent with the importance many choral conductors attribute to coordinating word offsets. In particular, the plosives /b/, /d/, /g/, and /p/ are easily confused. Coda liquids (/l/,/r/) were also found to be a source of confusion. Finally, an increasing density of voices appears to have a facilitating effect for the coda nasal /m/. Groups of four or more choral singers do appear to be less intelligible than single singers, although the observed effect is modest. However, increasing the number of singers in a choral texture beyond four singers does not appear to further degrade intelligibility. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  15. [A comparative study of pathological voice based on traditional acoustic characteristics and nonlinear features].

    Science.gov (United States)

    Gan, Deying; Hu, Weiping; Zhao, Bingxin

    2014-10-01

    By analyzing the mechanism of pronunciation, traditional acoustic parameters, including fundamental frequency, Mel frequency cepstral coefficients (MFCC), linear prediction cepstrum coefficient (LPCC), frequency perturbation, amplitude perturbation, and nonlinear characteristic parameters, including entropy (sample entropy, fuzzy entropy, multi-scale entropy), box-counting dimension, intercept and Hurst, are extracted as feature vectors for identification of pathological voice. Seventy-eight normal voice samples and 73 pathological voice samples for /a/, and 78 normal samples and 80 pathological samples for /i/ are recognized based on support vector machine (SVM). The results showed that compared with traditional acoustic parameters, nonlinear characteristic parameters could be well used to distinguish between healthy and pathological voices, and the recognition rates for /a/ were all higher than those for /i/ except for multi-scale entropy. That is why the /a/ sound data is used widely in related research at home and abroad for obtaining better identification of pathological voices. Adopting multi-scale entropy for /i/ could obtain higher recognition rate than /a/ between healthy and pathological samples, which may provide some useful inspiration for evaluating vocal compensatory function.

  16. Facial Recognition

    National Research Council Canada - National Science Library

    Mihalache Sergiu; Stoica Mihaela-Zoica

    2014-01-01

    .... From birth, faces are important in the individual's social interaction. Face perceptions are very complex as the recognition of facial expressions involves extensive and diverse areas in the brain...

  17. Fake Base Station Recognition and Locating Technology Research%伪基站系统侦测识别及定位方法研究

    Institute of Scientific and Technical Information of China (English)

    周之童; 夏子焱; 邢佳帅; 李珍妮

    2014-01-01

    Fake base station as the latest high-tech crime method due to its high mobility and camouflage, makes the introduction of pseudo base station technology widely circulated in the underground market in China, the ministry of public security special operation are many times back through the source tracking method of pseudo base station, and it is dififcult to directly get the current. This paper, based on the principle of pseudo base station based on this study a comprehensive detect recognition and positioning method of pseudo base station system.%伪基站作为最新的高科技犯罪手段由于其高移动性与伪装性,使得伪基站技术一经引进我国就在地下市场广为流传,而公安部多次专项行动均是通过追源头捣窝点的方法对伪基站进行打击,很难直接抓到现行。文章根据伪基站的原理研究了一种综合的伪基站系统侦测识别以及定位方法。

  18. IgE recognition of chimeric isoforms of the honeybee (Apis mellifera) venom allergen Api m 10 evaluated by protein array technology.

    Science.gov (United States)

    Van Vaerenbergh, Matthias; De Smet, Lina; Rafei-Shamsabadi, David; Blank, Simon; Spillner, Edzard; Ebo, Didier G; Devreese, Bart; Jakob, Thilo; de Graaf, Dirk C

    2015-02-01

    Api m 10 has recently been established as novel major allergen that is recognized by more than 60% of honeybee venom (HBV) allergic patients. Previous studies suggest Api m 10 protein heterogeneity which may have implications for diagnosis and immunotherapy of HBV allergy. In the present study, RT-PCR revealed the expression of at least nine additional Api m 10 transcript isoforms by the venom glands. Two distinct mechanisms are responsible for the generation of these isoforms: while the previously known variant 2 is produced by an alternative splicing event, novel identified isoforms are intragenic chimeric transcripts. To the best of our knowledge, this is the first report of the identification of chimeric transcripts generated by the honeybee. By a retrospective proteomic analysis we found evidence for the presence of several of these isoforms in the venom proteome. Additionally, we analyzed IgE reactivity to different isoforms by protein array technology using sera from HBV allergic patients, which revealed that IgE recognition of Api m 10 is both isoform- and patient-specific. While it was previously demonstrated that the majority of HBV allergic patients display IgE reactivity to variant 2, our study also shows that some patients lacking IgE antibodies for variant 2 display IgE reactivity to two of the novel identified Api m 10 variants, i.e. variants 3 and 4.

  19. Fingerprint recognition

    OpenAIRE

    Diefenderfer, Graig T.

    2006-01-01

    The use of biometrics is an evolving component in today's society. Fingerprint recognition continues to be one of the most widely used biometric systems. This thesis explores the various steps present in a fingerprint recognition system. The study develops a working algorithm to extract fingerprint minutiae from an input fingerprint image. This stage incorporates a variety of image pre-processing steps necessary for accurate minutiae extraction and includes two different methods of ridge thin...

  20. 车载语音识别系统可靠性设计的关键技术研究%Research on Key Technology of Reliability Design of Vehicular Speech Recognition

    Institute of Scientific and Technical Information of China (English)

    张方伟; 丁武俊; 陈文强; 潘之杰; 赵福全

    2012-01-01

    At present, bluetooth system and speech control system of speech recognition technology are widely used in more and more vehicle types, but reliability of the speech recognition is poor. From system development, based on complex environment in vehicle, this paper illustrates how to improve reliability of speech recognition system from aspects of speech recognitinn logic, keywords determination, speech recognition technology, harness and microphone distribution, so as to maximize effect of the speech recognition system and offer more customized service. The results indicated that reliability of speech recognition has been improved based on this method.%目前基于语音识别技术的蓝牙免提、语音控制等系统在越来越多的车型上得到了广泛的应用,但语音识别系统的可靠性都不是很高。文章主要从系统开发的角度,以车内的复杂环境为基础,从语音识别逻辑、关键词制定、语音识别技术、线束及麦克风布置等多方面来研究如何提高语音识别系统的可靠性,使其价值最大化,提供的服务更加人性化。结果表明,基于以上方法能让语音识别系统的可靠性得到很大的提高。

  1. Children's Voice or Children's Voices? How Educational Research Can Be at the Heart of Schooling

    Science.gov (United States)

    Stern, Julian

    2015-01-01

    There are problems with considering children and young people in schools as quite separate individuals, and with considering them as members of a single collectivity. The tension is represented in the use of "voice" and "voices" in educational debates. Voices in dialogue, in contrast to "children's voice", are…

  2. Voice Over Internet Protocol (VoIP) in a Control Center Environment

    Science.gov (United States)

    Pirani, Joseph; Calvelage, Steven

    2010-01-01

    The technology of transmitting voice over data networks has been available for over 10 years. Mass market VoIP services for consumers to make and receive standard telephone calls over broadband Internet networks have grown in the last 5 years. While operational costs are less with VoIP implementations as opposed to time division multiplexing (TDM) based voice switches, is it still advantageous to convert a mission control center s voice system to this newer technology? Marshall Space Flight Center (MSFC) Huntsville Operations Support Center (HOSC) has converted its mission voice services to a commercial product that utilizes VoIP technology. Results from this testing, design, and installation have shown unique considerations that must be addressed before user operations. There are many factors to consider for a control center voice design. Technology advantages and disadvantages were investigated as they refer to cost. There were integration concerns which could lead to complex failure scenarios but simpler integration for the mission infrastructure. MSFC HOSC will benefit from this voice conversion with less product replacement cost, less operations cost and a more integrated mission services environment.

  3. Voice complaints, risk factors for voice problems and history of voice problems in relation to puberty in female student teachers.

    NARCIS (Netherlands)

    Thomas, G.; Jong, F.I.C.R.S. de; Kooijman, P.G.C.; Donders, A.R.T.; Cremers, C.W.R.J.

    2006-01-01

    The aim of the study was to estimate voice complaints, risk factors for voice complaints and history of voice problems in student teachers before they embarked on their professional teaching career. A cross-sectional questionnaire survey was performed among female student teachers. The response rate

  4. Voice complaints, risk factors for voice problems and history of voice problems in relation to puberty in female student teachers.

    NARCIS (Netherlands)

    Thomas, G.; Jong, F.I.C.R.S. de; Kooijman, P.G.C.; Donders, A.R.T.; Cremers, C.W.R.J.

    2006-01-01

    The aim of the study was to estimate voice complaints, risk factors for voice complaints and history of voice problems in student teachers before they embarked on their professional teaching career. A cross-sectional questionnaire survey was performed among female student teachers. The response rate

  5. Age Dependent Face Recognition using Eigenface

    OpenAIRE

    Hlaing Htake Khaung Tin

    2013-01-01

    Face recognition is the most successful form of human surveillance. Face recognition technology, is being used to improve human efficiency when recognition faces, is one of the fastest growing fields in the biometric industry. In the first stage, the age is classified into eleven categories which distinguish the person oldness in terms of age. In the second stage of the process is face recognition based on the predicted age. Age prediction has considerable potential applications in human comp...

  6. Pattern recognition, machine intelligence and biometrics

    CERN Document Server

    Wang, Patrick S P

    2012-01-01

    ""Pattern Recognition, Machine Intelligence and Biometrics"" covers the most recent developments in Pattern Recognition and its applications, using artificial intelligence technologies within an increasingly critical field. It covers topics such as: image analysis and fingerprint recognition; facial expressions and emotions; handwriting and signatures; iris recognition; hand-palm gestures; and multimodal based research. The applications span many fields, from engineering, scientific studies and experiments, to biomedical and diagnostic applications, to personal identification and homeland secu

  7. Quick Statistics about Voice, Speech, and Language

    Science.gov (United States)

    ... here Home » Health Info » Statistics and Epidemiology Quick Statistics About Voice, Speech, Language Voice, Speech, Language, and ... no 205. Hyattsville, MD: National Center for Health Statistics. 2015. Hoffman HJ, Li C-M, Losonczy K, ...

  8. Introduction: Textual and contextual voices of translation

    DEFF Research Database (Denmark)

    2017-01-01

    Voices – marks of the tangle of subjectivities involved in textual processes – constitute the very fabric of texts in general and translations in particular. The title of this book, Textual and Contextual Voices of Translation, refers both to textual voices, that is, the voices found within...... the translated texts, and to contextual voices, that is, the voices of those involved in shaping, commenting, or otherwise influencing the textual voices. The latter appear in prefaces, reviews, and other texts that surround the translated texts and provide them with a context. Our main claim is that studying...... both the textual and contextual voices helps us better understand and explain the complexity of both the translation process and the translation product. The dovetailed approach to translation research that is advocated in this book aims at highlighting the diversity of participants, power positions...

  9. When noise becomes voice

    DEFF Research Database (Denmark)

    Veerasawmy, Rune; McCarthy, John

    2014-01-01

    In this paper, we present crowd experience as a novel concept when designing interactive technology for spectator crowds in public settings. Technology-mediated experiences in groups have already been given serious attention in the field of interaction design. However, crowd experiences are disti...

  10. Thread Recognition System Based on Machine Vision Technology%基于机器视觉技术的螺纹识别系统

    Institute of Scientific and Technical Information of China (English)

    景敏

    2013-01-01

    Thread angle identification is a common method to distinguish thread types. Traditional detection methods have many disadvantages such as low efficiency and high cost and gauges are easy to be abraded. The needs of high efficient development of modern industry are not met any more. CCD is used to obtain the basic im-age of thread. And the thread contour is analyzed through image smoothness, edge detection, binary image pro-duction and contour hunting. The thread angle parameters are measured and identified. The measurement meth-ods of thread angle parameter using machine vision are discussed. And a thread recognition system mainly based on the machine vision recognition technology and integrated visual sensing with image processing system is de-signed. The feasibility and correctness of the method is proved from theory and practice.%螺纹牙型角识别是区分螺纹种类的常用手段,传统检测手段效率低、量规易磨损、成本高,已不能满足现代工业高效发展的需求。利用CCD获取螺纹基本图像,并通过图像的平滑、边缘检测、二值化处理及轮廓提取,对螺纹轮廓进行分析,从中测量出螺纹的牙型角参数并进行识别。探讨了利用机器视觉对螺纹牙型角参数进行测量的方法,并设计出一套以机器视觉识别技术为核心的视觉传感和图像处理系统为一体的螺纹识别系统。从理论和实践上证实了该方法的可行性和准确性。

  11. Speaker's voice as a memory cue.

    Science.gov (United States)

    Campeanu, Sandra; Craik, Fergus I M; Alain, Claude

    2015-02-01

    Speaker's voice occupies a central role as the cornerstone of auditory social interaction. Here, we review the evidence suggesting that speaker's voice constitutes an integral context cue in auditory memory. Investigation into the nature of voice representation as a memory cue is essential to understanding auditory memory and the neural correlates which underlie it. Evidence from behavioral and electrophysiological studies suggest that while specific voice reinstatement (i.e., same speaker) often appears to facilitate word memory even without attention to voice at study, the presence of a partial benefit of similar voices between study and test is less clear. In terms of explicit memory experiments utilizing unfamiliar voices, encoding methods appear to play a pivotal role. Voice congruency effects have been found when voice is specifically attended at study (i.e., when relatively shallow, perceptual encoding takes place). These behavioral findings coincide with neural indices of memory performance such as the parietal old/new recollection effect and the late right frontal effect. The former distinguishes between correctly identified old words and correctly identified new words, and reflects voice congruency only when voice is attended at study. Characterization of the latter likely depends upon voice memory, rather than word memory. There is also evidence to suggest that voice effects can be found in implicit memory paradigms. However, the presence of voice effects appears to depend greatly on the task employed. Using a word identification task, perceptual similarity between study and test conditions is, like for explicit memory tests, crucial. In addition, the type of noise employed appears to have a differential effect. While voice effects have been observed when white noise is used at both study and test, using multi-talker babble does not confer the same results. In terms of neuroimaging research modulations, characterization of an implicit memory effect

  12. Speech technology and cinema: can they learn from each other?

    Science.gov (United States)

    Pauletto, Sandra

    2013-10-01

    The voice is the most important sound of a film soundtrack. It represents a character and it carries language. There are different types of cinematic voices: dialogue, internal monologues, and voice-overs. Conventionally, two main characteristics differentiate these voices: lip synchronization and the voice's attributes that make it appropriate for the character (for example, a voice that sounds very close to the audience can be appropriate for a narrator, but not for an onscreen character). What happens, then, if a film character can only speak through an asynchronous machine that produces a 'robot-like' voice? This article discusses the sound-related work and experimentation done by the author for the short film Voice by Choice. It also attempts to discover whether speech technology design can learn from its cinematic representation, and if such uncommon film protagonists can contribute creatively to transform the conventions of cinematic voices.

  13. Voicing Consciousness: The Mind in Writing

    Science.gov (United States)

    Luce-Kapler, Rebecca; Catlin, Susan; Sumara, Dennis; Kocher, Philomene

    2011-01-01

    In this paper, the authors investigate the enduring power of voice as a concept in writing pedagogy. They argue that one can benefit from considering Elbow's assertion that both text and voice be considered as important aspects of written discourse. In particular, voice is a powerful metaphor for the material, social and historical nature of…

  14. Understanding the 'Anorexic Voice' in Anorexia Nervosa.

    Science.gov (United States)

    Pugh, Matthew; Waller, Glenn

    2016-07-20

    In common with individuals experiencing a number of disorders, people with anorexia nervosa report experiencing an internal 'voice'. The anorexic voice comments on the individual's eating, weight and shape and instructs the individual to restrict or compensate. However, the core characteristics of the anorexic voice are not known. This study aimed to develop a parsimonious model of the voice characteristics that are related to key features of eating disorder pathology and to determine whether patients with anorexia nervosa fall into groups with different voice experiences. The participants were 49 women with full diagnoses of anorexia nervosa. Each completed validated measures of the power and nature of their voice experience and of their responses to the voice. Different voice characteristics were associated with current body mass index, duration of disorder and eating cognitions. Two subgroups emerged, with 'weaker' and 'stronger' voice experiences. Those with stronger voices were characterized by having more negative eating attitudes, more severe compensatory behaviours, a longer duration of illness and a greater likelihood of having the binge-purge subtype of anorexia nervosa. The findings indicate that the anorexic voice is an important element of the psychopathology of anorexia nervosa. Addressing the anorexic voice might be helpful in enhancing outcomes of treatments for anorexia nervosa, but that conclusion might apply only to patients with more severe eating psychopathology. Copyright © 2016 John Wiley & Sons, Ltd.

  15. Voice and culture: A prospect theory approach

    NARCIS (Netherlands)

    Paddock, E.L.; Ko, Junsu; Cropanzano, R.; Bagger, J.; El Akremi, A.; Camerman, A.; Greguras, G. J.; Mladinic, A.; Moliner, C.; Nam, K.; Törnblom, K.; Van den Bos, Kees

    2015-01-01

    The present study examines the congruence of individuals' minimum preferred amounts of voice with the prospect theory value function across nine countries. Accounting for previously ignored minimum preferred amounts of voice and actual voice amounts integral to testing the steepness of gain and loss

  16. Finding Voice: Learning about Language and Power

    Science.gov (United States)

    Christensen, Linda

    2011-01-01

    Christensen discusses why teachers need to teach students "voice" in its social and political context, to show the intersection of voice and power, to encourage students to ask, "Whose voices get heard? Whose are marginalized?" As Christensen writes, "Once students begin to understand that Standard English is one language among many, we can help…

  17. Voice and culture: A prospect theory approach

    NARCIS (Netherlands)

    Paddock, E.L.; Ko, Junsu; Cropanzano, R.; Bagger, J.; El Akremi, A.; Camerman, A.; Greguras, G. J.; Mladinic, A.; Moliner, C.; Nam, K.; Törnblom, K.; Van den Bos, Kees

    2015-01-01

    The present study examines the congruence of individuals' minimum preferred amounts of voice with the prospect theory value function across nine countries. Accounting for previously ignored minimum preferred amounts of voice and actual voice amounts integral to testing the steepness of gain and loss

  18. "Voice Forum" The Human Voice as Primary Instrument in Music Therapy

    DEFF Research Database (Denmark)

    Pedersen, Inge Nygaard; Storm, Sanne

    2009-01-01

    Aspects will be drawn on the human voice as tool for embodying our psychological and physiological state, and attempting integration of feelings. Presentations and dialogues on different methods and techniques in "Therapy related body-and voice work.", as well as the human voice as a tool for non...... for nonverbal orientation and information both to our selves and others. Focus on training on the voice instrument, the effect and impact of the human voice, and listening perspectives...

  19. Voice in political decision-making: the effect of group voice on perceived trustworthiness of decision makers and subsequent acceptance of decisions.

    Science.gov (United States)

    Terwel, Bart W; Harinck, Fieke; Ellemers, Naomi; Daamen, Dancker D L

    2010-06-01

    The implementation of carbon dioxide capture and storage technology (CCS) is considered an important climate change mitigation strategy, but the viability of this technology will depend on public acceptance of CCS policy decisions. The results of three experiments with students as participants show that whether or not interest groups receive an opportunity to express their opinions in the decision-making process (i.e., group voice) affects acceptance of CCS policy decisions, with inferred trustworthiness of the decision maker mediating this effect. Decision-making procedures providing different interest groups with equal opportunities to voice their opinions instigate more trust in the decision maker and, in turn, lead to greater willingness to accept decisions compared to no-voice procedures (i.e., unilateral decision-making-Study 1) and unequal group-voice procedures (i.e., when one type of interest group receives voice, but another type of interest group does not-Study 2). Study 3 further shows that an individual's own level of knowledge about CCS moderates the desire for an opportunity for members of the general public to voice opinions in the decision-making process, inferred trustworthiness of decision makers, and policy acceptance. These results imply that people care about voice in decision-making even when they are not directly personally involved in the decision-making process. We conclude that people tend to use procedural information when deciding to accept or oppose policy decisions on political complex issues; hence, it is important that policymakers use fair group-voice procedures and that they communicate to the public how they arrive at their decisions. PsycINFO Database Record (c) 2010 APA, all rights reserved.

  20. Voice-Specialized Speech-Language Pathologist's Criteria for Discharge from Voice Therapy.

    Science.gov (United States)

    Gillespie, Amanda I; Gartner-Schmidt, Jackie

    2017-08-07

    No standard protocol exists to determine when a patient is ready and able to be discharged from voice therapy. The aim of the present study was to determine what factors speech-language pathologists (SLPs) deem most important when discharging a patient from voice therapy. A second aim was to determine if responses differed based on years of voice experience. Step 1: Seven voice-specialized SLPs generated a list of items thought to be relevant to voice therapy discharge. Step 2: Fifty voice-specialized SLPs rated each item on the list in terms of importance in determining discharge from voice therapy. Step 1: Four themes emerged-outcome measures, laryngeal appearance, SLP perceptions, and patient factors-as important items when determining discharge from voice therapy. Step 2: The top five most important criteria for discharge readiness were that the patient had to be able to (1) independently use a better voice (transfer), (2) function with his or her new voice production in activities of daily living (transfer), (3) differentiate between good and bad voice, (4) take responsibility for voice, and (5) sound better from baseline. Novice and experienced clinicians agreed between 94% and 97% concerning what was deemed "very important." SLPs agree that a patient's ability to use voice techniques in conversation and real-life situations outside of the therapy room are the most important determinants for voice therapy discharge. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  1. Multi-thread Parallel Speech Recognition for Mobile Applications

    Directory of Open Access Journals (Sweden)

    LOJKA Martin

    2014-05-01

    Full Text Available In this paper, the server based solution of the multi-thread large vocabulary automatic speech recognition engine is described along with the Android OS and HTML5 practical application examples. The basic idea was to bring speech recognition available for full variety of applications for computers and especially for mobile devices. The speech recognition engine should be independent of commercial products and services (where the dictionary could not be modified. Using of third-party services could be also a security and privacy problem in specific applications, when the unsecured audio data could not be sent to uncontrolled environments (voice data transferred to servers around the globe. Using our experience with speech recognition applications, we have been able to construct a multi-thread speech recognition serverbased solution designed for simple applications interface (API to speech recognition engine modified to specific needs of particular application.

  2. 基于AAC音频编码与混沌加密的语音加密技术实现%Implementation of voice encryption technology based on AAC audio coding and chaotic encryption

    Institute of Scientific and Technical Information of China (English)

    王先泉

    2012-01-01

    为了减少语音的加密数据量,在此提出一种基于压缩编码和混沌加密的语音加密方案.采用先编码后加密的实现方法,语音编码算法采用AAC低复杂度编码算法,加密算法采用二维猫映射算法,在ARM9硬件平台下的实现结果表明AAC编码压缩比为18:1,猫映射加解密算法执行效率快,解密后无失真,实验结论证实该方案是可行的.在此将音频压缩算法与混沌加密结合起来,在不影响语音音质和加密效果的前提下减少了加密运算的数据量以及最终加密文件的大小,缩短了加解密语音数据所花费的时间,减少了语音保密通信所需要的带宽.%A voice encryption program based on compression encoding and chaotic encryption is proposed in this paper for reducing the voice encryption data size. The implementation method to encode first and then encrypt is used in the progam. AAC low complexity encoding algorithm is adopted in the speech coding algorithm. The two-dimensional cat map algorithm is adopted in the encryption algorithm. The result achieved in ARM9 hardware platform indicates that the AAC encoding ratio of 18:1. the cat map encryption-decryption algorithm has high execution efficiency, and there is no distortion after decryption. The experimental results confirm that the program is feasible. The innovation of this paper is the combination of audio compression algorithm and chaotic encryption. It reduced the data size of encryption operation and the size of final encrypted files, shortened the time spent in encrypting and decrypting the voice data, and narrowed the bandwidth required for voice secure communication on the premise not to affect the voice quality and encryption effect.

  3. The development of the Spanish verb ir into auxiliary of voice

    DEFF Research Database (Denmark)

    Vinther, Thora

    2005-01-01

    spanish, syntax, grammaticalisation, past participle, passive voice, middle voice, language development......spanish, syntax, grammaticalisation, past participle, passive voice, middle voice, language development...

  4. 基于图像识别技术的螺栓裂缝识别系统研究%The Research of Bolt Cracks Discern System based on Image Recognition Technology

    Institute of Scientific and Technical Information of China (English)

    李亚琳; 王玉增; 李柏震; 刘双源

    2013-01-01

    提出了一种基于图像识别技术的螺栓裂缝非接触识别技术.利用高精度工业相机对待检螺栓进行拍摄,然后采用细化、提高曝光度、图像分割等技术对拍摄到的图片进行处理,获取图像特征,最终达到识别裂缝的目的.实验表明,将图像识别技术应用于螺栓裂缝识别领域,可以大大简化识别步骤,提高识别精度,为螺栓裂缝检测提供了新的发展方向.%Propose a non-contact identification of bolt crack based on image recognition technology. Take photo of every bolt waiting for the detection with high-precision industrial cameras, then process the photos with image thinning, improve exposure, edge detection technology, so we can get the image feature, and finally achieve the purpose of crack identification. The experiment shows that the use of image recognition technology in bolt cracks identification can greatly simplify the identification step, improving recognition accuracy, and provides a new direction of development for the bolt cracks detection.

  5. Objective Voice Parameters in Colombian School Workers with Healthy Voices

    Directory of Open Access Journals (Sweden)

    Lady Catherine Cantor Cutiva

    2015-09-01

    Full Text Available Objectives: To characterize the objective voice parameters among school workers, and to identi­fy associated factors of three objective voice parameters, namely fundamental frequency, sound pressure level and maximum phonation time. Materials and methods: We conducted a cross-sectional study among 116 Colombian teachers and 20 Colombian non-teachers. After signing the informed consent form, participants filled out a questionnaire. Then, a voice sample was recorded and evaluated perceptually by a speech therapist and by objective voice analysis with praat software. Short-term environmental measurements of sound level, temperature, humi­dity, and reverberation time were conducted during visits at the workplaces, such as classrooms and offices. Linear regression analysis was used to determine associations between individual and work-related factors and objective voice parameters. Results: Compared with men, women had higher fundamental frequency (201 Hz for teachers and 209 for non-teachers vs. 120 Hz for teachers and 127 for non-teachers and sound pressure level (82 dB vs. 80 dB, and shorter maximum phonation time (around 14 seconds vs. around 16 seconds. Female teachers younger than 50 years of age evidenced a significant tendency to speak with lower fundamental frequen­cy and shorter mpt compared with female teachers older than 50 years of age. Female teachers had significantly higher fundamental frequency (66 Hz, higher sound pressure level (2 dB and short phonation time (2 seconds than male teachers. Conclusion: Female teachers younger than 50 years of age had significantly lower F0 and shorter mpt compared with those older than 50 years of age. The multivariate analysis showed that gender was a much more important determinant of variations in F0, spl and mpt than age and teaching occupation. Objectively measured temperature also contributed to the changes on spl among school workers.

  6. Playful Interaction with Voice Sensing Modular Robots

    DEFF Research Database (Denmark)

    Heesche, Bjarke; MacDonald, Ewen; Fogh, Rune

    2013-01-01

    This paper describes a voice sensor, suitable for modular robotic systems, which estimates the energy and fundamental frequency, F0, of the user’s voice. Through a number of example applications and tests with children, we observe how the voice sensor facilitates playful interaction between...... children and two different robot configurations. In future work, we will investigate if such a system can motivate children to improve voice control and explore how to extend the sensor to detect emotions in the user’s voice....

  7. Recognition and Exteriority: Towards a Recognition-Theoretic Account of Globalization

    Directory of Open Access Journals (Sweden)

    Sebastian Purcell

    2011-06-01

    Full Text Available This essay aims to extend Paul Ricœur’s account of recognition to address some of the concerns of globalization, especially those voiced by Enrique Dussel. The extension is accomplished in two parts.  First, Dussel’s account of spatial existence as dwelling is reviewed as it is pertinent to the concerns of globalization.  Next, it is demonstrated that each of the aspects of Ricœur’s account of recognition may be given a spatial re-articulation.  The results thus establish an outline of how recognition theory might address some of the concerns of globalization.  The essay concludes with several consequences for the modification of recognition politics as one finds it in the late work of Ricœur and in Axel Honneth’s ongoing inquiries. 

  8. Florida manatee avoidance technology: A pilot program by the Florida Fish and Wildlife Conservation Commission

    Science.gov (United States)

    Frisch, Katherine; Haubold, Elsa

    2003-10-01

    Since 1976, approximately 25% of the annual Florida manatee (Trichechus manatus latirostris) mortality has been attributed to collisions with watercraft. In 2001, the Florida Legislature appropriated $200,000 in funds for research projects using technological solutions to directly address the problem of collisions between manatees and watercraft. The Florida Fish & Wildlife Conservation Commission initially funded seven projects for the first two fiscal years. The selected proposals were designed to explore technology that had not previously been applied to the manatee/boat collision problem and included many acoustic concepts related to voice recognition, sonar, and an alerting device to be put on boats to warn manatees. The most promising results to date are from projects employing voice-recognition techniques to identify manatee vocalizations and warn boaters of the manatees' presence. Sonar technology, much like that used in fish finders, is promising but has met with regulatory problems regarding permitting and remains to be tested, as has the manatee-alerting device. The state of Florida found results of the initial years of funding compelling and plans to fund further manatee avoidance technology research in a continued effort to mitigate the problem of manatee/boat collisions.

  9. AN EXPERIMENT WITH THE VOICE TO DESIGN CERAMICS

    DEFF Research Database (Denmark)

    Hansen, Flemming Tvede

    2013-01-01

    This article is about how experiential knowledge that the craftsmen gains in a direct physical interaction with a responding material can be transformed and utilized in the use of digital technologies. The article presents an experiment with a 3D interactive and dynamic system to create ceramics...... from the human voice and thus how digital technology makes new possibilities in ceramic craft. 3D digital shape is created using simple geometric rules and is output to a 3D printer to make ceramic objects. The system demonstrates the close connection between digital technology and craft practice....

  10. AN EXPERIMENT WITH THE VOICE TO DESIGN CERAMICS

    DEFF Research Database (Denmark)

    Hansen, Flemming Tvede

    2013-01-01

    from the human voice and thus how digital technology makes new possibilities in ceramic craft. 3D digital shape is created using simple geometric rules and is output to a 3D printer to make ceramic objects. The system demonstrates the close connection between digital technology and craft practice.......This article is about how experiential knowledge that the craftsmen gains in a direct physical interaction with a responding material can be transformed and utilized in the use of digital technologies. The article presents an experiment with a 3D interactive and dynamic system to create ceramics...

  11. Voice synthesis using the three-dimensional digital waveguide mesh

    Science.gov (United States)

    Speed, Matthew DA

    The acoustic response of the vocal tract is fundamental to our interpretation of voice production. As an acoustic filter, it shapes the spectral envelope of vocal fold vibration towards resonant modes, or formants, whose behaviours form the most basic building blocks of phonetics. Physical models of the voice exploit this effect by modelling the nature of wave propagation in abstracted cylindrical constructs. Whilst effective, the accuracy of such approaches is limited due to their limited geometrical analogue. Developments in numerical acoustics modelling meanwhile have seen the formalisation of higher dimensionality configurations of the same technologies, allowing a much closer geometrical representation of an acoustic field. The major focus of this thesis is the application of such a technique to the vocal tract, and comparison of its performance with lower dimensionality approaches. To afford the development of such models, a body of data is collected from Magnetic Resonance Imaging for a range of subjects, and procedures are developed for the decomposition of this imaging into suitable, efficient data structures for simulation. The simulation technique is exhaustively validated using a combination of bespoke measurement/inversion techniques and analytical determination of lower frequency behaviours. Finally, voice synthesis based on each numerical model is compared with acoustic recordings of the subjects involved and with equivalent simulations from lower dimensionality methods. It is found that application of a higher dimensionality method typically yields a more accurate frequency-domain representation of the voice, although in some cases lower dimensionality equivalents are seen to perform better at low frequencies..

  12. Application Of t-Cherry Junction Trees in Pattern Recognition

    Directory of Open Access Journals (Sweden)

    Edith Kovacs

    2010-06-01

    Full Text Available Pattern recognition aims to classify data (patterns based ei-
    ther on a priori knowledge or on statistical information extracted from the data. In this paper we will concentrate on statistical pattern recognition using a new probabilistic approach which makes possible to select the so called 'informative' features. We develop a pattern recognition algorithm which is based on the conditional independence structure underlying the statistical data. Our method was succesfully applied on a real problem of recognizing Parkinson's disease on the basis of voice disorders.

  13. 基于嵌入式Linux语音识别系统的设计%Design of Speech Recognition System Based on Embedded Linux

    Institute of Scientific and Technical Information of China (English)

    钟豪; 张常年; 徐成波

    2014-01-01

    该设计运用三星公司的S3C2440,结合ICRoute公司的高性能语音识别芯片LD3320,进行了语音识别系统的硬件和软件设计。在嵌入式Linux操作系统下,运用多进程机制完成了对语音识别芯片、超声波测距和云台的控制,并将语音识别技术应用于多角度超声波测距系统中。通过测试,系统可以通过识别语音指令控制测量方向,无需手动干预,最后将测量结果通过语音播放出来。%This paper fulfills the hardware and software design of the voice recognition system, using the Samsung’s S3C2440 and the high performance chip LD3320 designed by ICRoute. It uses multi-process mechanism to complete the speech recognition, ultrasonic ranging and PTZ control based on embedded Linux platform. At the same time, the system makes the speech recognition technology applied to multi-angle ultrasonic ranging. Through the actual testing, the system can control the direction of measure-ment by identifying the voice command, without manual intervention, and finally the measurement results play out through the voice.

  14. VOICE QUALITY BEFORE AND AFTER THYROIDECTOMY

    Directory of Open Access Journals (Sweden)

    Dora CVELBAR

    2016-04-01

    Full Text Available Introduction: Voice disorders are a well-known complication which is often associated with thyroid gland diseases and because voice is still the basic mean of communication it is very important to maintain its quality healthy. Objectives: The aim of this study referred to questions whether there is a statistically significant difference between results of voice self-assessment, perceptual voice assessment and acoustic voice analysis before and after thyroidectomy and whether there are statistically significant correlations between variables of voice self-assessment, perceptual assessment and acoustic analysis before and after thyroidectomy. Methods: This scientific research included 12 participants aged between 41 and 76. Voice self-assessment was conducted with the help of Croatian version of Voice Handicap Index (VHI. Recorded reading samples were used for perceptual assessment and later evaluated by two clinical speech and language therapists. Recorded samples of phonation were used for acoustic analysis which was conducted with the help of acoustic program Praat. All of the data was processed through descriptive statistics and nonparametric statistical methods. Results: Results showed that there are statistically significant differences between results of voice self-assessments and results of acoustic analysis before and after thyroidectomy. Statistically significant correlations were found between variables of perceptual assessment and acoustic analysis. Conclusion: Obtained results indicate the importance of multidimensional, preoperative and postoperative assessment. This kind of assessment allows the clinician to describe all of the voice features and provides appropriate recommendation for further rehabilitation to the patient in order to optimize voice outcomes.

  15. Beyond Insularity: Releasing the Voices.

    Science.gov (United States)

    Greene, Maxine

    1993-01-01

    Aspects of English-as-a-Second-Language are discussed from the standpoint of a teacher-educator with a particular interest in philosophy, the arts, and humanities and what they signify for the schools. The idea of giving voice to all viewpoints and sociocultural circumstances is considered for content learning and heterogeneous grouping. (Contains…

  16. A voice and nothing more

    DEFF Research Database (Denmark)

    Mebus, Andreas Nozic Lindgren

    2012-01-01

    Andreas Mebus fokuserer herefter på et helt konkret aspekt af talen, nemlig ”stemmen” i sin artikel ”A voice and nothing more – en filosofisk udredning af stemmen”. Gennem Mladen Dolars teori om stemmen, redegør Mebus for de forskellige aspekter ved stemmen; som bærer af mening, som æstetisk...

  17. Voice, Citizenship, and Civic Action

    DEFF Research Database (Denmark)

    Tufte, Thomas

    2014-01-01

    In recent years the world has experienced a resurgence in practices of bottom-up communication for social change, a plethora of agency in which claims for voice and citizenship through massive civic action have conquered center stage in the public debate. This resurgence has sparked a series...

  18. The Performing Voice of Radio

    DEFF Research Database (Denmark)

    Lawaetz, Anna

    The ongoing international development of opening media archives for researchers as well as for broader audiences calls for a closer discussion of the mediated voice and how to analyse it. Which parameters can be analysed and which parameters are not covered by the analysis? Furthermore, how do we...

  19. Voice and choice by delegation

    NARCIS (Netherlands)

    van de Bovenkamp, H.; Vollaard, H.; Trappenburg, M.; Grit, K

    2013-01-01

    In many Western countries, options for citizens to influence public services are increased to improve the quality of services and democratize decision making. Possibilities to influence are often cast into Albert Hirschman's taxonomy of exit (choice), voice, and loyalty. In this article we identify

  20. Work-related voice disorder

    Directory of Open Access Journals (Sweden)

    Paulo Eduardo Przysiezny

    2015-04-01

    Full Text Available INTRODUCTION: Dysphonia is the main symptom of the disorders of oral communication. However, voice disorders also present with other symptoms such as difficulty in maintaining the voice (asthenia, vocal fatigue, variation in habitual vocal fundamental frequency, hoarseness, lack of vocal volume and projection, loss of vocal efficiency, and weakness when speaking. There are several proposals for the etiologic classification of dysphonia: functional, organofunctional, organic, and work-related voice disorder (WRVD.OBJECTIVE: To conduct a literature review on WRVD and on the current Brazilian labor legislation.METHODS: This was a review article with bibliographical research conducted on the PubMed and Bireme databases, using the terms "work-related voice disorder", "occupational dysphonia", "dysphonia and labor legislation", and a review of labor and social security relevant laws.CONCLUSION: WRVD is a situation that frequently is listed as a reason for work absenteeism, functional rehabilitation, or for prolonged absence from work. Currently, forensic physicians have no comparative parameters to help with the analysis of vocal disorders. In certain situations WRVD may cause, work disability. This disorder may be labor-related, or be an adjuvant factor to work-related diseases.