WorldWideScience

Sample records for voice recognition technology

  1. Success with voice recognition.

    Science.gov (United States)

    Sferrella, Sheila M

    2003-01-01

    You need a compelling reason to implement voice recognition technology. At my institution, the compelling reason was a turnaround time for Radiology results of more than two days. Only 41 percent of our reports were transcribed and signed within 24 hours. In November 1998, a team from Lehigh Valley Hospital went to RSNA and reviewed every voice system on the market. The evaluation was done with the radiologist workflow in mind, and we came back from the meeting with the vendor selection completed. The next steps included developing a business plan, approval of funds, reference calls to more than 15 sites and contract negotiation, all of which took about six months. The department of Radiology at Lehigh Valley Hospital and Health Network (LVHHN) is a multi-site center that performs over 360,000 procedures annually. The department handles all modalities of radiology: general diagnosis, neuroradiology, ultrasound, CT Scan, MRI, interventional radiology, arthography, myelography, bone densitometry, nuclear medicine, PET imaging, vascular lab and other advanced procedures. The department consists of 200 FTEs and a medical staff of more than 40 radiologists. The budget is in the $10.3 million range. There are three hospital sites and four outpatient imaging center sites where services are provided. At Lehigh Valley Hospital, radiologists are not dedicated to one subspecialty, so implementing a voice system by modality was not an option. Because transcription was so far behind, we needed to eliminate that part of the process. As a result, we decided to deploy the system all at once and with the radiologists as editors. The planning and testing phase took about four months, and the implementation took two weeks. We deployed over 40 workstations and trained close to 50 physicians. The radiologists brought in an extra radiologist from our group for the two weeks of training. That allowed us to train without taking a radiologist out of the department. We trained three to six

  2. Literature review of voice recognition and generation technology for Army helicopter applications

    Science.gov (United States)

    Christ, K. A.

    1984-08-01

    This report is a literature review on the topics of voice recognition and generation. Areas covered are: manual versus vocal data input, vocabulary, stress and workload, noise, protective masks, feedback, and voice warning systems. Results of the studies presented in this report indicate that voice data entry has less of an impact on a pilot's flight performance, during low-level flying and other difficult missions, than manual data entry. However, the stress resulting from such missions may cause the pilot's voice to change, reducing the recognition accuracy of the system. The noise present in helicopter cockpits also causes the recognition accuracy to decrease. Noise-cancelling devices are being developed and improved upon to increase the recognition performance in noisy environments. Future research in the fields of voice recognition and generation should be conducted in the areas of stress and workload, vocabulary, and the types of voice generation best suited for the helicopter cockpit. Also, specific tasks should be studied to determine whether voice recognition and generation can be effectively applied.

  3. The Voice as Computer Interface: A Look at Tomorrow's Technologies.

    Science.gov (United States)

    Lange, Holley R.

    1991-01-01

    Discussion of voice as the communications device for computer-human interaction focuses on voice recognition systems for use within a library environment. Voice technologies are described, including voice response and voice recognition; examples of voice systems in use in libraries are examined; and further possibilities, including use with…

  4. FILTWAM and Voice Emotion Recognition

    NARCIS (Netherlands)

    Bahreini, Kiavash; Nadolski, Rob; Westera, Wim

    2014-01-01

    This paper introduces the voice emotion recognition part of our framework for improving learning through webcams and microphones (FILTWAM). This framework enables multimodal emotion recognition of learners during game-based learning. The main goal of this study is to validate the use of microphone

  5. Voice congruency facilitates word recognition.

    Directory of Open Access Journals (Sweden)

    Sandra Campeanu

    Full Text Available Behavioral studies of spoken word memory have shown that context congruency facilitates both word and source recognition, though the level at which context exerts its influence remains equivocal. We measured event-related potentials (ERPs while participants performed both types of recognition task with words spoken in four voices. Two voice parameters (i.e., gender and accent varied between speakers, with the possibility that none, one or two of these parameters was congruent between study and test. Results indicated that reinstating the study voice at test facilitated both word and source recognition, compared to similar or no context congruency at test. Behavioral effects were paralleled by two ERP modulations. First, in the word recognition test, the left parietal old/new effect showed a positive deflection reflective of context congruency between study and test words. Namely, the same speaker condition provided the most positive deflection of all correctly identified old words. In the source recognition test, a right frontal positivity was found for the same speaker condition compared to the different speaker conditions, regardless of response success. Taken together, the results of this study suggest that the benefit of context congruency is reflected behaviorally and in ERP modulations traditionally associated with recognition memory.

  6. Voice congruency facilitates word recognition.

    Science.gov (United States)

    Campeanu, Sandra; Craik, Fergus I M; Alain, Claude

    2013-01-01

    Behavioral studies of spoken word memory have shown that context congruency facilitates both word and source recognition, though the level at which context exerts its influence remains equivocal. We measured event-related potentials (ERPs) while participants performed both types of recognition task with words spoken in four voices. Two voice parameters (i.e., gender and accent) varied between speakers, with the possibility that none, one or two of these parameters was congruent between study and test. Results indicated that reinstating the study voice at test facilitated both word and source recognition, compared to similar or no context congruency at test. Behavioral effects were paralleled by two ERP modulations. First, in the word recognition test, the left parietal old/new effect showed a positive deflection reflective of context congruency between study and test words. Namely, the same speaker condition provided the most positive deflection of all correctly identified old words. In the source recognition test, a right frontal positivity was found for the same speaker condition compared to the different speaker conditions, regardless of response success. Taken together, the results of this study suggest that the benefit of context congruency is reflected behaviorally and in ERP modulations traditionally associated with recognition memory.

  7. Practical applications of interactive voice technologies: Some accomplishments and prospects

    Science.gov (United States)

    Grady, Michael W.; Hicklin, M. B.; Porter, J. E.

    1977-01-01

    A technology assessment of the application of computers and electronics to complex systems is presented. Three existing systems which utilize voice technology (speech recognition and speech generation) are described. Future directions in voice technology are also described.

  8. Electrolarynx Voice Recognition Utilizing Pulse Coupled Neural Network

    Directory of Open Access Journals (Sweden)

    Fatchul Arifin

    2010-08-01

    Full Text Available The laryngectomies patient has no ability to speak normally because their vocal chords have been removed. The easiest option for the patient to speak again is by using electrolarynx speech. This tool is placed on the lower chin. Vibration of the neck while speaking is used to produce sound. Meanwhile, the technology of "voice recognition" has been growing very rapidly. It is expected that the technology of "voice recognition" can also be used by laryngectomies patients who use electrolarynx.This paper describes a system for electrolarynx speech recognition. Two main parts of the system are feature extraction and pattern recognition. The Pulse Coupled Neural Network – PCNN is used to extract the feature and characteristic of electrolarynx speech. Varying of β (one of PCNN parameter also was conducted. Multi layer perceptron is used to recognize the sound patterns. There are two kinds of recognition conducted in this paper: speech recognition and speaker recognition. The speech recognition recognizes specific speech from every people. Meanwhile, speaker recognition recognizes specific speech from specific person. The system ran well. The "electrolarynx speech recognition" has been tested by recognizing of “A” and "not A" voice. The results showed that the system had 94.4% validation. Meanwhile, the electrolarynx speaker recognition has been tested by recognizing of “saya” voice from some different speakers. The results showed that the system had 92.2% validation. Meanwhile, the best β parameter of PCNN for electrolarynx recognition is 3.

  9. Voice Recognition in Face-Blind Patients

    Science.gov (United States)

    Liu, Ran R.; Pancaroglu, Raika; Hills, Charlotte S.; Duchaine, Brad; Barton, Jason J. S.

    2016-01-01

    Right or bilateral anterior temporal damage can impair face recognition, but whether this is an associative variant of prosopagnosia or part of a multimodal disorder of person recognition is an unsettled question, with implications for cognitive and neuroanatomic models of person recognition. We assessed voice perception and short-term recognition of recently heard voices in 10 subjects with impaired face recognition acquired after cerebral lesions. All 4 subjects with apperceptive prosopagnosia due to lesions limited to fusiform cortex had intact voice discrimination and recognition. One subject with bilateral fusiform and anterior temporal lesions had a combined apperceptive prosopagnosia and apperceptive phonagnosia, the first such described case. Deficits indicating a multimodal syndrome of person recognition were found only in 2 subjects with bilateral anterior temporal lesions. All 3 subjects with right anterior temporal lesions had normal voice perception and recognition, 2 of whom performed normally on perceptual discrimination of faces. This confirms that such lesions can cause a modality-specific associative prosopagnosia. PMID:25349193

  10. Implicit multisensory associations influence voice recognition.

    Directory of Open Access Journals (Sweden)

    Katharina von Kriegstein

    2006-10-01

    Full Text Available Natural objects provide partially redundant information to the brain through different sensory modalities. For example, voices and faces both give information about the speech content, age, and gender of a person. Thanks to this redundancy, multimodal recognition is fast, robust, and automatic. In unimodal perception, however, only part of the information about an object is available. Here, we addressed whether, even under conditions of unimodal sensory input, crossmodal neural circuits that have been shaped by previous associative learning become activated and underpin a performance benefit. We measured brain activity with functional magnetic resonance imaging before, while, and after participants learned to associate either sensory redundant stimuli, i.e. voices and faces, or arbitrary multimodal combinations, i.e. voices and written names, ring tones, and cell phones or brand names of these cell phones. After learning, participants were better at recognizing unimodal auditory voices that had been paired with faces than those paired with written names, and association of voices with faces resulted in an increased functional coupling between voice and face areas. No such effects were observed for ring tones that had been paired with cell phones or names. These findings demonstrate that brief exposure to ecologically valid and sensory redundant stimulus pairs, such as voices and faces, induces specific multisensory associations. Consistent with predictive coding theories, associative representations become thereafter available for unimodal perception and facilitate object recognition. These data suggest that for natural objects effective predictive signals can be generated across sensory systems and proceed by optimization of functional connectivity between specialized cortical sensory modules.

  11. Robust matching for voice recognition

    Science.gov (United States)

    Higgins, Alan; Bahler, L.; Porter, J.; Blais, P.

    1994-10-01

    This paper describes an automated method of comparing a voice sample of an unknown individual with samples from known speakers in order to establish or verify the individual's identity. The method is based on a statistical pattern matching approach that employs a simple training procedure, requires no human intervention (transcription, work or phonetic marketing, etc.), and makes no assumptions regarding the expected form of the statistical distributions of the observations. The content of the speech material (vocabulary, grammar, etc.) is not assumed to be constrained in any way. An algorithm is described which incorporates frame pruning and channel equalization processes designed to achieve robust performance with reasonable computational resources. An experimental implementation demonstrating the feasibility of the concept is described.

  12. Obligatory and facultative brain regions for voice-identity recognition

    Science.gov (United States)

    Roswandowitz, Claudia; Kappes, Claudia; Obrig, Hellmuth; von Kriegstein, Katharina

    2018-01-01

    Abstract Recognizing the identity of others by their voice is an important skill for social interactions. To date, it remains controversial which parts of the brain are critical structures for this skill. Based on neuroimaging findings, standard models of person-identity recognition suggest that the right temporal lobe is the hub for voice-identity recognition. Neuropsychological case studies, however, reported selective deficits of voice-identity recognition in patients predominantly with right inferior parietal lobe lesions. Here, our aim was to work towards resolving the discrepancy between neuroimaging studies and neuropsychological case studies to find out which brain structures are critical for voice-identity recognition in humans. We performed a voxel-based lesion-behaviour mapping study in a cohort of patients (n = 58) with unilateral focal brain lesions. The study included a comprehensive behavioural test battery on voice-identity recognition of newly learned (voice-name, voice-face association learning) and familiar voices (famous voice recognition) as well as visual (face-identity recognition) and acoustic control tests (vocal-pitch and vocal-timbre discrimination). The study also comprised clinically established tests (neuropsychological assessment, audiometry) and high-resolution structural brain images. The three key findings were: (i) a strong association between voice-identity recognition performance and right posterior/mid temporal and right inferior parietal lobe lesions; (ii) a selective association between right posterior/mid temporal lobe lesions and voice-identity recognition performance when face-identity recognition performance was factored out; and (iii) an association of right inferior parietal lobe lesions with tasks requiring the association between voices and faces but not voices and names. The results imply that the right posterior/mid temporal lobe is an obligatory structure for voice-identity recognition, while the inferior parietal

  13. Voice Response Systems Technology.

    Science.gov (United States)

    Gerald, Jeanette

    1984-01-01

    Examines two methods of generating synthetic speech in voice response systems, which allow computers to communicate in human terms (speech), using human interface devices (ears): phoneme and reconstructed voice systems. Considerations prior to implementation, current and potential applications, glossary, directory, and introduction to Input Output…

  14. Improving Speaker Recognition by Biometric Voice Deconstruction

    Directory of Open Access Journals (Sweden)

    Luis Miguel eMazaira-Fernández

    2015-09-01

    Full Text Available Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g. YouTube to broadcast its message. In this new scenario, classical identification methods (such fingerprints or face recognition have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. Through the present paper, a new methodology to characterize speakers will be shown. This methodology is benefiting from the advances achieved during the last years in understanding and modelling voice production. The paper hypothesizes that a gender dependent characterization of speakers combined with the use of a new set of biometric parameters extracted from the components resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract gender-dependent extended biometric parameters are given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions.

  15. Improving Speaker Recognition by Biometric Voice Deconstruction

    Science.gov (United States)

    Mazaira-Fernandez, Luis Miguel; Álvarez-Marquina, Agustín; Gómez-Vilda, Pedro

    2015-01-01

    Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g., YouTube) to broadcast its message. In this new scenario, classical identification methods (such as fingerprints or face recognition) have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. The present study benefits from the advances achieved during last years in understanding and modeling voice production. The paper hypothesizes that a gender-dependent characterization of speakers combined with the use of a set of features derived from the components, resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract the gender-dependent extended biometric parameters is given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions. PMID:26442245

  16. Pegembangan Game dengan Menggunakan Teknologi Voice Recognition Berbasis Android

    Directory of Open Access Journals (Sweden)

    Franky Hadinata Marpaung

    2014-06-01

    Full Text Available The purpose of this research is to create a new kind of game by using technology that rarely used in current games. It is developed as an entertainment media and also a social media in which the users can play the games together via multiplayer mode. This research uses Scrum development method since it supports small scaled developer and it supports software increment along the development. Using this game application, the users can play and watch interesting animations by controlling it with their voice, listen the character imitating the users’ voice, play various mini games both in single player or multiplayer mode via Bluetooth connection. The conclusion is that game application of My Name is Dug use voice recognition and inter-devices connection as its main features. It also has various mini games that support both single player and multiplayer.

  17. Obligatory and facultative brain regions for voice-identity recognition.

    Science.gov (United States)

    Roswandowitz, Claudia; Kappes, Claudia; Obrig, Hellmuth; von Kriegstein, Katharina

    2018-01-01

    Recognizing the identity of others by their voice is an important skill for social interactions. To date, it remains controversial which parts of the brain are critical structures for this skill. Based on neuroimaging findings, standard models of person-identity recognition suggest that the right temporal lobe is the hub for voice-identity recognition. Neuropsychological case studies, however, reported selective deficits of voice-identity recognition in patients predominantly with right inferior parietal lobe lesions. Here, our aim was to work towards resolving the discrepancy between neuroimaging studies and neuropsychological case studies to find out which brain structures are critical for voice-identity recognition in humans. We performed a voxel-based lesion-behaviour mapping study in a cohort of patients (n = 58) with unilateral focal brain lesions. The study included a comprehensive behavioural test battery on voice-identity recognition of newly learned (voice-name, voice-face association learning) and familiar voices (famous voice recognition) as well as visual (face-identity recognition) and acoustic control tests (vocal-pitch and vocal-timbre discrimination). The study also comprised clinically established tests (neuropsychological assessment, audiometry) and high-resolution structural brain images. The three key findings were: (i) a strong association between voice-identity recognition performance and right posterior/mid temporal and right inferior parietal lobe lesions; (ii) a selective association between right posterior/mid temporal lobe lesions and voice-identity recognition performance when face-identity recognition performance was factored out; and (iii) an association of right inferior parietal lobe lesions with tasks requiring the association between voices and faces but not voices and names. The results imply that the right posterior/mid temporal lobe is an obligatory structure for voice-identity recognition, while the inferior parietal lobe is

  18. Familiar Person Recognition: Is Autonoetic Consciousness More Likely to Accompany Face Recognition Than Voice Recognition?

    Science.gov (United States)

    Barsics, Catherine; Brédart, Serge

    2010-11-01

    Autonoetic consciousness is a fundamental property of human memory, enabling us to experience mental time travel, to recollect past events with a feeling of self-involvement, and to project ourselves in the future. Autonoetic consciousness is a characteristic of episodic memory. By contrast, awareness of the past associated with a mere feeling of familiarity or knowing relies on noetic consciousness, depending on semantic memory integrity. Present research was aimed at evaluating whether conscious recollection of episodic memories is more likely to occur following the recognition of a familiar face than following the recognition of a familiar voice. Recall of semantic information (biographical information) was also assessed. Previous studies that investigated the recall of biographical information following person recognition used faces and voices of famous people as stimuli. In this study, the participants were presented with personally familiar people's voices and faces, thus avoiding the presence of identity cues in the spoken extracts and allowing a stricter control of frequency exposure with both types of stimuli (voices and faces). In the present study, the rate of retrieved episodic memories, associated with autonoetic awareness, was significantly higher from familiar faces than familiar voices even though the level of overall recognition was similar for both these stimuli domains. The same pattern was observed regarding semantic information retrieval. These results and their implications for current Interactive Activation and Competition person recognition models are discussed.

  19. Investigations of Hemispheric Specialization of Self-Voice Recognition

    Science.gov (United States)

    Rosa, Christine; Lassonde, Maryse; Pinard, Claudine; Keenan, Julian Paul; Belin, Pascal

    2008-01-01

    Three experiments investigated functional asymmetries related to self-recognition in the domain of voices. In Experiment 1, participants were asked to identify one of three presented voices (self, familiar or unknown) by responding with either the right or the left-hand. In Experiment 2, participants were presented with auditory morphs between the…

  20. Voice Recognition Interface in the Rehabilitation of Combat Amputees

    National Research Council Canada - National Science Library

    Lenhart, Martha; Yancosek, Kathleen E

    2004-01-01

    The goal of this pilot study is to assess the impact of training on voice recognition software as part of the rehabilitation process that Military patients with amputation, or peripheral nerve loss...

  1. Exploring expressivity and emotion with artificial voice and speech technologies.

    Science.gov (United States)

    Pauletto, Sandra; Balentine, Bruce; Pidcock, Chris; Jones, Kevin; Bottaci, Leonardo; Aretoulaki, Maria; Wells, Jez; Mundy, Darren P; Balentine, James

    2013-10-01

    Emotion in audio-voice signals, as synthesized by text-to-speech (TTS) technologies, was investigated to formulate a theory of expression for user interface design. Emotional parameters were specified with markup tags, and the resulting audio was further modulated with post-processing techniques. Software was then developed to link a selected TTS synthesizer with an automatic speech recognition (ASR) engine, producing a chatbot that could speak and listen. Using these two artificial voice subsystems, investigators explored both artistic and psychological implications of artificial speech emotion. Goals of the investigation were interdisciplinary, with interest in musical composition, augmentative and alternative communication (AAC), commercial voice announcement applications, human-computer interaction (HCI), and artificial intelligence (AI). The work-in-progress points towards an emerging interdisciplinary ontology for artificial voices. As one study output, HCI tools are proposed for future collaboration.

  2. When the face fits: recognition of celebrities from matching and mismatching faces and voices.

    Science.gov (United States)

    Stevenage, Sarah V; Neil, Greg J; Hamlin, Iain

    2014-01-01

    The results of two experiments are presented in which participants engaged in a face-recognition or a voice-recognition task. The stimuli were face-voice pairs in which the face and voice were co-presented and were either "matched" (same person), "related" (two highly associated people), or "mismatched" (two unrelated people). Analysis in both experiments confirmed that accuracy and confidence in face recognition was consistently high regardless of the identity of the accompanying voice. However accuracy of voice recognition was increasingly affected as the relationship between voice and accompanying face declined. Moreover, when considering self-reported confidence in voice recognition, confidence remained high for correct responses despite the proportion of these responses declining across conditions. These results converged with existing evidence indicating the vulnerability of voice recognition as a relatively weak signaller of identity, and results are discussed in the context of a person-recognition framework.

  3. Acoustic cues for the recognition of self-voice and other-voice

    Directory of Open Access Journals (Sweden)

    Mingdi eXu

    2013-10-01

    Full Text Available Self-recognition, being indispensable for successful social communication, has become a major focus in current social neuroscience. The physical aspects of the self are most typically manifested in the face and voice. Compared with the wealth of studies on self-face recognition, self-voice recognition (SVR has not gained much attention. Converging evidence has suggested that the fundamental frequency (F0 and formant structures serve as the key acoustic cues for other-voice recognition (OVR. However, little is known about which, and how, acoustic cues are utilized for SVR as opposed to OVR. To address this question, we independently manipulated the F0 and formant information of recorded voices and investigated their contributions to SVR and OVR. Japanese participants were presented with recorded vocal stimuli and were asked to identify the speaker—either themselves or one of their peers. Six groups of 5 peers of the same sex participated in the study. Under conditions where the formant information was fully preserved and where only the frequencies lower than the third formant (F3 were retained, accuracies of SVR deteriorated significantly with the modulation of the F0, and the results were comparable for OVR. By contrast, under a condition where only the frequencies higher than F3 were retained, the accuracy of SVR was significantly higher than that of OVR throughout the range of F0 modulations, and the F0 scarcely affected the accuracies of SVR and OVR. Our results indicate that while both F0 and formant information are involved in SVR, as well as in OVR, the advantage of SVR is manifested only when major formant information for speech intelligibility is absent. These findings imply the robustness of self-voice representation, possibly by virtue of auditory familiarity and other factors such as its association with motor/articulatory representation.

  4. A Robust Multimodal Bio metric Authentication Scheme with Voice and Face Recognition

    International Nuclear Information System (INIS)

    Kasban, H.

    2017-01-01

    This paper proposes a multimodal biometric scheme for human authentication based on fusion of voice and face recognition. For voice recognition, three categories of features (statistical coefficients, cepstral coefficients and voice timbre) are used and compared. The voice identification modality is carried out using Gaussian Mixture Model (GMM). For face recognition, three recognition methods (Eigenface, Linear Discriminate Analysis (LDA), and Gabor filter) are used and compared. The combination of voice and face biometrics systems into a single multimodal biometrics system is performed using features fusion and scores fusion. This study shows that the best results are obtained using all the features (cepstral coefficients, statistical coefficients and voice timbre features) for voice recognition, LDA face recognition method and scores fusion for the multimodal biometrics system

  5. Robotics control using isolated word recognition of voice input

    Science.gov (United States)

    Weiner, J. M.

    1977-01-01

    A speech input/output system is presented that can be used to communicate with a task oriented system. Human speech commands and synthesized voice output extend conventional information exchange capabilities between man and machine by utilizing audio input and output channels. The speech input facility is comprised of a hardware feature extractor and a microprocessor implemented isolated word or phrase recognition system. The recognizer offers a medium sized (100 commands), syntactically constrained vocabulary, and exhibits close to real time performance. The major portion of the recognition processing required is accomplished through software, minimizing the complexity of the hardware feature extractor.

  6. Speech Recognition of Aged Voices in the AAL Context: Detection of Distress Sentences

    OpenAIRE

    Aman , Frédéric; Vacher , Michel; Rossato , Solange; Portet , François

    2013-01-01

    International audience; By 2050, about a third of the French population will be over 65. In the context of technologies development aiming at helping aged people to live independently at home, the CIRDO project aims at implementing an ASR system into a social inclusion product designed for elderly people in order to detect distress situations. Speech recognition systems present higher word error rate when speech is uttered by elderly speakers compared to when non-aged voice is considered. Two...

  7. Voice reinstatement modulates neural indices of continuous word recognition.

    Science.gov (United States)

    Campeanu, Sandra; Craik, Fergus I M; Backer, Kristina C; Alain, Claude

    2014-09-01

    The present study was designed to examine listeners' ability to use voice information incidentally during spoken word recognition. We recorded event-related brain potentials (ERPs) during a continuous recognition paradigm in which participants indicated on each trial whether the spoken word was "new" or "old." Old items were presented at 2, 8 or 16 words following the first presentation. Context congruency was manipulated by having the same word repeated by either the same speaker or a different speaker. The different speaker could share the gender, accent or neither feature with the word presented the first time. Participants' accuracy was greatest when the old word was spoken by the same speaker than by a different speaker. In addition, accuracy decreased with increasing lag. The correct identification of old words was accompanied by an enhanced late positivity over parietal sites, with no difference found between voice congruency conditions. In contrast, an earlier voice reinstatement effect was observed over frontal sites, an index of priming that preceded recollection in this task. Our results provide further evidence that acoustic and semantic information are integrated into a unified trace and that acoustic information facilitates spoken word recollection. Copyright © 2014 Elsevier Ltd. All rights reserved.

  8. The Neuropsychology of Familiar Person Recognition from Face and Voice

    Directory of Open Access Journals (Sweden)

    Guido Gainotti

    2014-05-01

    Full Text Available Prosopagnosia has been considered for a long period of time as the most important and almost exclusive disorder in the recognition of familiar people. In recent years, however, this conviction has been undermined by the description of patients showing a concomitant defect in the recognition of familiar faces and voices as a consequence of lesions encroaching upon the right anterior temporal lobe (ATL. These new data have obliged researchers to reconsider on one hand the construct of ‘associative prosopagnosia’ and on the other hand current models of people recognition. A systematic review of the patterns of familiar people recognition disorders observed in patients with right and left ATL lesions has shown that in patients with right ATL lesions face familiarity feelings and the retrieval of person-specific semantic information from faces are selectively affected, whereas in patients with left ATL lesions the defect selectively concerns famous people naming. Furthermore, some patients with right ATL lesions and intact face familiarity feelings show a defect in the retrieval of person-specific semantic knowledge greater from face than from name. These data are at variance with current models assuming: (a that familiarity feelings are generated at the level of person identity nodes (PINs where information processed by various sensory modalities converge, and (b that PINs provide a modality-free gateway to a single semantic system, where information about people is stored in an amodal format. They suggest, on the contrary: (a that familiarity feelings are generated at the level of modality-specific recognition units; (b that face and voice recognition units are represented more in the right than in the left ATLs; (c that in the right ATL are mainly stored person-specific information based on a convergence of perceptual information, whereas in the left ATLs are represented verbally-mediated person-specific information.

  9. Voice recognition software can be used for scientific articles

    DEFF Research Database (Denmark)

    Pommergaard, Hans-Christian; Huang, Chenxi; Burcharth, Jacob

    2015-01-01

    INTRODUCTION: Dictation of scientific articles has been recognised as an efficient method for producing high-quality, first article drafts. However, standardised transcription service by a secretary may not be available for all researchers and voice recognition software (VRS) may therefore...... with a median score of five (range: 3-9), which was improved with the addition of 5,000 words. CONCLUSION: The out-of-the-box performance of VRS was acceptable and improved after additional words were added. Further studies are needed to investigate the effect of additional software accuracy training....

  10. Voice recognition software can be used for scientific articles

    DEFF Research Database (Denmark)

    Pommergaard, Hans-Christian; Huang, Chenxi; Burcharth, Jacob

    2015-01-01

    INTRODUCTION: Dictation of scientific articles has been recognised as an efficient method for producing high-quality, first article drafts. However, standardised transcription service by a secretary may not be available for all researchers and voice recognition software (VRS) may therefore...... be an alternative. The purpose of this study was to evaluate the out-of-the-box accuracy of VRS. METHODS: Eleven young researchers without dictation experience dictated the first draft of their own scientific article after thorough preparation according to a pre-defined schedule. The dictate transcribed by VRS...

  11. Evaluating a voice recognition system: finding the right product for your department.

    Science.gov (United States)

    Freeh, M; Dewey, M; Brigham, L

    2001-06-01

    The Department of Radiology at the University of Utah Health Sciences Center has been in the process of transitioning from the traditional film-based department to a digital imaging department for the past 2 years. The department is now transitioning from the traditional method of dictating reports (dictation by radiologist to transcription to review and signing by radiologist) to a voice recognition system. The transition to digital operations will not be complete until we have the ability to directly interface the dictation process with the image review process. Voice recognition technology has advanced to the level where it can and should be an integral part of the new way of working in radiology and is an integral part of an efficient digital imaging department. The transition to voice recognition requires the task of identifying the product and the company that will best meet a department's needs. This report introduces the methods we used to evaluate the vendors and the products available as we made our purchasing decision. We discuss our evaluation method and provide a checklist that can be used by other departments to assist with their evaluation process. The criteria used in the evaluation process fall into the following major categories: user operations, technical infrastructure, medical dictionary, system interfaces, service support, cost, and company strength. Conclusions drawn from our evaluation process will be detailed, with the intention being to shorten the process for others as they embark on a similar venture. As more and more organizations investigate the many products and services that are now being offered to enhance the operations of a radiology department, it becomes increasingly important that solid methods are used to most effectively evaluate the new products. This report should help others complete the task of evaluating a voice recognition system and may be adaptable to other products as well.

  12. Understanding the mechanisms of familiar voice-identity recognition in the human brain.

    Science.gov (United States)

    Maguinness, Corrina; Roswandowitz, Claudia; von Kriegstein, Katharina

    2018-03-31

    Humans have a remarkable skill for voice-identity recognition: most of us can remember many voices that surround us as 'unique'. In this review, we explore the computational and neural mechanisms which may support our ability to represent and recognise a unique voice-identity. We examine the functional architecture of voice-sensitive regions in the superior temporal gyrus/sulcus, and bring together findings on how these regions may interact with each other, and additional face-sensitive regions, to support voice-identity processing. We also contrast findings from studies on neurotypicals and clinical populations which have examined the processing of familiar and unfamiliar voices. Taken together, the findings suggest that representations of familiar and unfamiliar voices might dissociate in the human brain. Such an observation does not fit well with current models for voice-identity processing, which by-and-large assume a common sequential analysis of the incoming voice signal, regardless of voice familiarity. We provide a revised audio-visual integrative model of voice-identity processing which brings together traditional and prototype models of identity processing. This revised model includes a mechanism of how voice-identity representations are established and provides a novel framework for understanding and examining the potential differences in familiar and unfamiliar voice processing in the human brain. Copyright © 2018 Elsevier Ltd. All rights reserved.

  13. A Voice Processing Technology for Rural Specific Context

    Science.gov (United States)

    He, Zhiyong; Zhang, Zhengguang; Zhao, Chunshen

    Durian the promotion and applications of rural information, different geographical dialect voice interaction is a very complex issue. Through in-depth analysis of TTS core technologies, this paper presents the methods of intelligent segmentation, word segmentation algorithm and intelligent voice thesaurus construction in the different dialects context. And then COM based development methodology for specific context voice processing system implementation and programming method. The method has a certain reference value for the rural dialect and voice processing applications.

  14. Analysis And Voice Recognition In Indonesian Language Using MFCC And SVM Method

    Directory of Open Access Journals (Sweden)

    Harvianto Harvianto

    2016-06-01

    Full Text Available Voice recognition technology is one of biometric technology. Sound is a unique part of the human being which made an individual can be easily distinguished one from another. Voice can also provide information such as gender, emotion, and identity of the speaker. This research will record human voices that pronounce digits between 0 and 9 with and without noise. Features of this sound recording will be extracted using Mel Frequency Cepstral Coefficient (MFCC. Mean, standard deviation, max, min, and the combination of them will be used to construct the feature vectors. This feature vectors then will be classified using Support Vector Machine (SVM. There will be two classification models. The first one is based on the speaker and the other one based on the digits pronounced. The classification model then will be validated by performing 10-fold cross-validation.The best average accuracy from two classification model is 91.83%. This result achieved using Mean + Standard deviation + Min + Max as features.

  15. Effect of voice recognition on radiologist reporting time

    International Nuclear Information System (INIS)

    Bhan, S.N.; Coblentz, C.L.; Norman, G.R.; Ali, S.H.

    2008-01-01

    To study the effect that voice recognition (VR) has on radiologist reporting efficiency in a clinical setting and to identify variables associated with faster reporting time. Five radiologists were observed during the routine reporting of 402 plain radiograph studies using either VR (n 217) or conventional dictation (CD) (n = 185). Two radiologists were observed reporting 66 computed tomography (CT) studies using either VR (n - 39) or CD (n - 27). The time spent per reporting cycle, defined as the radiologist's time spent on a study from report finalization to the subsequent report finalization, was compared. As well, characteristics about the radiologist and their reporting style were collected and correlated against reporting time. For plain radiographs, radiologists took 134% (P = 0.048) more time to produce reports using VR, but there was significant variability between radiologists. Significant association with faster reporting times using VR included: English as a first language (r-0.24), use of a template (r -0.34), use of a headset microphone (r -0.46), and increased experience with VR (r -0.43). Experience as a staff radiologist and having previous study for comparison did not correlate with reporting time. For CT, there was no significant difference in reporting time identified between VR and CD (P 0.61). Overall, VR slightly decreases the reporting efficiency of radiologists. However, efficiency may be improved if English is a first language, a headset microphone, and macros and templates are use. (author)

  16. Voice recognition software can be used for scientific articles.

    Science.gov (United States)

    Pommergaard, Hans-Christian; Huang, Chenxi; Burcharth, Jacob; Rosenberg, Jacob

    2015-02-01

    Dictation of scientific articles has been recognised as an efficient method for producing high-quality, first article drafts. However, standardised transcription service by a secretary may not be available for all researchers and voice recognition software (VRS) may therefore be an alternative. The purpose of this study was to evaluate the out-of-the-box accuracy of VRS. Eleven young researchers without dictation experience dictated the first draft of their own scientific article after thorough preparation according to a pre-defined schedule. The dictate transcribed by VRS was compared with the same dictate transcribed by an experienced research secretary, and the effect of adding words to the vocabulary of the VRS was investigated. The number of errors per hundred words was used as outcome. Furthermore, three experienced researchers assessed the subjective readability using a Likert scale (0-10). Dragon Nuance Premium version 12.5 was used as VRS. The median number of errors per hundred words was 18 (range: 8.5-24.3), which improved when 15,000 words were added to the vocabulary. Subjective readability assessment showed that the texts were understandable with a median score of five (range: 3-9), which was improved with the addition of 5,000 words. The out-of-the-box performance of VRS was acceptable and improved after additional words were added. Further studies are needed to investigate the effect of additional software accuracy training.

  17. Superior voice recognition in a patient with acquired prosopagnosia and object agnosia.

    Science.gov (United States)

    Hoover, Adria E N; Démonet, Jean-François; Steeves, Jennifer K E

    2010-11-01

    Anecdotally, it has been reported that individuals with acquired prosopagnosia compensate for their inability to recognize faces by using other person identity cues such as hair, gait or the voice. Are they therefore superior at the use of non-face cues, specifically voices, to person identity? Here, we empirically measure person and object identity recognition in a patient with acquired prosopagnosia and object agnosia. We quantify person identity (face and voice) and object identity (car and horn) recognition for visual, auditory, and bimodal (visual and auditory) stimuli. The patient is unable to recognize faces or cars, consistent with his prosopagnosia and object agnosia, respectively. He is perfectly able to recognize people's voices and car horns and bimodal stimuli. These data show a reverse shift in the typical weighting of visual over auditory information for audiovisual stimuli in a compromised visual recognition system. Moreover, the patient shows selectively superior voice recognition compared to the controls revealing that two different stimulus domains, persons and objects, may not be equally affected by sensory adaptation effects. This also implies that person and object identity recognition are processed in separate pathways. These data demonstrate that an individual with acquired prosopagnosia and object agnosia can compensate for the visual impairment and become quite skilled at using spared aspects of sensory processing. In the case of acquired prosopagnosia it is advantageous to develop a superior use of voices for person identity recognition in everyday life. Copyright © 2010 Elsevier Ltd. All rights reserved.

  18. Page Recognition: Quantum Leap In Recognition Technology

    Science.gov (United States)

    Miller, Larry

    1989-07-01

    No milestone has proven as elusive as the always-approaching "year of the LAN," but the "year of the scanner" might claim the silver medal. Desktop scanners have been around almost as long as personal computers. And everyone thinks they are used for obvious desktop-publishing and business tasks like scanning business documents, magazine articles and other pages, and translating those words into files your computer understands. But, until now, the reality fell far short of the promise. Because it's true that scanners deliver an accurate image of the page to your computer, but the software to recognize this text has been woefully disappointing. Old optical-character recognition (OCR) software recognized such a limited range of pages as to be virtually useless to real users. (For example, one OCR vendor specified 12-point Courier font from an IBM Selectric typewriter: the same font in 10-point, or from a Diablo printer, was unrecognizable!) Computer dealers have told me the chasm between OCR expectations and reality is so broad and deep that nine out of ten prospects leave their stores in disgust when they learn the limitations. And this is a very important, very unfortunate gap. Because the promise of recognition -- what people want it to do -- carries with it tremendous improvements in our productivity and ability to get tons of written documents into our computers where we can do real work with it. The good news is that a revolutionary new development effort has led to the new technology of "page recognition," which actually does deliver the promise we've always wanted from OCR. I'm sure every reader appreciates the breakthrough represented by the laser printer and page-makeup software, a combination so powerful it created new reasons for buying a computer. A similar breakthrough is happening right now in page recognition: the Macintosh (and, I must admit, other personal computers) equipped with a moderately priced scanner and OmniPage software (from Caere

  19. Voice recognition for radiology reporting: Is it good enough?

    International Nuclear Information System (INIS)

    Rana, D.S.; Hurst, G.; Shepstone, L.; Pilling, J.; Cockburn, J.; Crawford, M.

    2005-01-01

    AIM: To compare the efficiency and accuracy of radiology reports generated by voice recognition (VR) against the traditional tape dictation-transcription (DT) method. MATERIALS AND METHODS: Two hundred and twenty previously reported computed radiography (CR) and cross-sectional imaging (CSI) examinations were separately entered into the Radiology Information System (RIS) using both VR and DT. The times taken and errors found in the reports were compared using univariate analyses based upon the sign-test, and a general linear model constructed to examine the mean differences between the two methods. RESULTS: There were significant reductions (p<0.001) in the mean difference in the reporting times using VR compared with DT for the two reporting methods assessed (CR, +67.4; CSI, +122.1 s). There was a significant increase in the mean difference in the actual radiologist times using VR compared with DT in the CSI reports; -14.3 s, p=0.037 (more experienced user); -13.7 s, p=0.014 (less experienced user). There were significantly more total and major errors when using VR compared with DT for CR reports (-0.25 and -0.26, respectively), and in total errors for CSI (-0.75, p<0.001), but no difference in major errors (-0.16, p=0.168). Although there were significantly more errors with VR in the less experienced group of users (mean difference in total errors -0.90, and major errors -0.40, p<0.001), there was no significant difference in the more experienced (p=0.419 and p=0.814, respectively). CONCLUSIONS: VR is a viable reporting method for experienced users, with a quicker overall report production time (despite an increase in the radiologists' time) and a tendency to more errors for inexperienced users

  20. Voice Assessment of Student Work: Recent Studies and Emerging Technologies

    Science.gov (United States)

    Eckhouse, Barry; Carroll, Rebecca

    2013-01-01

    Although relatively little attention has been given to the voice assessment of student work, at least when compared with more traditional forms of text-based review, the attention it has received strongly points to a promising form of review that has been hampered by the limits of an emerging technology. A fresh review of voice assessment in light…

  1. Voicing the Technological Body. Some Musicological Reflections on Combinations of Voice and Technology in Popular Music

    Directory of Open Access Journals (Sweden)

    Florian Heesch

    2016-05-01

    Full Text Available The article deals with interrelations of voice, body and technology in popular music from a musicological perspective. It is an attempt to outline a systematic approach to the history of music technology with regard to aesthetic aspects, taking the identity of the singing subject as a main point of departure for a hermeneutic reading of popular song. Although the argumentation is based largely on musicological research, it is also inspired by the notion of presentness as developed by theologian and media scholar Walter Ong. The variety of the relationships between voice, body, and technology with regard to musical representations of identity, in particular gender and race, is systematized alongside the following cagories: (1 the “absence of the body,” that starts with the establishment of phonography; (2 “amplified presence,” as a signifier for uses of the microphone to enhance low sounds in certain manners; and (3 “hybridity,” including vocal identities that blend human body sounds and technological processing, whereby special focus is laid on uses of the vocoder and similar technologies.

  2. Evolving Spiking Neural Networks for Recognition of Aged Voices.

    Science.gov (United States)

    Silva, Marco; Vellasco, Marley M B R; Cataldo, Edson

    2017-01-01

    The aging of the voice, known as presbyphonia, is a natural process that can cause great change in vocal quality of the individual. This is a relevant problem to those people who use their voices professionally, and its early identification can help determine a suitable treatment to avoid its progress or even to eliminate the problem. This work focuses on the development of a new model for the identification of aging voices (independently of their chronological age), using as input attributes parameters extracted from the voice and glottal signals. The proposed model, named Quantum binary-real evolving Spiking Neural Network (QbrSNN), is based on spiking neural networks (SNNs), with an unsupervised training algorithm, and a Quantum-Inspired Evolutionary Algorithm that automatically determines the most relevant attributes and the optimal parameters that configure the SNN. The QbrSNN model was evaluated in a database composed of 120 records, containing samples from three groups of speakers. The results obtained indicate that the proposed model provides better accuracy than other approaches, with fewer input attributes. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  3. Motorcycle Start-stop System based on Intelligent Biometric Voice Recognition

    Science.gov (United States)

    Winda, A.; E Byan, W. R.; Sofyan; Armansyah; Zariantin, D. L.; Josep, B. G.

    2017-03-01

    Current mechanical key in the motorcycle is prone to bulgary, being stolen or misplaced. Intelligent biometric voice recognition as means to replace this mechanism is proposed as an alternative. The proposed system will decide whether the voice is belong to the user or not and the word utter by the user is ‘On’ or ‘Off’. The decision voice will be sent to Arduino in order to start or stop the engine. The recorded voice is processed in order to get some features which later be used as input to the proposed system. The Mel-Frequency Ceptral Coefficient (MFCC) is adopted as a feature extraction technique. The extracted feature is the used as input to the SVM-based identifier. Experimental results confirm the effectiveness of the proposed intelligent voice recognition and word recognition system. It show that the proposed method produces a good training and testing accuracy, 99.31% and 99.43%, respectively. Moreover, the proposed system shows the performance of false rejection rate (FRR) and false acceptance rate (FAR) accuracy of 0.18% and 17.58%, respectively. In the intelligent word recognition shows that the training and testing accuracy are 100% and 96.3%, respectively.

  4. Effects of emotional and perceptual-motor stress on a voice recognition system's accuracy: An applied investigation

    Science.gov (United States)

    Poock, G. K.; Martin, B. J.

    1984-02-01

    This was an applied investigation examining the ability of a speech recognition system to recognize speakers' inputs when the speakers were under different stress levels. Subjects were asked to speak to a voice recognition system under three conditions: (1) normal office environment, (2) emotional stress, and (3) perceptual-motor stress. Results indicate a definite relationship between voice recognition system performance and the type of low stress reference patterns used to achieve recognition.

  5. Pengoperasian Beban Listrik Fase Tunggal Terkendali Melalui Minimum System Berbasis Mikrokontroler Dan Sensor Voice Recognition (Vr)

    OpenAIRE

    Goeritno, Arief; Ginting, Sandy Ferdiansyah; Yatim, Rakhmad

    2017-01-01

    Minimum system berbasis mikrokontroler dan sensor voice recognition (VR) sebagai pengendali aktuator telah digunakan untuk pengoperasian beban listrik fase tunggal. Minimum system adalah suatu sistem yang tersusun melalui 2 (dua) tahapan, yaitu (a) diagram rangkaian dan bentuk fisis board dan (b) pengawatan terintegrasi terhadap minimum system pada sistem mikrokontroler ATmega16. Keberadaan sistem mikrokontroler pada minimum system perlu program tertanam melalui pemrograman berbasis bahasa ...

  6. Impact of a voice recognition system on report cycle time and radiologist reading time

    Science.gov (United States)

    Melson, David L.; Brophy, Robert; Blaine, G. James; Jost, R. Gilbert; Brink, Gary S.

    1998-07-01

    Because of its exciting potential to improve clinical service, as well as reduce costs, a voice recognition system for radiological dictation was recently installed at our institution. This system will be clinically successful if it dramatically reduces radiology report turnaround time without substantially affecting radiologist dictation and editing time. This report summarizes an observer study currently under way in which radiologist reporting times using the traditional transcription system and the voice recognition system are compared. Four radiologists are observed interpreting portable intensive care unit (ICU) chest examinations at a workstation in the chest reading area. Data are recorded with the radiologists using the transcription system and using the voice recognition system. The measurements distinguish between time spent performing clerical tasks and time spent actually dictating the report. Editing time and the number of corrections made are recorded. Additionally, statistics are gathered to assess the voice recognition system's impact on the report cycle time -- the time from report dictation to availability of an edited and finalized report -- and the length of reports.

  7. The recognition of female voice based on voice registers in singing techniques in real-time using hankel transform method and macdonald function

    Science.gov (United States)

    Meiyanti, R.; Subandi, A.; Fuqara, N.; Budiman, M. A.; Siahaan, A. P. U.

    2018-03-01

    A singer doesn’t just recite the lyrics of a song, but also with the use of particular sound techniques to make it more beautiful. In the singing technique, more female have a diverse sound registers than male. There are so many registers of the human voice, but the voice registers used while singing, among others, Chest Voice, Head Voice, Falsetto, and Vocal fry. Research of speech recognition based on the female’s voice registers in singing technique is built using Borland Delphi 7.0. Speech recognition process performed by the input recorded voice samples and also in real time. Voice input will result in weight energy values based on calculations using Hankel Transformation method and Macdonald Functions. The results showed that the accuracy of the system depends on the accuracy of sound engineering that trained and tested, and obtained an average percentage of the successful introduction of the voice registers record reached 48.75 percent, while the average percentage of the successful introduction of the voice registers in real time to reach 57 percent.

  8. Artificially intelligent recognition of Arabic speaker using voice print-based local features

    Science.gov (United States)

    Mahmood, Awais; Alsulaiman, Mansour; Muhammad, Ghulam; Akram, Sheeraz

    2016-11-01

    Local features for any pattern recognition system are based on the information extracted locally. In this paper, a local feature extraction technique was developed. This feature was extracted in the time-frequency plain by taking the moving average on the diagonal directions of the time-frequency plane. This feature captured the time-frequency events producing a unique pattern for each speaker that can be viewed as a voice print of the speaker. Hence, we referred to this technique as voice print-based local feature. The proposed feature was compared to other features including mel-frequency cepstral coefficient (MFCC) for speaker recognition using two different databases. One of the databases used in the comparison is a subset of an LDC database that consisted of two short sentences uttered by 182 speakers. The proposed feature attained 98.35% recognition rate compared to 96.7% for MFCC using the LDC subset.

  9. Educational Technology and Student Voice: Examining Teacher Candidates' Perceptions

    Science.gov (United States)

    Byker, Erik Jon; Putman, S. Michael; Handler, Laura; Polly, Drew

    2017-01-01

    Student Voice is a term that honors the participatory roles that students have when they enter learning spaces like classrooms. Student Voice is the recognition of students' choice, creativity, and freedom. Seminal educationists--like Dewey and Montessori--centered the purposes of education in the flourishing and valuing of Student Voice. This…

  10. Voice Activity Detection. Fundamentals and Speech Recognition System Robustness

    OpenAIRE

    Ramirez, J.; Gorriz, J. M.; Segura, J. C.

    2007-01-01

    This chapter has shown an overview of the main challenges in robust speech detection and a review of the state of the art and applications. VADs are frequently used in a number of applications including speech coding, speech enhancement and speech recognition. A precise VAD extracts a set of discriminative speech features from the noisy speech and formulates the decision in terms of well defined rule. The chapter has summarized three robust VAD methods that yield high speech/non-speech discri...

  11. Use of voice recognition software in an outpatient pediatric specialty practice.

    Science.gov (United States)

    Issenman, Robert M; Jaffer, Iqbal H

    2004-09-01

    Voice recognition software (VRS), with specialized medical vocabulary, is being promoted to enhance physician efficiency, decrease costs, and improve patient safety. This study reports the experience of a pediatric subspecialist (pediatric gastroenterology) physician with the use of Dragon Naturally Speaking (version 6; ScanSoft Inc, Peabody, MA), incorporated for use with a proprietary electronic medical record, in a large university medical center ambulatory care service. After 2 hours of group orientation and 2 hours of individual VRS instruction, the physician trained the software for 1 month (30 letters) during a hospital slowdown. Set-up, dictation, and correction times for the physician and medical transcriptionist were recorded for these training sessions, as well as for 42 subsequently dictated letters. Figures were extrapolated to the yearly clinic volume for the physician, to estimate costs (physician: 110 dollars per hour; transcriptionist: 11 dollars per hour, US dollars). The use of VRS required an additional 200% of physician dictation and correction time (9 minutes vs 3 minutes), compared with the use of electronic signatures for letters typed by an experienced transcriptionist and imported into the electronic medical record. When the cost of the license agreement and the costs of physician and transcriptionist time were included, the use of the software cost 100% more, for the amount of dictation performed annually by the physician. VRS is an intriguing technology. It holds the possibility of streamlining medical practice. However, the learning curve and accuracy of the tested version of the software limit broad physician acceptance at this time.

  12. Voice recognition through phonetic features with Punjabi utterances

    Science.gov (United States)

    Kaur, Jasdeep; Juglan, K. C.; Sharma, Vishal; Upadhyay, R. K.

    2017-07-01

    This paper deals with perception and disorders of speech in view of Punjabi language. Visualizing the importance of voice identification, various parameters of speaker identification has been studied. The speech material was recorded with a tape recorder in their normal and disguised mode of utterances. Out of the recorded speech materials, the utterances free from noise, etc were selected for their auditory and acoustic spectrographic analysis. The comparison of normal and disguised speech of seven subjects is reported. The fundamental frequency (F0) at similar places, Plosive duration at certain phoneme, Amplitude ratio (A1:A2) etc. were compared in normal and disguised speech. It was found that the formant frequency of normal and disguised speech remains almost similar only if it is compared at the position of same vowel quality and quantity. If the vowel is more closed or more open in the disguised utterance the formant frequency will be changed in comparison to normal utterance. The ratio of the amplitude (A1: A2) is found to be speaker dependent. It remains unchanged in the disguised utterance. However, this value may shift in disguised utterance if cross sectioning is not done at the same location.

  13. Advanced distributed simulation technology: Digital Voice Gateway Reference Guide

    Science.gov (United States)

    Vanhook, Dan; Stadler, Ed

    1994-01-01

    The Digital Voice Gateway (referred to as the 'DVG' in this document) transmits and receives four full duplex encoded speech channels over the Ethernet. The information in this document applies only to DVG's running firmware of the version listed on the title page. This document, previously named Digital Voice Gateway Reference Guide, BBN Systems and Technologies Corporation, Cambridge, MA 02138, was revised for revision 2.00. This new revision changes the network protocol used by the DVG, to comply with the SINCGARS radio simulation (For SIMNET 6.6.1). Because of the extensive changes to revision 2.00 a separate document was created rather than supplying change pages.

  14. Two-component network model in voice identification technologies

    Directory of Open Access Journals (Sweden)

    Edita K. Kuular

    2018-03-01

    Full Text Available Among the most important parameters of biometric systems with voice modalities that determine their effectiveness, along with reliability and noise immunity, a speed of identification and verification of a person has been accentuated. This parameter is especially sensitive while processing large-scale voice databases in real time regime. Many research studies in this area are aimed at developing new and improving existing algorithms for presentation and processing voice records to ensure high performance of voice biometric systems. Here, it seems promising to apply a modern approach, which is based on complex network platform for solving complex massive problems with a large number of elements and taking into account their interrelationships. Thus, there are known some works which while solving problems of analysis and recognition of faces from photographs, transform images into complex networks for their subsequent processing by standard techniques. One of the first applications of complex networks to sound series (musical and speech analysis are description of frequency characteristics by constructing network models - converting the series into networks. On the network ontology platform a previously proposed technique of audio information representation aimed on its automatic analysis and speaker recognition has been developed. This implies converting information into the form of associative semantic (cognitive network structure with amplitude and frequency components both. Two speaker exemplars have been recorded and transformed into pertinent networks with consequent comparison of their topological metrics. The set of topological metrics for each of network models (amplitude and frequency one is a vector, and together  those combine a matrix, as a digital "network" voiceprint. The proposed network approach, with its sensitivity to personal conditions-physiological, psychological, emotional, might be useful not only for person identification

  15. Emotion Recognition From Singing Voices Using Contemporary Commercial Music and Classical Styles.

    Science.gov (United States)

    Hakanpää, Tua; Waaramaa, Teija; Laukkanen, Anne-Maria

    2018-02-22

    This study examines the recognition of emotion in contemporary commercial music (CCM) and classical styles of singing. This information may be useful in improving the training of interpretation in singing. This is an experimental comparative study. Thirteen singers (11 female, 2 male) with a minimum of 3 years' professional-level singing studies (in CCM or classical technique or both) participated. They sang at three pitches (females: a, e1, a1, males: one octave lower) expressing anger, sadness, joy, tenderness, and a neutral state. Twenty-nine listeners listened to 312 short (0.63- to 4.8-second) voice samples, 135 of which were sung using a classical singing technique and 165 of which were sung in a CCM style. The listeners were asked which emotion they heard. Activity and valence were derived from the chosen emotions. The percentage of correct recognitions out of all the answers in the listening test (N = 9048) was 30.2%. The recognition percentage for the CCM-style singing technique was higher (34.5%) than for the classical-style technique (24.5%). Valence and activation were better perceived than the emotions themselves, and activity was better recognized than valence. A higher pitch was more likely to be perceived as joy or anger, and a lower pitch as sorrow. Both valence and activation were better recognized in the female CCM samples than in the other samples. There are statistically significant differences in the recognition of emotions between classical and CCM styles of singing. Furthermore, in the singing voice, pitch affects the perception of emotions, and valence and activity are more easily recognized than emotions. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  16. Cultural in-group advantage: emotion recognition in African American and European American faces and voices.

    Science.gov (United States)

    Wickline, Virginia B; Bailey, Wendy; Nowicki, Stephen

    2009-03-01

    The authors explored whether there were in-group advantages in emotion recognition of faces and voices by culture or geographic region. Participants were 72 African American students (33 men, 39 women), 102 European American students (30 men, 72 women), 30 African international students (16 men, 14 women), and 30 European international students (15 men, 15 women). The participants determined emotions in African American and European American faces and voices. Results showed an in-group advantage-sometimes by culture, less often by race-in recognizing facial and vocal emotional expressions. African international students were generally less accurate at interpreting American nonverbal stimuli than were European American, African American, and European international peers. Results suggest that, although partly universal, emotional expressions have subtle differences across cultures that persons must learn.

  17. Village voice: towards inclusive information technologies

    Energy Technology Data Exchange (ETDEWEB)

    Garside, Ben

    2009-04-15

    A decade ago it was dubbed the 'digital divide'. Now, the gap in information and communications technologies (ICTs) between North and South is gradually shrinking. The developing world accounts for two-thirds of total mobile phone subscriptions, and Africa has the world's fastest growing mobile phone market. By gaining a toehold in affordable ICTs, the poor can access the knowledge and services they need, such as real-time market prices, to boost their livelihoods. But to be sustainable, technologies need to factor in social realities. These include how people already share knowledge, and adapt to introduced technologies: mobile phones, for instance, confer status but can eat into much-needed income. Many development agencies opt for technology-led solutions that fail to 'take'. Approaches that keep development concerns at their core and people as their central focus are key.

  18. Village voice: towards inclusive information technologies

    Energy Technology Data Exchange (ETDEWEB)

    Garside, Ben

    2009-04-15

    A decade ago it was dubbed the 'digital divide'. Now, the gap in information and communications technologies (ICTs) between North and South is gradually shrinking. The developing world accounts for two-thirds of total mobile phone subscriptions, and Africa has the world's fastest growing mobile phone market. By gaining a toehold in affordable ICTs, the poor can access the knowledge and services they need, such as real-time market prices, to boost their livelihoods. But to be sustainable, technologies need to factor in social realities. These include how people already share knowledge, and adapt to introduced technologies: mobile phones, for instance, confer status but can eat into much-needed income. Many development agencies opt for technology-led solutions that fail to 'take'. Approaches that keep development concerns at their core and people as their central focus are key.

  19. A memory like a female Fur Seal: long-lasting recognition of pup's voice by mothers.

    Science.gov (United States)

    Mathevon, Nicolas; Charrier, Isabelle; Aubin, Thierry

    2004-06-01

    In colonial mammals like fur seals, mutual vocal recognition between mothers and their pup is of primary importance for breeding success. Females alternate feeding sea-trips with suckling periods on land, and when coming back from the ocean, they have to vocally find their offspring among numerous similar-looking pups. Young fur seals emit a 'mother-attraction call' that presents individual characteristics. In this paper, we review the perceptual process of pup's call recognition by Subantarctic Fur Seal Arctocephalus tropicalis mothers. To identify their progeny, females rely on the frequency modulation pattern and spectral features of this call. As the acoustic characteristics of a pup's call change throughout the lactation period due to the growing process, mothers have thus to refine their memorization of their pup's voice. Field experiments show that female Fur Seals are able to retain all the successive versions of their pup's call.

  20. An Introduction to Face Recognition Technology

    Directory of Open Access Journals (Sweden)

    Shang-Hung Lin

    2000-01-01

    Full Text Available Recently face recognition is attracting much attention in the society of network multimedia information access.  Areas such as network security, content indexing and retrieval, and video compression benefits from face recognition technology because "people" are the center of attention in a lot of video.  Network access control via face recognition not only makes hackers virtually impossible to steal one's "password", but also increases the user-friendliness in human-computer interaction.  Indexing and/or retrieving video data based on the appearances of particular persons will be useful for users such as news reporters, political scientists, and moviegoers.  For the applications of videophone and teleconferencing, the assistance of face recognition also provides a more efficient coding scheme.  In this paper, we give an introductory course of this new information processing technology.  The paper shows the readers the generic framework for the face recognition system, and the variants that are frequently encountered by the face recognizer.  Several famous face recognition algorithms, such as eigenfaces and neural networks, will also be explained.

  1. It doesn't matter what you say: FMRI correlates of voice learning and recognition independent of speech content.

    Science.gov (United States)

    Zäske, Romi; Awwad Shiekh Hasan, Bashar; Belin, Pascal

    2017-09-01

    Listeners can recognize newly learned voices from previously unheard utterances, suggesting the acquisition of high-level speech-invariant voice representations during learning. Using functional magnetic resonance imaging (fMRI) we investigated the anatomical basis underlying the acquisition of voice representations for unfamiliar speakers independent of speech, and their subsequent recognition among novel voices. Specifically, listeners studied voices of unfamiliar speakers uttering short sentences and subsequently classified studied and novel voices as "old" or "new" in a recognition test. To investigate "pure" voice learning, i.e., independent of sentence meaning, we presented German sentence stimuli to non-German speaking listeners. To disentangle stimulus-invariant and stimulus-dependent learning, during the test phase we contrasted a "same sentence" condition in which listeners heard speakers repeating the sentences from the preceding study phase, with a "different sentence" condition. Voice recognition performance was above chance in both conditions although, as expected, performance was higher for same than for different sentences. During study phases activity in the left inferior frontal gyrus (IFG) was related to subsequent voice recognition performance and same versus different sentence condition, suggesting an involvement of the left IFG in the interactive processing of speaker and speech information during learning. Importantly, at test reduced activation for voices correctly classified as "old" compared to "new" emerged in a network of brain areas including temporal voice areas (TVAs) of the right posterior superior temporal gyrus (pSTG), as well as the right inferior/middle frontal gyrus (IFG/MFG), the right medial frontal gyrus, and the left caudate. This effect of voice novelty did not interact with sentence condition, suggesting a role of temporal voice-selective areas and extra-temporal areas in the explicit recognition of learned voice identity

  2. Emotionally conditioning the target-speech voice enhances recognition of the target speech under "cocktail-party" listening conditions.

    Science.gov (United States)

    Lu, Lingxi; Bao, Xiaohan; Chen, Jing; Qu, Tianshu; Wu, Xihong; Li, Liang

    2018-05-01

    Under a noisy "cocktail-party" listening condition with multiple people talking, listeners can use various perceptual/cognitive unmasking cues to improve recognition of the target speech against informational speech-on-speech masking. One potential unmasking cue is the emotion expressed in a speech voice, by means of certain acoustical features. However, it was unclear whether emotionally conditioning a target-speech voice that has none of the typical acoustical features of emotions (i.e., an emotionally neutral voice) can be used by listeners for enhancing target-speech recognition under speech-on-speech masking conditions. In this study we examined the recognition of target speech against a two-talker speech masker both before and after the emotionally neutral target voice was paired with a loud female screaming sound that has a marked negative emotional valence. The results showed that recognition of the target speech (especially the first keyword in a target sentence) was significantly improved by emotionally conditioning the target speaker's voice. Moreover, the emotional unmasking effect was independent of the unmasking effect of the perceived spatial separation between the target speech and the masker. Also, (skin conductance) electrodermal responses became stronger after emotional learning when the target speech and masker were perceptually co-located, suggesting an increase of listening efforts when the target speech was informationally masked. These results indicate that emotionally conditioning the target speaker's voice does not change the acoustical parameters of the target-speech stimuli, but the emotionally conditioned vocal features can be used as cues for unmasking target speech.

  3. Educational Pedagogy Explored: Attachment, Voice, and Students’ Limited Recognition of the Purpose of Writing

    Directory of Open Access Journals (Sweden)

    Rebecca A. Fairchild

    2013-07-01

    Full Text Available The following teacher research case-study involved an exploration of educational pedagogy by working with a freshman composition student at a college university. All data collected for the study was gathered during the 2013 spring semester. The study was driven by an inquiry based approach where the researcher determined the center of focus that arose from an exploration of the student as a writer through a survey, a classroom observation, multiple one-on-one meetings, and email conversations. The focus area that arose was the student’s limited recognition that writing was done solely for school purposes. Related puzzlements stemming from this focus area included the student’s lack of attachment and lack of voice in her writing. The conclusive data provided insights for how to educate students in future classrooms regarding how vital it is for students to be able to attach themselves to their work.

  4. Evaluation of Speech Recognition of Cochlear Implant Recipients Using Adaptive, Digital Remote Microphone Technology and a Speech Enhancement Sound Processing Algorithm.

    Science.gov (United States)

    Wolfe, Jace; Morais, Mila; Schafer, Erin; Agrawal, Smita; Koch, Dawn

    2015-05-01

    Cochlear implant recipients often experience difficulty with understanding speech in the presence of noise. Cochlear implant manufacturers have developed sound processing algorithms designed to improve speech recognition in noise, and research has shown these technologies to be effective. Remote microphone technology utilizing adaptive, digital wireless radio transmission has also been shown to provide significant improvement in speech recognition in noise. There are no studies examining the potential improvement in speech recognition in noise when these two technologies are used simultaneously. The goal of this study was to evaluate the potential benefits and limitations associated with the simultaneous use of a sound processing algorithm designed to improve performance in noise (Advanced Bionics ClearVoice) and a remote microphone system that incorporates adaptive, digital wireless radio transmission (Phonak Roger). A two-by-two way repeated measures design was used to examine performance differences obtained without these technologies compared to the use of each technology separately as well as the simultaneous use of both technologies. Eleven Advanced Bionics (AB) cochlear implant recipients, ages 11 to 68 yr. AzBio sentence recognition was measured in quiet and in the presence of classroom noise ranging in level from 50 to 80 dBA in 5-dB steps. Performance was evaluated in four conditions: (1) No ClearVoice and no Roger, (2) ClearVoice enabled without the use of Roger, (3) ClearVoice disabled with Roger enabled, and (4) simultaneous use of ClearVoice and Roger. Speech recognition in quiet was better than speech recognition in noise for all conditions. Use of ClearVoice and Roger each provided significant improvement in speech recognition in noise. The best performance in noise was obtained with the simultaneous use of ClearVoice and Roger. ClearVoice and Roger technology each improves speech recognition in noise, particularly when used at the same time

  5. Voice recognition versus transcriptionist: error rates and productivity in MRI reporting.

    Science.gov (United States)

    Strahan, Rodney H; Schneider-Kolsky, Michal E

    2010-10-01

    Despite the frequent introduction of voice recognition (VR) into radiology departments, little evidence still exists about its impact on workflow, error rates and costs. We designed a study to compare typographical errors, turnaround times (TAT) from reported to verified and productivity for VR-generated reports versus transcriptionist-generated reports in MRI. Fifty MRI reports generated by VR and 50 finalized MRI reports generated by the transcriptionist, of two radiologists, were sampled retrospectively. Two hundred reports were scrutinised for typographical errors and the average TAT from dictated to final approval. To assess productivity, the average MRI reports per hour for one of the radiologists was calculated using data from extra weekend reporting sessions. Forty-two % and 30% of the finalized VR reports for each of the radiologists investigated contained errors. Only 6% and 8% of the transcriptionist-generated reports contained errors. The average TAT for VR was 0 h, and for the transcriptionist reports TAT was 89 and 38.9 h. Productivity was calculated at 8.6 MRI reports per hour using VR and 13.3 MRI reports using the transcriptionist, representing a 55% increase in productivity. Our results demonstrate that VR is not an effective method of generating reports for MRI. Ideally, we would have the report error rate and productivity of a transcriptionist and the TAT of VR. © 2010 The Authors. Journal of Medical Imaging and Radiation Oncology © 2010 The Royal Australian and New Zealand College of Radiologists.

  6. Voice recognition versus transcriptionist: error rated and productivity in MRI reporting

    International Nuclear Information System (INIS)

    Strahan, Rodney H.; Schneider-Kolsky, Michal E.

    2010-01-01

    Full text: Purpose: Despite the frequent introduction of voice recognition (VR) into radiology departments, little evidence still exists about its impact on workflow, error rates and costs. We designed a study to compare typographical errors, turnaround times (TAT) from reported to verified and productivity for VR-generated reports versus transcriptionist-generated reports in MRI. Methods: Fifty MRI reports generated by VR and 50 finalised MRI reports generated by the transcriptionist, of two radiologists, were sampled retrospectively. Two hundred reports were scrutinised for typographical errors and the average TAT from dictated to final approval. To assess productivity, the average MRI reports per hour for one of the radiologists was calculated using data from extra weekend reporting sessions. Results: Forty-two % and 30% of the finalised VR reports for each of the radiologists investigated contained errors. Only 6% and 8% of the transcriptionist-generated reports contained errors. The average TAT for VR was 0 h, and for the transcriptionist reports TAT was 89 and 38.9 h. Productivity was calculated at 8.6 MRI reports per hour using VR and 13.3 MRI reports using the transcriptionist, representing a 55% increase in productivity. Conclusion: Our results demonstrate that VR is not an effective method of generating reports for MRI. Ideally, we would have the report error rate and productivity of a transcriptionist and the TAT of VR.

  7. Automatic Speech Recognition Systems for the Evaluation of Voice and Speech Disorders in Head and Neck Cancer

    Directory of Open Access Journals (Sweden)

    Andreas Maier

    2010-01-01

    Full Text Available In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appropriate for objective and quick evaluation of intelligibility. In this study we investigate the applicability of the method to speech disorders caused by head and neck cancer. Intelligibility was quantified by speech recognition on recordings of a standard text read by 41 German laryngectomized patients with cancer of the larynx or hypopharynx and 49 German patients who had suffered from oral cancer. The speech recognition provides the percentage of correctly recognized words of a sequence, that is, the word recognition rate. Automatic evaluation was compared to perceptual ratings by a panel of experts and to an age-matched control group. Both patient groups showed significantly lower word recognition rates than the control group. Automatic speech recognition yielded word recognition rates which complied with experts' evaluation of intelligibility on a significant level. Automatic speech recognition serves as a good means with low effort to objectify and quantify the most important aspect of pathologic speech—the intelligibility. The system was successfully applied to voice and speech disorders.

  8. Voice user interface design for emerging multilingual markets

    CSIR Research Space (South Africa)

    Van Huyssteen, G

    2012-10-01

    Full Text Available Multilingual emerging markets hold many opportunities for the application of spoken language technologies, such as automatic speech recognition (ASR) or test-to-speech (TTS) technologies in interactive voice response (IVR) systems. However...

  9. Technologies for Self-Determination for Youth with Developmental Disabilities. Technologies for Voice: A Critical Issues Brief

    Science.gov (United States)

    Skouge, James R.; Kelly, Mary L.; Roberts, Kelly D.; Leake, David W.; Stodden, Robert A.

    2007-01-01

    This paper focuses on "technologies for voice" that are related to the self-determination of youth with developmental disabilities. The authors describe a self-determination model that values family-focused, community-referenced pedagogies employing "new media" to give voice to youth and their families. In line with the adage that a picture is…

  10. A self-teaching image processing and voice-recognition-based, intelligent and interactive system to educate visually impaired children

    Science.gov (United States)

    Iqbal, Asim; Farooq, Umar; Mahmood, Hassan; Asad, Muhammad Usman; Khan, Akrama; Atiq, Hafiz Muhammad

    2010-02-01

    A self teaching image processing and voice recognition based system is developed to educate visually impaired children, chiefly in their primary education. System comprises of a computer, a vision camera, an ear speaker and a microphone. Camera, attached with the computer system is mounted on the ceiling opposite (on the required angle) to the desk on which the book is placed. Sample images and voices in the form of instructions and commands of English, Urdu alphabets, Numeric Digits, Operators and Shapes are already stored in the database. A blind child first reads the embossed character (object) with the help of fingers than he speaks the answer, name of the character, shape etc into the microphone. With the voice command of a blind child received by the microphone, image is taken by the camera which is processed by MATLAB® program developed with the help of Image Acquisition and Image processing toolbox and generates a response or required set of instructions to child via ear speaker, resulting in self education of a visually impaired child. Speech recognition program is also developed in MATLAB® with the help of Data Acquisition and Signal Processing toolbox which records and process the command of the blind child.

  11. A memory like a female Fur Seal: long-lasting recognition of pup's voice by mothers

    Directory of Open Access Journals (Sweden)

    Nicolas Mathevon

    2004-06-01

    Full Text Available In colonial mammals like fur seals, mutual vocal recognition between mothers and their pup is of primary importance for breeding success. Females alternate feeding sea-trips with suckling periods on land, and when coming back from the ocean, they have to vocally find their offspring among numerous similar-looking pups. Young fur seals emit a 'mother-attraction call' that presents individual characteristics. In this paper, we review the perceptual process of pup's call recognition by Subantarctic Fur Seal Arctocephalus tropicalis mothers. To identify their progeny, females rely on the frequency modulation pattern and spectral features of this call. As the acoustic characteristics of a pup's call change throughout the lactation period due to the growing process, mothers have thus to refine their memorization of their pup's voice. Field experiments show that female Fur Seals are able to retain all the successive versions of their pup's call.Em mamíferos coloniais como as focas, o reconhecimento vocal mútuo entre as mães e seu filhote é de importância primordial para o sucesso reprodutivo. As fêmeas alternam viagens de alimentação no mar com períodos de amamentação em terra e, quando voltam à colônia, elas devem achar vocalmente seu filhote no meio de muitos outros visualmente semelhantes. As jovens focas emitem um ''grito de atração da mãe'' que apresenta características individuais. Examinamos aqui o processo perceptual do reconhecimento do grito do filhote pela mãe numa população sub-antártica da foca Arctocephalus tropicalis. Para identificar seu filhote as fêmeas se baseiam no padrão da freqüência de modulação e outras características espectrais deste grito. Como os parâmetros acústicos do grito de um filhote mudam ao longo do período de amamentação por causa do seu crescimento, as mães precisam de uma memorização refinada da voz de seu filhote. Experiências de campo mostram que as fêmeas desta espécie s

  12. Self Assistive Technology for Disabled People – Voice Controlled Wheel Chair and Home Automation System

    Directory of Open Access Journals (Sweden)

    R. Puviarasi

    2014-07-01

    Full Text Available This paper describes the design of an innovative and low cost self-assistive technology that is used to facilitate the control of a wheelchair and home appliances by using advanced voice commands of the disabled people. This proposed system will provide an alternative to the physically challenged people with quadriplegics who is permanently unable to move their limbs (but who is able to speak and hear and elderly people in controlling the motion of the wheelchair and home appliances using their voices to lead an independent, confident and enjoyable life. The performance of this microcontroller based and voice integrated design is evaluated in terms of accuracy and velocity in various environments. The results show that it could be part of an assistive technology for the disabled persons without any third person’s assistance.

  13. Speech Recognition Technology for Disabilities Education

    Science.gov (United States)

    Tang, K. Wendy; Kamoua, Ridha; Sutan, Victor; Farooq, Omer; Eng, Gilbert; Chu, Wei Chern; Hou, Guofeng

    2005-01-01

    Speech recognition is an alternative to traditional methods of interacting with a computer, such as textual input through a keyboard. An effective system can replace or reduce the reliability on standard keyboard and mouse input. This can especially assist dyslexic students who have problems with character or word use and manipulation in a textual…

  14. Revisiting vocal perception in non-human animals: a review of vowel discrimination, speaker voice recognition, and speaker normalization

    Directory of Open Access Journals (Sweden)

    Buddhamas eKriengwatana

    2015-01-01

    Full Text Available The extent to which human speech perception evolved by taking advantage of predispositions and pre-existing features of vertebrate auditory and cognitive systems remains a central question in the evolution of speech. This paper reviews asymmetries in vowel perception, speaker voice recognition, and speaker normalization in non-human animals – topics that have not been thoroughly discussed in relation to the abilities of non-human animals, but are nonetheless important aspects of vocal perception. Throughout this paper we demonstrate that addressing these issues in non-human animals is relevant and worthwhile because many non-human animals must deal with similar issues in their natural environment. That is, they must also discriminate between similar-sounding vocalizations, determine signaler identity from vocalizations, and resolve signaler-dependent variation in vocalizations from conspecifics. Overall, we find that, although plausible, the current evidence is insufficiently strong to conclude that directional asymmetries in vowel perception are specific to humans, or that non-human animals can use voice characteristics to recognize human individuals. However, we do find some indication that non-human animals can normalize speaker differences. Accordingly, we identify avenues for future research that would greatly improve and advance our understanding of these topics.

  15. An investigation and comparison of speech recognition software for determining if bird song recordings contain legible human voices

    Directory of Open Access Journals (Sweden)

    Tim D. Hunt

    Full Text Available The purpose of this work was to test the effectiveness of using readily available speech recognition API services to determine if recordings of bird song had inadvertently recorded human voices. A mobile phone was used to record a human speaking at increasing distances from the phone in an outside setting with bird song occurring in the background. One of the services was trained with sample recordings and each service was compared for their ability to return recognized words. The services from Google and IBM performed similarly and the Microsoft service, that allowed training, performed slightly better. However, all three services failed to perform at a level that would enable recordings with recognizable human speech to be deleted in order to maintain full privacy protection.

  16. Giving Voice to Emotion: Voice Analysis Technology Uncovering Mental States is Playing a Growing Role in Medicine, Business, and Law Enforcement.

    Science.gov (United States)

    Allen, Summer

    2016-01-01

    It's tough to imagine anything more frustrating than interacting with a call center. Generally, people don't reach out to call centers when they?re happy-they're usually trying to get help with a problem or gearing up to do battle over a billing error. Add in an automatic phone tree, and you have a recipe for annoyance. But what if that robotic voice offering you a smorgasbord of numbered choices could tell that you were frustrated and then funnel you to an actual human being? This type of voice analysis technology exists, and it's just one example of the many ways that computers can use your voice to extract information about your mental and emotional state-including information you may not think of as being accessible through your voice alone.

  17. Students' Voices about Information and Communication Technology in Upper Secondary Schools

    Science.gov (United States)

    Olofsson, Anders D.; Lindberg, Ola J.; Fransson, Göran

    2018-01-01

    Purpose: The purpose of this paper is to explore upper secondary school students' voices on how information and communication technology (ICT) could structure and support their everyday activities and time at school. Design/methodology/approach: In all, 11 group interviews were conducted with a total of 46 students from three upper secondary…

  18. Growing Misconception of Technology: Investigation of Elementary Students' Recognition of and Reasoning about Technological Artifacts

    Science.gov (United States)

    Firat, Mehmet

    2017-01-01

    Knowledge of technology is an educational goal of science education. A primary way of increasing technology literacy in a society is to develop students' conception of technology starting from their elementary school years. However, there is a lack of research on student recognition of and reasoning about technology and technological artifacts. In…

  19. Three voices: women working in nuclear science and technology

    International Nuclear Information System (INIS)

    1999-01-01

    Nuclear science and technology is a fascinating and growing work area for women. This short video portrays three professional women working within this field for the International Atomic Energy Agency

  20. The Effects of Certain Background Noises on the Performance of a Voice Recognition System.

    Science.gov (United States)

    1980-09-01

    Principles in Experimental Design. New York: McGraw-Hill, 1962. Woodworth, R.S. and H. Schlosberg, Experimental Psychology, (Revised edition), New...collection iheet APPENDIX II EXPERIMENTAL PROTOCOL AND SUBJECTS’ INSTRICTJONS THIS IS AN EXPERIMENT DESIGNED TO EVALUJATE SOME ," lE RECOGNITION EQUIPMENT. I...37. CDR Paul Chatelier OUSD R&E Room 3D129 Pentagon Washington, D.C. 20301 38. Ralph Cleveland NFMSO Code 9333 Mechanicsburg, PA 17055 39. Clay Coler

  1. Voices Project: Technological Innovations in Social Inclusion of People with Visual Impairment

    Directory of Open Access Journals (Sweden)

    Janaina Cazini

    2013-04-01

    Full Text Available This article aims to analyze how technological innovations are contributing to inclusion of people with disabilities in society and at work, from the study of social innovations, assistive technology and digital inclusion presented in a case study on the Voices Project. The project, developed in partnership with the Association of Parents and Friends of the Blind and the Federal Technological University of Paraná, in the years 2008/2009, made a computer course for people with visual impairments. The theoretical survey and project data confirmed that social innovations really are essential tools for digital inclusion of people with disabilities contributing thus to their inclusion in the workplace.

  2. Technical Reviews on Pattern Recognition in Process Analytical Technology

    International Nuclear Information System (INIS)

    Kim, Jong Yun; Choi, Yong Suk; Ji, Sun Kyung; Park, Yong Joon; Song, Kyu Seok; Jung, Sung Hee

    2008-12-01

    Pattern recognition is one of the first and the most widely adopted chemometric tools among many active research area in chemometrics such as design of experiment(DoE), pattern recognition, multivariate calibration, signal processing. Pattern recognition has been used to identify the origin of a wine and the time of year that the vine was grown by using chromatography, cause of fire by using GC/MS chromatography, detection of explosives and land mines, cargo and luggage inspection in seaports and airports by using a prompt gamma-ray activation analysis, and source apportionment of environmental pollutant by using a stable isotope ratio mass spectrometry. Recently, pattern recognition has been taken into account as a major chemometric tool in the so-called 'process analytical technology (PAT)', which is a newly-developed concept in the area of process analytics proposed by US Food and Drug Administration (US FDA). For instance, identification of raw material by using a pattern recognition analysis plays an important role for the effective quality control of the production process. Recently, pattern recognition technique has been used to identify the spatial distribution and uniformity of the active ingredients present in the product such as tablet by transforming the chemical data into the visual information

  3. Implementation of a Tour Guide Robot System Using RFID Technology and Viterbi Algorithm-Based HMM for Speech Recognition

    Directory of Open Access Journals (Sweden)

    Neng-Sheng Pai

    2014-01-01

    Full Text Available This paper applied speech recognition and RFID technologies to develop an omni-directional mobile robot into a robot with voice control and guide introduction functions. For speech recognition, the speech signals were captured by short-time processing. The speaker first recorded the isolated words for the robot to create speech database of specific speakers. After the speech pre-processing of this speech database, the feature parameters of cepstrum and delta-cepstrum were obtained using linear predictive coefficient (LPC. Then, the Hidden Markov Model (HMM was used for model training of the speech database, and the Viterbi algorithm was used to find an optimal state sequence as the reference sample for speech recognition. The trained reference model was put into the industrial computer on the robot platform, and the user entered the isolated words to be tested. After processing by the same reference model and comparing with previous reference model, the path of the maximum total probability in various models found using the Viterbi algorithm in the recognition was the recognition result. Finally, the speech recognition and RFID systems were achieved in an actual environment to prove its feasibility and stability, and implemented into the omni-directional mobile robot.

  4. Economic Evaluation of Voice Recognition (VR) for the Clinician's Desktop at the Naval Hospital Roosevelt Roads

    National Research Council Canada - National Science Library

    1997-01-01

    This thesis investigates the current status of VR technology, its use in support of Joint vision 2010, its use in the Healthcare environment and provides an analysis of the VR Pilot Project at NHRR...

  5. Early recognition of technological opportunities. Realization and perspectives

    International Nuclear Information System (INIS)

    Stegelmann, H.U.; Peters, H.P.; Stein, G.; Muench, E.

    1988-03-01

    In cooperation with the American consulting company Arthur D. Little, a number of procedures, including evaluation of literature data banks, expert interviews and expert workshops, were tried. A three-step concept was finally developed involving identification of candidate technologies (identification), collection of information on these candidates (exploration), ultimately leading to an assessment of the candidate technologies (evaluation). Such a procedure basically enables long-term observation of the scientific policy decisions. This information may serve to identify the deficits and strength of the German scientific system in comparison to that of other countries. Such a system permits the survey and documentation of scientists' subjective expectations on the trends of technology developments and the associated economic and other social consequences. It became apparent that this concept should not raise expectations too high and that it is not essentially different from the advisory instruments already employed today (advisory councils, expert consultants), but rather that these established procedures are merely systematized and supplemented by further information sources (e.g. data banks). In implementing this study two central sets of problems were identified which must be overcome: The early recognition of opportunities is in the long run based on analysts infiltrating the existing network of specialist scientists and examining the information in circulation there with respect to the aims of early recognition so that access to this network is a decisive requirement for an institutionalization of early recognition; incentive systems must be created motivating scientists to become actively involved in the early recognition of technological opportunities. (orig./HP) [de

  6. Application of Video Recognition Technology in Landslide Monitoring System

    Directory of Open Access Journals (Sweden)

    Qingjia Meng

    2018-01-01

    Full Text Available The video recognition technology is applied to the landslide emergency remote monitoring system. The trajectories of the landslide are identified by this system in this paper. The system of geological disaster monitoring is applied synthetically to realize the analysis of landslide monitoring data and the combination of video recognition technology. Landslide video monitoring system will video image information, time point, network signal strength, power supply through the 4G network transmission to the server. The data is comprehensively analysed though the remote man-machine interface to conduct to achieve the threshold or manual control to determine the front-end video surveillance system. The system is used to identify the target landslide video for intelligent identification. The algorithm is embedded in the intelligent analysis module, and the video frame is identified, detected, analysed, filtered, and morphological treatment. The algorithm based on artificial intelligence and pattern recognition is used to mark the target landslide in the video screen and confirm whether the landslide is normal. The landslide video monitoring system realizes the remote monitoring and control of the mobile side, and provides a quick and easy monitoring technology.

  7. Identification of Alfalfa Leaf Diseases Using Image Recognition Technology.

    Directory of Open Access Journals (Sweden)

    Feng Qin

    Full Text Available Common leaf spot (caused by Pseudopeziza medicaginis, rust (caused by Uromyces striatus, Leptosphaerulina leaf spot (caused by Leptosphaerulina briosiana and Cercospora leaf spot (caused by Cercospora medicaginis are the four common types of alfalfa leaf diseases. Timely and accurate diagnoses of these diseases are critical for disease management, alfalfa quality control and the healthy development of the alfalfa industry. In this study, the identification and diagnosis of the four types of alfalfa leaf diseases were investigated using pattern recognition algorithms based on image-processing technology. A sub-image with one or multiple typical lesions was obtained by artificial cutting from each acquired digital disease image. Then the sub-images were segmented using twelve lesion segmentation methods integrated with clustering algorithms (including K_means clustering, fuzzy C-means clustering and K_median clustering and supervised classification algorithms (including logistic regression analysis, Naive Bayes algorithm, classification and regression tree, and linear discriminant analysis. After a comprehensive comparison, the segmentation method integrating the K_median clustering algorithm and linear discriminant analysis was chosen to obtain lesion images. After the lesion segmentation using this method, a total of 129 texture, color and shape features were extracted from the lesion images. Based on the features selected using three methods (ReliefF, 1R and correlation-based feature selection, disease recognition models were built using three supervised learning methods, including the random forest, support vector machine (SVM and K-nearest neighbor methods. A comparison of the recognition results of the models was conducted. The results showed that when the ReliefF method was used for feature selection, the SVM model built with the most important 45 features (selected from a total of 129 features was the optimal model. For this SVM model, the

  8. Identification of Alfalfa Leaf Diseases Using Image Recognition Technology

    Science.gov (United States)

    Qin, Feng; Liu, Dongxia; Sun, Bingda; Ruan, Liu; Ma, Zhanhong; Wang, Haiguang

    2016-01-01

    Common leaf spot (caused by Pseudopeziza medicaginis), rust (caused by Uromyces striatus), Leptosphaerulina leaf spot (caused by Leptosphaerulina briosiana) and Cercospora leaf spot (caused by Cercospora medicaginis) are the four common types of alfalfa leaf diseases. Timely and accurate diagnoses of these diseases are critical for disease management, alfalfa quality control and the healthy development of the alfalfa industry. In this study, the identification and diagnosis of the four types of alfalfa leaf diseases were investigated using pattern recognition algorithms based on image-processing technology. A sub-image with one or multiple typical lesions was obtained by artificial cutting from each acquired digital disease image. Then the sub-images were segmented using twelve lesion segmentation methods integrated with clustering algorithms (including K_means clustering, fuzzy C-means clustering and K_median clustering) and supervised classification algorithms (including logistic regression analysis, Naive Bayes algorithm, classification and regression tree, and linear discriminant analysis). After a comprehensive comparison, the segmentation method integrating the K_median clustering algorithm and linear discriminant analysis was chosen to obtain lesion images. After the lesion segmentation using this method, a total of 129 texture, color and shape features were extracted from the lesion images. Based on the features selected using three methods (ReliefF, 1R and correlation-based feature selection), disease recognition models were built using three supervised learning methods, including the random forest, support vector machine (SVM) and K-nearest neighbor methods. A comparison of the recognition results of the models was conducted. The results showed that when the ReliefF method was used for feature selection, the SVM model built with the most important 45 features (selected from a total of 129 features) was the optimal model. For this SVM model, the

  9. Intelligent Facial Recognition Systems: Technology advancements for security applications

    Energy Technology Data Exchange (ETDEWEB)

    Beer, C.L.

    1993-07-01

    Insider problems such as theft and sabotage can occur within the security and surveillance realm of operations when unauthorized people obtain access to sensitive areas. A possible solution to these problems is a means to identify individuals (not just credentials or badges) in a given sensitive area and provide full time personnel accountability. One approach desirable at Department of Energy facilities for access control and/or personnel identification is an Intelligent Facial Recognition System (IFRS) that is non-invasive to personnel. Automatic facial recognition does not require the active participation of the enrolled subjects, unlike most other biological measurement (biometric) systems (e.g., fingerprint, hand geometry, or eye retinal scan systems). It is this feature that makes an IFRS attractive for applications other than access control such as emergency evacuation verification, screening, and personnel tracking. This paper discusses current technology that shows promising results for DOE and other security applications. A survey of research and development in facial recognition identified several companies and universities that were interested and/or involved in the area. A few advanced prototype systems were also identified. Sandia National Laboratories is currently evaluating facial recognition systems that are in the advanced prototype stage. The initial application for the evaluation is access control in a controlled environment with a constant background and with cooperative subjects. Further evaluations will be conducted in a less controlled environment, which may include a cluttered background and subjects that are not looking towards the camera. The outcome of the evaluations will help identify areas of facial recognition systems that need further development and will help to determine the effectiveness of the current systems for security applications.

  10. Feasibility of automated speech sample collection with stuttering children using interactive voice response (IVR) technology.

    Science.gov (United States)

    Vogel, Adam P; Block, Susan; Kefalianos, Elaina; Onslow, Mark; Eadie, Patricia; Barth, Ben; Conway, Laura; Mundt, James C; Reilly, Sheena

    2015-04-01

    To investigate the feasibility of adopting automated interactive voice response (IVR) technology for remotely capturing standardized speech samples from stuttering children. Participants were 10 6-year-old stuttering children. Their parents called a toll-free number from their homes and were prompted to elicit speech from their children using a standard protocol involving conversation, picture description and games. The automated IVR system was implemented using an off-the-shelf telephony software program and delivered by a standard desktop computer. The software infrastructure utilizes voice over internet protocol. Speech samples were automatically recorded during the calls. Video recordings were simultaneously acquired in the home at the time of the call to evaluate the fidelity of the telephone collected samples. Key outcome measures included syllables spoken, percentage of syllables stuttered and an overall rating of stuttering severity using a 10-point scale. Data revealed a high level of relative reliability in terms of intra-class correlation between the video and telephone acquired samples on all outcome measures during the conversation task. Findings were less consistent for speech samples during picture description and games. Results suggest that IVR technology can be used successfully to automate remote capture of child speech samples.

  11. Making women's voices heard: technological change and women's employment in Malaysia.

    Science.gov (United States)

    Ng Choon Sim, C

    1999-01-01

    This paper examines the 1994-96 UN University Institute for New Technologies policy research project on technological change and women's employment in Asia. The project was conducted to provide a voice for nongovernmental organizations (NGOs) representing women workers. It focuses on the Malaysian experience in terms of the impact of technology on women's work and employment in the telecommunications and electronic industry. The results of the NGO research project revealed that the shift to a more intensive production has no uniform impact on women. Although new jobs were created, women employment status remains vulnerable. Meaning, female workers are afraid of the technological redundancy, casualization of labor, as well as health and safety hazards associated with new technology. A good example of the effect of industrialization to women¿s rights is the situation in Malaysia. Although cutting edge technology, combined with restructuring, has yielded some benefits in terms of a vastly expanded network and services, better performances and economies of scale, employment situation of the majority of women still remained in the low-skilled or semi-skilled categories. In order to upgrade women employment status along with the technological advancement, open communication and cooperation of all types is needed to ensure a successful outcome.

  12. Developing and modeling of voice control system for prosthetic robot arm in medical systems

    Directory of Open Access Journals (Sweden)

    Koksal Gundogdu

    2018-04-01

    Full Text Available In parallel with the development of technology, various control methods are also developed. Voice control system is one of these control methods. In this study, an effective modelling upon mathematical models used in the literature is performed, and a voice control system is developed in order to control prosthetic robot arms. The developed control system has been applied on four-jointed RRRR robot arm. Implementation tests were performed on the designed system. As a result of the tests; it has been observed that the technique utilized in our system achieves about 11% more efficient voice recognition than currently used techniques in the literature. With the improved mathematical modelling, it has been shown that voice commands could be effectively used for controlling the prosthetic robot arm. Keywords: Voice recognition model, Voice control, Prosthetic robot arm, Robotic control, Forward kinematic

  13. The Army word recognition system

    Science.gov (United States)

    Hadden, David R.; Haratz, David

    1977-01-01

    The application of speech recognition technology in the Army command and control area is presented. The problems associated with this program are described as well as as its relevance in terms of the man/machine interactions, voice inflexions, and the amount of training needed to interact with and utilize the automated system.

  14. Innovative Technology for the Assisted Delivery of Intensive Voice Treatment (LSVT[R]LOUD) for Parkinson Disease

    Science.gov (United States)

    Halpern, Angela E.; Ramig, Lorraine O.; Matos, Carlos E. C.; Petska-Cable, Jill A.; Spielman, Jennifer L.; Pogoda, Janice M.; Gilley, Phillip M.; Sapir, Shimon; Bennett, John K.; McFarland, David H.

    2012-01-01

    Purpose: To assess the feasibility and effectiveness of a newly developed assistive technology system, Lee Silverman Voice Treatment Companion (LSVT[R] Companion[TM], hereafter referred to as "Companion"), to support the delivery of LSVT[R]LOUD, an efficacious speech intervention for individuals with Parkinson disease (PD). Method: Sixteen…

  15. Adoption of Speech Recognition Technology in Community Healthcare Nursing.

    Science.gov (United States)

    Al-Masslawi, Dawood; Block, Lori; Ronquillo, Charlene

    2016-01-01

    Adoption of new health information technology is shown to be challenging. However, the degree to which new technology will be adopted can be predicted by measures of usefulness and ease of use. In this work these key determining factors are focused on for design of a wound documentation tool. In the context of wound care at home, consistent with evidence in the literature from similar settings, use of Speech Recognition Technology (SRT) for patient documentation has shown promise. To achieve a user-centred design, the results from a conducted ethnographic fieldwork are used to inform SRT features; furthermore, exploratory prototyping is used to collect feedback about the wound documentation tool from home care nurses. During this study, measures developed for healthcare applications of the Technology Acceptance Model will be used, to identify SRT features that improve usefulness (e.g. increased accuracy, saving time) or ease of use (e.g. lowering mental/physical effort, easy to remember tasks). The identified features will be used to create a low fidelity prototype that will be evaluated in future experiments.

  16. Voice, Schooling, Inequality, and Scale

    Science.gov (United States)

    Collins, James

    2013-01-01

    The rich studies in this collection show that the investigation of voice requires analysis of "recognition" across layered spatial-temporal and sociolinguistic scales. I argue that the concepts of voice, recognition, and scale provide insight into contemporary educational inequality and that their study benefits, in turn, from paying attention to…

  17. Giving children voice in the design of technology for education in the developing world

    Directory of Open Access Journals (Sweden)

    Helene Gelderblom

    2014-10-01

    Full Text Available Of the numerous projects that involve ICTs to solve the problems of the developing world, many are unsuccessful. Reasons include lack of attention to how the human and social systems need to adapt to the new technologies, problems with the intent of the initiators, and lack of user involvement. Focusing on the design of ICT for education and acknowledging the range of complex reasons for possible failure, this article focuses on lack on involvement of end users (specifically children in the design and development of ICT solutions. Children in the developing world are not given voice when it comes to the design of technology aimed at providing them with better education. Through examination of the concept of “children’s voice” as well as through discussion of a practical design case to support underprivileged children in South Africa, this article shows that (1 listening to children requires that adult co-designers have the correct attitude towards their child partners and that they are committed to really hearing them; (2 power relations and context plays an important role in the contribution children can make; and (3 South African children have the ability to provide essential input into the design of technology for education.

  18. Development of Personalized Urination Recognition Technology Using Smart Bands

    Directory of Open Access Journals (Sweden)

    Sung-Jong Eun

    2017-04-01

    Full Text Available Purpose This study collected and analyzed activity data sensed through smart bands worn by patients in order to resolve the clinical issues posed by using voiding charts. By developing a smart band-based algorithm for recognizing urination activity in patients, this study aimed to explore the feasibility of urination monitoring systems. Methods This study aimed to develop an algorithm that recognizes urination based on a patient’s posture and changes in posture. Motion data was obtained from a smart band on the arm. An algorithm that recognizes the 3 stages of urination (forward movement, urination, backward movement was developed based on data collected from a 3-axis accelerometer and from tilt angle data. Real-time data were acquired from the smart band, and for data corresponding to a certain duration, the absolute value of the signals was calculated and then compared with the set threshold value to determine the occurrence of vibration signals. In feature extraction, the most essential information describing each pattern was identified after analyzing the characteristics of the data. The results of the feature extraction process were sorted using a classifier to detect urination. Results An experiment was carried out to assess the performance of the recognition technology proposed in this study. The final accuracy of the algorithm was calculated based on clinical guidelines for urologists. The experiment showed a high average accuracy of 90.4%, proving the robustness of the proposed algorithm. Conclusions The proposed urination recognition technology draws on acceleration data and tilt angle data collected via a smart band; these data were then analyzed using a classifier after comparative analyses with standardized feature patterns.

  19. Cost-Sensitive Learning for Emotion Robust Speaker Recognition

    Directory of Open Access Journals (Sweden)

    Dongdong Li

    2014-01-01

    Full Text Available In the field of information security, voice is one of the most important parts in biometrics. Especially, with the development of voice communication through the Internet or telephone system, huge voice data resources are accessed. In speaker recognition, voiceprint can be applied as the unique password for the user to prove his/her identity. However, speech with various emotions can cause an unacceptably high error rate and aggravate the performance of speaker recognition system. This paper deals with this problem by introducing a cost-sensitive learning technology to reweight the probability of test affective utterances in the pitch envelop level, which can enhance the robustness in emotion-dependent speaker recognition effectively. Based on that technology, a new architecture of recognition system as well as its components is proposed in this paper. The experiment conducted on the Mandarin Affective Speech Corpus shows that an improvement of 8% identification rate over the traditional speaker recognition is achieved.

  20. Cost-sensitive learning for emotion robust speaker recognition.

    Science.gov (United States)

    Li, Dongdong; Yang, Yingchun; Dai, Weihui

    2014-01-01

    In the field of information security, voice is one of the most important parts in biometrics. Especially, with the development of voice communication through the Internet or telephone system, huge voice data resources are accessed. In speaker recognition, voiceprint can be applied as the unique password for the user to prove his/her identity. However, speech with various emotions can cause an unacceptably high error rate and aggravate the performance of speaker recognition system. This paper deals with this problem by introducing a cost-sensitive learning technology to reweight the probability of test affective utterances in the pitch envelop level, which can enhance the robustness in emotion-dependent speaker recognition effectively. Based on that technology, a new architecture of recognition system as well as its components is proposed in this paper. The experiment conducted on the Mandarin Affective Speech Corpus shows that an improvement of 8% identification rate over the traditional speaker recognition is achieved.

  1. Automatic Speech Recognition Systems for the Evaluation of Voice and Speech Disorders in Head and Neck Cancer

    OpenAIRE

    Andreas Maier; Tino Haderlein; Florian Stelzle; Elmar Nöth; Emeka Nkenke; Frank Rosanowski; Anne Schützenberger; Maria Schuster

    2010-01-01

    In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appropriate for objective and quick evaluation of intelligibility. In this study we investigate the applicability of the method to speech disorders caused by head and neck cancer. Intelligibility was quantified by speech recognition on recordings of a standard text read by 41 German laryngect...

  2. Emerging technologies with potential for objectively evaluating speech recognition skills.

    Science.gov (United States)

    Rawool, Vishakha Waman

    2016-01-01

    Work-related exposure to noise and other ototoxins can cause damage to the cochlea, synapses between the inner hair cells, the auditory nerve fibers, and higher auditory pathways, leading to difficulties in recognizing speech. Procedures designed to determine speech recognition scores (SRS) in an objective manner can be helpful in disability compensation cases where the worker claims to have poor speech perception due to exposure to noise or ototoxins. Such measures can also be helpful in determining SRS in individuals who cannot provide reliable responses to speech stimuli, including patients with Alzheimer's disease, traumatic brain injuries, and infants with and without hearing loss. Cost-effective neural monitoring hardware and software is being rapidly refined due to the high demand for neurogaming (games involving the use of brain-computer interfaces), health, and other applications. More specifically, two related advances in neuro-technology include relative ease in recording neural activity and availability of sophisticated analysing techniques. These techniques are reviewed in the current article and their applications for developing objective SRS procedures are proposed. Issues related to neuroaudioethics (ethics related to collection of neural data evoked by auditory stimuli including speech) and neurosecurity (preservation of a person's neural mechanisms and free will) are also discussed.

  3. Smart Homes with Voice Activated Systems for Disabled People

    OpenAIRE

    Bekir Busatlic; Nejdet Dogru; Isaac Lera; Enes Sukic

    2017-01-01

    Smart home refers to the application of various technologies to semi-unsupervised home control It refers to systems that control temperature, lighting, door locks, windows and many other appliances. The aim of this study was to design a system that will use existing technology to showcase how it can benefit people with disabilities. This work uses only off-the-shelf products (smart home devices and controllers), speech recognition technology, open-source code libraries. The Voice Activated Sm...

  4. Smartphone App for Voice Disorders

    Science.gov (United States)

    ... on. Feature: Taste, Smell, Hearing, Language, Voice, Balance Smartphone App for Voice Disorders Past Issues / Fall 2013 ... developed a mobile monitoring device that relies on smartphone technology to gather a week's worth of talking, ...

  5. Low-Cost Implementation of a Named Entity Recognition System for Voice-Activated Human-Appliance Interfaces in a Smart Home

    Directory of Open Access Journals (Sweden)

    Geonwoo Park

    2018-02-01

    Full Text Available When we develop voice-activated human-appliance interface systems in smart homes, named entity recognition (NER is an essential tool for extracting execution targets from natural language commands. Previous studies on NER systems generally include supervised machine-learning methods that require a substantial amount of human-annotated training corpus. In the smart home environment, categories of named entities should be defined according to voice-activated devices (e.g., food names for refrigerators and song titles for music players. The previous machine-learning methods make it difficult to change categories of named entities because a large amount of the training corpus should be newly constructed by hand. To address this problem, we present a semi-supervised NER system to minimize the time-consuming and labor-intensive task of constructing the training corpus. Our system uses distant supervision methods with two kinds of auto-labeling processes: auto-labeling based on heuristic rules for single-class named entity corpus generation and auto-labeling based on a pre-trained single-class NER model for multi-class named entity corpus generation. Then, our system improves NER accuracy by using a bagging-based active learning method. In our experiments that included a generic domain that featured 11 named entity classes and a context-specific domain about baseball that featured 21 named entity classes, our system demonstrated good performances in both domains, with F1-measures of 0.777 and 0.958, respectively. Since our system was built from a relatively small human-annotated training corpus, we believe it is a viable alternative to current NER systems in smart home environments.

  6. Recognition

    DEFF Research Database (Denmark)

    Gimmler, Antje

    2017-01-01

    In this article, I shall examine the cognitive, heuristic and theoretical functions of the concept of recognition. To evaluate both the explanatory power and the limitations of a sociological concept, the theory construction must be analysed and its actual productivity for sociological theory mus...

  7. Effect of Acting Experience on Emotion Expression and Recognition in Voice: Non-Actors Provide Better Stimuli than Expected.

    Science.gov (United States)

    Jürgens, Rebecca; Grass, Annika; Drolet, Matthis; Fischer, Julia

    Both in the performative arts and in emotion research, professional actors are assumed to be capable of delivering emotions comparable to spontaneous emotional expressions. This study examines the effects of acting training on vocal emotion depiction and recognition. We predicted that professional actors express emotions in a more realistic fashion than non-professional actors. However, professional acting training may lead to a particular speech pattern; this might account for vocal expressions by actors that are less comparable to authentic samples than the ones by non-professional actors. We compared 80 emotional speech tokens from radio interviews with 80 re-enactments by professional and inexperienced actors, respectively. We analyzed recognition accuracies for emotion and authenticity ratings and compared the acoustic structure of the speech tokens. Both play-acted conditions yielded similar recognition accuracies and possessed more variable pitch contours than the spontaneous recordings. However, professional actors exhibited signs of different articulation patterns compared to non-trained speakers. Our results indicate that for emotion research, emotional expressions by professional actors are not better suited than those from non-actors.

  8. Preliminary Analysis of Automatic Speech Recognition and Synthesis Technology.

    Science.gov (United States)

    1983-05-01

    ANDELES CA 0 SHDAP ET AL MAY 93 UNCISSIFED UCG -020-8 MDA04-8’-C-415F/ 17/2 N mE = h IEEE 11111 10’ ~ 2.0 11-41 & 11111I25IID MICROCOPY RESOLUTION TEST...speech. Private industry, which sees a major market for improved speech recognition systems, is attempting to solve the problems involved in...manufacturer is able to market such a recognition system. A second requirement for the spotting of keywords in distress signals concerns the need for a

  9. Digital Technologies for Promoting "Student Voice" and Co-Creating Learning Experience in an Academic Course

    Science.gov (United States)

    Blau, Ina; Shamir-Inbal, Tamar

    2018-01-01

    "Student voice" (SV) refers to listening to and valuing students' views regarding their learning experiences, as well as treating them as equal partners in the evaluation process. This is expected, in turn, to empower students to take a more active role in shaping their learning. This study explores the role played by digital…

  10. Investigation of air transportation technology at Princeton University, 1983

    Science.gov (United States)

    Stengel, Robert F.

    1987-01-01

    Progress is discussed for each of the following areas: voice recognition technology for flight control; guidance and control strategies for penetration of microbursts and wind shear; application of artificial intelligence in flight control systems; and computer-aided aircraft design.

  11. DolphinAtack: Inaudible Voice Commands

    OpenAIRE

    Zhang, Guoming; Yan, Chen; Ji, Xiaoyu; Zhang, Taimin; Zhang, Tianchen; Xu, Wenyuan

    2017-01-01

    Speech recognition (SR) systems such as Siri or Google Now have become an increasingly popular human-computer interaction method, and have turned various systems into voice controllable systems(VCS). Prior work on attacking VCS shows that the hidden voice commands that are incomprehensible to people can control the systems. Hidden voice commands, though hidden, are nonetheless audible. In this work, we design a completely inaudible attack, DolphinAttack, that modulates voice commands on ultra...

  12. The Affordance of Speech Recognition Technology for EFL Learning in an Elementary School Setting

    Science.gov (United States)

    Liaw, Meei-Ling

    2014-01-01

    This study examined the use of speech recognition (SR) technology to support a group of elementary school children's learning of English as a foreign language (EFL). SR technology has been used in various language learning contexts. Its application to EFL teaching and learning is still relatively recent, but a solid understanding of its…

  13. The expression and recognition of emotions in the voice across five nations: A lens model analysis based on acoustic features.

    Science.gov (United States)

    Laukka, Petri; Elfenbein, Hillary Anger; Thingujam, Nutankumar S; Rockstuhl, Thomas; Iraki, Frederick K; Chui, Wanda; Althoff, Jean

    2016-11-01

    This study extends previous work on emotion communication across cultures with a large-scale investigation of the physical expression cues in vocal tone. In doing so, it provides the first direct test of a key proposition of dialect theory, namely that greater accuracy of detecting emotions from one's own cultural group-known as in-group advantage-results from a match between culturally specific schemas in emotional expression style and culturally specific schemas in emotion recognition. Study 1 used stimuli from 100 professional actors from five English-speaking nations vocally conveying 11 emotional states (anger, contempt, fear, happiness, interest, lust, neutral, pride, relief, sadness, and shame) using standard-content sentences. Detailed acoustic analyses showed many similarities across groups, and yet also systematic group differences. This provides evidence for cultural accents in expressive style at the level of acoustic cues. In Study 2, listeners evaluated these expressions in a 5 × 5 design balanced across groups. Cross-cultural accuracy was greater than expected by chance. However, there was also in-group advantage, which varied across emotions. A lens model analysis of fundamental acoustic properties examined patterns in emotional expression and perception within and across groups. Acoustic cues were used relatively similarly across groups both to produce and judge emotions, and yet there were also subtle cultural differences. Speakers appear to have a culturally nuanced schema for enacting vocal tones via acoustic cues, and perceivers have a culturally nuanced schema in judging them. Consistent with dialect theory's prediction, in-group judgments showed a greater match between these schemas used for emotional expression and perception. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  14. Speech recognition technology: an outlook for human-to-machine interaction.

    Science.gov (United States)

    Erdel, T; Crooks, S

    2000-01-01

    Speech recognition, as an enabling technology in healthcare-systems computing, is a topic that has been discussed for quite some time, but is just now coming to fruition. Traditionally, speech-recognition software has been constrained by hardware, but improved processors and increased memory capacities are starting to remove some of these limitations. With these barriers removed, companies that create software for the healthcare setting have the opportunity to write more successful applications. Among the criticisms of speech-recognition applications are the high rates of error and steep training curves. However, even in the face of such negative perceptions, there remains significant opportunities for speech recognition to allow healthcare providers and, more specifically, physicians, to work more efficiently and ultimately spend more time with their patients and less time completing necessary documentation. This article will identify opportunities for inclusion of speech-recognition technology in the healthcare setting and examine major categories of speech-recognition software--continuous speech recognition, command and control, and text-to-speech. We will discuss the advantages and disadvantages of each area, the limitations of the software today, and how future trends might affect them.

  15. Frequency and analysis of non-clinical errors made in radiology reports using the National Integrated Medical Imaging System voice recognition dictation software.

    Science.gov (United States)

    Motyer, R E; Liddy, S; Torreggiani, W C; Buckley, O

    2016-11-01

    Voice recognition (VR) dictation of radiology reports has become the mainstay of reporting in many institutions worldwide. Despite benefit, such software is not without limitations, and transcription errors have been widely reported. Evaluate the frequency and nature of non-clinical transcription error using VR dictation software. Retrospective audit of 378 finalised radiology reports. Errors were counted and categorised by significance, error type and sub-type. Data regarding imaging modality, report length and dictation time was collected. 67 (17.72 %) reports contained ≥1 errors, with 7 (1.85 %) containing 'significant' and 9 (2.38 %) containing 'very significant' errors. A total of 90 errors were identified from the 378 reports analysed, with 74 (82.22 %) classified as 'insignificant', 7 (7.78 %) as 'significant', 9 (10 %) as 'very significant'. 68 (75.56 %) errors were 'spelling and grammar', 20 (22.22 %) 'missense' and 2 (2.22 %) 'nonsense'. 'Punctuation' error was most common sub-type, accounting for 27 errors (30 %). Complex imaging modalities had higher error rates per report and sentence. Computed tomography contained 0.040 errors per sentence compared to plain film with 0.030. Longer reports had a higher error rate, with reports >25 sentences containing an average of 1.23 errors per report compared to 0-5 sentences containing 0.09. These findings highlight the limitations of VR dictation software. While most error was deemed insignificant, there were occurrences of error with potential to alter report interpretation and patient management. Longer reports and reports on more complex imaging had higher error rates and this should be taken into account by the reporting radiologist.

  16. Automatic speech recognition for report generation in computed tomography

    International Nuclear Information System (INIS)

    Teichgraeber, U.K.M.; Ehrenstein, T.; Lemke, M.; Liebig, T.; Stobbe, H.; Hosten, N.; Keske, U.; Felix, R.

    1999-01-01

    Purpose: A study was performed to compare the performance of automatic speech recognition (ASR) with conventional transcription. Materials and Methods: 100 CT reports were generated by using ASR and 100 CT reports were dictated and written by medical transcriptionists. The time for dictation and correction of errors by the radiologist was assessed and the type of mistakes was analysed. The text recognition rate was calculated in both groups and the average time between completion of the imaging study by the technologist and generation of the written report was assessed. A commercially available speech recognition technology (ASKA Software, IBM Via Voice) running of a personal computer was used. Results: The time for the dictation using digital voice recognition was 9.4±2.3 min compared to 4.5±3.6 min with an ordinary Dictaphone. The text recognition rate was 97% with digital voice recognition and 99% with medical transcriptionists. The average time from imaging completion to written report finalisation was reduced from 47.3 hours with medical transcriptionists to 12.7 hours with ASR. The analysis of misspellings demonstrated (ASR vs. medical transcriptionists): 3 vs. 4 for syntax errors, 0 vs. 37 orthographic mistakes, 16 vs. 22 mistakes in substance and 47 vs. erroneously applied terms. Conclusions: The use of digital voice recognition as a replacement for medical transcription is recommendable when an immediate availability of written reports is necessary. (orig.) [de

  17. Voice application development for Android

    CERN Document Server

    McTear, Michael

    2013-01-01

    This book will give beginners an introduction to building voice-based applications on Android. It will begin by covering the basic concepts and will build up to creating a voice-based personal assistant. By the end of this book, you should be in a position to create your own voice-based applications on Android from scratch in next to no time.Voice Application Development for Android is for all those who are interested in speech technology and for those who, as owners of Android devices, are keen to experiment with developing voice apps for their devices. It will also be useful as a starting po

  18. Contribution to automatic image recognition applied to robot technology

    International Nuclear Information System (INIS)

    Juvin, Didier

    1983-01-01

    This paper describes a method for the analysis and interpretation of the images of objects located in a plain scene which is the environment of a robot. The first part covers the recovery of the contour of objects present in the image, and discusses a novel contour-following technique based on the line arborescence concept in combination with a 'cost function' giving a quantitative assessment of contour quality. We present heuristics for moderate-cost, minimum-time arborescence coverage, which is equivalent to following probable contour lines in the image. A contour segmentation technique, invariant in the translational and rotational modes, is presented next. The second part describes a recognition method based on the above invariant encoding: the algorithm performs a preliminary screening based on coarse data derived from segmentation, followed by a comparison of forms with probable identity through application of a distance specified in terms of the invariant encoding. The last part covers the outcome of the above investigations, which have found an industrial application in the vision system of a range of robots. The system is set up in a 16-bit microprocessor and operates in real time. (author) [fr

  19. Use of Handwriting Recognition Technologies in Tablet-Based Learning Modules for First Grade Education

    Science.gov (United States)

    Yanikoglu, Berrin; Gogus, Aytac; Inal, Emre

    2017-01-01

    Learning through modules on a tablet helps students participate effectively in learning activities in classrooms and provides flexibility in the learning process. This study presents the design and evaluation of an application that is based on handwriting recognition technologies and e-content for the developed learning modules. The application…

  20. Machine Learning for Text-Independent Speaker Verification : How to Teach a Machine to RecognizeHuman Voices

    OpenAIRE

    Imoscopi, Stefano

    2016-01-01

    The aim of speaker recognition and veri cation is to identify people's identity from the characteristics of their voices (voice biometrics). Traditionally this technology has been employed mostly for security or authentication purposes, identi cation of employees/customers and criminal investigations. During the last decade the increasing popularity of hands-free and voice-controlled systems and the massive growth of media content generated on the internet has increased the need for technique...

  1. The use of open and machine vision technologies for development of gesture recognition intelligent systems

    Science.gov (United States)

    Cherkasov, Kirill V.; Gavrilova, Irina V.; Chernova, Elena V.; Dokolin, Andrey S.

    2018-05-01

    The article is devoted to reflection of separate aspects of intellectual system gesture recognition development. The peculiarity of the system is its intellectual block which completely based on open technologies: OpenCV library and Microsoft Cognitive Toolkit (CNTK) platform. The article presents the rationale for the choice of such set of tools, as well as the functional scheme of the system and the hierarchy of its modules. Experiments have shown that the system correctly recognizes about 85% of images received from sensors. The authors assume that the improvement of the algorithmic block of the system will increase the accuracy of gesture recognition up to 95%.

  2. Performance Evaluation of Speech Recognition Systems as a Next-Generation Pilot-Vehicle Interface Technology

    Science.gov (United States)

    Arthur, Jarvis J., III; Shelton, Kevin J.; Prinzel, Lawrence J., III; Bailey, Randall E.

    2016-01-01

    During the flight trials known as Gulfstream-V Synthetic Vision Systems Integrated Technology Evaluation (GV-SITE), a Speech Recognition System (SRS) was used by the evaluation pilots. The SRS system was intended to be an intuitive interface for display control (rather than knobs, buttons, etc.). This paper describes the performance of the current "state of the art" Speech Recognition System (SRS). The commercially available technology was evaluated as an application for possible inclusion in commercial aircraft flight decks as a crew-to-vehicle interface. Specifically, the technology is to be used as an interface from aircrew to the onboard displays, controls, and flight management tasks. A flight test of a SRS as well as a laboratory test was conducted.

  3. Audiovisual speech facilitates voice learning.

    Science.gov (United States)

    Sheffert, Sonya M; Olson, Elizabeth

    2004-02-01

    In this research, we investigated the effects of voice and face information on the perceptual learning of talkers and on long-term memory for spoken words. In the first phase, listeners were trained over several days to identify voices from words presented auditorily or audiovisually. The training data showed that visual information about speakers enhanced voice learning, revealing cross-modal connections in talker processing akin to those observed in speech processing. In the second phase, the listeners completed an auditory or audiovisual word recognition memory test in which equal numbers of words were spoken by familiar and unfamiliar talkers. The data showed that words presented by familiar talkers were more likely to be retrieved from episodic memory, regardless of modality. Together, these findings provide new information about the representational code underlying familiar talker recognition and the role of stimulus familiarity in episodic word recognition.

  4. Recognition and development of "educational technology" as a scientific field and school subject

    Directory of Open Access Journals (Sweden)

    Danilović Mirčeta S.

    2004-01-01

    Full Text Available The paper explores the process of development, establishment and recognition of "educational technology" as an independent scientific field and a separate teaching subject at universities. The paper points to: (a the problems that this field deals with or should deal with, (b knowledge needed for the profession of "educational technologist", (c various scientific institutions across the world involved in educational technology, (d scientific journals treating issues of modern educational technology, (e the authors i.e. psychologists and educators who developed and formulated the basic principles of this scientific field, (f educational features and potentials of educational technologies. Emphasis is placed on the role and importance of AV technology in developing, establishing and recognition of educational technology, and it is also pointed out that AV technology i.e. AV teaching aids and a movement for visualization of teaching were its forerunners and crucial factors for its establishing and developing into an independent area of teaching i.e. school subject. In summary it is stressed that educational technology provides for the execution of instruction through emission transmission, selection, coding, decoding, reception, memorization transformation of all types of pieces of information in teaching.

  5. Voice stress analysis and evaluation

    Science.gov (United States)

    Haddad, Darren M.; Ratley, Roy J.

    2001-02-01

    Voice Stress Analysis (VSA) systems are marketed as computer-based systems capable of measuring stress in a person's voice as an indicator of deception. They are advertised as being less expensive, easier to use, less invasive in use, and less constrained in their operation then polygraph technology. The National Institute of Justice have asked the Air Force Research Laboratory for assistance in evaluating voice stress analysis technology. Law enforcement officials have also been asking questions about this technology. If VSA technology proves to be effective, its value for military and law enforcement application is tremendous.

  6. Smart Homes with Voice Activated Systems for Disabled People

    Directory of Open Access Journals (Sweden)

    Bekir Busatlic

    2017-02-01

    Full Text Available Smart home refers to the application of various technologies to semi-unsupervised home control It refers to systems that control temperature, lighting, door locks, windows and many other appliances. The aim of this study was to design a system that will use existing technology to showcase how it can benefit people with disabilities. This work uses only off-the-shelf products (smart home devices and controllers, speech recognition technology, open-source code libraries. The Voice Activated Smart Home application was developed to demonstrate online grocery shopping and home control using voice comments and tested by measuring its effectiveness in performing tasks as well as its efficiency in recognizing user speech input.

  7. Exploitation of Existing Voice Over Internet Protocol Technology for Department of the Navy Application

    National Research Council Canada - National Science Library

    Vegter, Henry

    2002-01-01

    ..., reduced cost associated with toll calls and the merger of the telephone with the desktop will keep adoption of this technology on the path to ubiquitous use, Topics explored in the thesis include...

  8. Wireless Technology Recognition Based on RSSI Distribution at Sub-Nyquist Sampling Rate for Constrained Devices.

    Science.gov (United States)

    Liu, Wei; Kulin, Merima; Kazaz, Tarik; Shahid, Adnan; Moerman, Ingrid; De Poorter, Eli

    2017-09-12

    Driven by the fast growth of wireless communication, the trend of sharing spectrum among heterogeneous technologies becomes increasingly dominant. Identifying concurrent technologies is an important step towards efficient spectrum sharing. However, due to the complexity of recognition algorithms and the strict condition of sampling speed, communication systems capable of recognizing signals other than their own type are extremely rare. This work proves that multi-model distribution of the received signal strength indicator (RSSI) is related to the signals' modulation schemes and medium access mechanisms, and RSSI from different technologies may exhibit highly distinctive features. A distinction is made between technologies with a streaming or a non-streaming property, and appropriate feature spaces can be established either by deriving parameters such as packet duration from RSSI or directly using RSSI's probability distribution. An experimental study shows that even RSSI acquired at a sub-Nyquist sampling rate is able to provide sufficient features to differentiate technologies such as Wi-Fi, Long Term Evolution (LTE), Digital Video Broadcasting-Terrestrial (DVB-T) and Bluetooth. The usage of the RSSI distribution-based feature space is illustrated via a sample algorithm. Experimental evaluation indicates that more than 92% accuracy is achieved with the appropriate configuration. As the analysis of RSSI distribution is straightforward and less demanding in terms of system requirements, we believe it is highly valuable for recognition of wideband technologies on constrained devices in the context of dynamic spectrum access.

  9. [Application of image recognition technology in census of national traditional Chinese medicine resources].

    Science.gov (United States)

    Zhang, Xiao-Bo; Ge, Xiao-Guang; Jin, Yan; Shi, Ting-Ting; Wang, Hui; Li, Meng; Jing, Zhi-Xian; Guo, Lan-Ping; Huang, Lu-Qi

    2017-11-01

    With the development of computer and image processing technology, image recognition technology has been applied to the national medicine resources census work at all stages.Among them: ①In the preparatory work, in order to establish a unified library of traditional Chinese medicine resources, using text recognition technology based on paper materials, be the assistant in the digitalization of various categories related to Chinese medicine resources; to determine the representative area and plots of the survey from each census team, based on the satellite remote sensing image and vegetation map and other basic data, using remote sensing image classification and other technical methods to assist in determining the key investigation area. ②In the process of field investigation, to obtain the planting area of Chinese herbal medicine was accurately, we use the decision tree model, spectral feature and object-oriented method were used to assist the regional identification and area estimation of Chinese medicinal materials.③In the process of finishing in the industry, in order to be able to relatively accurately determine the type of Chinese medicine resources in the region, based on the individual photos of the plant, the specimens and the name of the use of image recognition techniques, to assist the statistical summary of the types of traditional Chinese medicine resources. ④In the application of the results of transformation, based on the pharmaceutical resources and individual samples of medicinal herbs, the development of Chinese medicine resources to identify APP and authentic herbs 3D display system, assisted the identification of Chinese medicine resources and herbs identification characteristics. The introduction of image recognition technology in the census of Chinese medicine resources, assisting census personnel to carry out related work, not only can reduce the workload of the artificial, improve work efficiency, but also improve the census results

  10. Analyzing the mediated voice - a datasession

    DEFF Research Database (Denmark)

    Lawaetz, Anna

    Broadcasted voices are technologically manipulated. In order to achieve a certain autencity or sound of “reality” paradoxically the voices are filtered and trained in order to reach the listeners. This “mis-en-scene” is important knowledge when it comes to the development of a consistent method o...... of analysis of the mediated voice...

  11. Middle Years Science Teachers Voice Their First Experiences with Interactive Whiteboard Technology

    Science.gov (United States)

    Gadbois, Shannon A.; Haverstock, Nicole

    2012-01-01

    Among new technologies, interactive whiteboards (IWBs) particularly seem to engage students and offer entertainment value that may make them highly beneficial for learning. This study examined 10 Grade 6 teachers' initial experiences and uses of IWBs for teaching science. Through interviews, classroom visits, and field notes, the outcomes…

  12. Object/Shape Recognition Technology: An Assessment of the Feasibility of Implementation at Defense Logistics Agency Disposition Services

    Science.gov (United States)

    2015-02-25

    provide efficiency and effectively manufacture or inventory items. The industries that benefit from Cognex technology are automotive, food and beverage ...recognition tedmology, Tedmology Readiness Level, PAGES Cost Benefit Analysis, Tedmology Commercialization, Technology Transition 139 16. PRICE CODE 17...Technology Development & Transition Strategy Guidebook xvii UD Ultimate Disposal U.S. United States USAF United States Air Force xviii THIS

  13. Voiced Excitations

    National Research Council Canada - National Science Library

    Holzricher, John

    2004-01-01

    To more easily obtain a voiced excitation function for speech characterization, measurements of skin motion, tracheal tube, and vocal fold, motions were made and compared to EM sensor-glottal derived...

  14. Sustainable Consumer Voices

    DEFF Research Database (Denmark)

    Klitmøller, Anders; Rask, Morten; Jensen, Nevena

    2011-01-01

    Aiming to explore how user driven innovation can inform high level design strategies, an in-depth empirical study was carried out, based on data from 50 observations of private vehicle users. This paper reports the resulting 5 consumer voices: Technology Enthusiast, Environmentalist, Design Lover...

  15. Taking Care of Your Voice

    Science.gov (United States)

    ... negative effect on voice. Exercise regularly. Exercise increases stamina and muscle tone. This helps provide good posture ... testing man-made and biological materials and stem cell technologies that may eventually be used to engineer ...

  16. Mechanics of human voice production and control.

    Science.gov (United States)

    Zhang, Zhaoyan

    2016-10-01

    As the primary means of communication, voice plays an important role in daily life. Voice also conveys personal information such as social status, personal traits, and the emotional state of the speaker. Mechanically, voice production involves complex fluid-structure interaction within the glottis and its control by laryngeal muscle activation. An important goal of voice research is to establish a causal theory linking voice physiology and biomechanics to how speakers use and control voice to communicate meaning and personal information. Establishing such a causal theory has important implications for clinical voice management, voice training, and many speech technology applications. This paper provides a review of voice physiology and biomechanics, the physics of vocal fold vibration and sound production, and laryngeal muscular control of the fundamental frequency of voice, vocal intensity, and voice quality. Current efforts to develop mechanical and computational models of voice production are also critically reviewed. Finally, issues and future challenges in developing a causal theory of voice production and perception are discussed.

  17. Drop App: give voice to youngsters through new technologies to fight drop-out

    Directory of Open Access Journals (Sweden)

    Margarita Gandullo Recio

    2016-05-01

    Full Text Available IES Valle de Aller is part of a european project Erasmus+ KA2 cooperation and innovation of good practices. IES Valle de Aller is working together on this project with other educational organizations from different European countries. In 2012, early school leavers (ESL affects 5,500,000 young europeans aged between 18 and 24 years, either because they had not completed compulsory secondary education or because they did not follow higher education studies, both high school and vocational training. They were not in the labor market. The objective of the EU2020 is to reduce the rate of ESL by 10 %. All inquiries about so far show that ESL is caused by a mixture of individual circumstances, educational and socioeconomic conditions. In the last five years we have developed many educational interventions by the different national systems and regional educational policies to curb the rate of ESL. What is innovative in this project, and that means added value, is the fact that the approach is focused on finding the student in his own world. Young people between 14 and 18 would be much more motivated to continue education programs if they use the languages of the new information and communications technology (ICT, because they are really the digital natives.

  18. Familiarity and Voice Representation: From Acoustic-Based Representation to Voice Averages

    Directory of Open Access Journals (Sweden)

    Maureen Fontaine

    2017-07-01

    Full Text Available The ability to recognize an individual from their voice is a widespread ability with a long evolutionary history. Yet, the perceptual representation of familiar voices is ill-defined. In two experiments, we explored the neuropsychological processes involved in the perception of voice identity. We specifically explored the hypothesis that familiar voices (trained-to-familiar (Experiment 1, and famous voices (Experiment 2 are represented as a whole complex pattern, well approximated by the average of multiple utterances produced by a single speaker. In experiment 1, participants learned three voices over several sessions, and performed a three-alternative forced-choice identification task on original voice samples and several “speaker averages,” created by morphing across varying numbers of different vowels (e.g., [a] and [i] produced by the same speaker. In experiment 2, the same participants performed the same task on voice samples produced by familiar speakers. The two experiments showed that for famous voices, but not for trained-to-familiar voices, identification performance increased and response times decreased as a function of the number of utterances in the averages. This study sheds light on the perceptual representation of familiar voices, and demonstrates the power of average in recognizing familiar voices. The speaker average captures the unique characteristics of a speaker, and thus retains the information essential for recognition; it acts as a prototype of the speaker.

  19. Movement recognition technology as a method of assessing spontaneous general movements in high risk infants

    Directory of Open Access Journals (Sweden)

    Claire eMarcroft

    2015-01-01

    Full Text Available Preterm birth is associated with increased risks of neurological and motor impairments such as cerebral palsy. The risks are highest in those born at the lowest gestations. Early identification of those most at risk is challenging meaning that a critical window of opportunity to improve outcomes through therapy-based interventions may be missed. Clinically, the assessment of spontaneous general movements is an important tool which can be used for the prediction of movement impairments in high risk infants.Movement recognition aims to capture and analyze relevant limb movements through computerized approaches focusing on continuous, objective, and quantitative assessment. Different methods of recording and analyzing infant movements have recently been explored in high risk infants. These range from camera-based solutions to body-worn miniaturized movement sensors used to record continuous time-series data that represent the dynamics of limb movements. Various machine learning methods have been developed and applied to the analysis of the recorded movement data. This analysis has focused on the detection and classification of atypical spontaneous general movements. This paper aims to identify recent translational studies using movement recognition technology as a method of assessing movement in high risk infants. The application of this technology within pediatric practice represents a growing area of inter-disciplinary collaboration which may lead to a greater understanding of the development of the nervous system in infants at high risk of motor impairment.

  20. Citizen Journalism and Digital Voices: Instituting a Collaborative Process between Global Youth, Technology and Media for Positive Social Change

    Science.gov (United States)

    Worley, Robin

    2011-01-01

    Millions of youths in developing countries are described by UNICEF as "invisible and excluded." They live at the margins of society, facing challenges to their daily existence, powerless to make positive changes. But the emergence of citizen journalism and digital storytelling may offer these youths a chance to share their voices and…

  1. Voice over Internet Protocol (VoIP) Technology as a Global Learning Tool: Information Systems Success and Control Belief Perspectives

    Science.gov (United States)

    Chen, Charlie C.; Vannoy, Sandra

    2013-01-01

    Voice over Internet Protocol- (VoIP) enabled online learning service providers struggling with high attrition rates and low customer loyalty issues despite VoIP's high degree of system fit for online global learning applications. Effective solutions to this prevalent problem rely on the understanding of system quality, information quality, and…

  2. The Voice/Data Communications system in the Health, Education, Telecommunications Experiments. Satellite Technology Demonstration, Technical Report No. 0417.

    Science.gov (United States)

    Janky, James M.; And Others

    The diligent use of two-way voice links via satellites substantially improves the quality and the availability of health care and educational services in remote areas. This improvement was demonstrated in several experiments that were sponsored by the Department of Health, Education, and Welfare and the National Aeronautics and Space…

  3. A rapid automatic analyzer and its methodology for effective bentonite content based on image recognition technology

    Directory of Open Access Journals (Sweden)

    Wei Long

    2016-09-01

    Full Text Available Fast and accurate determination of effective bentonite content in used clay bonded sand is very important for selecting the correct mixing ratio and mixing process to obtain high-performance molding sand. Currently, the effective bentonite content is determined by testing the ethylene blue absorbed in used clay bonded sand, which is usually a manual operation with some disadvantages including complicated process, long testing time and low accuracy. A rapid automatic analyzer of the effective bentonite content in used clay bonded sand was developed based on image recognition technology. The instrument consists of auto stirring, auto liquid removal, auto titration, step-rotation and image acquisition components, and processor. The principle of the image recognition method is first to decompose the color images into three-channel gray images based on the photosensitive degree difference of the light blue and dark blue in the three channels of red, green and blue, then to make the gray values subtraction calculation and gray level transformation of the gray images, and finally, to extract the outer circle light blue halo and the inner circle blue spot and calculate their area ratio. The titration process can be judged to reach the end-point while the area ratio is higher than the setting value.

  4. Human factors issues associated with the use of speech technology in the cockpit

    Science.gov (United States)

    Kersteen, Z. A.; Damos, D.

    1983-01-01

    The human factors issues associated with the use of voice technology in the cockpit are summarized. The formulation of the LHX avionics suite is described and the allocation of tasks to voice in the cockpit is discussed. State-of-the-art speech recognition technology is reviewed. Finally, a questionnaire designed to tap pilot opinions concerning the allocation of tasks to voice input and output in the cockpit is presented. This questionnaire was designed to be administered to operational AH-1G Cobra gunship pilots. Half of the questionnaire deals specifically with the AH-1G cockpit and the types of tasks pilots would like to have performed by voice in this existing rotorcraft. The remaining portion of the questionnaire deals with an undefined rotorcraft of the future and is aimed at determining what types of tasks these pilots would like to have performed by voice technology if anything was possible, i.e. if there were no technological constraints.

  5. Voice search for development

    CSIR Research Space (South Africa)

    Barnard, E

    2010-09-01

    Full Text Available of speech technology development, similar approaches are likely to be applicable in both circumstances. However, within these broad approaches there are details which are specific to certain languages (or lan- guage families) that may require solutions... to the modeling of pitch were therefore required. Similarly, it is possible that novel solutions will be required to deal with the click sounds that occur in some Southern Bantu languages, or the voicing Copyright  2010 ISCA 26-30 September 2010, Makuhari...

  6. Digital Technologies for Social Innovation: An Empirical Recognition on the New Enablers

    Directory of Open Access Journals (Sweden)

    Riccardo Maiolini

    2016-12-01

    Full Text Available Even though scholars’ attention has been placed on Social Innovation (SI, little evidence has been provided with regards to which tools are actually used to address social needs and foster Social Innovation initiatives. The purpose of the article is twofold. Firstly, the article offers empirical recognition to SI by investigating, on a large-scale, social and innovative activities conducted by start-ups and small and medium-sized enterprises (SMEs across the world between 2001 and 2014. Secondly, the article intends to capture SI core businesses and underlying complementarities between products, markets, and technologies and show in which way digital media and IT are essentially tracing innovation trajectories over a multitude of industries, leading the current industrial patterns of SI, and continually fostering its cross-industry nature.

  7. Voice Based City Panic Button System

    Science.gov (United States)

    Febriansyah; Zainuddin, Zahir; Bachtiar Nappu, M.

    2018-03-01

    The development of voice activated panic button application aims to design faster early notification of hazardous condition in community to the nearest police by using speech as the detector where the current application still applies touch-combination on screen and use coordination of orders from control center then the early notification still takes longer time. The method used in this research was by using voice recognition as the user voice detection and haversine formula for the comparison of closest distance between the user and the police. This research was equipped with auto sms, which sent notification to the victim’s relatives, that was also integrated with Google Maps application (GMaps) as the map to the victim’s location. The results show that voice registration on the application reaches 100%, incident detection using speech recognition while the application is running is 94.67% in average, and the auto sms to the victim relatives reaches 100%.

  8. Improved sensitivity of wearable nanogenerators made of electrospun Eu3+ doped P(VDF-HFP)/graphene composite nanofibers for self-powered voice recognition

    Science.gov (United States)

    Adhikary, Prakriti; Biswas, Anirban; Mandal, Dipankar

    2016-12-01

    Composite nanofibers of Eu3+ doped poly(vinylidene fluoride-co-hexafluoropropylene) (P(VDF-HFP))/graphene are prepared by the electrospinning technique for the fabrication of ultrasensitive wearable piezoelectric nanogenerators (WPNGs) where the post-poling technique is not necessary. It is found that the complete conversion of the piezoelectric β-phase and the improvement of the degree of crystallinity is governed by the incorporation of Eu3+ and graphene sheets into P(VDF-HFP) nanofibers. The flexible nanocomposite fibers are associated with a hypersensitive electronic transition that results in an intense red light emission, and WPNGs also have the capability of detecting external pressure as low as ~23 Pa with a higher degree of acoustic sensitivity, ~11 V Pa-1, than has ever been previously reported. This means that ultrasensitive WPNGs can be utilized to recognize human voices, which suggests they could be a potential tool in the biomedical and national security sectors. The capacitor’s ability to charge from abundant environmental vibrations, such as music, wind, body motion, etc, drives WPNGs as a power source for portable electronics. This fact may open up the prospect of using the Eu3+ doped P(VDF-HFP)/graphene composite electrospun nanofibers, with their multifunctional properties such as vibration sensitivity, wearability, red light emission capability and piezoelectric energy harvesting, for various promising applications in portable electronics, health care monitoring, noise detection and security monitoring.

  9. Interfacing COTS Speech Recognition and Synthesis Software to a Lotus Notes Military Command and Control Database

    Science.gov (United States)

    Carr, Oliver

    2002-10-01

    Speech recognition and synthesis technologies have become commercially viable over recent years. Two current market leading products in speech recognition technology are Dragon NaturallySpeaking and IBM ViaVoice. This report describes the development of speech user interfaces incorporating these products with Lotus Notes and Java applications. These interfaces enable data entry using speech recognition and allow warnings and instructions to be issued via speech synthesis. The development of a military vocabulary to improve user interaction is discussed. The report also describes an evaluation in terms of speed of the various speech user interfaces developed using Dragon NaturallySpeaking and IBM ViaVoice with a Lotus Notes Command and Control Support System Log database.

  10. Benefits for Voice Learning Caused by Concurrent Faces Develop over Time.

    Science.gov (United States)

    Zäske, Romi; Mühl, Constanze; Schweinberger, Stefan R

    2015-01-01

    Recognition of personally familiar voices benefits from the concurrent presentation of the corresponding speakers' faces. This effect of audiovisual integration is most pronounced for voices combined with dynamic articulating faces. However, it is unclear if learning unfamiliar voices also benefits from audiovisual face-voice integration or, alternatively, is hampered by attentional capture of faces, i.e., "face-overshadowing". In six study-test cycles we compared the recognition of newly-learned voices following unimodal voice learning vs. bimodal face-voice learning with either static (Exp. 1) or dynamic articulating faces (Exp. 2). Voice recognition accuracies significantly increased for bimodal learning across study-test cycles while remaining stable for unimodal learning, as reflected in numerical costs of bimodal relative to unimodal voice learning in the first two study-test cycles and benefits in the last two cycles. This was independent of whether faces were static images (Exp. 1) or dynamic videos (Exp. 2). In both experiments, slower reaction times to voices previously studied with faces compared to voices only may result from visual search for faces during memory retrieval. A general decrease of reaction times across study-test cycles suggests facilitated recognition with more speaker repetitions. Overall, our data suggest two simultaneous and opposing mechanisms during bimodal face-voice learning: while attentional capture of faces may initially impede voice learning, audiovisual integration may facilitate it thereafter.

  11. Tips for Healthy Voices

    Science.gov (United States)

    ... prevent voice problems and maintain a healthy voice: Drink water (stay well hydrated): Keeping your body well hydrated by drinking plenty of water each day (6-8 glasses) is essential to maintaining a healthy voice. The ...

  12. Probing echoic memory with different voices.

    Science.gov (United States)

    Madden, D J; Bastian, J

    1977-05-01

    Considerable evidence has indicated that some acoustical properties of spoken items are preserved in an "echoic" memory for approximately 2 sec. However, some of this evidence has also shown that changing the voice speaking the stimulus items has a disruptive effect on memory which persists longer than that of other acoustical variables. The present experiment examined the effect of voice changes on response bias as well as on accuracy in a recognition memory task. The task involved judging recognition probes as being present in or absent from sets of dichotically presented digits. Recognition of probes spoken in the same voice as that of the dichotic items was more accurate than recognition of different-voice probes at each of three retention intervals of up to 4 sec. Different-voice probes increased the likelihood of "absent" responses, but only up to a 1.4-sec delay. These shifts in response bias may represent a property of echoic memory which should be investigated further.

  13. Voice over IP Security

    CERN Document Server

    Keromytis, Angelos D

    2011-01-01

    Voice over IP (VoIP) and Internet Multimedia Subsystem technologies (IMS) are rapidly being adopted by consumers, enterprises, governments and militaries. These technologies offer higher flexibility and more features than traditional telephony (PSTN) infrastructures, as well as the potential for lower cost through equipment consolidation and, for the consumer market, new business models. However, VoIP systems also represent a higher complexity in terms of architecture, protocols and implementation, with a corresponding increase in the potential for misuse. In this book, the authors examine the

  14. Nanomechanical recognition of prognostic biomarker suPAR with DVD-ROM optical technology

    International Nuclear Information System (INIS)

    Bache, Michael; Bosco, Filippo G; Brøgger, Anna L; Frøhling, Kasper B; Boisen, Anja; Alstrøm, Tommy Sonne; Hwu, En-Te; Chen, Ching-Hsiu; Hwang, Ing-Shouh; Eugen-Olsen, Jesper

    2013-01-01

    In this work the use of a high-throughput nanomechanical detection system based on a DVD-ROM optical drive and cantilever sensors is presented for the detection of urokinase plasminogen activator receptor inflammatory biomarker (uPAR). Several large scale studies have linked elevated levels of soluble uPAR (suPAR) to infectious diseases, such as HIV, and certain types of cancer. Using hundreds of cantilevers and a DVD-based platform, cantilever deflection response from antibody–antigen recognition is investigated as a function of suPAR concentration. The goal is to provide a cheap and portable detection platform which can carry valuable prognostic information. In order to optimize the cantilever response the antibody immobilization and unspecific binding are initially characterized using quartz crystal microbalance technology. Also, the choice of antibody is explored in order to generate the largest surface stress on the cantilevers, thus increasing the signal. Using optimized experimental conditions the lowest detectable suPAR concentration is currently around 5 nM. The results reveal promising research strategies for the implementation of specific biochemical assays in a portable and high-throughput microsensor-based detection platform. (paper)

  15. Construction site Voice Operated Information System (VOIS) test

    Science.gov (United States)

    Lawrence, Debbie J.; Hettchen, William

    1991-01-01

    The Voice Activated Information System (VAIS), developed by USACERL, allows inspectors to verbally log on-site inspection reports on a hand held tape recorder. The tape is later processed by the VAIS, which enters the information into the system's database and produces a written report. The Voice Operated Information System (VOIS), developed by USACERL and Automated Sciences Group, through a ESACERL cooperative research and development agreement (CRDA), is an improved voice recognition system based on the concepts and function of the VAIS. To determine the applicability of the VOIS to Corps of Engineers construction projects, Technology Transfer Test Bad (T3B) funds were provided to the Corps of Engineers National Security Agency (NSA) Area Office (Fort Meade) to procure and implement the VOIS, and to train personnel in its use. This report summarizes the NSA application of the VOIS to quality assurance inspection of radio frequency shielding and to progress payment logs, and concludes that the VOIS is an easily implemented system that can offer improvements when applied to repetitive inspection procedures. Use of VOIS can save time during inspection, improve documentation storage, and provide flexible retrieval of stored information.

  16. Effect of Technological Changes in Information Transfer on the Delivery of Pharmacy Services.

    Science.gov (United States)

    Barker, Kenneth N.; And Others

    1989-01-01

    Personal computer technology has arrived in health care. Specific technological advances are optical disc storage, smart cards, voice recognition, and robotics. This paper discusses computers in medicine, in nursing, in conglomerates, and with patients. Future health care will be delivered in primary care centers, medical supermarkets, specialized…

  17. Review of Design of Speech Recognition and Text Analytics based Digital Banking Customer Interface and Future Directions of Technology Adoption

    OpenAIRE

    Saha, Amal K

    2017-01-01

    Banking is one of the most significant adopters of cutting-edge information technologies. Since its modern era beginning in the form of paper based accounting maintained in the branch, adoption of computerized system made it possible to centralize the processing in data centre and improve customer experience by making a more available and efficient system. The latest twist in this evolution is adoption of natural language processing and speech recognition in the user interface between the hum...

  18. Multimodal approaches for emotion recognition: a survey

    Science.gov (United States)

    Sebe, Nicu; Cohen, Ira; Gevers, Theo; Huang, Thomas S.

    2005-01-01

    Recent technological advances have enabled human users to interact with computers in ways previously unimaginable. Beyond the confines of the keyboard and mouse, new modalities for human-computer interaction such as voice, gesture, and force-feedback are emerging. Despite important advances, one necessary ingredient for natural interaction is still missing-emotions. Emotions play an important role in human-to-human communication and interaction, allowing people to express themselves beyond the verbal domain. The ability to understand human emotions is desirable for the computer in several applications. This paper explores new ways of human-computer interaction that enable the computer to be more aware of the user's emotional and attentional expressions. We present the basic research in the field and the recent advances into the emotion recognition from facial, voice, and physiological signals, where the different modalities are treated independently. We then describe the challenging problem of multimodal emotion recognition and we advocate the use of probabilistic graphical models when fusing the different modalities. We also discuss the difficult issues of obtaining reliable affective data, obtaining ground truth for emotion recognition, and the use of unlabeled data.

  19. Rapid determination of 239Pu in urine samples using molecular recognition technology product AnaLigRPu-02 gel

    International Nuclear Information System (INIS)

    Silvia Dulanska; Boris Remenec; Jan Bilohuscin; Miroslav Labaska; Bianka Horvathova; Andrej Matel

    2013-01-01

    This paper describes the use of IBC's AnaLig R Pu-02 molecular recognition technology product to effectively and selectively pre-concentrate, separate and recover plutonium from urine samples. This method uses two-stage column separations consisting of two different commercial products, Eichrom's Pre-filter Material and AnaLig R Pu-02 resin from IBC Advanced Technologies. By eliminating the co-precipitation techniques and the ashing steps to remove residual organics, the analysis time was reduced significantly. The method was successfully tested by adding known activities of reference solutions of 242 Pu and 239 Pu to urine samples. (author)

  20. Web Surveys to Digital Movies: Technological Tools of the Trade.

    Science.gov (United States)

    Fetterman, David M.

    2002-01-01

    Highlights some of the technological tools used by educational researchers today, focusing on data collection related tools such as Web surveys, digital photography, voice recognition and transcription, file sharing and virtual office, videoconferencing on the Internet, instantaneous chat and chat rooms, reporting and dissemination, and digital…

  1. Voice-associated static face image releases speech from informational masking.

    Science.gov (United States)

    Gao, Yayue; Cao, Shuyang; Qu, Tianshu; Wu, Xihong; Li, Haifeng; Zhang, Jinsheng; Li, Liang

    2014-06-01

    In noisy, multipeople talking environments such as a cocktail party, listeners can use various perceptual and/or cognitive cues to improve recognition of target speech against masking, particularly informational masking. Previous studies have shown that temporally prepresented voice cues (voice primes) improve recognition of target speech against speech masking but not noise masking. This study investigated whether static face image primes that have become target-voice associated (i.e., facial images linked through associative learning with voices reciting the target speech) can be used by listeners to unmask speech. The results showed that in 32 normal-hearing younger adults, temporally prepresenting a voice-priming sentence with the same voice reciting the target sentence significantly improved the recognition of target speech that was masked by irrelevant two-talker speech. When a person's face photograph image became associated with the voice reciting the target speech by learning, temporally prepresenting the target-voice-associated face image significantly improved recognition of target speech against speech masking, particularly for the last two keywords in the target sentence. Moreover, speech-recognition performance under the voice-priming condition was significantly correlated to that under the face-priming condition. The results suggest that learned facial information on talker identity plays an important role in identifying the target-talker's voice and facilitating selective attention to the target-speech stream against the masking-speech stream. © 2014 The Institute of Psychology, Chinese Academy of Sciences and Wiley Publishing Asia Pty Ltd.

  2. Using Technology to Claim Rights to Free Maternal Health Care: Lessons about Impact from the My Health, My Voice Pilot Project in India.

    Science.gov (United States)

    Dasgupt, Jashodhara; Sandhya, Y K; Lobis, Samantha; Verma, Pravesh; Schaaf, Marta

    2015-12-10

    My Health, My Voice is a human rights-based project that pilots the use of technology to monitor and display online data regarding informal payments for maternal health care in two districts of Uttar Pradesh, India. SAHAYOG, an organization based in Uttar Pradesh, partnered with a grassroots women's forum to inform women about their entitlements, to publicize the project, and to implement a toll-free hotline where women could report health providers' demands for informal payments. Between January 2012 and May 2013, the hotline recorded 873 reports of informal payment demands. Monitoring and evaluation revealed that the project enhanced women's knowledge of their entitlements, as well as their confidence to claim their rights. Anecdotal evidence suggests that health providers' demands for informal payments were reduced in response to the project, although hospital and district officials did not regularly consult the data. The use of technology accorded greater legitimacy among governmental stakeholders. Future research should examine the sustainability of changes, as well as the mechanisms driving health sector responsiveness. Copyright © 2015 Dasgupta et al. This is an open access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original author and source are credited.

  3. Research of Obstacle Recognition Technology in Cross-Country Environment for Unmanned Ground Vehicle

    Directory of Open Access Journals (Sweden)

    Zhao Yibing

    2014-01-01

    Full Text Available Being aimed at the obstacle recognition problem of unmanned ground vehicles in cross-country environment, this paper uses monocular vision sensor to realize the obstacle recognition of typical obstacles. Firstly, median filtering algorithm is applied during image preprocessing that can eliminate the noise. Secondly, image segmentation method based on the Fisher criterion function is used to segment the region of interest. Then, morphological method is used to process the segmented image, which is preparing for the subsequent analysis. The next step is to extract the color feature S, color feature a and edge feature “verticality” of image are extracted based on the HSI color space, the Lab color space, and two value images. Finally multifeature fusion algorithm based on Bayes classification theory is used for obstacle recognition. Test results show that the algorithm has good robustness and accuracy.

  4. Dimensionality in voice quality.

    Science.gov (United States)

    Bele, Irene Velsvik

    2007-05-01

    This study concerns speaking voice quality in a group of male teachers (n = 35) and male actors (n = 36), as the purpose was to investigate normal and supranormal voices. The goal was the development of a method of valid perceptual evaluation for normal to supranormal and resonant voices. The voices (text reading at two loudness levels) had been evaluated by 10 listeners, for 15 vocal characteristics using VA scales. In this investigation, the results of an exploratory factor analysis of the vocal characteristics used in this method are presented, reflecting four dimensions of major importance for normal and supranormal voices. Special emphasis is placed on the effects on voice quality of a change in the loudness variable, as two loudness levels are studied. Furthermore, the vocal characteristics Sonority and Ringing voice quality are paid special attention, as the essence of the term "resonant voice" was a basic issue throughout a doctoral dissertation where this study was included.

  5. Perceiving a stranger's voice as being one's own: a 'rubber voice' illusion?

    Directory of Open Access Journals (Sweden)

    Zane Z Zheng

    2011-04-01

    Full Text Available We describe an illusion in which a stranger's voice, when presented as the auditory concomitant of a participant's own speech, is perceived as a modified version of their own voice. When the congruence between utterance and feedback breaks down, the illusion is also broken. Compared to a baseline condition in which participants heard their own voice as feedback, hearing a stranger's voice induced robust changes in the fundamental frequency (F0 of their production. Moreover, the shift in F0 appears to be feedback dependent, since shift patterns depended reliably on the relationship between the participant's own F0 and the stranger-voice F0. The shift in F0 was evident both when the illusion was present and after it was broken, suggesting that auditory feedback from production may be used separately for self-recognition and for vocal motor control. Our findings indicate that self-recognition of voices, like other body attributes, is malleable and context dependent.

  6. A preliminary analysis of human factors affecting the recognition accuracy of a discrete word recognizer for C3 systems

    Science.gov (United States)

    Yellen, H. W.

    1983-03-01

    Literature pertaining to Voice Recognition abounds with information relevant to the assessment of transitory speech recognition devices. In the past, engineering requirements have dictated the path this technology followed. But, other factors do exist that influence recognition accuracy. This thesis explores the impact of Human Factors on the successful recognition of speech, principally addressing the differences or variability among users. A Threshold Technology T-600 was used for a 100 utterance vocubalary to test 44 subjects. A statistical analysis was conducted on 5 generic categories of Human Factors: Occupational, Operational, Psychological, Physiological and Personal. How the equipment is trained and the experience level of the speaker were found to be key characteristics influencing recognition accuracy. To a lesser extent computer experience, time or week, accent, vital capacity and rate of air flow, speaker cooperativeness and anxiety were found to affect overall error rates.

  7. Writing with Voice

    Science.gov (United States)

    Kesler, Ted

    2012-01-01

    In this Teaching Tips article, the author argues for a dialogic conception of voice, based in the work of Mikhail Bakhtin. He demonstrates a dialogic view of voice in action, using two writing examples about the same topic from his daughter, a fifth-grade student. He then provides five practical tips for teaching a dialogic conception of voice in…

  8. Marshall’s Voice

    Directory of Open Access Journals (Sweden)

    Halper Thomas

    2017-12-01

    Full Text Available Most judicial opinions, for a variety of reasons, do not speak with the voice of identifiable judges, but an analysis of several of John Marshall’s best known opinions reveals a distinctive voice, with its characteristic language and style of argumentation. The power of this voice helps to account for the influence of his views.

  9. A Wireless LAN and Voice Information System for Underground Coal Mine

    OpenAIRE

    Yu Zhang; Wei Yang; Dongsheng Han; Young-Il Kim

    2014-01-01

    In this paper we constructed a wireless information system, and developed a wireless voice communication subsystem based on Wireless Local Area Networks (WLAN) for underground coal mine, which employs Voice over IP (VoIP) technology and Session Initiation Protocol (SIP) to achieve wireless voice dispatching communications. The master control voice dispatching interface and call terminal software are also developed on the WLAN ground server side to manage and implement the voice dispatching co...

  10. VASIR: An Open-Source Research Platform for Advanced Iris Recognition Technologies.

    Science.gov (United States)

    Lee, Yooyoung; Micheals, Ross J; Filliben, James J; Phillips, P Jonathon

    2013-01-01

    The performance of iris recognition systems is frequently affected by input image quality, which in turn is vulnerable to less-than-optimal conditions due to illuminations, environments, and subject characteristics (e.g., distance, movement, face/body visibility, blinking, etc.). VASIR (Video-based Automatic System for Iris Recognition) is a state-of-the-art NIST-developed iris recognition software platform designed to systematically address these vulnerabilities. We developed VASIR as a research tool that will not only provide a reference (to assess the relative performance of alternative algorithms) for the biometrics community, but will also advance (via this new emerging iris recognition paradigm) NIST's measurement mission. VASIR is designed to accommodate both ideal (e.g., classical still images) and less-than-ideal images (e.g., face-visible videos). VASIR has three primary modules: 1) Image Acquisition 2) Video Processing, and 3) Iris Recognition. Each module consists of several sub-components that have been optimized by use of rigorous orthogonal experiment design and analysis techniques. We evaluated VASIR performance using the MBGC (Multiple Biometric Grand Challenge) NIR (Near-Infrared) face-visible video dataset and the ICE (Iris Challenge Evaluation) 2005 still-based dataset. The results showed that even though VASIR was primarily developed and optimized for the less-constrained video case, it still achieved high verification rates for the traditional still-image case. For this reason, VASIR may be used as an effective baseline for the biometrics community to evaluate their algorithm performance, and thus serves as a valuable research platform.

  11. Benefits and Challenges of Technology in High Schools: A Voice from Educational Leaders with a Freire Echo

    Science.gov (United States)

    Preston, Jane P.; Wiebe, Sean; Gabriel, Martha; McAuley, Alexander; Campbell, Barbara; MacDonald, Ron

    2015-01-01

    The purpose of this study is to document the perceptions of school leaders pertaining to the benefits and challenges of technology in high schools located on Prince Edward Island (PEI) (Canada). For this qualitative study, we interviewed 11 educational leaders representing the PEI Department of Education, principals, vice-principals, and…

  12. Differences in Access to Information and Communication Technologies: Voices of British Muslim Teenage Girls at Islamic Faith Schools

    Science.gov (United States)

    Hardaker, Glenn; Sabki, Aishah; Qazi, Atika; Iqbal, Javed

    2017-01-01

    Purpose: Most research on information and communication technologies (ICT) differences has been related to gender and ethnicity, and to a lesser extent religious affiliation. The purpose of this paper is to contribute to this field of research by situating the discussion in the context of British Muslims and extending current research into ICT…

  13. Using voice input and audio feedback to enhance the reality of a virtual experience

    Energy Technology Data Exchange (ETDEWEB)

    Miner, N.E.

    1994-04-01

    Virtual Reality (VR) is a rapidly emerging technology which allows participants to experience a virtual environment through stimulation of the participant`s senses. Intuitive and natural interactions with the virtual world help to create a realistic experience. Typically, a participant is immersed in a virtual environment through the use of a 3-D viewer. Realistic, computer-generated environment models and accurate tracking of a participant`s view are important factors for adding realism to a virtual experience. Stimulating a participant`s sense of sound and providing a natural form of communication for interacting with the virtual world are equally important. This paper discusses the advantages and importance of incorporating voice recognition and audio feedback capabilities into a virtual world experience. Various approaches and levels of complexity are discussed. Examples of the use of voice and sound are presented through the description of a research application developed in the VR laboratory at Sandia National Laboratories.

  14. Heuristics in primary care for recognition of unreported vision loss in older people: a technology development study.

    Science.gov (United States)

    Wijeyekoon, Skanda; Kharicha, Kalpa; Iliffe, Steve

    2015-09-01

    To evaluate heuristics (rules of thumb) for recognition of undetected vision loss in older patients in primary care. Vision loss is associated with ageing, and its prevalence is increasing. Visual impairment has a broad impact on health, functioning and well-being. Unrecognised vision loss remains common, and screening interventions have yet to reduce its prevalence. An alternative approach is to enhance practitioners' skills in recognising undetected vision loss, by having a more detailed picture of those who are likely not to act on vision changes, report symptoms or have eye tests. This paper describes a qualitative technology development study to evaluate heuristics for recognition of undetected vision loss in older patients in primary care. Using a previous modelling study, two heuristics in the form of mnemonics were developed to aid pattern recognition and allow general practitioners to identify potential cases of unreported vision loss. These heuristics were then analysed with experts. Findings It was concluded that their implementation in modern general practice was unsuitable and an alternative solution should be sort.

  15. Adherence to self-monitoring via interactive voice response technology in an eHealth intervention targeting weight gain prevention among Black women: randomized controlled trial.

    Science.gov (United States)

    Steinberg, Dori M; Levine, Erica L; Lane, Ilana; Askew, Sandy; Foley, Perry B; Puleo, Elaine; Bennett, Gary G

    2014-04-29

    eHealth interventions are effective for weight control and have the potential for broad reach. Little is known about the use of interactive voice response (IVR) technology for self-monitoring in weight control interventions, particularly among populations disproportionately affected by obesity. This analysis sought to examine patterns and predictors of IVR self-monitoring adherence and the association between adherence and weight change among low-income black women enrolled in a weight gain prevention intervention. The Shape Program was a randomized controlled trial comparing a 12-month eHealth behavioral weight gain prevention intervention to usual care among overweight and obese black women in the primary care setting. Intervention participants (n=91) used IVR technology to self-monitor behavior change goals (eg, no sugary drinks, 10,000 steps per day) via weekly IVR calls. Weight data were collected in clinic at baseline, 6, and 12 months. Self-monitoring data was stored in a study database and adherence was operationalized as the percent of weeks with a successful IVR call. Over 12 months, the average IVR completion rate was 71.6% (SD 28.1) and 52% (47/91) had an IVR completion rate ≥80%. At 12 months, IVR call completion was significantly correlated with weight loss (r =-.22; P=.04) and participants with an IVR completion rate ≥80% had significantly greater weight loss compared to those with an IVR completion rate self-monitoring. Adherence to IVR self-monitoring was high among socioeconomically disadvantaged black women enrolled in a weight gain prevention intervention. Higher adherence to IVR self-monitoring was also associated with greater weight change. IVR is an effective and useful tool to promote self-monitoring and has the potential for widespread use and long-term sustainability. Clinicaltrials.gov NCT00938535; http://www.clinicaltrials.gov/ct2/show/NCT00938535.

  16. Voice Over Internet Protocol Testbed Design for Non-Intrusive, Objective Voice Quality Assessment

    National Research Council Canada - National Science Library

    Manka, David L

    2007-01-01

    Voice over Internet Protocol (VoIP) is an emerging technology with the potential to assist the United States Marine Corps in solving communication challenges stemming from modern operational concepts...

  17. Can You See Me Now Visualizing Battlefield Facial Recognition Technology in 2035

    Science.gov (United States)

    2010-04-01

    this analogy: Assume that a normal individual, Tom, is very good at identifying different types of fruit juice such as orange juice , apple juice ...either compositing multiple images together to produce a more complete image or by creating a new algorithm to better deal with these problems...captures multiple frames of video and composites them into an appropriately high-resolution image that can be processed by the facial recognition software

  18. Speaker Recognition

    DEFF Research Database (Denmark)

    Mølgaard, Lasse Lohilahti; Jørgensen, Kasper Winther

    2005-01-01

    Speaker recognition is basically divided into speaker identification and speaker verification. Verification is the task of automatically determining if a person really is the person he or she claims to be. This technology can be used as a biometric feature for verifying the identity of a person...

  19. Analysis of Documentation Speed Using Web-Based Medical Speech Recognition Technology: Randomized Controlled Trial.

    Science.gov (United States)

    Vogel, Markus; Kaisers, Wolfgang; Wassmuth, Ralf; Mayatepek, Ertan

    2015-11-03

    Clinical documentation has undergone a change due to the usage of electronic health records. The core element is to capture clinical findings and document therapy electronically. Health care personnel spend a significant portion of their time on the computer. Alternatives to self-typing, such as speech recognition, are currently believed to increase documentation efficiency and quality, as well as satisfaction of health professionals while accomplishing clinical documentation, but few studies in this area have been published to date. This study describes the effects of using a Web-based medical speech recognition system for clinical documentation in a university hospital on (1) documentation speed, (2) document length, and (3) physician satisfaction. Reports of 28 physicians were randomized to be created with (intervention) or without (control) the assistance of a Web-based system of medical automatic speech recognition (ASR) in the German language. The documentation was entered into a browser's text area and the time to complete the documentation including all necessary corrections, correction effort, number of characters, and mood of participant were stored in a database. The underlying time comprised text entering, text correction, and finalization of the documentation event. Participants self-assessed their moods on a scale of 1-3 (1=good, 2=moderate, 3=bad). Statistical analysis was done using permutation tests. The number of clinical reports eligible for further analysis stood at 1455. Out of 1455 reports, 718 (49.35%) were assisted by ASR and 737 (50.65%) were not assisted by ASR. Average documentation speed without ASR was 173 (SD 101) characters per minute, while it was 217 (SD 120) characters per minute using ASR. The overall increase in documentation speed through Web-based ASR assistance was 26% (P=.04). Participants documented an average of 356 (SD 388) characters per report when not assisted by ASR and 649 (SD 561) characters per report when assisted

  20. Users’ Perceived Difficulties and Corresponding Reformulation Strategies in Google Voice Search

    Directory of Open Access Journals (Sweden)

    Wei Jeng

    2016-06-01

    Full Text Available In this article, we report users’ perceptions of query input errors and query reformulation strategies in voice search using data collected through a laboratory user study. Our results reveal that: 1 users’ perceived obstacles during a voice search can be related to speech recognition errors and topic complexity; 2 users naturally develop different strategies to deal with various types of words (e.g., acronyms, single-worded queries, non-English words with high error rates in speech recognition; and 3 users can have various emotional reactions when encounter voice input errors and they develop preferred usage occasions for voice search.

  1. Controlling An Electric Car Starter System Through Voice

    Directory of Open Access Journals (Sweden)

    A.B. Muhammad Firdaus

    2015-04-01

    Full Text Available Abstract These days automotive has turned into a stand out amongst the most well-known modes of transportation on the grounds that a large number of Malaysians could bear to have an auto. There are numerous decisions of innovations in auto that have in the market. One of the engineering is voice controlled framework. Voice Recognition is the procedure of consequently perceiving a certain statement talked by a specific speaker focused around individual data included in discourse waves. This paper is to make an car controlled by voice of human. An essential pre-processing venture in Voice Recognition systems is to recognize the vicinity of noise. Sensitivity to speech variability lacking recognition precision and helplessness to mimic are among the principle specialized obstacles that keep the far reaching selection of speech-based recognition systems. Voice recognition systems work sensibly well with a quiet conditions however inadequately under loud conditions or in twisted channels. The key focus of the project is to control an electric car starter system.

  2. METHODS FOR QUALITY ENHANCEMENT OF USER VOICE SIGNAL IN VOICE AUTHENTICATION SYSTEMS

    Directory of Open Access Journals (Sweden)

    O. N. Faizulaieva

    2014-03-01

    Full Text Available The reasonability for the usage of computer systems user voice in the authentication process is proved. The scientific task for improving the signal/noise ratio of the user voice signal in the authentication system is considered. The object of study is the process of input and output of the voice signal of authentication system user in computer systems and networks. Methods and means for input and extraction of voice signal against external interference signals are researched. Methods for quality enhancement of user voice signal in voice authentication systems are suggested. As modern computer facilities, including mobile ones, have two-channel audio card, the usage of two microphones is proposed in the voice signal input system of authentication system. Meanwhile, the task of forming a lobe of microphone array in a desired area of voice signal registration (100 Hz to 8 kHz is solved. The usage of directional properties of the proposed microphone array gives the possibility to have the influence of external interference signals two or three times less in the frequency range from 4 to 8 kHz. The possibilities for implementation of space-time processing of the recorded signals using constant and adaptive weighting factors are investigated. The simulation results of the proposed system for input and extraction of signals during digital processing of narrowband signals are presented. The proposed solutions make it possible to improve the value of the signal/noise ratio of the useful signals recorded up to 10, ..., 20 dB under the influence of external interference signals in the frequency range from 4 to 8 kHz. The results may be useful to specialists working in the field of voice recognition and speaker’s discrimination.

  3. Use of a voice and video internet technology as an alternative to in-person urgent care clinic visits.

    Science.gov (United States)

    Brunett, Patrick H; DiPiero, Albert; Flores, Christine; Choi, Dongseok; Kum, Hayley; Girard, Donald E

    2015-06-01

    This study aimed to determine the feasibility of patient-initiated online Internet urgent care visits, and to describe patient characteristics, scope of care, provider adherence to protocols, and diagnostic and therapeutic utilization. A total of 456 unique patients were seen via Internet-based technology during the study period, generating 478 consecutive total patient visits. Of the 82 patients referred for an in-person evaluation, 75 patients (91.5%) reported to the clinic as instructed. None of the 82 patients recommended for in-person evaluation required an emergency department referral, hospital admission or urgent consultative referral. We conclude that real-time online primary and urgent care visits are feasible, safe and potentially beneficial in increasing convenient access to urgent and primary care. © The Author(s) 2015 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.

  4. Use of iris recognition camera technology for the quantification of corneal opacification in mucopolysaccharidoses.

    Science.gov (United States)

    Aslam, Tariq Mehmood; Shakir, Savana; Wong, James; Au, Leon; Ashworth, Jane

    2012-12-01

    Mucopolysaccharidoses (MPS) can cause corneal opacification that is currently difficult to objectively quantify. With newer treatments for MPS comes an increased need for a more objective, valid and reliable index of disease severity for clinical and research use. Clinical evaluation by slit lamp is very subjective and techniques based on colour photography are difficult to standardise. In this article the authors present evidence for the utility of dedicated image analysis algorithms applied to images obtained by a highly sophisticated iris recognition camera that is small, manoeuvrable and adapted to achieve rapid, reliable and standardised objective imaging in a wide variety of patients while minimising artefactual interference in image quality.

  5. Pipeline Structural Damage Detection Using Self-Sensing Technology and PNN-Based Pattern Recognition

    International Nuclear Information System (INIS)

    Lee, Chang Gil; Park, Woong Ki; Park, Seung Hee

    2011-01-01

    In a structure, damage can occur at several scales from micro-cracking to corrosion or loose bolts. This makes the identification of damage difficult with one mode of sensing. Hence, a multi-mode actuated sensing system is proposed based on a self-sensing circuit using a piezoelectric sensor. In the self sensing-based multi-mode actuated sensing, one mode provides a wide frequency-band structural response from the self-sensed impedance measurement and the other mode provides a specific frequency-induced structural wavelet response from the self-sensed guided wave measurement. In this study, an experimental study on the pipeline system is carried out to verify the effectiveness and the robustness of the proposed structural health monitoring approach. Different types of structural damage are artificially inflicted on the pipeline system. To classify the multiple types of structural damage, a supervised learning-based statistical pattern recognition is implemented by composing a two-dimensional space using the damage indices extracted from the impedance and guided wave features. For more systematic damage classification, several control parameters to determine an optimal decision boundary for the supervised learning-based pattern recognition are optimized. Finally, further research issues will be discussed for real-world implementation of the proposed approach

  6. Singing voice outcomes following singing voice therapy.

    Science.gov (United States)

    Dastolfo-Hromack, Christina; Thomas, Tracey L; Rosen, Clark A; Gartner-Schmidt, Jackie

    2016-11-01

    The objectives of this study were to describe singing voice therapy (SVT), describe referred patient characteristics, and document the outcomes of SVT. Retrospective. Records of patients receiving SVT between June 2008 and June 2013 were reviewed (n = 51). All diagnoses were included. Demographic information, number of SVT sessions, and symptom severity were retrieved from the medical record. Symptom severity was measured via the 10-item Singing Voice Handicap Index (SVHI-10). Treatment outcome was analyzed by diagnosis, history of previous training, and SVHI-10. SVHI-10 scores decreased following SVT (mean change = 11, 40% decrease) (P singing lessons (n = 10) also completed an average of three SVT sessions. Primary muscle tension dysphonia (MTD1) and benign vocal fold lesion (lesion) were the most common diagnoses. Most patients (60%) had previous vocal training. SVHI-10 decrease was not significantly different between MTD and lesion. This is the first outcome-based study of SVT in a disordered population. Diagnosis of MTD or lesion did not influence treatment outcomes. Duration of SVT was short (approximately three sessions). Voice care providers are encouraged to partner with a singing voice therapist to provide optimal care for the singing voice. This study supports the use of SVT as a tool for the treatment of singing voice disorders. 4 Laryngoscope, 126:2546-2551, 2016. © 2016 The American Laryngological, Rhinological and Otological Society, Inc.

  7. A usability evaluation of an interactive application for halal products using optical character recognition and augmented reality technologies

    Science.gov (United States)

    Lam, Meng Chun; Nizam, Siti Soleha Muhammad; Arshad, Haslina; A'isyah Ahmad Shukri, Saidatul; Hashim, Nurhazarifah Che; Putra, Haekal Mozzia; Abidin, Rimaniza Zainal

    2017-10-01

    This article discusses the usability of an interactive application for halal products using Optical Character Recognition (OCR) and Augmented Reality (AR) technologies. Among the problems that have been identified in this study is that consumers have little knowledge about the E-Code. Therefore, users often have doubts about the halal status of the product. Nowadays, the integrity of halal status can be doubtful due to the actions of some irresponsible people spreading false information about a product. Therefore, an application that uses OCR and AR technology developed in this study will help the users to identify the information content of a product by scanning the E-Code label and by scanning the product's brand to know the halal status of the product. In this application, E-Code on the label of a product is scanned using OCR technology to display information about the E-Code. The product's brand is scan using augmented reality technology to display halal status of the product. The findings reveal that users are satisfied with this application and it is useful and easy to use.

  8. A Voice Operated Tour Planning System for Autonomous Mobile Robots

    Directory of Open Access Journals (Sweden)

    Charles V. Smith Iii

    2010-06-01

    Full Text Available Control systems driven by voice recognition software have been implemented before but lacked the context driven approach to generate relevant responses and actions. A partially voice activated control system for mobile robotics is presented that allows an autonomous robot to interact with people and the environment in a meaningful way, while dynamically creating customized tours. Many existing control systems also require substantial training for voice application. The system proposed requires little to no training and is adaptable to chaotic environments. The traversable area is mapped once and from that map a fully customized route is generated to the user

  9. Face the voice

    DEFF Research Database (Denmark)

    Lønstrup, Ansa

    2014-01-01

    will be based on a reception aesthetic and phenomenological approach, the latter as presented by Don Ihde in his book Listening and Voice. Phenomenologies of Sound , and my analytical sketches will be related to theoretical statements concerning the understanding of voice and media (Cavarero, Dolar, La......Belle, Neumark). Finally, the article will discuss the specific artistic combination and our auditory experience of mediated human voices and sculpturally projected faces in an art museum context under the general conditions of the societal panophonia of disembodied and mediated voices, as promoted by Steven...

  10. GIS technology in regional recognition of the distribution pattern of multifloral honey: The chemical traits in Serbia

    Directory of Open Access Journals (Sweden)

    Radović D.I.

    2014-01-01

    Full Text Available GIS is a computer-based system to input, store, manipulate, analyze and output spatially referenced data. There is a huge range application of GIS that generally sets out to fulfill: mapping, measurement, monitoring, modeling and management. In this study, GIS technology was used for the regional recognition of origin and distribution patterns of multifloral honey chemical traits in Serbia. This included organizing and analyzing the spatial and attributive data of 164 honey samples collected from different regions of Serbia during the harvesting season of 2009. Multifloral honey was characterized in regards to mineral composition, sugar content and basic physicochemical properties. The kriging method of Geostatistical Analyst was used for interpolation to predict values of a sampled variable over the whole territory of Serbia. [Projekat Ministarstva nauke Republike Srbije, br. III 46002, OI 172017 and 451-03-2372-IP Type 1/107

  11. Named Entity Recognition for Spanish language and applications in technology forecasting

    Directory of Open Access Journals (Sweden)

    Raúl Gutiérrez

    2015-12-01

    Full Text Available Named Entity Recognition (NER is a main task into Natural Language Processing. On the one hand, supporting the extraction of the information on unstructured data. On the other hand, The NER is a probabilistic graphical model that allows us to represent the conditional independency assumptions into the sequential labelling. In this paper, we propose a discriminative graphical model by using linear-chain Conditional Random Fields (CRFs. We present the experiments based on the Conll-2002 shared task and Ancora corpus according to the following criteria: recall, precision and F-score. Our contributions in this work are the following: first, we tested our baseline on the CoNLL-2002 shared task obtaining 80% F1-measure, and 59% F1-measure on AnCora corpus respectively. Finally, the application Vigtech allow us to identify information and patterns in the cancer topic, we discuss the results according to the model performance and the useful information to support the forecasting process

  12. Preventing Wine Counterfeiting by Individual Cork Stopper Recognition Using Image Processing Technologies

    Directory of Open Access Journals (Sweden)

    Valter Costa

    2018-03-01

    Full Text Available Wine counterfeiting is a major problem worldwide. Within this context, an approach to the problem of discerning original wine bottles from forged ones is the use of natural features present in the product, object and/or material (using it “as is”. The proposed application uses the cork stopper as a unique fingerprint, combined with state of the art image processing techniques to achieve individual object recognition and smartphones as the authentication equipment. The anti-counterfeiting scheme is divided into two phases: an enrollment phase, where every bottle is registered in a database using a photo of its cork stopper inside the bottle; and a verification phase, where an end-user/retailer captures a photo of the cork stopper using a regular smartphone, compares the photo with the previously-stored one and retrieves it if the wine bottle was previously registered. To evaluate the performance of the proposed application, two datasets of natural/agglomerate cork stoppers were built, totaling 1000 photos. The worst case results show a 100% precision ratio, an accuracy of 99.94% and a recall of 94.00%, using different smartphones. The perfect score in precision is a promising result, proving that this system can be applied to the prevention of wine counterfeiting and consumer/retailer security when purchasing a wine bottle.

  13. Clinical Voices - an update

    DEFF Research Database (Denmark)

    Fusaroli, Riccardo; Weed, Ethan

    Anomalous aspects of speech and voice, including pitch, fluency, and voice quality, are reported to characterise many mental disorders. However, it has proven difficult to quantify and explain this oddness of speech by employing traditional statistical methods. In this talk we will show how...

  14. MEASNET: Quality in Science and Technology by means of Inter-comparison and Mutual Recognition

    International Nuclear Information System (INIS)

    Cuerva, A.

    1998-01-01

    This work present the interesting relation that exists between Quality Systems, specifically EN45001 one, and scientific and technological disciplines. It is described how the general approach of Quality, as management and continuous improvement tool, applies also to groups whose activity is purely non systematic and innovative. (Author) 4 refs

  15. EXPERIMENTAL STUDY OF FIRMWARE FOR INPUT AND EXTRACTION OF USER’S VOICE SIGNAL IN VOICE AUTHENTICATION SYSTEMS

    Directory of Open Access Journals (Sweden)

    O. N. Faizulaieva

    2014-09-01

    Full Text Available Scientific task for improving the signal-to-noise ratio for user’s voice signal in computer systems and networks during the process of user’s voice authentication is considered. The object of study is the process of input and extraction of the voice signal of authentication system user in computer systems and networks. Methods and means for input and extraction of the voice signal on the background of external interference signals are investigated. Ways for quality improving of the user’s voice signal in systems of voice authentication are investigated experimentally. Firmware means for experimental unit of input and extraction of the user’s voice signal against external interference influence are considered. As modern computer means, including mobile, have two-channel audio card, two microphones are used in the voice signal input. The distance between sonic-wave sensors is 20 mm and it provides forming one direction pattern lobe of microphone array in a desired area of voice signal registration (from 100 Hz to 8 kHz. According to the results of experimental studies, the usage of directional properties of the proposed microphone array and space-time processing of the recorded signals with implementation of constant and adaptive weighting factors has made it possible to reduce considerably the influence of interference signals. The results of firmware experimental studies for input and extraction of the user’s voice signal against external interference influence are shown. The proposed solutions will give the possibility to improve the value of the signal/noise ratio of the useful signals recorded up to 20 dB under the influence of external interference signals in the frequency range from 4 to 8 kHz. The results may be useful to specialists working in the field of voice recognition and speaker discrimination.

  16. Onset and Maturation of Fetal Heart Rate Response to the Mother's Voice over Late Gestation

    Science.gov (United States)

    Kisilevsky, Barbara S.; Hains, Sylvia M. J.

    2011-01-01

    Background: Term fetuses discriminate their mother's voice from a female stranger's, suggesting recognition/learning of some property of her voice. Identification of the onset and maturation of the response would increase our understanding of the influence of environmental sounds on the development of sensory abilities and identify the period when…

  17. The effect of voice onset time differences on lexical access in Dutch

    NARCIS (Netherlands)

    Alphen, P.M. van; McQueen, J.M.

    2006-01-01

    Effects on spoken-word recognition of prevoicing differences in Dutch initial voiced plosives were examined. In 2 cross-modal identity-priming experiments, participants heard prime words and nonwords beginning with voiced plosives with 12, 6, or 0 periods of prevoicing or matched items beginning

  18. Culture/Religion and Identity: Social Justice versus Recognition

    Science.gov (United States)

    Bekerman, Zvi

    2012-01-01

    Recognition is the main word attached to multicultural perspectives. The multicultural call for recognition, the one calling for the recognition of cultural minorities and identities, the one now voiced by liberal states all over and also in Israel was a more difficult one. It took the author some time to realize that calling for the recognition…

  19. SURVEY OF BIOMETRIC SYSTEMS USING IRIS RECOGNITION

    OpenAIRE

    S.PON SANGEETHA; DR.M.KARNAN

    2014-01-01

    The security plays an important role in any type of organization in today’s life. Iris recognition is one of the leading automatic biometric systems in the area of security which is used to identify the individual person. Biometric systems include fingerprints, facial features, voice recognition, hand geometry, handwriting, the eye retina and the most secured one presented in this paper, the iris recognition. Biometric systems has become very famous in security systems because it is not possi...

  20. Voice following radiotherapy

    International Nuclear Information System (INIS)

    Stoicheff, M.L.

    1975-01-01

    This study was undertaken to provide information on the voice of patients following radiotherapy for glottic cancer. Part I presents findings from questionnaires returned by 227 of 235 patients successfully irradiated for glottic cancer from 1960 through 1971. Part II presents preliminary findings on the speaking fundamental frequencies of 22 irradiated patients. Normal to near-normal voice was reported by 83 percent of the 227 patients; however, 80 percent did indicate persisting vocal difficulties such as fatiguing of voice with much usage, inability to sing, reduced loudness, hoarse voice quality and inability to shout. Amount of talking during treatments appeared to affect length of time for voice to recover following treatments in those cases where it took from nine to 26 weeks; also, with increasing years since treatment, patients rated their voices more favorably. Smoking habits following treatments improved significantly with only 27 percent smoking heavily as compared with 65 percent prior to radiation therapy. No correlation was found between smoking (during or after treatments) and vocal ratings or between smoking and length of time for voice to recover. There was no relationship found between reported vocal ratings and stage of the disease

  1. Voice Savers for Music Teachers

    Science.gov (United States)

    Cookman, Starr

    2012-01-01

    Music teachers are in a class all their own when it comes to voice use. These elite vocal athletes require stamina, strength, and flexibility from their voices day in, day out for hours at a time. Voice rehabilitation clinics and research show that music education ranks high among the professionals most commonly affected by voice problems.…

  2. Automated technology to speed recognition of signs of illness in older adults.

    Science.gov (United States)

    Rantz, Marilyn J; Skubic, Marjorie; Koopman, Richelle J; Alexander, Gregory L; Phillips, Lorraine; Musterman, Katy; Back, Jessica; Aud, Myra A; Galambos, Colleen; Guevara, Rainer Dane; Miller, Steven J

    2012-04-01

    Our team has developed a technological innovation that detects changes in health status that indicate impending acute illness or exacerbation of chronic illness before usual assessment methods or self-reports of illness. We successfully used this information in a 1-year prospective study to alert health care providers so they could readily assess the situation and initiate early treatment to improve functional independence. Intervention participants showed significant improvements (as compared with the control group) for the Short Physical Performance Battery gait speed score at Quarter 3 (p = 0.03), hand grip-left at Quarter 2 (p = 0.02), hand grip-right at Quarter 4 (p = 0.05), and the GAITRite functional ambulation profile score at Quarter 2 (p = 0.05). Technological methods such as these could be widely adopted in older adult housing, long-term care settings, and in private homes where older adults wish to remain independent for as long as possible. Copyright 2012, SLACK Incorporated.

  3. From Empress to Emmen: Canadian-developed coil tubing technology gains international recognition

    Energy Technology Data Exchange (ETDEWEB)

    Ross, E.

    2004-03-01

    The evolution of the Canadian-developed Drilling Using Coiled Tubing (DUCT) technology is traced from its humble beginnings in the 1970s when it was used as small-diameter tubing for gas lift wells, evolving over time to larger tubing and complex job designs involving sophisticated modelling and planning. Today the technology can be found in underbalanced drilling projects as far east as offshore Holland. Underbalanced drilling can help operators achieve optimum well production by reducing the risk of reservoir damage from the influx of fluids, chemicals and formation solids into a porous formation. Using coiled tubing can ensure a consistent bottomhole pressure with no forced surging on the reservoir because there are no connections to be made and the circulation is continuous. Reservoir temperature and hydrocarbons continue to be the great challenges to the bottomhole assembly (BHA) and the positive displacement motor (PDM), but suppliers have been able to develop PDMs that can handle higher temperatures in the presence of hydrocarbons. With regard to future applications, coiled tubing drilling also appears to have a growing market for offshore re-entries using slim-hole drilling, due primarily to its lower transportation cost. 1 fig.

  4. Gender recognition from vocal source

    Science.gov (United States)

    Sorokin, V. N.; Makarov, I. S.

    2008-07-01

    Efficiency of automatic recognition of male and female voices based on solving the inverse problem for glottis area dynamics and for waveform of the glottal airflow volume velocity pulse is studied. The inverse problem is regularized through the use of analytical models of the voice excitation pulse and of the dynamics of the glottis area, as well as the model of one-dimensional glottal airflow. Parameters of these models and spectral parameters of the volume velocity pulse are considered. The following parameters are found to be most promising: the instant of maximum glottis area, the maximum derivative of the area, the slope of the spectrum of the glottal airflow volume velocity pulse, the amplitude ratios of harmonics of this spectrum, and the pitch. On the plane of the first two main components in the space of these parameters, an almost twofold decrease in the classification error relative to that for the pitch alone is attained. The male voice recognition probability is found to be 94.7%, and the female voice recognition probability is 95.9%.

  5. Voice-to-Phoneme Conversion Algorithms for Voice-Tag Applications in Embedded Platforms

    Directory of Open Access Journals (Sweden)

    Yan Ming Cheng

    2008-08-01

    Full Text Available We describe two voice-to-phoneme conversion algorithms for speaker-independent voice-tag creation specifically targeted at applications on embedded platforms. These algorithms (batch mode and sequential are compared in speech recognition experiments where they are first applied in a same-language context in which both acoustic model training and voice-tag creation and application are performed on the same language. Then, their performance is tested in a cross-language setting where the acoustic models are trained on a particular source language while the voice-tags are created and applied on a different target language. In the same-language environment, both algorithms either perform comparably to or significantly better than the baseline where utterances are manually transcribed by a phonetician. In the cross-language context, the voice-tag performances vary depending on the source-target language pair, with the variation reflecting predicted phonological similarity between the source and target languages. Among the most similar languages, performance nears that of the native-trained models and surpasses the native reference baseline.

  6. Speech pattern recognition for forensic acoustic purposes

    OpenAIRE

    Herrera Martínez, Marcelo; Aldana Blanco, Andrea Lorena; Guzmán Palacios, Ana María

    2014-01-01

    The present paper describes the development of a software for analysis of acoustic voice parameters (APAVOIX), which can be used for forensic acoustic purposes, based on the speaker recognition and identification. This software enables to observe in a clear manner, the parameters which are sufficient and necessary when performing a comparison between two voice signals, the suspicious and the original one. These parameters are used according to the classic method, generally used by state entit...

  7. Automatic Speech Acquisition and Recognition for Spacesuit Audio Systems

    Science.gov (United States)

    Ye, Sherry

    2015-01-01

    NASA has a widely recognized but unmet need for novel human-machine interface technologies that can facilitate communication during astronaut extravehicular activities (EVAs), when loud noises and strong reverberations inside spacesuits make communication challenging. WeVoice, Inc., has developed a multichannel signal-processing method for speech acquisition in noisy and reverberant environments that enables automatic speech recognition (ASR) technology inside spacesuits. The technology reduces noise by exploiting differences between the statistical nature of signals (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, ASR accuracy can be improved to the level at which crewmembers will find the speech interface useful. System components and features include beam forming/multichannel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, and ASR decoding. Arithmetic complexity models were developed and will help designers of real-time ASR systems select proper tasks when confronted with constraints in computational resources. In Phase I of the project, WeVoice validated the technology. The company further refined the technology in Phase II and developed a prototype for testing and use by suited astronauts.

  8. Developing Student Voices on the Internet.

    Science.gov (United States)

    Dresang, Eliza T.

    1997-01-01

    Books and online discussion groups encourage youth to develop strong narrative voices. Includes an annotated bibliography of books and Internet sites dealing with discovering the self and others; exploring race, culture, archeology, technology, war, poverty, gender and urban problems; creating and critiquing stories; and publishing industry…

  9. Voice - How humans communicate?

    Science.gov (United States)

    Tiwari, Manjul; Tiwari, Maneesha

    2012-01-01

    Voices are important things for humans. They are the medium through which we do a lot of communicating with the outside world: our ideas, of course, and also our emotions and our personality. The voice is the very emblem of the speaker, indelibly woven into the fabric of speech. In this sense, each of our utterances of spoken language carries not only its own message but also, through accent, tone of voice and habitual voice quality it is at the same time an audible declaration of our membership of particular social regional groups, of our individual physical and psychological identity, and of our momentary mood. Voices are also one of the media through which we (successfully, most of the time) recognize other humans who are important to us-members of our family, media personalities, our friends, and enemies. Although evidence from DNA analysis is potentially vastly more eloquent in its power than evidence from voices, DNA cannot talk. It cannot be recorded planning, carrying out or confessing to a crime. It cannot be so apparently directly incriminating. As will quickly become evident, voices are extremely complex things, and some of the inherent limitations of the forensic-phonetic method are in part a consequence of the interaction between their complexity and the real world in which they are used. It is one of the aims of this article to explain how this comes about. This subject have unsolved questions, but there is no direct way to present the information that is necessary to understand how voices can be related, or not, to their owners.

  10. [Development of image quality assurance support system using image recognition technology in radiography in lacked images of chest and abdomen].

    Science.gov (United States)

    Shibuya, Toru; Kato, Kyouichi; Eshima, Hidekazu; Sumi, Shinichirou; Kubo, Tadashi; Ishida, Hideki; Nakazawa, Yasuo

    2012-01-01

    In order to provide a precise radiography for diagnosis, it is required that we avoid radiography with defects by having enough evaluation. Conventionally, evaluation was performed only by observation of a radiological technologist (RT). The evaluation support system was developed for providing a high quality assurance without depending on RT observation only. The evaluation support system, called as the Image Quality Assurance Support System (IQASS), is characterized in that "image recognition technology" for the purpose of diagnostic radiography of chest and abdomen areas. The technique of the system used in this study. Of the 259 samples of posterior-anterior (AP) chest, lateral chest, and upright abdominal x-rays, the sensitivity and specificity was 93.1% and 91.8% in the chest AP, 93.3% and 93.6% in the chest lateral, and 95.0% and 93.8% in the upright abdominal x-rays. In the light of these results, it is suggested that AIQAS could be applied to practical usage for the RT.

  11. Connections between voice ergonomic risk factors and voice symptoms, voice handicap, and respiratory tract diseases.

    Science.gov (United States)

    Rantala, Leena M; Hakala, Suvi J; Holmqvist, Sofia; Sala, Eeva

    2012-11-01

    The aim of the study was to investigate the connections between voice ergonomic risk factors found in classrooms and voice-related problems in teachers. Voice ergonomic assessment was performed in 39 classrooms in 14 elementary schools by means of a Voice Ergonomic Assessment in Work Environment--Handbook and Checklist. The voice ergonomic risk factors assessed included working culture, noise, indoor air quality, working posture, stress, and access to a sound amplifier. Teachers from the above-mentioned classrooms reported their voice symptoms, respiratory tract diseases, and completed a Voice Handicap Index (VHI). The more voice ergonomic risk factors found in the classroom the higher were the teachers' total scores on voice symptoms and VHI. Stress was the factor that correlated most strongly with voice symptoms. Poor indoor air quality increased the occurrence of laryngitis. Voice ergonomics were poor in the classrooms studied and voice ergonomic risk factors affected the voice. It is important to convey information on voice ergonomics to education administrators and those responsible for school planning and taking care of school buildings. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  12. SPEECH EMOTION RECOGNITION USING MODIFIED QUADRATIC DISCRIMINATION FUNCTION

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    Quadratic Discrimination Function(QDF)is commonly used in speech emotion recognition,which proceeds on the premise that the input data is normal distribution.In this Paper,we propose a transformation to normalize the emotional features,then derivate a Modified QDF(MQDF) to speech emotion recognition.Features based on prosody and voice quality are extracted and Principal Component Analysis Neural Network (PCANN) is used to reduce dimension of the feature vectors.The results show that voice quality features are effective supplement for recognition.and the method in this paper could improve the recognition ratio effectively.

  13. Automatic Speaker Recognition for Mobile Forensic Applications

    Directory of Open Access Journals (Sweden)

    Mohammed Algabri

    2017-01-01

    Full Text Available Presently, lawyers, law enforcement agencies, and judges in courts use speech and other biometric features to recognize suspects. In general, speaker recognition is used for discriminating people based on their voices. The process of determining, if a suspected speaker is the source of trace, is called forensic speaker recognition. In such applications, the voice samples are most probably noisy, the recording sessions might mismatch each other, the sessions might not contain sufficient recording for recognition purposes, and the suspect voices are recorded through mobile channel. The identification of a person through his voice within a forensic quality context is challenging. In this paper, we propose a method for forensic speaker recognition for the Arabic language; the King Saud University Arabic Speech Database is used for obtaining experimental results. The advantage of this database is that each speaker’s voice is recorded in both clean and noisy environments, through a microphone and a mobile channel. This diversity facilitates its usage in forensic experimentations. Mel-Frequency Cepstral Coefficients are used for feature extraction and the Gaussian mixture model-universal background model is used for speaker modeling. Our approach has shown low equal error rates (EER, within noisy environments and with very short test samples.

  14. [Information technology in learning sign language].

    Science.gov (United States)

    Hernández, Cesar; Pulido, Jose L; Arias, Jorge E

    2015-01-01

    To develop a technological tool that improves the initial learning of sign language in hearing impaired children. The development of this research was conducted in three phases: the lifting of requirements, design and development of the proposed device, and validation and evaluation device. Through the use of information technology and with the advice of special education professionals, we were able to develop an electronic device that facilitates the learning of sign language in deaf children. This is formed mainly by a graphic touch screen, a voice synthesizer, and a voice recognition system. Validation was performed with the deaf children in the Filadelfia School of the city of Bogotá. A learning methodology was established that improves learning times through a small, portable, lightweight, and educational technological prototype. Tests showed the effectiveness of this prototype, achieving a 32 % reduction in the initial learning time for sign language in deaf children.

  15. Voice loops as coordination aids in space shuttle mission control.

    Science.gov (United States)

    Patterson, E S; Watts-Perotti, J; Woods, D D

    1999-01-01

    Voice loops, an auditory groupware technology, are essential coordination support tools for experienced practitioners in domains such as air traffic management, aircraft carrier operations and space shuttle mission control. They support synchronous communication on multiple channels among groups of people who are spatially distributed. In this paper, we suggest reasons for why the voice loop system is a successful medium for supporting coordination in space shuttle mission control based on over 130 hours of direct observation. Voice loops allow practitioners to listen in on relevant communications without disrupting their own activities or the activities of others. In addition, the voice loop system is structured around the mission control organization, and therefore directly supports the demands of the domain. By understanding how voice loops meet the particular demands of the mission control environment, insight can be gained for the design of groupware tools to support cooperative activity in other event-driven domains.

  16. Technology

    Directory of Open Access Journals (Sweden)

    Xu Jing

    2016-01-01

    Full Text Available The traditional answer card reading method using OMR (Optical Mark Reader, most commonly, OMR special card special use, less versatile, high cost, aiming at the existing problems proposed a method based on pattern recognition of the answer card identification method. Using the method based on Line Segment Detector to detect the tilt of the image, the existence of tilt image rotation correction, and eventually achieve positioning and detection of answers to the answer sheet .Pattern recognition technology for automatic reading, high accuracy, detect faster

  17. Automatic speech recognition for radiological reporting

    International Nuclear Information System (INIS)

    Vidal, B.

    1991-01-01

    Large vocabulary speech recognition, its techniques and its software and hardware technology, are being developed, aimed at providing the office user with a tool that could significantly improve both quantity and quality of his work: the dictation machine, which allows memos and documents to be input using voice and a microphone instead of fingers and a keyboard. The IBM Rome Science Center, together with the IBM Research Division, has built a prototype recognizer that accepts sentences in natural language from 20.000-word Italian vocabulary. The unit runs on a personal computer equipped with a special hardware capable of giving all the necessary computing power. The first laboratory experiments yielded very interesting results and pointed out such system characteristics to make its use possible in operational environments. To this purpose, the dictation of medical reports was considered as a suitable application. In cooperation with the 2nd Radiology Department of S. Maria della Misericordia Hospital (Udine, Italy), a system was experimented by radiology department doctors during their everyday work. The doctors were able to directly dictate their reports to the unit. The text appeared immediately on the screen, and eventual errors could be corrected either by voice or by using the keyboard. At the end of report dictation, the doctors could both print and archive the text. The report could also be forwarded to hospital information system, when the latter was available. Our results have been very encouraging: the system proved to be robust, simple to use, and accurate (over 95% average recognition rate). The experiment was precious for suggestion and comments, and its results are useful for system evolution towards improved system management and efficency

  18. Voice Therapy Practices and Techniques: A Survey of Voice Clinicians.

    Science.gov (United States)

    Mueller, Peter B.; Larson, George W.

    1992-01-01

    Eighty-three voice disorder therapists' ratings of statements regarding voice therapy practices indicated that vocal nodules are the most frequent disorder treated; vocal abuse and hard glottal attack elimination, counseling, and relaxation were preferred treatment approaches; and voice therapy is more effective with adults than with children.…

  19. Effects of Medications on Voice

    Science.gov (United States)

    ... ENTCareers Marketplace Find an ENT Doctor Near You Effects of Medications on Voice Effects of Medications on Voice Patient Health Information News ... replacement therapy post-menopause may have a variable effect. An inadequate level of thyroid replacement medication in ...

  20. Hearing Voices and Seeing Things

    Science.gov (United States)

    ... Facts for Families Guide Facts for Families - Vietnamese Hearing Voices and Seeing Things No. 102; Updated October ... delusions (a fixed, false, and often bizarre belief). Hearing voices or seeing things that are not there ...

  1. Design and realization of intelligent tourism service system based on voice interaction

    Science.gov (United States)

    Hu, Lei-di; Long, Yi; Qian, Cheng-yang; Zhang, Ling; Lv, Guo-nian

    2008-10-01

    Voice technology is one of the important contents to improve the intelligence and humanization of tourism service system. Combining voice technology, the paper concentrates on application needs and the composition of system to present an overall intelligent tourism service system's framework consisting of presentation layer, Web services layer, and tourism application service layer. On the basis, the paper further elaborated the implementation of the system and its key technologies, including intelligent voice interactive technology, seamless integration technology of multiple data sources, location-perception-based guides' services technology, and tourism safety control technology. Finally, according to the situation of Nanjing tourism, a prototype of Tourism Services System is realized.

  2. Using Hierarchical Time Series Clustering Algorithm and Wavelet Classifier for Biometric Voice Classification

    Directory of Open Access Journals (Sweden)

    Simon Fong

    2012-01-01

    Full Text Available Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers’ gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

  3. Experiences with voice to design ceramics

    DEFF Research Database (Denmark)

    Hansen, Flemming Tvede; Jensen, Kristoffer

    2014-01-01

    This article presents SoundShaping, a system to create ceramics from the human voice and thus how digital technology makes new possibilities in ceramic craft. The article is about how experiential knowledge that the craftsmen gains in a direct physical and tactile interaction with a responding...... material can be transformed and utilised in the use of digital technologies. SoundShaping is based on a generic audio feature extraction system and the principal component analysis to ensure that the pertinent information in the voice is used. Moreover, 3D shape is created using simple geometric rules....... The shape is output to a 3D printer to make ceramic results. The system demonstrates the close connection between digital technology and craft practice. Several experiments and reflections demonstrate the validity of this work....

  4. Experiences with Voice to Design Ceramics

    DEFF Research Database (Denmark)

    Hansen, Flemming Tvede; Jensen, Kristoffer

    2013-01-01

    This article presents SoundShaping, a system to create ceramics from the human voice and thus how digital technology makes new possibilities in ceramic craft. The article is about how experiential knowledge that the craftsmen gains in a direct physical and tactile interaction with a responding...... material can be transformed and utilized in the use of digital technologies. SoundShaping is based on a generic audio feature extraction system and the principal component analysis to ensure that the pertinent information in the voice is used. Moreover, 3D shape is created using simple geometric rules....... The shape is output to a 3D printer to make ceramic results. The system demonstrates the close connection between digital technology and craft practice. Several experiments and reflections demonstrate the validity of this work....

  5. Using the Voice to Design Ceramics

    DEFF Research Database (Denmark)

    Hansen, Flemming Tvede; Jensen, Kristoffer

    2011-01-01

    Digital technology makes new possibilities in ceramic craft. This project is about how experiential knowledge that the craftsmen gains in a direct physical and tactile interaction with a responding material can be transformed and utilized in the use of digital technologies. The project presents...... to make ceramic results. The system demonstrates the close connection between digital technology and craft practice....... SoundShaping, a system to create ceramics from the human voice. Based on a generic audio feature extraction system, and the principal component analysis to ensure that the pertinent information in the voice is used, a 3D shape is created using simple geometric rules. This shape is output to a 3D printer...

  6. Interprofessional, simulation-based technology-enhanced learning to improve physical health care in psychiatry: The recognition and assessment of medical problems in psychiatric settings course.

    Science.gov (United States)

    Akroyd, Mike; Jordan, Gary; Rowlands, Paul

    2016-06-01

    People with serious mental illness have reduced life expectancy compared with a control population, much of which is accounted for by significant physical comorbidity. Frontline clinical staff in mental health often lack confidence in recognition, assessment and management of such 'medical' problems. Simulation provides one way for staff to practise these skills in a safe setting. We produced a multidisciplinary simulation course around recognition and assessment of medical problems in psychiatric settings. We describe an audit of strategic and design aspects of the recognition and assessment of medical problems in psychiatric settings course, using the Department of Health's 'Framework for Technology Enhanced Learning' as our audit standards. At the same time as highlighting areas where recognition and assessment of medical problems in psychiatric settings adheres to these identified principles, such as the strategic underpinning of the approach, and the means by which information is collected, reviewed and shared, it also helps us to identify areas where we can improve. © The Author(s) 2014.

  7. Automatic Speech Recognition from Neural Signals: A Focused Review

    Directory of Open Access Journals (Sweden)

    Christian Herff

    2016-09-01

    Full Text Available Speech interfaces have become widely accepted and are nowadays integrated in various real-life applications and devices. They have become a part of our daily life. However, speech interfaces presume the ability to produce intelligible speech, which might be impossible due to either loud environments, bothering bystanders or incapabilities to produce speech (i.e.~patients suffering from locked-in syndrome. For these reasons it would be highly desirable to not speak but to simply envision oneself to say words or sentences. Interfaces based on imagined speech would enable fast and natural communication without the need for audible speech and would give a voice to otherwise mute people.This focused review analyzes the potential of different brain imaging techniques to recognize speech from neural signals by applying Automatic Speech Recognition technology. We argue that modalities based on metabolic processes, such as functional Near Infrared Spectroscopy and functional Magnetic Resonance Imaging, are less suited for Automatic Speech Recognition from neural signals due to low temporal resolution but are very useful for the investigation of the underlying neural mechanisms involved in speech processes. In contrast, electrophysiologic activity is fast enough to capture speech processes and is therefor better suited for ASR. Our experimental results indicate the potential of these signals for speech recognition from neural data with a focus on invasively measured brain activity (electrocorticography. As a first example of Automatic Speech Recognition techniques used from neural signals, we discuss the emph{Brain-to-text} system.

  8. Aerodynamic and sound intensity measurements in tracheoesophageal voice

    NARCIS (Netherlands)

    Grolman, Wilko; Eerenstein, Simone E. J.; Tan, Frédérique M. L.; Tange, Rinze A.; Schouwenburg, Paul F.

    2007-01-01

    BACKGROUND: In laryngectomized patients, tracheoesophageal voice generally provides a better voice quality than esophageal voice. Understanding the aerodynamics of voice production in patients with a voice prosthesis is important for optimizing prosthetic designs and successful voice rehabilitation.

  9. [Voice disorders in female teachers assessed by Voice Handicap Index].

    Science.gov (United States)

    Niebudek-Bogusz, Ewa; Kuzańska, Anna; Woźnicka, Ewelina; Sliwińska-Kowalska, Mariola

    2007-01-01

    The aim of this study was to assess the application of Voice Handicap Index (VHI) in the diagnosis of occupational voice disorders in female teachers. The subjective assessment of voice by VHI was performed in fifty subjects with dysphonia diagnosed in laryngovideostroboscopic examination. The control group comprised 30 women whose jobs did not involve vocal effort. The results of the total VHI score and each of its subscales: functional, emotional and physical was significantly worse in the study group than in controls (p teachers estimated their own voice problems as a moderate disability, while 12% of them reported severe voice disability. However, all non-teachers assessed their voice problems as slight, their results ranged at the lowest level of VHI score. This study confirmed that VHI as a tool for self-assessment of voice can be a significant contribution to the diagnosis of occupational dysphonia.

  10. Perceptual complexity of faces and voices modulates cross-modal behavioral facilitation effects

    Directory of Open Access Journals (Sweden)

    Frédéric Joassin

    2018-04-01

    Full Text Available Joassin et al. (Neuroscience Letters, 2004,369,132-137 observed that the recognition of face-voice associations led to an interference effect, i.e. to decreased performances relative to the recognition of faces presented in isolation. In the present experiment, we tested the hypothesis that this interference effect could be due to the fact that voices were more difficult to recognize than faces. For this purpose, we modified some faces by morphing to make them as difficult to recognize as the voices. Twenty one healthy volunteers performed a recogniton task of previously learned face-voice associations in 5 conditions: voices (A, natural faces (V, morphed faces (V30, voice-natural face associations (AV and voice-morphed faces associations (AV30. As expected, AV led to interference, as it was less well and slower performed than V. However, when faces were as difficult to recognize as voices, their simultaneous presentation produced a clear facilitation, AV30 being significantly better and faster performed than A and V30. These results demonstrate that matching or not the perceptual complexity of the unimodal stimuli modulates the potential cross-modal gains of the bimodal situations.

  11. Listen to a voice

    DEFF Research Database (Denmark)

    Hølge-Hazelton, Bibi

    2001-01-01

    Listen to the voice of a young girl Lonnie, who was diagnosed with Type 1 diabetes at 16. Imagine that she is deeply involved in the social security system. She lives with her mother and two siblings in a working class part of a small town. She is at a special school for problematic youth, and her...

  12. Voices of courage

    Directory of Open Access Journals (Sweden)

    Noraida Abdullah Karim

    2007-07-01

    Full Text Available In May 2007 the Women’s Commission for Refugee Women and Children1 presented its annual Voices of Courage awards to three displaced people who have dedicated their lives to promoting economic opportunities for refugee and displaced women and youth. These are their (edited testimonies.

  13. What the voice reveals

    NARCIS (Netherlands)

    Ko, Sei Jin

    2007-01-01

    Given that the voice is our main form of communication, we know surprisingly little about how it impacts judgment and behavior. Furthermore, the modern advancement in telecommunication systems, such as cellular phones, has meant that a large proportion of our everyday interactions are conducted

  14. Bodies and Voices

    DEFF Research Database (Denmark)

    A wide-ranging collection of essays centred on readings of the body in contemporary literary and socio-anthropological discourse, from slavery and rape to female genital mutilation, from clothing, ocular pornography, voice, deformation and transmutation to the imprisoned, dismembered, remembered...

  15. Human voice perception.

    Science.gov (United States)

    Latinus, Marianne; Belin, Pascal

    2011-02-22

    We are all voice experts. First and foremost, we can produce and understand speech, and this makes us a unique species. But in addition to speech perception, we routinely extract from voices a wealth of socially-relevant information in what constitutes a more primitive, and probably more universal, non-linguistic mode of communication. Consider the following example: you are sitting in a plane, and you can hear a conversation in a foreign language in the row behind you. You do not see the speakers' faces, and you cannot understand the speech content because you do not know the language. Yet, an amazing amount of information is available to you. You can evaluate the physical characteristics of the different protagonists, including their gender, approximate age and size, and associate an identity to the different voices. You can form a good idea of the different speaker's mood and affective state, as well as more subtle cues as the perceived attractiveness or dominance of the protagonists. In brief, you can form a fairly detailed picture of the type of social interaction unfolding, which a brief glance backwards can on the occasion help refine - sometimes surprisingly so. What are the acoustical cues that carry these different types of vocal information? How does our brain process and analyse this information? Here we briefly review an emerging field and the main tools used in voice perception research. Copyright © 2011 Elsevier Ltd. All rights reserved.

  16. Voice similarity in identical twins.

    Science.gov (United States)

    Van Gysel, W D; Vercammen, J; Debruyne, F

    2001-01-01

    If people are asked to discriminate visually the two individuals of a monozygotic twin (MT), they mostly get into trouble. Does this problem also exist when listening to twin voices? Twenty female and 10 male MT voices were randomly assembled with one "strange" voice to get voice trios. The listeners (10 female students in Speech and Language Pathology) were asked to label the twins (voices 1-2, 1-3 or 2-3) in two conditions: two standard sentences read aloud and a 2.5-second midsection of a sustained /a/. The proportion correctly labelled twins was for female voices 82% and 63% and for male voices 74% and 52% for the sentences and the sustained /a/ respectively, both being significantly greater than chance (33%). The acoustic analysis revealed a high intra-twin correlation for the speaking fundamental frequency (SFF) of the sentences and the fundamental frequency (F0) of the sustained /a/. So the voice pitch could have been a useful characteristic in the perceptual identification of the twins. We conclude that there is a greater perceptual resemblance between the voices of identical twins than between voices without genetic relationship. The identification however is not perfect. The voice pitch possibly contributes to the correct twin identifications.

  17. Garbage Modeling for On-device Speech Recognition

    NARCIS (Netherlands)

    Van Gysel, C.; Velikovich, L.; McGraw, I.; Beaufays, F.

    2015-01-01

    User interactions with mobile devices increasingly depend on voice as a primary input modality. Due to the disadvantages of sending audio across potentially spotty network connections for speech recognition, in recent years there has been growing attention to performing recognition on-device. The

  18. A pattern recognition mezzanine based on associative memory and FPGA technology for L1 track triggering at HL-LHC

    International Nuclear Information System (INIS)

    Alunni, L.; Biesuz, N.; Bilei, G.M.; Citraro, S.; Crescioli, F.; Fanò, L.; Fedi, G.; Magalotti, D.; Magazzù, G.; Servoli, L.; Storchi, L.; Palla, F.; Placidi, P.; Papi, A.; Piadyk, Y.; Rossi, E.; Spiezia, A.

    2016-01-01

    The increase of luminosity at HL-LHC will require the introduction of tracker information at Level-1 trigger system for the experiments to maintain an acceptable trigger rate to select interesting events despite the one order of magnitude increase in the minimum bias interactions. To extract in the required latency the track information a dedicated hardware has to be used. We present the tests of a prototype system (Pattern Recognition Mezzanine) as core of pattern recognition and track fitting for HL-LHC ATLAS and CMS experiments, combining the power of both Associative Memory custom ASIC and modern Field Programmable Gate Array (FPGA) devices.

  19. A pattern recognition mezzanine based on associative memory and FPGA technology for L1 track triggering at HL-LHC

    Science.gov (United States)

    Alunni, L.; Biesuz, N.; Bilei, G. M.; Citraro, S.; Crescioli, F.; Fanò, L.; Fedi, G.; Magalotti, D.; Magazzù, G.; Servoli, L.; Storchi, L.; Palla, F.; Placidi, P.; Papi, A.; Piadyk, Y.; Rossi, E.; Spiezia, A.

    2016-07-01

    The increase of luminosity at HL-LHC will require the introduction of tracker information at Level-1 trigger system for the experiments to maintain an acceptable trigger rate to select interesting events despite the one order of magnitude increase in the minimum bias interactions. To extract in the required latency the track information a dedicated hardware has to be used. We present the tests of a prototype system (Pattern Recognition Mezzanine) as core of pattern recognition and track fitting for HL-LHC ATLAS and CMS experiments, combining the power of both Associative Memory custom ASIC and modern Field Programmable Gate Array (FPGA) devices.

  20. A pattern recognition mezzanine based on associative memory and FPGA technology for L1 track triggering at HL-LHC

    Energy Technology Data Exchange (ETDEWEB)

    Alunni, L. [INFN Sezione di Perugia (Italy); Biesuz, N. [INFN Sezione di Pisa (Italy); Bilei, G.M. [INFN Sezione di Perugia (Italy); Citraro, S. [Università di Pisa, Pisa (Italy); Crescioli, F. [LPNHE, Paris (France); Fanò, L. [INFN Sezione di Perugia (Italy); Fedi, G., E-mail: giacomo.fedi@pi.infn.it [INFN Sezione di Pisa (Italy); Magalotti, D. [INFN Sezione di Perugia (Italy); UNIMORE, Modena (Italy); Magazzù, G. [INFN Sezione di Pisa (Italy); Servoli, L.; Storchi, L. [INFN Sezione di Perugia (Italy); Palla, F. [INFN Sezione di Pisa (Italy); Placidi, P. [INFN Sezione di Perugia (Italy); DIEI, Perugia (Italy); Papi, A. [INFN Sezione di Perugia (Italy); Piadyk, Y. [LPNHE, Paris (France); Rossi, E. [INFN Sezione di Pisa (Italy); Spiezia, A. [IHEP (China)

    2016-07-11

    The increase of luminosity at HL-LHC will require the introduction of tracker information at Level-1 trigger system for the experiments to maintain an acceptable trigger rate to select interesting events despite the one order of magnitude increase in the minimum bias interactions. To extract in the required latency the track information a dedicated hardware has to be used. We present the tests of a prototype system (Pattern Recognition Mezzanine) as core of pattern recognition and track fitting for HL-LHC ATLAS and CMS experiments, combining the power of both Associative Memory custom ASIC and modern Field Programmable Gate Array (FPGA) devices.

  1. A Spoken English Recognition Expert System.

    Science.gov (United States)

    1983-09-01

    34Speech Recognition by Computer," Scientific American. New York: Scientific American, April 1981: 64-76. 16. Marcus, Mitchell P. A Theo of Syntactic...prob)...) Pcssible words for voice decoder to choose from are: gents dishes issues itches ewes folks foes comunications units eunichs error * farce

  2. Pattern recognition

    CERN Document Server

    Theodoridis, Sergios

    2003-01-01

    Pattern recognition is a scientific discipline that is becoming increasingly important in the age of automation and information handling and retrieval. Patter Recognition, 2e covers the entire spectrum of pattern recognition applications, from image analysis to speech recognition and communications. This book presents cutting-edge material on neural networks, - a set of linked microprocessors that can form associations and uses pattern recognition to ""learn"" -and enhances student motivation by approaching pattern recognition from the designer's point of view. A direct result of more than 10

  3. Very low bit rate voice for packetized mobile applications

    International Nuclear Information System (INIS)

    Knittle, C.D.; Malone, K.T.

    1991-01-01

    This paper reports that transmitting digital voice via packetized mobile communications systems that employ relatively short packet lengths and narrow bandwidths often necessitates very low bit rate coding of the voice data. Sandia National Laboratories is currently developing an efficient voice coding system operating at 800 bits per second (bps). The coding scheme is a modified version of the 2400 bps NSA LPC-10e standard. The most significant modification to the LPC-10e scheme is the vector quantization of the line spectrum frequencies associated with the synthesis filters. An outline of a hardware implementation for the 800 bps coder is presented. The speech quality of the coder is generally good, although speaker recognition is not possible. Further research is being conducted to reduce the memory requirements and complexity of the vector quantizer, and to increase the quality of the reconstructed speech. This work may be of use dealing with nuclear materials

  4. Double Fourier analysis for Emotion Identification in Voiced Speech

    International Nuclear Information System (INIS)

    Sierra-Sosa, D.; Bastidas, M.; Ortiz P, D.; Quintero, O.L.

    2016-01-01

    We propose a novel analysis alternative, based on two Fourier Transforms for emotion recognition from speech. Fourier analysis allows for display and synthesizes different signals, in terms of power spectral density distributions. A spectrogram of the voice signal is obtained performing a short time Fourier Transform with Gaussian windows, this spectrogram portraits frequency related features, such as vocal tract resonances and quasi-periodic excitations during voiced sounds. Emotions induce such characteristics in speech, which become apparent in spectrogram time-frequency distributions. Later, the signal time-frequency representation from spectrogram is considered an image, and processed through a 2-dimensional Fourier Transform in order to perform the spatial Fourier analysis from it. Finally features related with emotions in voiced speech are extracted and presented. (paper)

  5. Forensic Speaker Recognition Law Enforcement and Counter-Terrorism

    CERN Document Server

    Patil, Hemant

    2012-01-01

    Forensic Speaker Recognition: Law Enforcement and Counter-Terrorism is an anthology of the research findings of 35 speaker recognition experts from around the world. The volume provides a multidimensional view of the complex science involved in determining whether a suspect’s voice truly matches forensic speech samples, collected by law enforcement and counter-terrorism agencies, that are associated with the commission of a terrorist act or other crimes. While addressing such topics as the challenges of forensic case work, handling speech signal degradation, analyzing features of speaker recognition to optimize voice verification system performance, and designing voice applications that meet the practical needs of law enforcement and counter-terrorism agencies, this material all sounds a common theme: how the rigors of forensic utility are demanding new levels of excellence in all aspects of speaker recognition. The contributors are among the most eminent scientists in speech engineering and signal process...

  6. Location-Enhanced Activity Recognition in Indoor Environments Using Off the Shelf Smart Watch Technology and BLE Beacons.

    Science.gov (United States)

    Filippoupolitis, Avgoustinos; Oliff, William; Takand, Babak; Loukas, George

    2017-05-27

    Activity recognition in indoor spaces benefits context awareness and improves the efficiency of applications related to personalised health monitoring, building energy management, security and safety. The majority of activity recognition frameworks, however, employ a network of specialised building sensors or a network of body-worn sensors. As this approach suffers with respect to practicality, we propose the use of commercial off-the-shelf devices. In this work, we design and evaluate an activity recognition system composed of a smart watch, which is enhanced with location information coming from Bluetooth Low Energy (BLE) beacons. We evaluate the performance of this approach for a variety of activities performed in an indoor laboratory environment, using four supervised machine learning algorithms. Our experimental results indicate that our location-enhanced activity recognition system is able to reach a classification accuracy ranging from 92% to 100%, while without location information classification accuracy it can drop to as low as 50% in some cases, depending on the window size chosen for data segmentation.

  7. Location-Enhanced Activity Recognition in Indoor Environments Using Off the Shelf Smart Watch Technology and BLE Beacons

    Directory of Open Access Journals (Sweden)

    Avgoustinos Filippoupolitis

    2017-05-01

    Full Text Available Activity recognition in indoor spaces benefits context awareness and improves the efficiency of applications related to personalised health monitoring, building energy management, security and safety. The majority of activity recognition frameworks, however, employ a network of specialised building sensors or a network of body-worn sensors. As this approach suffers with respect to practicality, we propose the use of commercial off-the-shelf devices. In this work, we design and evaluate an activity recognition system composed of a smart watch, which is enhanced with location information coming from Bluetooth Low Energy (BLE beacons. We evaluate the performance of this approach for a variety of activities performed in an indoor laboratory environment, using four supervised machine learning algorithms. Our experimental results indicate that our location-enhanced activity recognition system is able to reach a classification accuracy ranging from 92% to 100%, while without location information classification accuracy it can drop to as low as 50% in some cases, depending on the window size chosen for data segmentation.

  8. Integrating speech technology to meet crew station design requirements

    Science.gov (United States)

    Simpson, Carol A.; Ruth, John C.; Moore, Carolyn A.

    The last two years have seen improvements in speech generation and speech recognition technology that make speech I/O for crew station controls and displays viable for operational systems. These improvements include increased robustness of algorithm performance in high levels of background noise, increased vocabulary size, improved performance in the connected speech mode, and less speaker dependence. This improved capability makes possible far more sophisticated user interface design than was possible with earlier technology. Engineering, linguistic, and human factors design issues are discussed in the context of current voice I/O technology performance.

  9. A study on the application of voice interaction in automotive human machine interface experience design

    Science.gov (United States)

    Huang, Zhaohui; Huang, Xiemin

    2018-04-01

    This paper, firstly, introduces the application trend of the integration of multi-channel interactions in automotive HMI ((Human Machine Interface) from complex information models faced by existing automotive HMI and describes various interaction modes. By comparing voice interaction and touch screen, gestures and other interaction modes, the potential and feasibility of voice interaction in automotive HMI experience design are concluded. Then, the related theories of voice interaction, identification technologies, human beings' cognitive models of voices and voice design methods are further explored. And the research priority of this paper is proposed, i.e. how to design voice interaction to create more humane task-oriented dialogue scenarios to enhance interactive experiences of automotive HMI. The specific scenarios in driving behaviors suitable for the use of voice interaction are studied and classified, and the usability principles and key elements for automotive HMI voice design are proposed according to the scenario features. Then, through the user participatory usability testing experiment, the dialogue processes of voice interaction in automotive HMI are defined. The logics and grammars in voice interaction are classified according to the experimental results, and the mental models in the interaction processes are analyzed. At last, the voice interaction design method to create the humane task-oriented dialogue scenarios in the driving environment is proposed.

  10. Hearing the unheard: An interdisciplinary, mixed methodology study of women’s experiences of hearing voices (auditory verbal hallucinations

    Directory of Open Access Journals (Sweden)

    Simon eMcCarthy-Jones

    2015-12-01

    Full Text Available This paper explores the experiences of women who ‘hear voices’ (auditory verbal hallucinations. We begin by examining historical understandings of women hearing voices, showing these have been driven by androcentric theories of how women’s bodies functioned, leading to women being viewed as requiring their voices be interpreted by men. We show the twentieth-century was associated with recognition that the mental violation of women’s minds (represented by some voice-hearing was often a consequence of the physical violation of women’s bodies. We next report the results of a qualitative study into voice-hearing women’s experiences (N=8. This found similarities between women’s relationships with their voices and their relationships with others and the wider social context. Finally, we present results from a quantitative study comparing voice-hearing in women (n=65 and men (n=132 in a psychiatric setting. Women were more likely than men to have certain forms of voice-hearing (voices conversing and to have antecedent events of trauma, physical illness, and relationship problems. Voices identified as female may have more positive affect than male voices. We conclude that women voice-hearers have and continue to face specific challenges necessitating research and activism, and hope this paper will act as a stimulus to such work.

  11. Risk factors for voice problems in teachers.

    NARCIS (Netherlands)

    Kooijman, P.G.C.; Jong, F.I.C.R.S. de; Thomas, G.; Huinck, W.J.; Donders, A.R.T.; Graamans, K.; Schutte, H.K.

    2006-01-01

    In order to identify factors that are associated with voice problems and voice-related absenteeism in teachers, 1,878 questionnaires were analysed. The questionnaires inquired about personal data, voice complaints, voice-related absenteeism from work and conditions that may lead to voice complaints

  12. Risk factors for voice problems in teachers

    NARCIS (Netherlands)

    Kooijman, P. G. C.; de Jong, F. I. C. R. S.; Thomas, G.; Huinck, W.; Donders, R.; Graamans, K.; Schutte, H. K.

    2006-01-01

    In order to identify factors that are associated with voice problems and voice-related absenteeism in teachers, 1,878 questionnaires were analysed. The questionnaires inquired about personal data, voice complaints, voice-related absenteeism from work and conditions that may lead to voice complaints

  13. You're a What? Voice Actor

    Science.gov (United States)

    Liming, Drew

    2009-01-01

    This article talks about voice actors and features Tony Oliver, a professional voice actor. Voice actors help to bring one's favorite cartoon and video game characters to life. They also do voice-overs for radio and television commercials and movie trailers. These actors use the sound of their voice to sell a character's emotions--or an advertised…

  14. Multidimensional assessment of strongly irregular voices such as in substitution voicing and spasmodic dysphonia: a compilation of own research.

    Science.gov (United States)

    Moerman, Mieke; Martens, Jean-Pierre; Dejonckere, Philippe

    2015-04-01

    This article is a compilation of own research performed during the European COoperation in Science and Technology (COST) action 2103: 'Advance Voice Function Assessment', an initiative of voice and speech processing teams consisting of physicists, engineers, and clinicians. This manuscript concerns analyzing largely irregular voicing types, namely substitution voicing (SV) and adductor spasmodic dysphonia (AdSD). A specific perceptual rating scale (IINFVo) was developed, and the Auditory Model Based Pitch Extractor (AMPEX), a piece of software that automatically analyses running speech and generates pitch values in background noise, was applied. The IINFVo perceptual rating scale has been shown to be useful in evaluating SV. The analysis of strongly irregular voices stimulated a modification of the European Laryngological Society's assessment protocol which was originally designed for the common types of (less severe) dysphonia. Acoustic analysis with AMPEX demonstrates that the most informative features are, for SV, the voicing-related acoustic features and, for AdSD, the perturbation measures. Poor correlations between self-assessment and acoustic and perceptual dimensions in the assessment of highly irregular voices argue for a multidimensional approach.

  15. Iris recognition via plenoptic imaging

    Science.gov (United States)

    Santos-Villalobos, Hector J.; Boehnen, Chris Bensing; Bolme, David S.

    2017-11-07

    Iris recognition can be accomplished for a wide variety of eye images by using plenoptic imaging. Using plenoptic technology, it is possible to correct focus after image acquisition. One example technology reconstructs images having different focus depths and stitches them together, resulting in a fully focused image, even in an off-angle gaze scenario. Another example technology determines three-dimensional data for an eye and incorporates it into an eye model used for iris recognition processing. Another example technology detects contact lenses. Application of the technologies can result in improved iris recognition under a wide variety of scenarios.

  16. License plate recognition (phase B).

    Science.gov (United States)

    2010-06-01

    License Plate Recognition (LPR) technology has been used for off-line automobile enforcement purposes. The technology has seen mixed success with correct reading rate as high as 60 to 80% depending on the specific application and environment. This li...

  17. Factors that influence the recognition, reporting and resolution of incidents related to medical devices and other healthcare technologies: a systematic review.

    Science.gov (United States)

    Polisena, Julie; Gagliardi, Anna; Urbach, David; Clifford, Tammy; Fiander, Michelle

    2015-03-29

    Medical devices have improved the treatment of many medical conditions. Despite their benefit, the use of devices can lead to unintended incidents, potentially resulting in unnecessary harm, injury or complications to the patient, a complaint, loss or damage. Devices are used in hospitals on a routine basis. Research to date, however, has been primarily limited to describing incidents rates, so the optimal design of a hospital-based surveillance system remains unclear. Our research objectives were twofold: i) to explore factors that influence device-related incident recognition, reporting and resolution and ii) to investigate interventions or strategies to improve the recognition, reporting and resolution of medical device-related incidents. We searched the bibliographic databases: MEDLINE, Embase, the Cochrane Central Register of Controlled Trials and PsycINFO database. Grey literature (literature that is not commercially available) was searched for studies on factors that influence incident recognition, reporting and resolution published and interventions or strategies for their improvement from 2003 to 2014. Although we focused on medical devices, other health technologies were eligible for inclusion. Thirty studies were included in our systematic review, but most studies were concentrated on other health technologies. The study findings indicate that fear of punishment, uncertainty of what should be reported and how incident reports will be used and time constraints to incident reporting are common barriers to incident recognition and reporting. Relevant studies on the resolution of medical errors were not found. Strategies to improve error reporting include the use of an electronic error reporting system, increased training and feedback to frontline clinicians about the reported error. The available evidence on factors influencing medical device-related incident recognition, reporting and resolution by healthcare professionals can inform data collection and

  18. Fiscal 1997 report on the introductory study. Human behavior recognition evaluation technology; 1997 nendo sentan kenkyu hokokusho. Ningen kodo ninchi hyoka gijutsu

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1998-03-01

    Importance of the human behavior technology was paid attention to to get true safety and comfortableness of life through adaptability of products/systems to all humans. In consideration of the social and technological background, the human behavior recognition evaluation technology avoids economic/social losses caused by accidents and troubles and also provides the safe life environment for living people including the aged. Further, the gist of the project was proposed to make it clear that it can give a course of making things from a viewpoint of development of products dealing with individuals (personal fit) and can give new added values and contribute to heightening of international competitiveness at the same time. Development is made of technology of dividing concrete behavior patterns into types and also accumulating them in usable forms at the time of product design and in emergency and of technology of measuring all the time information on human behaviors in daily life without restrictions and on site. Supporting technology is developed for making the most of behavior information of users for product design and in emergency. Effects of the spread are also estimated. 76 refs., 13 figs., 9 tabs.

  19. Contribution to automatic image recognition. Application to analysis of plain scenes of overlapping parts in robot technology

    International Nuclear Information System (INIS)

    Tan, Shengbiao

    1987-01-01

    A method for object modeling and overlapped object automatic recognition is presented. Our work is composed of three essential parts: image processing, object modeling, and evaluation of the implementation of the stated concepts. In the first part, we present a method of edge encoding which is based on a re-sampling of the data encoded according to Freeman, this method generates an isotropic, homogenous and very precise representation. The second part relates to object modeling. This important step makes much easier the recognition work. The new method proposed characterizes a model with two groups of information: the description group containing the primitives, the discrimination group containing data packs, called 'transition vectors'. Based on this original method of information organization, a 'relative learning' is able to select, to ignore and to update the information concerning the objects already learned, according to the new information to be included into the data base. The recognition is a two-pass process: the first pass determines very efficiently the presence of objects by making use of each object's particularities, and this hypothesis is either confirmed or rejected by the following fine verification pass. The last part describes in detail the experimentation results. We demonstrate the robustness of the algorithms with images in both poor lighting and overlapping objects conditions. The system, named SOFIA, has been installed into an industrial vision system series and works in real time. (author) [fr

  20. Voice and silence in organizations

    Directory of Open Access Journals (Sweden)

    Moaşa, H.

    2011-01-01

    Full Text Available Unlike previous research on voice and silence, this article breaksthe distance between the two and declines to treat them as opposites. Voice and silence are interrelated and intertwined strategic forms ofcommunication which presuppose each other in such a way that the absence of one would minimize completely the other’s presence. Social actors are not voice, or silence. Social actors can have voice or silence, they can do both because they operate at multiple levels and deal with multiple issues at different moments in time.

  1. AN EXPERIMENT WITH THE VOICE TO DESIGN CERAMICS

    DEFF Research Database (Denmark)

    Hansen, Flemming Tvede

    2013-01-01

    from the human voice and thus how digital technology makes new possibilities in ceramic craft. 3D digital shape is created using simple geometric rules and is output to a 3D printer to make ceramic objects. The system demonstrates the close connection between digital technology and craft practice....

  2. Voice Biometrics for Information Assurance Applications

    National Research Council Canada - National Science Library

    Kang, George

    2002-01-01

    .... The ultimate goal of voice biometrics is to enable the use of voice as a password. Voice biometrics are "man-in-the-loop" systems in which system performance is significantly dependent on human performance...

  3. Speech emotion recognition based on statistical pitch model

    Institute of Scientific and Technical Information of China (English)

    WANG Zhiping; ZHAO Li; ZOU Cairong

    2006-01-01

    A modified Parzen-window method, which keep high resolution in low frequencies and keep smoothness in high frequencies, is proposed to obtain statistical model. Then, a gender classification method utilizing the statistical model is proposed, which have a 98% accuracy of gender classification while long sentence is dealt with. By separation the male voice and female voice, the mean and standard deviation of speech training samples with different emotion are used to create the corresponding emotion models. Then the Bhattacharyya distance between the test sample and statistical models of pitch, are utilized for emotion recognition in speech.The normalization of pitch for the male voice and female voice are also considered, in order to illustrate them into a uniform space. Finally, the speech emotion recognition experiment based on K Nearest Neighbor shows that, the correct rate of 81% is achieved, where it is only 73.85%if the traditional parameters are utilized.

  4. Digital Media Creates Youth Voices Heard

    Directory of Open Access Journals (Sweden)

    Jeff Sallee

    2014-06-01

    Full Text Available Oklahoma 4-H clubs and military service centers partnered with the Adobe Youth Voices (AYV program to give youth opportunities to raise their voices through digital media. This program reached out to underrepresented youth and gave them the tools and technology to effectively express themselves. The intent of this project was for 4-H members to create videos to educate, help and raise awareness in their communities of topics that were important to the youth. These experiences help youth gain knowledge towards helping others solve farm, home, and community problems. Participating youth selected issues that were important to them and created a short video, educating others and sharing their convictions on the topics of horse therapy, citizenship, bullying, and distracted driving.

  5. Objective voice parameters in Colombian school workers with healthy voices

    NARCIS (Netherlands)

    L.C. Cantor Cutiva (Lady Catherine); A. Burdorf (Alex)

    2015-01-01

    textabstractObjectives: To characterize the objective voice parameters among school workers, and to identify associated factors of three objective voice parameters, namely fundamental frequency, sound pressure level and maximum phonation time. Materials and methods: We conducted a cross-sectional

  6. Pedagogic Voice: Student Voice in Teaching and Engagement Pedagogies

    Science.gov (United States)

    Baroutsis, Aspa; McGregor, Glenda; Mills, Martin

    2016-01-01

    In this paper, we are concerned with the notion of "pedagogic voice" as it relates to the presence of student "voice" in teaching, learning and curriculum matters at an alternative, or second chance, school in Australia. This school draws upon many of the principles of democratic schooling via its utilisation of student voice…

  7. Learning Media Application Based On Microcontroller Chip Technology In Early Age

    Science.gov (United States)

    Ika Hidayati, Permata

    2018-04-01

    In Early childhood cognitive intelligence need right rncdia learning that can help a child’s cognitive intelligence quickly. The purpose of this study to design a learning media in the form of a puppet can used to introduce human anatomy during early childhood. This educational doll utilizing voice recognition technology from EasyVR module to receive commands from the user to introduce body parts on a doll, is used as an indicator TED. In addition to providing the introduction of human anatomy, this dolljut. a user can give a shout out to mainly play previously stored voice module sound recorder. Results obtained from this study is that this educational dolls can detect more than voice and spoken commands that can be random detected. Distance concrete of this doll in detecting the sound is up to a distance of 2.5 meters.

  8. A Wireless LAN and Voice Information System for Underground Coal Mine

    Directory of Open Access Journals (Sweden)

    Yu Zhang

    2014-06-01

    Full Text Available In this paper we constructed a wireless information system, and developed a wireless voice communication subsystem based on Wireless Local Area Networks (WLAN for underground coal mine, which employs Voice over IP (VoIP technology and Session Initiation Protocol (SIP to achieve wireless voice dispatching communications. The master control voice dispatching interface and call terminal software are also developed on the WLAN ground server side to manage and implement the voice dispatching communication. A testing system for voice communication was constructed in tunnels of an underground coal mine, which was used to actually test the wireless voice communication subsystem via a network analysis tool, named Clear Sight Analyzer. In tests, the actual flow charts of registration, call establishment and call removal were analyzed by capturing call signaling of SIP terminals, and the key performance indicators were evaluated in coal mine, including average subjective value of voice quality, packet loss rate, delay jitter, disorder packet transmission and end-to- end delay. Experimental results and analysis demonstrate that the wireless voice communication subsystem developed communicates well in underground coal mine environment, achieving the designed function of voice dispatching communication.

  9. Facing Sound - Voicing Art

    DEFF Research Database (Denmark)

    Lønstrup, Ansa

    2013-01-01

    This article is based on examples of contemporary audiovisual art, with a special focus on the Tony Oursler exhibition Face to Face at Aarhus Art Museum ARoS in Denmark in March-July 2012. My investigation involves a combination of qualitative interviews with visitors, observations of the audience´s...... interactions with the exhibition and the artwork in the museum space and short analyses of individual works of art based on reception aesthetics and phenomenology and inspired by newer writings on sound, voice and listening....

  10. Bodies, Spaces, Voices, Silences

    OpenAIRE

    Donatella Mazzoleni; Pietro Vitiello

    2013-01-01

    A good architecture should not only allow functional, formal and technical quality for urban spaces, but also let the voice of the city be perceived, listened, enjoyed. Every city has got its specific sound identity, or “ISO” (R. O. Benenzon), made up of a complex texture of background noises and fluctuation of sound figures emerging and disappearing in a game of continuous fadings. For instance, the ISO of Naples is characterized by a spread need of hearing the sound return of one’s/others v...

  11. ALPHABET SIGN LANGUAGE RECOGNITION USING LEAP MOTION TECHNOLOGY AND RULE BASED BACKPROPAGATION-GENETIC ALGORITHM NEURAL NETWORK (RBBPGANN

    Directory of Open Access Journals (Sweden)

    Wijayanti Nurul Khotimah

    2017-01-01

    Full Text Available Sign Language recognition was used to help people with normal hearing communicate effectively with the deaf and hearing-impaired. Based on survey that conducted by Multi-Center Study in Southeast Asia, Indonesia was on the top four position in number of patients with hearing disability (4.6%. Therefore, the existence of Sign Language recognition is important. Some research has been conducted on this field. Many neural network types had been used for recognizing many kinds of sign languages. However, their performance are need to be improved. This work focuses on the ASL (Alphabet Sign Language in SIBI (Sign System of Indonesian Language which uses one hand and 26 gestures. Here, thirty four features were extracted by using Leap Motion. Further, a new method, Rule Based-Backpropagation Genetic Al-gorithm Neural Network (RB-BPGANN, was used to recognize these Sign Languages. This method is combination of Rule and Back Propagation Neural Network (BPGANN. Based on experiment this pro-posed application can recognize Sign Language up to 93.8% accuracy. It was very good to recognize large multiclass instance and can be solution of overfitting problem in Neural Network algorithm.

  12. The Voices of the Documentarist

    Science.gov (United States)

    Utterback, Ann S.

    1977-01-01

    Discusses T. S. Elliot's essay, "The Three Voices of Poetry" which conceptualizes the position taken by the poet or creator. Suggests that an examination of documentary film, within the three voices concept, expands the critical framework of the film genre. (MH)

  13. The Role of the Electronic Portfolio in Enhancing Information and Communication Technology and English Language Skills: The Voices of Six Malaysian Undergraduates

    Science.gov (United States)

    Thang, Siew Ming; Lee, Yit Sim; Zulkifli, Nurul Farhana

    2012-01-01

    This study investigated the effects of the construction and development of electronic portfolios (e-portfolios) on a small user population at a public university in Malaysia. The study was based on a three-month Information and Communication Technology (ICT) and language learning course offered to the undergraduates of the university. One of the…

  14. When the Divide Isn't Just Digital: How Technology-Enriched Afterschool Programs Help Immigrant Youth Find a Voice, a Place, and a Future

    Science.gov (United States)

    London, Rebecca A.; Pastor, Manuel, Jr.; Rosner, Rachel

    2008-01-01

    The so-called "digital divide"--unequal access to information technology--is one of many social inequalities faced by individuals who are low-income, ethnic minorities, or immigrants. Surprisingly, the digital divide is even larger for young people than it is for adults, with African-American and Latino young people, as well as…

  15. The effect of transformational leadership and job autonomy on promotive and prohibitive voice

    DEFF Research Database (Denmark)

    Svendsen, Mari; Unterrainer, Christine; Jønsson, Thomas Faurholt

    2018-01-01

    Although there is a vast amount of research on leadership and improvement-oriented voice behavior, the amount of cross-lagged research on leadership that also incorporates more challenging forms of voice is sparse. This paper reports on a two-wave study of white-collar workers in a Norwegian...... medical technology company, investigating the relationship among employees’ perceived transformational leadership behaviors, job autonomy, and promotive and prohibitive voice. Testing our results cross-lagged, we demonstrate that perceived transformational leadership is significantly related...... to prohibitive voice over time, whereas this effect worked in the opposite direction for promotive voice. We also explore the boundary conditions of transformational leadership, demonstrating that perceived job autonomy strengthens the effect of transformational leadership on prohibitive voice. Implications...

  16. ELearning Strategic Planning 2020: The Voice of Future Students as Stakeholders in Higher Education

    Science.gov (United States)

    Finger, Glenn; Smart, Vicky

    2013-01-01

    Most universities are undertaking information technology (IT) strategic planning. The development of those plans often includes the voices of academics and sometimes engages alumni and current students. However, few engage and acknowledge the voice of future students. This paper is situated within the "Griffith University 2020 Strategic…

  17. Speech Recognition

    Directory of Open Access Journals (Sweden)

    Adrian Morariu

    2009-01-01

    Full Text Available This paper presents a method of speech recognition by pattern recognition techniques. Learning consists in determining the unique characteristics of a word (cepstral coefficients by eliminating those characteristics that are different from one word to another. For learning and recognition, the system will build a dictionary of words by determining the characteristics of each word to be used in the recognition. Determining the characteristics of an audio signal consists in the following steps: noise removal, sampling it, applying Hamming window, switching to frequency domain through Fourier transform, calculating the magnitude spectrum, filtering data, determining cepstral coefficients.

  18. Bodies, Spaces, Voices, Silences

    Directory of Open Access Journals (Sweden)

    Donatella Mazzoleni

    2013-07-01

    Full Text Available A good architecture should not only allow functional, formal and technical quality for urban spaces, but also let the voice of the city be perceived, listened, enjoyed. Every city has got its specific sound identity, or “ISO” (R. O. Benenzon, made up of a complex texture of background noises and fluctuation of sound figures emerging and disappearing in a game of continuous fadings. For instance, the ISO of Naples is characterized by a spread need of hearing the sound return of one’s/others voices, by a hate of silence. Cities may fall ill: illness from noise, within super-crowded neighbourhoods, or illness from silence, in the forced isolation of peripheries. The proposal of an urban music therapy denotes an unpublished and innovative enlarged interdisciplinary research path, where architecture, music, medicine, psychology, communication science may converge, in order to work for rebalancing spaces and relation life of the urban collectivity, through the care of body and sound dimensions.

  19. The role of spectral and temporal cues in voice gender discrimination by normal-hearing listeners and cochlear implant users.

    Science.gov (United States)

    Fu, Qian-Jie; Chinchilla, Sherol; Galvin, John J

    2004-09-01

    The present study investigated the relative importance of temporal and spectral cues in voice gender discrimination and vowel recognition by normal-hearing subjects listening to an acoustic simulation of cochlear implant speech processing and by cochlear implant users. In the simulation, the number of speech processing channels ranged from 4 to 32, thereby varying the spectral resolution; the cutoff frequencies of the channels' envelope filters ranged from 20 to 320 Hz, thereby manipulating the available temporal cues. For normal-hearing subjects, results showed that both voice gender discrimination and vowel recognition scores improved as the number of spectral channels was increased. When only 4 spectral channels were available, voice gender discrimination significantly improved as the envelope filter cutoff frequency was increased from 20 to 320 Hz. For all spectral conditions, increasing the amount of temporal information had no significant effect on vowel recognition. Both voice gender discrimination and vowel recognition scores were highly variable among implant users. The performance of cochlear implant listeners was similar to that of normal-hearing subjects listening to comparable speech processing (4-8 spectral channels). The results suggest that both spectral and temporal cues contribute to voice gender discrimination and that temporal cues are especially important for cochlear implant users to identify the voice gender when there is reduced spectral resolution.

  20. FCJ-151 The modulation and ordering of affect: from emotion recognition technology to the critique of class composition

    Directory of Open Access Journals (Sweden)

    Mark Gawne

    2012-01-01

    Full Text Available Recent developments in the workplace have seen the intensification of methods to elicit and capture value within and across the affective encounter, notably through the introduction of technologies to monitor and measure the production of affect and emotion in service workers. This paper develops the beginning of a critique of these technologies through a discussion of affective HCI, OKAO Vision and an engagement with the compositionist critique developed in (post-Operaismo.

  1. Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System.

    Science.gov (United States)

    Partila, Pavol; Voznak, Miroslav; Tovarek, Jaromir

    2015-01-01

    The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.

  2. Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System

    Directory of Open Access Journals (Sweden)

    Pavol Partila

    2015-01-01

    Full Text Available The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.

  3. Crossing Cultures with Multi-Voiced Journals

    Science.gov (United States)

    Styslinger, Mary E.; Whisenant, Alison

    2004-01-01

    In this article, the authors discuss the benefits of using multi-voiced journals as a teaching strategy in reading instruction. Multi-voiced journals, an adaptation of dual-voiced journals, encourage responses to reading in varied, cultured voices of characters. It is similar to reading journals in that they prod students to connect to the lives…

  4. Objective voice and speech analysis of persons with chronic hoarseness by prosodic analysis of speech samples.

    Science.gov (United States)

    Haderlein, Tino; Döllinger, Michael; Matoušek, Václav; Nöth, Elmar

    2016-10-01

    Automatic voice assessment is often performed using sustained vowels. In contrast, speech analysis of read-out texts can be applied to voice and speech assessment. Automatic speech recognition and prosodic analysis were used to find regression formulae between automatic and perceptual assessment of four voice and four speech criteria. The regression was trained with 21 men and 62 women (average age 49.2 years) and tested with another set of 24 men and 49 women (48.3 years), all suffering from chronic hoarseness. They read the text 'Der Nordwind und die Sonne' ('The North Wind and the Sun'). Five voice and speech therapists evaluated the data on 5-point Likert scales. Ten prosodic and recognition accuracy measures (features) were identified which describe all the examined criteria. Inter-rater correlation within the expert group was between r = 0.63 for the criterion 'match of breath and sense units' and r = 0.87 for the overall voice quality. Human-machine correlation was between r = 0.40 for the match of breath and sense units and r = 0.82 for intelligibility. The perceptual ratings of different criteria were highly correlated with each other. Likewise, the feature sets modeling the criteria were very similar. The automatic method is suitable for assessing chronic hoarseness in general and for subgroups of functional and organic dysphonia. In its current version, it is almost as reliable as a randomly picked rater from a group of voice and speech therapists.

  5. When a new technological product launching fails: A multi-method approach of facial recognition and E-WOM sentiment analysis.

    Science.gov (United States)

    Hernández-Fernández, Dra Asunción; Mora, Elísabet; Vizcaíno Hernández, María Isabel

    2018-04-17

    The dual aim of this research is, firstly, to analyze the physiological and unconscious emotional response of consumers to a new technological product and, secondly, link this emotional response to consumer conscious verbal reports of positive and negative product perceptions. In order to do this, biometrics and self-reported measures of emotional response are combined. On the one hand, a neuromarketing experiment based on the facial recognition of emotions of 10 subjects, when physical attributes and economic information of a technological product are exposed, shows the prevalence of the ambivalent emotion of surprise. On the other hand, a nethnographic qualitative approach of sentiment analysis of 67-user online comments characterise the valence of this emotion as mainly negative in the case and context studied. Theoretical, practical and methodological contributions are anticipated from this paper. From a theoretical point of view this proposal contributes valuable information to the product design process, to an effective development of the marketing mix variables of price and promotion, and to a successful selection of the target market. From a practical point of view, the approach employed in the case study on the product Google Glass provides empirical evidence useful in the decision making process for this and other technological enterprises launching a new product. And from a methodological point of view, the usefulness of integrated neuromarketing-eWOM analysis could contribute to the proliferation of this tandem in marketing research. Copyright © 2018 Elsevier Inc. All rights reserved.

  6. Voice synthesis application

    Science.gov (United States)

    Lightstone, P. C.; Davidson, W. M.

    1982-04-01

    The military detection assessment laboratory houses an experimental field system which assesses different alarm indicators such as fence disturbance sensors, MILES cables, and microwave Racons. A speech synthesis board which could be interfaced, by means of a computer, to an alarm logger making verbal acknowledgement of alarms possible was purchased. Different products and different types of voice synthesis were analyzed before a linear predictive code device produced by Telesensory Speech Systems of Palo Alto, California was chosen. This device is called the Speech 1000 Board and has a dedicated 8085 processor. A multiplexer card was designed and the Sp 1000 interfaced through the card into a TMS 990/100M Texas Instrument microcomputer. It was also necessary to design the software with the capability of recognizing and flagging an alarm on any 1 of 32 possible lines. The experimental field system was then packaged with a dc power supply, LED indicators, speakers, and switches, and deployed in the field performing reliably.

  7. How to help teachers' voices.

    Science.gov (United States)

    Saatweber, Margarete

    2008-01-01

    It has been shown that teachers are at high risk of developing occupational dysphonia, and it has been widely accepted that the vocal characteristics of a speaker play an important role in determining the reactions of listeners. The functions of breathing, breathing movement, breathing tonus, voice vibrations and articulation tonus are transmitted to the listener. So we may conclude that listening to the teacher's voice at school influences children's behavior and the perception of spoken language. This paper presents the concept of Schlaffhorst-Andersen including exercises to help teachers improve their voice, breathing, movement and their posture. Copyright 2008 S. Karger AG, Basel.

  8. Tax and Citizenship Relations : A Critical Approach About Taxes in incidents the New Technologies : no Democratic Participation Services , Citizenship without Voice

    Directory of Open Access Journals (Sweden)

    Nathalia Correia Pompeu

    2016-06-01

    Full Text Available Based on subdivisions that LC No. 116 of 2003 established the computer services, many companies are fined, particularly in cities such as S.o Paulo, which have different rates for the services of one item Attaches list of this Law, for not differentiate correctly the services provided. This is because, as the Law brings vague and imprecise concepts, they differ widely among the computing market, technical professionals and municipal tax area, causing quite a stir in the taxation of ISS. The importance of defining, understanding and define the various application of this tax concepts will bring legal certainty to those involved in the process as well, enable the technological development in the correct tax assessment of international contracts.

  9. Voice Habits and Behaviors: Voice Care Among Flamenco Singers.

    Science.gov (United States)

    Garzón García, Marina; Muñoz López, Juana; Y Mendoza Lara, Elvira

    2017-03-01

    The purpose of this study is to analyze the vocal behavior of flamenco singers, as compared with classical music singers, to establish a differential vocal profile of voice habits and behaviors in flamenco music. Bibliographic review was conducted, and the Singer's Vocal Habits Questionnaire, an experimental tool designed by the authors to gather data regarding hygiene behavior, drinking and smoking habits, type of practice, voice care, and symptomatology perceived in both the singing and the speaking voice, was administered. We interviewed 94 singers, divided into two groups: the flamenco experimental group (FEG, n = 48) and the classical control group (CCG, n = 46). Frequency analysis, a Likert scale, and discriminant and exploratory factor analysis were used to obtain a differential profile for each group. The FEG scored higher than the CCG in speaking voice symptomatology. The FEG scored significantly higher than the CCG in use of "inadequate vocal technique" when singing. Regarding voice habits, the FEG scored higher in "lack of practice and warm-up" and "environmental habits." A total of 92.6% of the subjects classified themselves correctly in each group. The Singer's Vocal Habits Questionnaire has proven effective in differentiating flamenco and classical singers. Flamenco singers are exposed to numerous vocal risk factors that make them more prone to vocal fatigue, mucosa dehydration, phonotrauma, and muscle stiffness than classical singers. Further research is needed in voice training in flamenco music, as a means to strengthen the voice and enable it to meet the requirements of this musical genre. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  10. A posteriori error estimates in voice source recovery

    Science.gov (United States)

    Leonov, A. S.; Sorokin, V. N.

    2017-12-01

    The inverse problem of voice source pulse recovery from a segment of a speech signal is under consideration. A special mathematical model is used for the solution that relates these quantities. A variational method of solving inverse problem of voice source recovery for a new parametric class of sources, that is for piecewise-linear sources (PWL-sources), is proposed. Also, a technique for a posteriori numerical error estimation for obtained solutions is presented. A computer study of the adequacy of adopted speech production model with PWL-sources is performed in solving the inverse problems for various types of voice signals, as well as corresponding study of a posteriori error estimates. Numerical experiments for speech signals show satisfactory properties of proposed a posteriori error estimates, which represent the upper bounds of possible errors in solving the inverse problem. The estimate of the most probable error in determining the source-pulse shapes is about 7-8% for the investigated speech material. It is noted that a posteriori error estimates can be used as a criterion of the quality for obtained voice source pulses in application to speaker recognition.

  11. Comparison of the Effects of SMART Board Technology and Flash Card Instruction on Sight Word Recognition and Observational Learning

    Science.gov (United States)

    Mechling, Linda C.; Gast, David L.; Thompson, Kimberly L.

    2009-01-01

    This study compared the effectiveness of SMART Board, interactive whiteboard technology and traditional flash cards in teaching reading in a small-group instructional arrangement. Three students with moderate intellectual disabilities were taught to read grocery store aisle marker words under each condition. Observational learning (students…

  12. High quality voice synthesis middle ware; Kohinshitsu onsei gosei middle war

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-03-01

    Toshiba Corp. newly developed a natural voice synthesis system, TOS Drive TTS (TOtally speaker Driven Text-To-Speech) system, in which natural high-quality read-aloud is greatly improved, and also developed as its application a voice synthesis middle ware. In the newly developed system, using as a model a narrator's voice recorded preliminarily, a metrical control dictionary is automatically learned that reproduces the characteristics of metrical patters such as intonation or rhythm of a human voice, as is a voice bases dictionary that reproduces the characteristics of a voice quality, enabling natural voice synthesis to be realized that picks up human voice characteristics. The system is high quality and also very compact, while the voice synthesis middle ware utilizing this technology is adaptable to various platforms such as MPU or OS. The system is very suitable for audio response in the ITS field having car navigation systems as the core; besides, expanded application is expected to an audio response system that used to employ a sound recording and reproducing system. (translated by NEDO)

  13. A model for treating voice disorders in school-age children within a video gaming environment.

    Science.gov (United States)

    King, Suzanne N; Davis, Larry; Lehman, Jeffrey J; Ruddy, Bari Hoffman

    2012-09-01

    Clinicians use a variety of approaches to motivate children with hyperfunctional voice disorders to comply with voice therapy in a therapeutic session and improve the motivation of children to practice home-based exercises. Utilization of current entertainment technology in such approaches may improve participation and motivation in voice therapy. The purpose of this study is to test the feasibility of using an entertainment video game as a therapy device. Prospective cohort and case-control study. Three levels of game testing were conducted to an existing entertainment video game for use as a voice therapy protocol. The game was tested by two computer programmers and five normal participants. The third level of testing was a case study with a child diagnosed with a hyperfunctional voice disorder. Modifications to the game were made after each feasibility test. Errors with the video game performance were modified, including the addition of a time stamp directory and game controller. Resonance voice exercises were modified to accommodate the gaming environment and unique competitive situation, including speech rate, acoustic parameters, game speed, and point allocations. The development of video games for voice therapeutic purposes attempt to replicate the high levels of engagement and motivation attained with entertainment video games, stimulating a more productive means of learning while doing. This case study found that a purely entertainment video game can be implemented as a voice therapeutic protocol based on information obtained from the case study. Copyright © 2012 The Voice Foundation. All rights reserved.

  14. Voice and choice by delegation.

    Science.gov (United States)

    van de Bovenkamp, Hester; Vollaard, Hans; Trappenburg, Margo; Grit, Kor

    2013-02-01

    In many Western countries, options for citizens to influence public services are increased to improve the quality of services and democratize decision making. Possibilities to influence are often cast into Albert Hirschman's taxonomy of exit (choice), voice, and loyalty. In this article we identify delegation as an important addition to this framework. Delegation gives individuals the chance to practice exit/choice or voice without all the hard work that is usually involved in these options. Empirical research shows that not many people use their individual options of exit and voice, which could lead to inequality between users and nonusers. We identify delegation as a possible solution to this problem, using Dutch health care as a case study to explore this option. Notwithstanding various advantages, we show that voice and choice by delegation also entail problems of inequality and representativeness.

  15. Voice Force tulekul / Tõnu Ojala

    Index Scriptorium Estoniae

    Ojala, Tõnu, 1969-

    2005-01-01

    60. sünnipäeva tähistava Tallinna Tehnikaülikooli Akadeemilise Meeskoori juubelihooaja üritusest - a capella pop-gruppide festivalist Voice Force (kontserdid 12. nov. klubis Parlament ja 3. dets. Vene Kultuurikeskuses)

  16. The Christian voice in philosophy

    Directory of Open Access Journals (Sweden)

    Stuart Fowler

    1982-03-01

    Full Text Available In this paper the Rev. Stuart Fowler outlines a Christian voice in Philosophy and urges the Christian philosopher to investigate his position and his stance with integrity and honesty.

  17. Mobile user experience for voice services: A theoretical framework

    CSIR Research Space (South Africa)

    Botha, Adèle

    2012-02-01

    Full Text Available The purpose of this paper is to provide a “Mobile User Experience Framework for Voice services.” The rapid spread of mobile cellular technology within Africa has made it a prime vehicle for accessing services and content. The challenge remains...

  18. Digital voice recording: An efficient alternative for data collection

    Science.gov (United States)

    Mark A. Rumble; Thomas M. Juntti; Thomas W. Bonnot; Joshua J. Millspaugh

    2009-01-01

    Study designs are usually constrained by logistical and budgetary considerations that can affect the depth and breadth of the research. Little attention has been paid to increasing the efficiency of data recording. Digital voice recording and translation may offer improved efficiency of field personnel. Using this technology, we increased our data collection by 55...

  19. Image/Music/Voice: Song Dubbing in Hollywood Musicals.

    Science.gov (United States)

    Siefert, Marsha

    1995-01-01

    Uses the practice of song dubbing in the Hollywood film musical to explore the implications and consequences of the singing voice for imaging practices in the 1930s through 1960s. Discusses the ideological, technological, and socioeconomic basis for song dubbing. Discusses gender, race, and ethnicity patterns of image-sound practices. (SR)

  20. Speech Recognition on Mobile Devices

    DEFF Research Database (Denmark)

    Tan, Zheng-Hua; Lindberg, Børge

    2010-01-01

    in the mobile context covering motivations, challenges, fundamental techniques and applications. Three ASR architectures are introduced: embedded speech recognition, distributed speech recognition and network speech recognition. Their pros and cons and implementation issues are discussed. Applications within......The enthusiasm of deploying automatic speech recognition (ASR) on mobile devices is driven both by remarkable advances in ASR technology and by the demand for efficient user interfaces on such devices as mobile phones and personal digital assistants (PDAs). This chapter presents an overview of ASR...

  1. Memory for faces and voices varies as a function of sex and expressed emotion.

    Science.gov (United States)

    S Cortes, Diana; Laukka, Petri; Lindahl, Christina; Fischer, Håkan

    2017-01-01

    We investigated how memory for faces and voices (presented separately and in combination) varies as a function of sex and emotional expression (anger, disgust, fear, happiness, sadness, and neutral). At encoding, participants judged the expressed emotion of items in forced-choice tasks, followed by incidental Remember/Know recognition tasks. Results from 600 participants showed that accuracy (hits minus false alarms) was consistently higher for neutral compared to emotional items, whereas accuracy for specific emotions varied across the presentation modalities (i.e., faces, voices, and face-voice combinations). For the subjective sense of recollection ("remember" hits), neutral items received the highest hit rates only for faces, whereas for voices and face-voice combinations anger and fear expressions instead received the highest recollection rates. We also observed better accuracy for items by female expressers, and own-sex bias where female participants displayed memory advantage for female faces and face-voice combinations. Results further suggest that own-sex bias can be explained by recollection, rather than familiarity, rates. Overall, results show that memory for faces and voices may be influenced by the expressions that they carry, as well as by the sex of both items and participants. Emotion expressions may also enhance the subjective sense of recollection without enhancing memory accuracy.

  2. Recognizing famous voices: influence of stimulus duration and different types of retrieval cues.

    Science.gov (United States)

    Schweinberger, S R; Herholz, A; Sommer, W

    1997-04-01

    The current investigation measured the effects of increasing stimulus duration on listeners' ability to recognize famous voices. In addition, the investigation studied the influence of different types of cues on the naming of voices that could not be named before. Participants were presented with samples of famous and unfamiliar voices and were asked to decide whether or not the samples were spoken by a famous person. The duration of each sample increased in seven steps from 0.25 s up to a maximum of 2 s. Voice recognition improvements with stimulus duration were with a growth function. Gains were most rapid within the first second and less pronounced thereafter. When participants were unable to name a famous voice, they were cued with either a second voice sample, the occupation, or the initials of the celebrity. Initials were most effective in eliciting the name only when semantic information about the speaker had been accessed prior to cue presentation. Paralleling previous research on face naming, this may indicate that voice naming is contingent on previous activation of person-specific semantic information.

  3. Memory for faces and voices varies as a function of sex and expressed emotion.

    Directory of Open Access Journals (Sweden)

    Diana S Cortes

    Full Text Available We investigated how memory for faces and voices (presented separately and in combination varies as a function of sex and emotional expression (anger, disgust, fear, happiness, sadness, and neutral. At encoding, participants judged the expressed emotion of items in forced-choice tasks, followed by incidental Remember/Know recognition tasks. Results from 600 participants showed that accuracy (hits minus false alarms was consistently higher for neutral compared to emotional items, whereas accuracy for specific emotions varied across the presentation modalities (i.e., faces, voices, and face-voice combinations. For the subjective sense of recollection ("remember" hits, neutral items received the highest hit rates only for faces, whereas for voices and face-voice combinations anger and fear expressions instead received the highest recollection rates. We also observed better accuracy for items by female expressers, and own-sex bias where female participants displayed memory advantage for female faces and face-voice combinations. Results further suggest that own-sex bias can be explained by recollection, rather than familiarity, rates. Overall, results show that memory for faces and voices may be influenced by the expressions that they carry, as well as by the sex of both items and participants. Emotion expressions may also enhance the subjective sense of recollection without enhancing memory accuracy.

  4. The voice of emotion across species: how do human listeners recognize animals' affective states?

    Directory of Open Access Journals (Sweden)

    Marina Scheumann

    Full Text Available Voice-induced cross-taxa emotional recognition is the ability to understand the emotional state of another species based on its voice. In the past, induced affective states, experience-dependent higher cognitive processes or cross-taxa universal acoustic coding and processing mechanisms have been discussed to underlie this ability in humans. The present study sets out to distinguish the influence of familiarity and phylogeny on voice-induced cross-taxa emotional perception in humans. For the first time, two perspectives are taken into account: the self- (i.e. emotional valence induced in the listener versus the others-perspective (i.e. correct recognition of the emotional valence of the recording context. Twenty-eight male participants listened to 192 vocalizations of four different species (human infant, dog, chimpanzee and tree shrew. Stimuli were recorded either in an agonistic (negative emotional valence or affiliative (positive emotional valence context. Participants rated the emotional valence of the stimuli adopting self- and others-perspective by using a 5-point version of the Self-Assessment Manikin (SAM. Familiarity was assessed based on subjective rating, objective labelling of the respective stimuli and interaction time with the respective species. Participants reliably recognized the emotional valence of human voices, whereas the results for animal voices were mixed. The correct classification of animal voices depended on the listener's familiarity with the species and the call type/recording context, whereas there was less influence of induced emotional states and phylogeny. Our results provide first evidence that explicit voice-induced cross-taxa emotional recognition in humans is shaped more by experience-dependent cognitive mechanisms than by induced affective states or cross-taxa universal acoustic coding and processing mechanisms.

  5. Practical Voice Recognition for the Aircraft Cockpit, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — This proposal responds to the urgent need for improved pilot interfaces in the modern aircraft cockpit. Recent advances in aircraft equipment bring tremendous...

  6. Touchless palmprint recognition systems

    CERN Document Server

    Genovese, Angelo; Scotti, Fabio

    2014-01-01

    This book examines the context, motivation and current status of biometric systems based on the palmprint, with a specific focus on touchless and less-constrained systems. It covers new technologies in this rapidly evolving field and is one of the first comprehensive books on palmprint recognition systems.It discusses the research literature and the most relevant industrial applications of palmprint biometrics, including the low-cost solutions based on webcams. The steps of biometric recognition are described in detail, including acquisition setups, algorithms, and evaluation procedures. Const

  7. Future Educators' Explaining Voices

    Science.gov (United States)

    de Oliveira, Janaina Minelli; Caballero, Pablo Buenestado; Camacho, Mar

    2013-01-01

    Teacher education programs must offer pre-service students innovative technology-supported learning environments, guiding them in the revision of their preconceptions on literacy and technology. This present paper presents a case study that uses podcast to inquiry into future educators' views on technology and the digital age. Results show future…

  8. Citizen voices performing public participation in science and environment communication

    CERN Document Server

    Carvalho, Anabela; Doyle, Julie

    2012-01-01

    How is "participation" ascribed meaning and practised in science and environment communication? And how are citizen voices articulated, invoked, heard, marginalised or silenced in those processes? Citizen Voices takes its starting point in the so-called dialogic or participatory turn in scientific and environmental governance in which practices claiming to be based on principles of participation, dialogue and citizen involvement have proliferated. The book goes beyond the buzzword of "participation" in order to give empirically rich, theoretically informed and critical accounts of how citizen participation is understood and enacted in mass mediation and public engagement practices. A diverse series of studies across Europe and the US are presented, providing readers with empirical insights into the articulation of citizen voices in different national, cultural and institutional contexts. Building bridges across media and communication studies, science and technology studies, environmental studies and urban pl...

  9. Assessment voice synthesizers for reading in digital books

    Directory of Open Access Journals (Sweden)

    Sérvulo Fernandes da Silva Neto

    2013-07-01

    Full Text Available The digital accessibility shows ways to information access in digital media that assist people with different types of disabilities to a better interaction with the computer independent of its limitations. Of these tools are composed by voice synthesizers, that supposedly simplifying their access to any recorded knowledge through digital technologies. However such tools have emerged originally in countries foreign language. Which brings us to the following research problem: the voice synthesizers are appropriate for reading digital books in the Portuguese language? The objective of this study was to analyze and classify different software tools voice synthesizers in combination with software digital book readers to support accessibility to e-books in Portuguese. Through literature review were identified applications software voice synthesizers, composing the sample analyzed in this work. We used a simplified version of the method of Multiple Criteria Decision Support - MMDA, to assess these. In the research 12 were considered readers of e-books and 11 software voice synthesizer, tested with six formats of e-books (E-pub, PDF, HTML, DOC, TXT, and Mobi. In accordance with the results, the software Virtual Vision achieved the highest score. Relative to formats, it was found that the PDF has measured a better score when summed the results of the three synthesizers. In the studied universe contacted that many synthesizers simply cannot be used because they did not support the Portuguese language.

  10. Understanding the 'Anorexic Voice' in Anorexia Nervosa.

    Science.gov (United States)

    Pugh, Matthew; Waller, Glenn

    2017-05-01

    In common with individuals experiencing a number of disorders, people with anorexia nervosa report experiencing an internal 'voice'. The anorexic voice comments on the individual's eating, weight and shape and instructs the individual to restrict or compensate. However, the core characteristics of the anorexic voice are not known. This study aimed to develop a parsimonious model of the voice characteristics that are related to key features of eating disorder pathology and to determine whether patients with anorexia nervosa fall into groups with different voice experiences. The participants were 49 women with full diagnoses of anorexia nervosa. Each completed validated measures of the power and nature of their voice experience and of their responses to the voice. Different voice characteristics were associated with current body mass index, duration of disorder and eating cognitions. Two subgroups emerged, with 'weaker' and 'stronger' voice experiences. Those with stronger voices were characterized by having more negative eating attitudes, more severe compensatory behaviours, a longer duration of illness and a greater likelihood of having the binge-purge subtype of anorexia nervosa. The findings indicate that the anorexic voice is an important element of the psychopathology of anorexia nervosa. Addressing the anorexic voice might be helpful in enhancing outcomes of treatments for anorexia nervosa, but that conclusion might apply only to patients with more severe eating psychopathology. Copyright © 2016 John Wiley & Sons, Ltd. Experiences of an internal 'anorexic voice' are common in anorexia nervosa. Clinicians should consider the role of the voice when formulating eating pathology in anorexia nervosa, including how individuals perceive and relate to that voice. Addressing the voice may be beneficial, particularly in more severe and enduring forms of anorexia nervosa. When working with the voice, clinicians should aim to address both the content of the voice and how

  11. Anti-voice adaptation suggests prototype-based coding of voice identity

    Directory of Open Access Journals (Sweden)

    Marianne eLatinus

    2011-07-01

    Full Text Available We used perceptual aftereffects induced by adaptation with anti-voice stimuli to investigate voice identity representations. Participants learned a set of voices then were tested on a voice identification task with vowel stimuli morphed between identities, after different conditions of adaptation. In Experiment 1, participants chose the identity opposite to the adapting anti-voice significantly more often than the other two identities (e.g., after being adapted to anti-A, they identified the average voice as A. In Experiment 2, participants showed a bias for identities opposite to the adaptor specifically for anti-voice, but not for non anti-voice adaptors. These results are strikingly similar to adaptation aftereffects observed for facial identity. They are compatible with a representation of individual voice identities in a multidimensional perceptual voice space referenced on a voice prototype.

  12. Optical voice encryption based on digital holography.

    Science.gov (United States)

    Rajput, Sudheesh K; Matoba, Osamu

    2017-11-15

    We propose an optical voice encryption scheme based on digital holography (DH). An off-axis DH is employed to acquire voice information by obtaining phase retardation occurring in the object wave due to sound wave propagation. The acquired hologram, including voice information, is encrypted using optical image encryption. The DH reconstruction and decryption with all the correct parameters can retrieve an original voice. The scheme has the capability to record the human voice in holograms and encrypt it directly. These aspects make the scheme suitable for other security applications and help to use the voice as a potential security tool. We present experimental and some part of simulation results.

  13. Indonesian Automatic Speech Recognition For Command Speech Controller Multimedia Player

    Directory of Open Access Journals (Sweden)

    Vivien Arief Wardhany

    2014-12-01

    Full Text Available The purpose of multimedia devices development is controlling through voice. Nowdays voice that can be recognized only in English. To overcome the issue, then recognition using Indonesian language model and accousticc model and dictionary. Automatic Speech Recognizier is build using engine CMU Sphinx with modified english language to Indonesian Language database and XBMC used as the multimedia player. The experiment is using 10 volunteers testing items based on 7 commands. The volunteers is classifiedd by the genders, 5 Male & 5 female. 10 samples is taken in each command, continue with each volunteer perform 10 testing command. Each volunteer also have to try all 7 command that already provided. Based on percentage clarification table, the word “Kanan” had the most recognize with percentage 83% while “pilih” is the lowest one. The word which had the most wrong clarification is “kembali” with percentagee 67%, while the word “kanan” is the lowest one. From the result of Recognition Rate by male there are several command such as “Kembali”, “Utama”, “Atas “ and “Bawah” has the low Recognition Rate. Especially for “kembali” cannot be recognized as the command in the female voices but in male voice that command has 4% of RR this is because the command doesn’t have similar word in english near to “kembali” so the system unrecognize the command. Also for the command “Pilih” using the female voice has 80% of RR but for the male voice has only 4% of RR. This problem is mostly because of the different voice characteristic between adult male and female which male has lower voice frequencies (from 85 to 180 Hz than woman (165 to 255 Hz.The result of the experiment showed that each man had different number of recognition rate caused by the difference tone, pronunciation, and speed of speech. For further work needs to be done in order to improving the accouracy of the Indonesian Automatic Speech Recognition system

  14. Towards Real-Time Speech Emotion Recognition for Affective E-Learning

    Science.gov (United States)

    Bahreini, Kiavash; Nadolski, Rob; Westera, Wim

    2016-01-01

    This paper presents the voice emotion recognition part of the FILTWAM framework for real-time emotion recognition in affective e-learning settings. FILTWAM (Framework for Improving Learning Through Webcams And Microphones) intends to offer timely and appropriate online feedback based upon learner's vocal intonations and facial expressions in order…

  15. Highly flexible self-powered sensors based on printed circuit board technology for human motion detection and gesture recognition.

    Science.gov (United States)

    Fuh, Yiin-Kuen; Ho, Hsi-Chun

    2016-03-04

    In this paper, we demonstrate a new integration of printed circuit board (PCB) technology-based self-powered sensors (PSSs) and direct-write, near-field electrospinning (NFES) with polyvinylidene fluoride (PVDF) micro/nano fibers (MNFs) as source materials. Integration with PCB technology is highly desirable for affordable mass production. In addition, we systematically investigate the effects of electrodes with intervals in the range of 0.15 mm to 0.40 mm on the resultant PSS output voltage and current. The results show that at a strain of 0.5% and 5 Hz, a PSS with a gap interval 0.15 mm produces a maximum output voltage of 3 V and a maximum output current of 220 nA. Under the same dimensional constraints, the MNFs are massively connected in series (via accumulation of continuous MNFs across the gaps ) and in parallel (via accumulation of parallel MNFs on the same gap) simultaneously. Finally, encapsulation in a flexible polymer with different interval electrodes demonstrated that electrical superposition can be realized by connecting MNFs collectively and effectively in serial/parallel patterns to achieve a high current and high voltage output, respectively. Further improvement in PSSs based on the effect of cooperativity was experimentally realized by rolling-up the device into a cylindrical shape, resulting in a 130% increase in power output due to the cooperative effect. We assembled the piezoelectric MNF sensors on gloves, bandages and stockings to fabricate devices that can detect different types of human motion, including finger motion and various flexing and extensions of an ankle. The firmly glued PSSs were tested on the glove and ankle respectively to detect and harvest the various movements and the output voltage was recorded as ∼1.5 V under jumping movement (one PSS) and ∼4.5 V for the clenched fist with five fingers bent concurrently (five PSSs). This research shows that piezoelectric MNFs not only have a huge impact on harvesting various external

  16. Highly flexible self-powered sensors based on printed circuit board technology for human motion detection and gesture recognition

    Science.gov (United States)

    Fuh, Yiin-Kuen; Ho, Hsi-Chun

    2016-03-01

    In this paper, we demonstrate a new integration of printed circuit board (PCB) technology-based self-powered sensors (PSSs) and direct-write, near-field electrospinning (NFES) with polyvinylidene fluoride (PVDF) micro/nano fibers (MNFs) as source materials. Integration with PCB technology is highly desirable for affordable mass production. In addition, we systematically investigate the effects of electrodes with intervals in the range of 0.15 mm to 0.40 mm on the resultant PSS output voltage and current. The results show that at a strain of 0.5% and 5 Hz, a PSS with a gap interval 0.15 mm produces a maximum output voltage of 3 V and a maximum output current of 220 nA. Under the same dimensional constraints, the MNFs are massively connected in series (via accumulation of continuous MNFs across the gaps ) and in parallel (via accumulation of parallel MNFs on the same gap) simultaneously. Finally, encapsulation in a flexible polymer with different interval electrodes demonstrated that electrical superposition can be realized by connecting MNFs collectively and effectively in serial/parallel patterns to achieve a high current and high voltage output, respectively. Further improvement in PSSs based on the effect of cooperativity was experimentally realized by rolling-up the device into a cylindrical shape, resulting in a 130% increase in power output due to the cooperative effect. We assembled the piezoelectric MNF sensors on gloves, bandages and stockings to fabricate devices that can detect different types of human motion, including finger motion and various flexing and extensions of an ankle. The firmly glued PSSs were tested on the glove and ankle respectively to detect and harvest the various movements and the output voltage was recorded as ∼1.5 V under jumping movement (one PSS) and ∼4.5 V for the clenched fist with five fingers bent concurrently (five PSSs). This research shows that piezoelectric MNFs not only have a huge impact on harvesting various external

  17. A retrospective study of New Zealand case law involving assisted reproduction technology and the social recognition of 'new' family.

    Science.gov (United States)

    Legge, M; Fitzgerald, R; Frank, N

    2007-01-01

    The New Zealand Human Assisted Reproductive Technology (HART) Act became law in 2004. In this article, we provide a retrospective analysis of New Zealand case law from September 1990 to March 2004, leading up to the creation of the HART Act. We examine the new understandings of parenting (developed through the routine use of ART in New Zealand) which the case law attempted to test. We examine these concepts against the previous understandings of family enshrined in the pre-existing legislation, which formed the basis for judicial rulings in the various cases to which we refer. In conclusion, we provide a brief summary of the 2004 HART legislation and draw comparisons between the old and new legislative and bureaucratic frameworks that define and support New Zealand family structure. We suggest that a change in cultural backdrop is occurring from the traditional western ideology of the nuclear family towards the traditional Maori concept of family formation, which includes a well-accepted traditional practice of guardianship and a more open and extended family structure. This 'new' structure reflects the contemporary lived experience of family kinship in western societies as individualized and open to choice.

  18. Mobile Technologies in Schools: The Student Voice

    Science.gov (United States)

    Hodge, Emma-Leigh; Robertson, Neville; Sargisson, Rebecca J.

    2017-01-01

    Intermediate and high school students spend a large amount of time using mobile devices (Lauricella, Cingel, Blackwell, Wartella, & Conway, 2014), and such devices are increasingly being integrated into our school system. We conducted a series of student-led focus groups, with this early adolescent cohort, in order to better understand their…

  19. Similar representations of emotions across faces and voices.

    Science.gov (United States)

    Kuhn, Lisa Katharina; Wydell, Taeko; Lavan, Nadine; McGettigan, Carolyn; Garrido, Lúcia

    2017-09-01

    [Correction Notice: An Erratum for this article was reported in Vol 17(6) of Emotion (see record 2017-18585-001). In the article, the copyright attribution was incorrectly listed and the Creative Commons CC-BY license disclaimer was incorrectly omitted from the author note. The correct copyright is "© 2017 The Author(s)" and the omitted disclaimer is below. All versions of this article have been corrected. "This article has been published under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Copyright for this article is retained by the author(s). Author(s) grant(s) the American Psychological Association the exclusive right to publish the article and identify itself as the original publisher."] Emotions are a vital component of social communication, carried across a range of modalities and via different perceptual signals such as specific muscle contractions in the face and in the upper respiratory system. Previous studies have found that emotion recognition impairments after brain damage depend on the modality of presentation: recognition from faces may be impaired whereas recognition from voices remains preserved, and vice versa. On the other hand, there is also evidence for shared neural activation during emotion processing in both modalities. In a behavioral study, we investigated whether there are shared representations in the recognition of emotions from faces and voices. We used a within-subjects design in which participants rated the intensity of facial expressions and nonverbal vocalizations for each of the 6 basic emotion labels. For each participant and each modality, we then computed a representation matrix with the intensity ratings of each emotion. These matrices allowed us to examine the patterns of confusions between emotions and to characterize the representations

  20. Optical gesture sensing and depth mapping technologies for head-mounted displays: an overview

    Science.gov (United States)

    Kress, Bernard; Lee, Johnny

    2013-05-01

    Head Mounted Displays (HMDs), and especially see-through HMDs have gained renewed interest in recent time, and for the first time outside the traditional military and defense realm, due to several high profile consumer electronics companies presenting their products to hit market. Consumer electronics HMDs have quite different requirements and constrains as their military counterparts. Voice comments are the de-facto interface for such devices, but when the voice recognition does not work (not connection to the cloud for example), trackpad and gesture sensing technologies have to be used to communicate information to the device. We review in this paper the various technologies developed today integrating optical gesture sensing in a small footprint, as well as the various related 3d depth mapping sensors.

  1. Quick Statistics about Voice, Speech, and Language

    Science.gov (United States)

    ... here Home » Health Info » Statistics and Epidemiology Quick Statistics About Voice, Speech, Language Voice, Speech, Language, and ... no 205. Hyattsville, MD: National Center for Health Statistics. 2015. Hoffman HJ, Li C-M, Losonczy K, ...

  2. IgE recognition of chimeric isoforms of the honeybee (Apis mellifera) venom allergen Api m 10 evaluated by protein array technology.

    Science.gov (United States)

    Van Vaerenbergh, Matthias; De Smet, Lina; Rafei-Shamsabadi, David; Blank, Simon; Spillner, Edzard; Ebo, Didier G; Devreese, Bart; Jakob, Thilo; de Graaf, Dirk C

    2015-02-01

    Api m 10 has recently been established as novel major allergen that is recognized by more than 60% of honeybee venom (HBV) allergic patients. Previous studies suggest Api m 10 protein heterogeneity which may have implications for diagnosis and immunotherapy of HBV allergy. In the present study, RT-PCR revealed the expression of at least nine additional Api m 10 transcript isoforms by the venom glands. Two distinct mechanisms are responsible for the generation of these isoforms: while the previously known variant 2 is produced by an alternative splicing event, novel identified isoforms are intragenic chimeric transcripts. To the best of our knowledge, this is the first report of the identification of chimeric transcripts generated by the honeybee. By a retrospective proteomic analysis we found evidence for the presence of several of these isoforms in the venom proteome. Additionally, we analyzed IgE reactivity to different isoforms by protein array technology using sera from HBV allergic patients, which revealed that IgE recognition of Api m 10 is both isoform- and patient-specific. While it was previously demonstrated that the majority of HBV allergic patients display IgE reactivity to variant 2, our study also shows that some patients lacking IgE antibodies for variant 2 display IgE reactivity to two of the novel identified Api m 10 variants, i.e. variants 3 and 4. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Identification of strong earthquake ground motion by using pattern recognition

    International Nuclear Information System (INIS)

    Suzuki, Kohei; Tozawa, Shoji; Temmyo, Yoshiharu.

    1983-01-01

    The method of grasping adequately the technological features of complex waveform of earthquake ground motion and utilizing them as the input to structural systems has been proposed by many researchers, and the method of making artificial earthquake waves to be used for the aseismatic design of nuclear facilities has not been established in the unified form. In this research, earthquake ground motion was treated as an irregular process with unsteady amplitude and frequency, and the running power spectral density was expressed as a dark and light image on a plane of the orthogonal coordinate system with both time and frequency axes. The method of classifying this image into a number of technologically important categories by pattern recognition was proposed. This method is based on the concept called compound similarity method in the image technology, entirely different from voice diagnosis, and it has the feature that the result of identification can be quantitatively evaluated by the analysis of correlation of spatial images. Next, the standard pattern model of the simulated running power spectral density corresponding to the representative classification categories was proposed. Finally, the method of making unsteady simulated earthquake motion was shown. (Kako, I.)

  4. English Voicing in Dimensional Theory*

    Science.gov (United States)

    Iverson, Gregory K.; Ahn, Sang-Cheol

    2007-01-01

    Assuming a framework of privative features, this paper interprets two apparently disparate phenomena in English phonology as structurally related: the lexically specific voicing of fricatives in plural nouns like wives or thieves and the prosodically governed “flapping” of medial /t/ (and /d/) in North American varieties, which we claim is itself not a rule per se, but rather a consequence of the laryngeal weakening of fortis /t/ in interaction with speech-rate determined segmental abbreviation. Taking as our point of departure the Dimensional Theory of laryngeal representation developed by Avery & Idsardi (2001), along with their assumption that English marks voiceless obstruents but not voiced ones (Iverson & Salmons 1995), we find that an unexpected connection between fricative voicing and coronal flapping emerges from the interplay of familiar phonemic and phonetic factors in the phonological system. PMID:18496590

  5. Voices Falling Through the Air

    Directory of Open Access Journals (Sweden)

    Paul Elliman

    2012-11-01

    Full Text Available Where am I? Or as the young boy in Jules Verne’s Journey to the Centre of the Earth calls back to his distant-voiced companions: ‘Lost… in the most intense darkness.’ ‘Then I understood it,’ says the boy, Axel, ‘To make them hear me, all I had to do was to speak with my mouth close to the wall, which would serve to conduct my voice, as the wire conducts the electric fluid’ (Verne 1864. By timing their calls, the group of explorers work out that Axel is separated from them by a distance of four miles, held in a cavernous vertical gallery of smooth rock. Feeling his way down towards the others, the boy ends up falling, along with his voice, through the space. Losing consciousness he seems to give himself up to the space...

  6. Speaker's voice as a memory cue.

    Science.gov (United States)

    Campeanu, Sandra; Craik, Fergus I M; Alain, Claude

    2015-02-01

    Speaker's voice occupies a central role as the cornerstone of auditory social interaction. Here, we review the evidence suggesting that speaker's voice constitutes an integral context cue in auditory memory. Investigation into the nature of voice representation as a memory cue is essential to understanding auditory memory and the neural correlates which underlie it. Evidence from behavioral and electrophysiological studies suggest that while specific voice reinstatement (i.e., same speaker) often appears to facilitate word memory even without attention to voice at study, the presence of a partial benefit of similar voices between study and test is less clear. In terms of explicit memory experiments utilizing unfamiliar voices, encoding methods appear to play a pivotal role. Voice congruency effects have been found when voice is specifically attended at study (i.e., when relatively shallow, perceptual encoding takes place). These behavioral findings coincide with neural indices of memory performance such as the parietal old/new recollection effect and the late right frontal effect. The former distinguishes between correctly identified old words and correctly identified new words, and reflects voice congruency only when voice is attended at study. Characterization of the latter likely depends upon voice memory, rather than word memory. There is also evidence to suggest that voice effects can be found in implicit memory paradigms. However, the presence of voice effects appears to depend greatly on the task employed. Using a word identification task, perceptual similarity between study and test conditions is, like for explicit memory tests, crucial. In addition, the type of noise employed appears to have a differential effect. While voice effects have been observed when white noise is used at both study and test, using multi-talker babble does not confer the same results. In terms of neuroimaging research modulations, characterization of an implicit memory effect

  7. Permanent Quadriplegia Following Replacement of Voice Prosthesis.

    Science.gov (United States)

    Ozturk, Kayhan; Erdur, Omer; Kibar, Ertugrul

    2016-11-01

    The authors presented a patient with quadriplegia caused by cervical spine abscess following voice prosthesis replacement. The authors present the first reported permanent quadriplegia patient caused by voice prosthesis replacement. The authors wanted to emphasize that life-threatening complications may be faced during the replacement of voice prosthesis. Care should be taken during the replacement of voice prosthesis and if some problems have been faced during the procedure patients must be followed closely.

  8. I like my voice better: self-enhancement bias in perceptions of voice attractiveness.

    Science.gov (United States)

    Hughes, Susan M; Harrison, Marissa A

    2013-01-01

    Previous research shows that the human voice can communicate a wealth of nonsemantic information; preferences for voices can predict health, fertility, and genetic quality of the speaker, and people often use voice attractiveness, in particular, to make these assessments of others. But it is not known what we think of the attractiveness of our own voices as others hear them. In this study eighty men and women rated the attractiveness of an array of voice recordings of different individuals and were not told that their own recorded voices were included in the presentation. Results showed that participants rated their own voices as sounding more attractive than others had rated their voices, and participants also rated their own voices as sounding more attractive than they had rated the voices of others. These findings suggest that people may engage in vocal implicit egotism, a form of self-enhancement.

  9. Analisis dan Perancangan Sistem Interactive Voice Response (IVR Berbasis Openvxi Menggunakan Asterisk Pada Hotel Sahid Jaya

    Directory of Open Access Journals (Sweden)

    Johan Muliadi Kerta

    2010-12-01

    Full Text Available The purpose of this study is to provide a deep understanding of technology of IVR (Interactive Voice Recognition in OpenVXI based network using Asterisk, that help Sahid Jaya hotel to streamline communication with the guests, improving service and employee effectiveness. The methodology used is analysis which is through surveys, observation and interviews with the company and the design method by designing system from the existing needs analysis. The results obtained from this research is Communication System based in IVR OpenVXI can be applied to existing VoIP network, without disrupting the existing network. Some existing services can be supported with this system usage. Conclusions can be drawn is the use of IVR in the network using VoIP technology is the best solution to overcome the problems and needs of the hospitality which one of these is limited information guest cna get that can provided by non-operators and for support service improvement. 

  10. Voices Not Heard: Voice-Use Profiles of Elementary Music Teachers, the Effects of Voice Amplification on Vocal Load, and Perceptions of Issues Surrounding Voice Use

    Science.gov (United States)

    Morrow, Sharon L.

    2009-01-01

    Teachers represent the largest group of occupational voice users and have voice-related problems at a rate of over twice that found in the general population. Among teachers, music teachers are roughly four times more likely than classroom teachers to develop voice-related problems. Although it has been established that music teachers use their…

  11. Multi-thread Parallel Speech Recognition for Mobile Applications

    Directory of Open Access Journals (Sweden)

    LOJKA Martin

    2014-05-01

    Full Text Available In this paper, the server based solution of the multi-thread large vocabulary automatic speech recognition engine is described along with the Android OS and HTML5 practical application examples. The basic idea was to bring speech recognition available for full variety of applications for computers and especially for mobile devices. The speech recognition engine should be independent of commercial products and services (where the dictionary could not be modified. Using of third-party services could be also a security and privacy problem in specific applications, when the unsecured audio data could not be sent to uncontrolled environments (voice data transferred to servers around the globe. Using our experience with speech recognition applications, we have been able to construct a multi-thread speech recognition serverbased solution designed for simple applications interface (API to speech recognition engine modified to specific needs of particular application.

  12. Interventions for preventing voice disorders in adults.

    Science.gov (United States)

    Ruotsalainen, J H; Sellman, J; Lehto, L; Jauhiainen, M; Verbeek, J H

    2007-10-17

    Poor voice quality due to a voice disorder can lead to a reduced quality of life. In occupations where voice use is substantial it can lead to periods of absence from work. To evaluate the effectiveness of interventions to prevent voice disorders in adults. We searched MEDLINE (PubMed, 1950 to 2006), EMBASE (1974 to 2006), CENTRAL (The Cochrane Library, Issue 2 2006), CINAHL (1983 to 2006), PsychINFO (1967 to 2006), Science Citation Index (1986 to 2006) and the Occupational Health databases OSH-ROM (to 2006). The date of the last search was 05/04/06. Randomised controlled clinical trials (RCTs) of interventions evaluating the effectiveness of treatments to prevent voice disorders in adults. For work-directed interventions interrupted time series and prospective cohort studies were also eligible. Two authors independently extracted data and assessed trial quality. Meta-analysis was performed where appropriate. We identified two randomised controlled trials including a total of 53 participants in intervention groups and 43 controls. One study was conducted with teachers and the other with student teachers. Both trials were poor quality. Interventions were grouped into 1) direct voice training, 2) indirect voice training and 3) direct and indirect voice training combined.1) Direct voice training: One study did not find a significant decrease of the Voice Handicap Index for direct voice training compared to no intervention.2) Indirect voice training: One study did not find a significant decrease of the Voice Handicap Index for indirect voice training when compared to no intervention.3) Direct and indirect voice training combined: One study did not find a decrease of the Voice Handicap Index for direct and indirect voice training combined when compared to no intervention. The same study did however find an improvement in maximum phonation time (Mean Difference -3.18 sec; 95 % CI -4.43 to -1.93) for direct and indirect voice training combined when compared to no

  13. Objective Voice Parameters in Colombian School Workers with Healthy Voices

    Directory of Open Access Journals (Sweden)

    Lady Catherine Cantor Cutiva

    2015-09-01

    Full Text Available Objectives: To characterize the objective voice parameters among school workers, and to identi­fy associated factors of three objective voice parameters, namely fundamental frequency, sound pressure level and maximum phonation time. Materials and methods: We conducted a cross-sectional study among 116 Colombian teachers and 20 Colombian non-teachers. After signing the informed consent form, participants filled out a questionnaire. Then, a voice sample was recorded and evaluated perceptually by a speech therapist and by objective voice analysis with praat software. Short-term environmental measurements of sound level, temperature, humi­dity, and reverberation time were conducted during visits at the workplaces, such as classrooms and offices. Linear regression analysis was used to determine associations between individual and work-related factors and objective voice parameters. Results: Compared with men, women had higher fundamental frequency (201 Hz for teachers and 209 for non-teachers vs. 120 Hz for teachers and 127 for non-teachers and sound pressure level (82 dB vs. 80 dB, and shorter maximum phonation time (around 14 seconds vs. around 16 seconds. Female teachers younger than 50 years of age evidenced a significant tendency to speak with lower fundamental frequen­cy and shorter mpt compared with female teachers older than 50 years of age. Female teachers had significantly higher fundamental frequency (66 Hz, higher sound pressure level (2 dB and short phonation time (2 seconds than male teachers. Conclusion: Female teachers younger than 50 years of age had significantly lower F0 and shorter mpt compared with those older than 50 years of age. The multivariate analysis showed that gender was a much more important determinant of variations in F0, spl and mpt than age and teaching occupation. Objectively measured temperature also contributed to the changes on spl among school workers.

  14. Voice Over Internet Protocol (VoIP) in a Control Center Environment

    Science.gov (United States)

    Pirani, Joseph; Calvelage, Steven

    2010-01-01

    The technology of transmitting voice over data networks has been available for over 10 years. Mass market VoIP services for consumers to make and receive standard telephone calls over broadband Internet networks have grown in the last 5 years. While operational costs are less with VoIP implementations as opposed to time division multiplexing (TDM) based voice switches, is it still advantageous to convert a mission control center s voice system to this newer technology? Marshall Space Flight Center (MSFC) Huntsville Operations Support Center (HOSC) has converted its mission voice services to a commercial product that utilizes VoIP technology. Results from this testing, design, and installation have shown unique considerations that must be addressed before user operations. There are many factors to consider for a control center voice design. Technology advantages and disadvantages were investigated as they refer to cost. There were integration concerns which could lead to complex failure scenarios but simpler integration for the mission infrastructure. MSFC HOSC will benefit from this voice conversion with less product replacement cost, less operations cost and a more integrated mission services environment.

  15. Can blind persons accurately assess body size from the voice?

    Science.gov (United States)

    Pisanski, Katarzyna; Oleszkiewicz, Anna; Sorokowska, Agnieszka

    2016-04-01

    Vocal tract resonances provide reliable information about a speaker's body size that human listeners use for biosocial judgements as well as speech recognition. Although humans can accurately assess men's relative body size from the voice alone, how this ability is acquired remains unknown. In this study, we test the prediction that accurate voice-based size estimation is possible without prior audiovisual experience linking low frequencies to large bodies. Ninety-one healthy congenitally or early blind, late blind and sighted adults (aged 20-65) participated in the study. On the basis of vowel sounds alone, participants assessed the relative body sizes of male pairs of varying heights. Accuracy of voice-based body size assessments significantly exceeded chance and did not differ among participants who were sighted, or congenitally blind or who had lost their sight later in life. Accuracy increased significantly with relative differences in physical height between men, suggesting that both blind and sighted participants used reliable vocal cues to size (i.e. vocal tract resonances). Our findings demonstrate that prior visual experience is not necessary for accurate body size estimation. This capacity, integral to both nonverbal communication and speech perception, may be present at birth or may generalize from broader cross-modal correspondences. © 2016 The Author(s).

  16. Application Of t-Cherry Junction Trees in Pattern Recognition

    Directory of Open Access Journals (Sweden)

    Edith Kovacs

    2010-06-01

    Full Text Available Pattern recognition aims to classify data (patterns based ei-
    ther on a priori knowledge or on statistical information extracted from the data. In this paper we will concentrate on statistical pattern recognition using a new probabilistic approach which makes possible to select the so called 'informative' features. We develop a pattern recognition algorithm which is based on the conditional independence structure underlying the statistical data. Our method was succesfully applied on a real problem of recognizing Parkinson's disease on the basis of voice disorders.

  17. Work-related voice disorder

    Directory of Open Access Journals (Sweden)

    Paulo Eduardo Przysiezny

    2015-04-01

    Full Text Available INTRODUCTION: Dysphonia is the main symptom of the disorders of oral communication. However, voice disorders also present with other symptoms such as difficulty in maintaining the voice (asthenia, vocal fatigue, variation in habitual vocal fundamental frequency, hoarseness, lack of vocal volume and projection, loss of vocal efficiency, and weakness when speaking. There are several proposals for the etiologic classification of dysphonia: functional, organofunctional, organic, and work-related voice disorder (WRVD.OBJECTIVE: To conduct a literature review on WRVD and on the current Brazilian labor legislation.METHODS: This was a review article with bibliographical research conducted on the PubMed and Bireme databases, using the terms "work-related voice disorder", "occupational dysphonia", "dysphonia and labor legislation", and a review of labor and social security relevant laws.CONCLUSION: WRVD is a situation that frequently is listed as a reason for work absenteeism, functional rehabilitation, or for prolonged absence from work. Currently, forensic physicians have no comparative parameters to help with the analysis of vocal disorders. In certain situations WRVD may cause, work disability. This disorder may be labor-related, or be an adjuvant factor to work-related diseases.

  18. Playful Interaction with Voice Sensing Modular Robots

    DEFF Research Database (Denmark)

    Heesche, Bjarke; MacDonald, Ewen; Fogh, Rune

    2013-01-01

    This paper describes a voice sensor, suitable for modular robotic systems, which estimates the energy and fundamental frequency, F0, of the user’s voice. Through a number of example applications and tests with children, we observe how the voice sensor facilitates playful interaction between child...... children and two different robot configurations. In future work, we will investigate if such a system can motivate children to improve voice control and explore how to extend the sensor to detect emotions in the user’s voice....

  19. Facial recognition in education system

    Science.gov (United States)

    Krithika, L. B.; Venkatesh, K.; Rathore, S.; Kumar, M. Harish

    2017-11-01

    Human beings exploit emotions comprehensively for conveying messages and their resolution. Emotion detection and face recognition can provide an interface between the individuals and technologies. The most successful applications of recognition analysis are recognition of faces. Many different techniques have been used to recognize the facial expressions and emotion detection handle varying poses. In this paper, we approach an efficient method to recognize the facial expressions to track face points and distances. This can automatically identify observer face movements and face expression in image. This can capture different aspects of emotion and facial expressions.

  20. VOICE QUALITY BEFORE AND AFTER THYROIDECTOMY

    Directory of Open Access Journals (Sweden)

    Dora CVELBAR

    2016-04-01

    Full Text Available Introduction: Voice disorders are a well-known complication which is often associated with thyroid gland diseases and because voice is still the basic mean of communication it is very important to maintain its quality healthy. Objectives: The aim of this study referred to questions whether there is a statistically significant difference between results of voice self-assessment, perceptual voice assessment and acoustic voice analysis before and after thyroidectomy and whether there are statistically significant correlations between variables of voice self-assessment, perceptual assessment and acoustic analysis before and after thyroidectomy. Methods: This scientific research included 12 participants aged between 41 and 76. Voice self-assessment was conducted with the help of Croatian version of Voice Handicap Index (VHI. Recorded reading samples were used for perceptual assessment and later evaluated by two clinical speech and language therapists. Recorded samples of phonation were used for acoustic analysis which was conducted with the help of acoustic program Praat. All of the data was processed through descriptive statistics and nonparametric statistical methods. Results: Results showed that there are statistically significant differences between results of voice self-assessments and results of acoustic analysis before and after thyroidectomy. Statistically significant correlations were found between variables of perceptual assessment and acoustic analysis. Conclusion: Obtained results indicate the importance of multidimensional, preoperative and postoperative assessment. This kind of assessment allows the clinician to describe all of the voice features and provides appropriate recommendation for further rehabilitation to the patient in order to optimize voice outcomes.

  1. Application of computer voice input/output

    International Nuclear Information System (INIS)

    Ford, W.; Shirk, D.G.

    1981-01-01

    The advent of microprocessors and other large-scale integration (LSI) circuits is making voice input and output for computers and instruments practical; specialized LSI chips for speech processing are appearing on the market. Voice can be used to input data or to issue instrument commands; this allows the operator to engage in other tasks, move about, and to use standard data entry systems. Voice synthesizers can generate audible, easily understood instructions. Using voice characteristics, a control system can verify speaker identity for security purposes. Two simple voice-controlled systems have been designed at Los Alamos for nuclear safeguards applicaations. Each can easily be expanded as time allows. The first system is for instrument control that accepts voice commands and issues audible operator prompts. The second system is for access control. The speaker's voice is used to verify his identity and to actuate external devices

  2. Enhanced Living by Assessing Voice Pathology Using a Co-Occurrence Matrix

    OpenAIRE

    Muhammad, Ghulam; Alhamid, Mohammed F.; Hossain, M. Shamim; Almogren, Ahmad S.; Vasilakos, Athanasios V.

    2017-01-01

    A large number of the population around the world suffers from various disabilities. Disabilities affect not only children but also adults of different professions. Smart technology can assist the disabled population and lead to a comfortable life in an enhanced living environment (ELE). In this paper, we propose an effective voice pathology assessment system that works in a smart home framework. The proposed system takes input from various sensors, and processes the acquired voice signals an...

  3. The development of the Spanish verb ir into auxiliary of voice

    DEFF Research Database (Denmark)

    Vinther, Thora

    2005-01-01

    spanish, syntax, grammaticalisation, past participle, passive voice, middle voice, language development......spanish, syntax, grammaticalisation, past participle, passive voice, middle voice, language development...

  4. Non Audio-Video gesture recognition system

    DEFF Research Database (Denmark)

    Craciunescu, Razvan; Mihovska, Albena Dimitrova; Kyriazakos, Sofoklis

    2016-01-01

    Gesture recognition is a topic in computer science and language technology with the goal of interpreting human gestures via mathematical algorithms. Gestures can originate from any bodily motion or state but commonly originate from the face or hand. Current research focus includes on the emotion...... recognition from the face and hand gesture recognition. Gesture recognition enables humans to communicate with the machine and interact naturally without any mechanical devices. This paper investigates the possibility to use non-audio/video sensors in order to design a low-cost gesture recognition device...

  5. Forensic Automatic Speaker Recognition Based on Likelihood Ratio Using Acoustic-phonetic Features Measured Automatically

    Directory of Open Access Journals (Sweden)

    Huapeng Wang

    2015-01-01

    Full Text Available Forensic speaker recognition is experiencing a remarkable paradigm shift in terms of the evaluation framework and presentation of voice evidence. This paper proposes a new method of forensic automatic speaker recognition using the likelihood ratio framework to quantify the strength of voice evidence. The proposed method uses a reference database to calculate the within- and between-speaker variability. Some acoustic-phonetic features are extracted automatically using the software VoiceSauce. The effectiveness of the approach was tested using two Mandarin databases: A mobile telephone database and a landline database. The experiment's results indicate that these acoustic-phonetic features do have some discriminating potential and are worth trying in discrimination. The automatic acoustic-phonetic features have acceptable discriminative performance and can provide more reliable results in evidence analysis when fused with other kind of voice features.

  6. Comparison of the effects of cylindrical correction with and without iris recognition technology in wavefront laser-assisted in situ keratomileusis.

    Science.gov (United States)

    Wang, Tsung-Jen; Lin, Yu-Huang; Chang, David C-K; Chou, Hsiu-Chu; Wang, I-Jong

    2012-04-01

      To analyse the magnitude of cylindrical corrections over which cyclotorsion compensation with iris recognition (IR) technology is beneficial during wavefront laser-assisted in situ keratomileusis.   A retrospectively comparative case series.   Fifty-four eyes that underwent wavefront laser-assisted in situ keratomileusis without IR (non-IR group) and 53 eyes that underwent wavefront laser-assisted in situ keratomileusis with IR (IR group) were recruited.   Subgroup analysis based on baseline astigmatism were: a low degree of astigmatism (≥1.00 D to <2.00 D), a moderate degree of astigmatism (≥2.00 D to <3.00 D) and a high degree of astigmatism (≥3.00 D).   Vector and non-vector analyses were used for comparison.   The mean cylinder was -1.89 ± 0.76 D in the non-IR group and -2.00 ± 0.77 D in the IR group. Postoperatively, 38 eyes (74.50%) in the IR group and 31 eyes (57.50%) in the non-IR group were within ± 0.50 D of the target induced astigmatism vector (P = 0.063). The difference vector was 0.49 ± 0.28 in the IR group and 0.63 ± 0.40 in the non-IR group (P = 0.031). In the analysis of subgroups, the magnitude of error was significantly lower in the moderate IR subgroup than that of the moderate non-IR subgroup (P = 0.034). Furthermore, the moderate IR subgroup had a lower mean difference vector (P = 0.0078) and a greater surgically induced astigmatism (P = 0.036) than those of the moderate non-IR group.   Wavefront laser-assisted in situ keratomileusis for the treatment of astigmatism using IR technology was effective and accurate for the treatment of myopic astigmatism. © 2011 The Authors. Clinical and Experimental Ophthalmology © 2011 Royal Australian and New Zealand College of Ophthalmologists.

  7. Foetal response to music and voice.

    Science.gov (United States)

    Al-Qahtani, Noura H

    2005-10-01

    To examine whether prenatal exposure to music and voice alters foetal behaviour and whether foetal response to music differs from human voice. A prospective observational study was conducted in 20 normal term pregnant mothers. Ten foetuses were exposed to music and voice for 15 s at different sound pressure levels to find out the optimal setting for the auditory stimulation. Music, voice and sham were played to another 10 foetuses via a headphone on the maternal abdomen. The sound pressure level was 105 db and 94 db for music and voice, respectively. Computerised assessment of foetal heart rate and activity were recorded. 90 actocardiograms were obtained for the whole group. One way anova followed by posthoc (Student-Newman-Keuls method) analysis was used to find if there is significant difference in foetal response to music and voice versus sham. Foetuses responded with heart rate acceleration and motor response to both music and voice. This was statistically significant compared to sham. There was no significant difference between the foetal heart rate acceleration to music and voice. Prenatal exposure to music and voice alters the foetal behaviour. No difference was detected in foetal response to music and voice.

  8. Multi-modal assessment of on-road demand of voice and manual phone calling and voice navigation entry across two embedded vehicle systems

    Science.gov (United States)

    Mehler, Bruce; Kidd, David; Reimer, Bryan; Reagan, Ian; Dobres, Jonathan; McCartt, Anne

    2016-01-01

    Abstract One purpose of integrating voice interfaces into embedded vehicle systems is to reduce drivers’ visual and manual distractions with ‘infotainment’ technologies. However, there is scant research on actual benefits in production vehicles or how different interface designs affect attentional demands. Driving performance, visual engagement, and indices of workload (heart rate, skin conductance, subjective ratings) were assessed in 80 drivers randomly assigned to drive a 2013 Chevrolet Equinox or Volvo XC60. The Chevrolet MyLink system allowed completing tasks with one voice command, while the Volvo Sensus required multiple commands to navigate the menu structure. When calling a phone contact, both voice systems reduced visual demand relative to the visual–manual interfaces, with reductions for drivers in the Equinox being greater. The Equinox ‘one-shot’ voice command showed advantages during contact calling but had significantly higher error rates than Sensus during destination address entry. For both secondary tasks, neither voice interface entirely eliminated visual demand. Practitioner Summary: The findings reinforce the observation that most, if not all, automotive auditory–vocal interfaces are multi-modal interfaces in which the full range of potential demands (auditory, vocal, visual, manipulative, cognitive, tactile, etc.) need to be considered in developing optimal implementations and evaluating drivers’ interaction with the systems. Social Media: In-vehicle voice-interfaces can reduce visual demand but do not eliminate it and all types of demand need to be taken into account in a comprehensive evaluation. PMID:26269281

  9. Multi-modal assessment of on-road demand of voice and manual phone calling and voice navigation entry across two embedded vehicle systems.

    Science.gov (United States)

    Mehler, Bruce; Kidd, David; Reimer, Bryan; Reagan, Ian; Dobres, Jonathan; McCartt, Anne

    2016-03-01

    One purpose of integrating voice interfaces into embedded vehicle systems is to reduce drivers' visual and manual distractions with 'infotainment' technologies. However, there is scant research on actual benefits in production vehicles or how different interface designs affect attentional demands. Driving performance, visual engagement, and indices of workload (heart rate, skin conductance, subjective ratings) were assessed in 80 drivers randomly assigned to drive a 2013 Chevrolet Equinox or Volvo XC60. The Chevrolet MyLink system allowed completing tasks with one voice command, while the Volvo Sensus required multiple commands to navigate the menu structure. When calling a phone contact, both voice systems reduced visual demand relative to the visual-manual interfaces, with reductions for drivers in the Equinox being greater. The Equinox 'one-shot' voice command showed advantages during contact calling but had significantly higher error rates than Sensus during destination address entry. For both secondary tasks, neither voice interface entirely eliminated visual demand. Practitioner Summary: The findings reinforce the observation that most, if not all, automotive auditory-vocal interfaces are multi-modal interfaces in which the full range of potential demands (auditory, vocal, visual, manipulative, cognitive, tactile, etc.) need to be considered in developing optimal implementations and evaluating drivers' interaction with the systems. Social Media: In-vehicle voice-interfaces can reduce visual demand but do not eliminate it and all types of demand need to be taken into account in a comprehensive evaluation.

  10. An automatic speech recognition system with speaker-independent identification support

    Science.gov (United States)

    Caranica, Alexandru; Burileanu, Corneliu

    2015-02-01

    The novelty of this work relies on the application of an open source research software toolkit (CMU Sphinx) to train, build and evaluate a speech recognition system, with speaker-independent support, for voice-controlled hardware applications. Moreover, we propose to use the trained acoustic model to successfully decode offline voice commands on embedded hardware, such as an ARMv6 low-cost SoC, Raspberry PI. This type of single-board computer, mainly used for educational and research activities, can serve as a proof-of-concept software and hardware stack for low cost voice automation systems.

  11. Voice disorders in mucosal leishmaniasis.

    Directory of Open Access Journals (Sweden)

    Ana Cristina Nunes Ruas

    Full Text Available INTRODUCTION: Leishmaniasis is considered as one of the six most important infectious diseases because of its high detection coefficient and ability to produce deformities. In most cases, mucosal leishmaniasis (ML occurs as a consequence of cutaneous leishmaniasis. If left untreated, mucosal lesions can leave sequelae, interfering in the swallowing, breathing, voice and speech processes and requiring rehabilitation. OBJECTIVE: To describe the anatomical characteristics and voice quality of ML patients. MATERIALS AND METHODS: A descriptive transversal study was conducted in a cohort of ML patients treated at the Laboratory for Leishmaniasis Surveillance of the Evandro Chagas National Institute of Infectious Diseases-Fiocruz, between 2010 and 2013. The patients were submitted to otorhinolaryngologic clinical examination by endoscopy of the upper airways and digestive tract and to speech-language assessment through directed anamnesis, auditory perception, phonation times and vocal acoustic analysis. The variables of interest were epidemiologic (sex and age and clinic (lesion location, associated symptoms and voice quality. RESULTS: 26 patients under ML treatment and monitored by speech therapists were studied. 21 (81% were male and five (19% female, with ages ranging from 15 to 78 years (54.5+15.0 years. The lesions were distributed in the following structures 88.5% nasal, 38.5% oral, 34.6% pharyngeal and 19.2% laryngeal, with some patients presenting lesions in more than one anatomic site. The main complaint was nasal obstruction (73.1%, followed by dysphonia (38.5%, odynophagia (30.8% and dysphagia (26.9%. 23 patients (84.6% presented voice quality perturbations. Dysphonia was significantly associated to lesions in the larynx, pharynx and oral cavity. CONCLUSION: We observed that vocal quality perturbations are frequent in patients with mucosal leishmaniasis, even without laryngeal lesions; they are probably associated to disorders of some

  12. Speech technology and cinema: can they learn from each other?

    Science.gov (United States)

    Pauletto, Sandra

    2013-10-01

    The voice is the most important sound of a film soundtrack. It represents a character and it carries language. There are different types of cinematic voices: dialogue, internal monologues, and voice-overs. Conventionally, two main characteristics differentiate these voices: lip synchronization and the voice's attributes that make it appropriate for the character (for example, a voice that sounds very close to the audience can be appropriate for a narrator, but not for an onscreen character). What happens, then, if a film character can only speak through an asynchronous machine that produces a 'robot-like' voice? This article discusses the sound-related work and experimentation done by the author for the short film Voice by Choice. It also attempts to discover whether speech technology design can learn from its cinematic representation, and if such uncommon film protagonists can contribute creatively to transform the conventions of cinematic voices.

  13. Voices of Romanian scientists

    CERN Multimedia

    Stefania Pandolfi

    2016-01-01

    As Romania has now become a Member State of CERN, Romanian scientists share their thoughts about this new era of partnership for their community.   Members of ATLAS from Romanian institutes at CERN (from left to right): Dan Ciubotaru, Michele Renda, Bogdan Blidaru, Alexandra Tudorache, Marina Rotaru, Ana Dumitriu, Valentina Tudorache, Adam Jinaru, Calin Alexa. On 17 July 2016, Romania became the twenty-second Member State of CERN, 25 years after the first cooperation agreement with the country was signed. “CERN and Romania already have a long history of strong collaboration”, says Emmanuel Tsesmelis, head of Relations with Associate Members and Non-Member States. “We very much look forward to strengthening this collaboration as Romania becomes CERN’s twenty-second Member State, which promises the development of mutual interests in scientific research, related technologies and education,” he affirms. Romania&...

  14. [Assessment of voice acoustic parameters in female teachers with diagnosed occupational voice disorders].

    Science.gov (United States)

    Niebudek-Bogusz, Ewa; Fiszer, Marta; Sliwińska-Kowalska, Mariola

    2005-01-01

    Laryngovideostroboscopy is the method most frequently used in the assessment of voice disorders. However, the employment of quantitative methods, such as voice acoustic analysis, is essential for evaluating the effectiveness of prophylactic and therapeutic activities as well as for objective medical certification of larynx pathologies. The aim of this study was to examine voice acoustic parameters in female teachers with occupational voice diseases. Acoustic analysis (IRIS software) was performed in 66 female teachers, including 35 teachers with occupational voice diseases and 31 with functional dysphonia. The teachers with occupational voice diseases presented the lower average fundamental frequency (193 Hz) compared to the group with functional dysphonia (209 Hz) and to the normative value (236 Hz), whereas other acoustic parameters did not differ significantly in both groups. Voice acoustic analysis, when applied separately from vocal loading, cannot be used as a testing method to verify the diagnosis of occupational voice disorders.

  15. Pattern recognition and string matching

    CERN Document Server

    Cheng, Xiuzhen

    2002-01-01

    The research and development of pattern recognition have proven to be of importance in science, technology, and human activity. Many useful concepts and tools from different disciplines have been employed in pattern recognition. Among them is string matching, which receives much theoretical and practical attention. String matching is also an important topic in combinatorial optimization. This book is devoted to recent advances in pattern recognition and string matching. It consists of twenty eight chapters written by different authors, addressing a broad range of topics such as those from classifica­ tion, matching, mining, feature selection, and applications. Each chapter is self-contained, and presents either novel methodological approaches or applications of existing theories and techniques. The aim, intent, and motivation for publishing this book is to pro­ vide a reference tool for the increasing number of readers who depend upon pattern recognition or string matching in some way. This includes student...

  16. Document recognition serving people with disabilities

    Science.gov (United States)

    Fruchterman, James R.

    2007-01-01

    Document recognition advances have improved the lives of people with print disabilities, by providing accessible documents. This invited paper provides perspectives on the author's career progression from document recognition professional to social entrepreneur applying this technology to help people with disabilities. Starting with initial thoughts about optical character recognition in college, it continues with the creation of accurate omnifont character recognition that did not require training. It was difficult to make a reading machine for the blind in a commercial setting, which led to the creation of a nonprofit social enterprise to deliver these devices around the world. This network of people with disabilities scanning books drove the creation of Bookshare.org, an online library of scanned books. Looking forward, the needs for improved document recognition technology to further lower the barriers to reading are discussed. Document recognition professionals should be proud of the positive impact their work has had on some of society's most disadvantaged communities.

  17. Integrating cues of social interest and voice pitch in men's preferences for women's voices

    OpenAIRE

    Jones, Benedict C; Feinberg, David R; DeBruine, Lisa M; Little, Anthony C; Vukovic, Jovana

    2008-01-01

    Most previous studies of vocal attractiveness have focused on preferences for physical characteristics of voices such as pitch. Here we examine the content of vocalizations in interaction with such physical traits, finding that vocal cues of social interest modulate the strength of men's preferences for raised pitch in women's voices. Men showed stronger preferences for raised pitch when judging the voices of women who appeared interested in the listener than when judging the voices of women ...

  18. The Feasibility and Acceptability of Using Technology-Based Daily Diaries with HIV-Infected Young Men Who have Sex with Men: A Comparison of Internet and Voice Modalities.

    Science.gov (United States)

    Cherenack, Emily M; Wilson, Patrick A; Kreuzman, Andrew M; Price, Georgine N

    2016-08-01

    This study delivered a daily diary to 67 HIV-infected men who have sex with men (MSM) between 16 and 24 years old for 66 days to measure HIV-risk behaviors and other psychosocial variables via two diary modalities: internet (accessible via any web-enabled device) and voice (accessible via telephone). Participants were randomized to complete one diary modality for 33 days before switching to the second modality for 33 days. The study was implemented in three urban HIV health care centers in the United States where participants were receiving services. Through diary data and qualitative interview data, we examined the feasibility and acceptability of the dairies and identified barriers and facilitators of dairy compliance. Results show high participant retention in the daily diary (93.4 %) and high compliance for the number of dairies completed (72.4 %). Internet diaries were preferred by 92 % of participants and completed at a significantly higher rate (77.5 %) than voice diaries (67.7 %). Facilitators included opportunities for self-reflection and cathartic sharing, monetary compensation, relationships with study staff, and daily reminders. Barriers included being busy or not having privacy at the time of reminders, forgetting, and falling asleep. Participants also described barriers and facilitators unique to each modality. Overall, both modalities were feasible and acceptable for use with our sample of HIV-infected MSM.

  19. Voice Onset Time in Azerbaijani Consonants

    Directory of Open Access Journals (Sweden)

    Ali Jahan

    2009-10-01

    Full Text Available Objective: Voice onset time is known to be cue for the distinction between voiced and voiceless stops and it can be used to describe or categorize a range of developmental, neuromotor and linguistic disorders. The aim of this study is determination of standard values of voice onset time for Azerbaijani language (Tabriz dialect. Materials & Methods: In this description-analytical study, 30 Azeris persons whom were selected conveniently by simple selection, uttered 46 monosyllabic words initiating with 6 Azerbaijani stops twice. Using Praat software, the voice onset time values were analyzed by waveform and wideband spectrogram in milliseconds. Vowel effect, sex differences and the effect of place of articulation on VOT, were evaluated and data were analyzed by one-way ANOVA test. Results: There was no significant difference in voice onset time between male and female Azeris speakers (P<0.05. Vowel and place of articulation had significant correlation with voice onset time (P<0.001. Voice onset time values for /b/, /p/, /d/, /t/, /g/, /k/, and [c], [ɟ] allophones were 10.64, 86.88, 13.35, 87.09, 26.25, 100.62, 131.19, 63.18 mili second, respectively. Conclusion: Voice onset time values are the same for Azerbaijani men and women. However, like many other languages, back and high vowels and back place of articulation lengthen VOT. Also, voiceless stops are aspirated in this language and voiced stops have positive VOT values.

  20. Singing Voice Analysis, Synthesis, and Modeling

    Science.gov (United States)

    Kim, Youngmoo E.

    The singing voice is the oldest musical instrument, but its versatility and emotional power are unmatched. Through the combination of music, lyrics, and expression, the voice is able to affect us in ways that no other instrument can. The fact that vocal music is prevalent in almost all cultures is indicative of its innate appeal to the human aesthetic. Singing also permeates most genres of music, attesting to the wide range of sounds the human voice is capable of producing. As listeners we are naturally drawn to the sound of the human voice, and, when present, it immediately becomes the focus of our attention.

  1. "Voice Forum" The Human Voice as Primary Instrument in Music Therapy

    DEFF Research Database (Denmark)

    Pedersen, Inge Nygaard; Storm, Sanne

    2009-01-01

    Aspects will be drawn on the human voice as tool for embodying our psychological and physiological state, and attempting integration of feelings. Presentations and dialogues on different methods and techniques in "Therapy related body-and voice work.", as well as the human voice as a tool for non...

  2. V2S: Voice to Sign Language Translation System for Malaysian Deaf People

    Science.gov (United States)

    Mean Foong, Oi; Low, Tang Jung; La, Wai Wan

    The process of learning and understand the sign language may be cumbersome to some, and therefore, this paper proposes a solution to this problem by providing a voice (English Language) to sign language translation system using Speech and Image processing technique. Speech processing which includes Speech Recognition is the study of recognizing the words being spoken, regardless of whom the speaker is. This project uses template-based recognition as the main approach in which the V2S system first needs to be trained with speech pattern based on some generic spectral parameter set. These spectral parameter set will then be stored as template in a database. The system will perform the recognition process through matching the parameter set of the input speech with the stored templates to finally display the sign language in video format. Empirical results show that the system has 80.3% recognition rate.

  3. The voice conveys emotion in ten globalized cultures and one remote village in Bhutan.

    Science.gov (United States)

    Cordaro, Daniel T; Keltner, Dacher; Tshering, Sumjay; Wangchuk, Dorji; Flynn, Lisa M

    2016-02-01

    With data from 10 different globalized cultures and 1 remote, isolated village in Bhutan, we examined universals and cultural variations in the recognition of 16 nonverbal emotional vocalizations. College students in 10 nations (Study 1) and villagers in remote Bhutan (Study 2) were asked to match emotional vocalizations to 1-sentence stories of the same valence. Guided by previous conceptualizations of recognition accuracy, across both studies, 7 of the 16 vocal burst stimuli were found to have strong or very strong recognition in all 11 cultures, 6 vocal bursts were found to have moderate recognition, and 4 were not universally recognized. All vocal burst stimuli varied significantly in terms of the degree to which they were recognized across the 11 cultures. Our discussion focuses on the implications of these results for current debates concerning the emotion conveyed in the voice. (c) 2016 APA, all rights reserved).

  4. Digitization of Full-Text Documents Before Publishing on the Internet: A Case Study Reviewing the Latest Optical Character Recognition Technologies.

    Science.gov (United States)

    McClean, Clare M.

    1998-01-01

    Reviews strengths and weaknesses of five optical character recognition (OCR) software packages used to digitize paper documents before publishing on the Internet. Outlines options available and stages of the conversion process. Describes the learning experience of Eurotext, a United Kingdom-based electronic libraries project (eLib). (PEN)

  5. Interactive Voice/Web Response System in clinical research.

    Science.gov (United States)

    Ruikar, Vrishabhsagar

    2016-01-01

    Emerging technologies in computer and telecommunication industry has eased the access to computer through telephone. An Interactive Voice/Web Response System (IxRS) is one of the user friendly systems for end users, with complex and tailored programs at its backend. The backend programs are specially tailored for easy understanding of users. Clinical research industry has experienced revolution in methodologies of data capture with time. Different systems have evolved toward emerging modern technologies and tools in couple of decades from past, for example, Electronic Data Capture, IxRS, electronic patient reported outcomes, etc.

  6. Clinical voice analysis of Carnatic singers.

    Science.gov (United States)

    Arunachalam, Ravikumar; Boominathan, Prakash; Mahalingam, Shenbagavalli

    2014-01-01

    Carnatic singing is a classical South Indian style of music that involves rigorous training to produce an "open throated" loud, predominantly low-pitched singing, embedded with vocal nuances in higher pitches. Voice problems in singers are not uncommon. The objective was to report the nature of voice problems and apply a routine protocol to assess the voice. Forty-five trained performing singers (females: 36 and males: 9) who reported to a tertiary care hospital with voice problems underwent voice assessment. The study analyzed their problems and the clinical findings. Voice change, difficulty in singing higher pitches, and voice fatigue were major complaints. Most of the singers suffered laryngopharyngeal reflux that coexisted with muscle tension dysphonia and chronic laryngitis. Speaking voices were rated predominantly as "moderate deviation" on GRBAS (Grade, Rough, Breathy, Asthenia, and Strain). Maximum phonation time ranged from 4 to 29 seconds (females: 10.2, standard deviation [SD]: 5.28 and males: 15.7, SD: 5.79). Singing frequency range was reduced (females: 21.3 Semitones and males: 23.99 Semitones). Dysphonia severity index (DSI) scores ranged from -3.5 to 4.91 (females: 0.075 and males: 0.64). Singing frequency range and DSI did not show significant difference between sex and across clinical diagnosis. Self-perception using voice disorder outcome profile revealed overall severity score of 5.1 (SD: 2.7). Findings are discussed from a clinical intervention perspective. Study highlighted the nature of voice problems (hyperfunctional) and required modifications in assessment protocol for Carnatic singers. Need for regular assessments and vocal hygiene education to maintain good vocal health are emphasized as outcomes. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  7. Associations between the Transsexual Voice Questionnaire (TVQMtF ) and self-report of voice femininity and acoustic voice measures.

    Science.gov (United States)

    Dacakis, Georgia; Oates, Jennifer; Douglas, Jacinta

    2017-11-01

    The Transsexual Voice Questionnaire (TVQ MtF ) was designed to capture the voice-related perceptions of individuals whose gender identity as female is the opposite of their birth-assigned gender (MtF women). Evaluation of the psychometric properties of the TVQ MtF is ongoing. To investigate associations between TVQ MtF scores and (1) self-perceptions of voice femininity and (2) acoustic parameters of voice pitch and voice quality in order to evaluate further the validity of the TVQ MtF . A strong correlation between TVQ MtF scores and self-ratings of voice femininity was predicted, but no association between TVQ MtF scores and acoustic measures of voice pitch and quality was proposed. Participants were 148 MtF women (mean age 48.14 years) recruited from the La Trobe Communication Clinic and the clinics of three doctors specializing in transgender health. All participants completed the TVQ MtF and 34 of these participants also provided a voice sample for acoustic analysis. Pearson product-moment correlation analysis was conducted to examine the associations between TVQ MtF scores and (1) self-perceptions of voice femininity and (2) acoustic measures of F0, jitter (%), shimmer (dB) and harmonic-to-noise ratio (HNR). Strong negative correlations between the participants' perceptions of their voice femininity and the TVQ MtF scores demonstrated that for this group of MtF women a low self-rating of voice femininity was associated with more frequent negative voice-related experiences. This association was strongest with the vocal-functioning component of the TVQ MtF . These strong correlations and high levels of shared variance between the TVQ MtF and a measure of a related construct provides evidence for the convergent validity of the TVQ MtF . The absence of significant correlations between the TVQ MtF and the acoustic data is consistent with the equivocal findings of earlier research. This finding indicates that these two measures assess different aspects of the voice

  8. Sound induced activity in voice sensitive cortex predicts voice memory ability

    Directory of Open Access Journals (Sweden)

    Rebecca eWatson

    2012-04-01

    Full Text Available The ‘temporal voice areas’ (TVAs (Belin et al., 2000 of the human brain show greater neuronal activity in response to human voices than to other categories of nonvocal sounds. However, a direct link between TVA activity and voice perceptionbehaviour has not yet been established. Here we show that a functional magnetic resonance imaging (fMRI measure of activity in the TVAs predicts individual performance at a separately administered voice memory test. This relation holds whengeneral sound memory ability is taken into account. These findings provide the first evidence that the TVAs are specifically involved in voice cognition.

  9. Face recognition, a landmarks tale

    NARCIS (Netherlands)

    Beumer, G.M.

    2009-01-01

    Face recognition is a technology that appeals to the imagination of many people. This is particularly reflected in the popularity of science-fiction films and forensic detective series such as CSI, CSI New York, CSI Miami, Bones and NCIS. Although these series tend to be set in the present, their

  10. Speech recognition implementation in radiology

    International Nuclear Information System (INIS)

    White, Keith S.

    2005-01-01

    Continuous speech recognition (SR) is an emerging technology that allows direct digital transcription of dictated radiology reports. The SR systems are being widely deployed in the radiology community. This is a review of technical and practical issues that should be considered when implementing an SR system. (orig.)

  11. Pattern Recognition

    Directory of Open Access Journals (Sweden)

    Aleš Procházka

    2018-05-01

    Full Text Available Multimodal signal analysis based on sophisticated sensors, efficient communicationsystems and fast parallel processing methods has a rapidly increasing range of multidisciplinaryapplications. The present paper is devoted to pattern recognition, machine learning, and the analysisof sleep stages in the detection of sleep disorders using polysomnography (PSG data, includingelectroencephalography (EEG, breathing (Flow, and electro-oculogram (EOG signals. The proposedmethod is based on the classification of selected features by a neural network system with sigmoidaland softmax transfer functions using Bayesian methods for the evaluation of the probabilities of theseparate classes. The application is devoted to the analysis of the sleep stages of 184 individualswith different diagnoses, using EEG and further PSG signals. Data analysis points to an averageincrease of the length of the Wake stage by 2.7% per 10 years and a decrease of the length of theRapid Eye Movement (REM stages by 0.8% per 10 years. The mean classification accuracy for givensets of records and single EEG and multimodal features is 88.7% ( standard deviation, STD: 2.1 and89.6% (STD:1.9, respectively. The proposed methods enable the use of adaptive learning processesfor the detection and classification of health disorders based on prior specialist experience andman–machine interaction.

  12. The processing of auditory and visual recognition of self-stimuli.

    Science.gov (United States)

    Hughes, Susan M; Nicholson, Shevon E

    2010-12-01

    This study examined self-recognition processing in both the auditory and visual modalities by determining how comparable hearing a recording of one's own voice was to seeing photograph of one's own face. We also investigated whether the simultaneous presentation of auditory and visual self-stimuli would either facilitate or inhibit self-identification. Ninety-one participants completed reaction-time tasks of self-recognition when presented with their own faces, own voices, and combinations of the two. Reaction time and errors made when responding with both the right and left hand were recorded to determine if there were lateralization effects on these tasks. Our findings showed that visual self-recognition for facial photographs appears to be superior to auditory self-recognition for voice recordings. Furthermore, a combined presentation of one's own face and voice appeared to inhibit rather than facilitate self-recognition and there was a left-hand advantage for reaction time on the combined-presentation tasks. Copyright © 2010 Elsevier Inc. All rights reserved.

  13. Voices from Around the Globe

    Directory of Open Access Journals (Sweden)

    Birgit Schreiber

    2017-07-01

    Full Text Available JSAA has been seeking to provide an opportunity for Student Affairs professionals and higher education scholars from around the globe to share their research and experiences of student services and student affairs programmes from their respective regional and institutional contexts. This has been given a specific platform with the guest-edited issue “Voices from Around the Globe” which is the result of a collaboration with the International Association of Student Affairs and Services (IASAS, and particularly with the guest editors, Kathleen Callahan and Chinedu Mba.

  14. Voice Disorders: Etiology and Diagnosis.

    Science.gov (United States)

    Martins, Regina Helena Garcia; do Amaral, Henrique Abrantes; Tavares, Elaine Lara Mendes; Martins, Maira Garcia; Gonçalves, Tatiana Maria; Dias, Norimar Hernandes

    2016-11-01

    Voice disorders affect adults and children and have different causes in different age groups. The aim of the study is to present the etiology and diagnosis dysphonia in a large population of patients with this voice disorder.for dysphonia of a large population of dysphonic patients. We evaluated 2019 patients with dysphonia who attended the Voice Disease ambulatories of a university hospital. Parameters assessed were age, gender, profession, associated symptoms, smoking, and videolaryngoscopy diagnoses. Of the 2019 patients with dysphonia who were included in this study, 786 were male (38.93%) and 1233 were female (61.07). The age groups were as follows: 1-6 years (n = 100); 7-12 years (n = 187); 13-18 years (n = 92); 19-39 years (n = 494); 41-60 years (n = 811); and >60 years (n = 335). Symptoms associated with dysphonia were vocal overuse (n = 677), gastroesophageal symptoms (n = 535), and nasosinusal symptoms (n = 497). The predominant professions of the patients were domestic workers, students, and teachers. Smoking was reported by 13.6% patients. With regard to the etiology of dysphonia, in children (1-18 years old), nodules (n = 225; 59.3%), cysts (n = 39; 10.3%), and acute laryngitis (n = 26; 6.8%) prevailed. In adults (19-60 years old), functional dysphonia (n = 268; 20.5%), acid laryngitis (n = 164; 12.5%), and vocal polyps (n = 156; 12%) predominated. In patients older than 60 years, presbyphonia (n = 89; 26.5%), functional dysphonia (n = 59; 17.6%), and Reinke's edema (n = 48; 14%) predominated. In this population of 2019 patients with dysphonia, adults and women were predominant. Dysphonia had different etiologies in the age groups studied. Nodules and cysts were predominant in children, functional dysphonia and reflux in adults, and presbyphonia and Reinke's edema in the elderly. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  15. From Out of Our Voices

    Directory of Open Access Journals (Sweden)

    Evangelia Papanikolaou

    2010-01-01

    Full Text Available Note from the interviewer: Diane Austin's new book “The Theory and Practice of Vocal Psychotherapy: Songs of the Self” (2008 which was published recently, has been an excellent opportunity to learn more about the use of voice in therapy, its clinical applications and its enormous possibilities that offers within a psychotherapeutic setting. This interview focuses on introducing some of these aspects based on Austin’s work, and on exploring her background, motivations and considerations towards this pioneer music-therapeutic approach. The interview has been edited by Diane Austin and Evangelia Papanikolaou and took place via a series of emails, dated from September to December 2009.

  16. Muscular tension and body posture in relation to voice handicap and voice quality in teachers with persistent voice complaints.

    Science.gov (United States)

    Kooijman, P G C; de Jong, F I C R S; Oudes, M J; Huinck, W; van Acht, H; Graamans, K

    2005-01-01

    The aim of this study was to investigate the relationship between extrinsic laryngeal muscular hypertonicity and deviant body posture on the one hand and voice handicap and voice quality on the other hand in teachers with persistent voice complaints and a history of voice-related absenteeism. The study group consisted of 25 female teachers. A voice therapist assessed extrinsic laryngeal muscular tension and a physical therapist assessed body posture. The assessed parameters were clustered in categories. The parameters in the different categories represent the same function. Further a tension/posture index was created, which is the summation of the different parameters. The different parameters and the index were related to the Voice Handicap Index (VHI) and the Dysphonia Severity Index (DSI). The scores of the VHI and the individual parameters differ significantly except for the posterior weight bearing and tension of the sternocleidomastoid muscle. There was also a significant difference between the individual parameters and the DSI, except for tension of the cricothyroid muscle and posterior weight bearing. The score of the tension/posture index correlates significantly with both the VHI and the DSI. In a linear regression analysis, the combination of hypertonicity of the sternocleidomastoid, the geniohyoid muscles and posterior weight bearing is the most important predictor for a high voice handicap. The combination of hypertonicity of the geniohyoid muscle, posterior weight bearing, high position of the hyoid bone, hypertonicity of the cricothyroid muscle and anteroposition of the head is the most important predictor for a low DSI score. The results of this study show the higher the score of the index, the higher the score of the voice handicap and the worse the voice quality is. Moreover, the results are indicative for the importance of assessment of muscular tension and body posture in the diagnosis of voice disorders.

  17. The Role of Occupational Voice Demand and Patient-Rated Impairment in Predicting Voice Therapy Adherence.

    Science.gov (United States)

    Ebersole, Barbara; Soni, Resha S; Moran, Kathleen; Lango, Miriam; Devarajan, Karthik; Jamal, Nausheen

    2018-05-01

    Examine the relationship among the severity of patient-perceived voice impairment, perceptual dysphonia severity, occupational voice demand, and voice therapy adherence. Identify clinical predictors of increased risk for therapy nonadherence. A retrospective cohort study of patients presenting with a chief complaint of persistent dysphonia at an interdisciplinary voice center was done. The Voice Handicap Index-10 (VHI-10) and the Voice-Related Quality of Life (V-RQOL) survey scores, clinician rating of dysphonia severity using the Grade score from the Grade, Roughness Breathiness, Asthenia, and Strain scale, occupational voice demand, and patient demographics were tested for associations with therapy adherence, defined as completion of the treatment plan. Classification and Regression Tree (CART) analysis was performed to establish thresholds for nonadherence risk. Of 166 patients evaluated, 111 were recommended for voice therapy. The therapy nonadherence rate was 56%. Occupational voice demand category, VHI-10, and V-RQOL scores were the only factors significantly correlated with therapy adherence (P demand are significantly more likely to be nonadherent with therapy than those with high occupational voice demand (P 40 is a significant cutoff point for predicting therapy nonadherence (P demand and patient perception of impairment are significantly and independently correlated with therapy adherence. A VHI-10 score of ≤9 or a V-RQOL score of >40 is a significant cutoff point for predicting nonadherence risk. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  18. Integrating cues of social interest and voice pitch in men's preferences for women's voices.

    Science.gov (United States)

    Jones, Benedict C; Feinberg, David R; Debruine, Lisa M; Little, Anthony C; Vukovic, Jovana

    2008-04-23

    Most previous studies of vocal attractiveness have focused on preferences for physical characteristics of voices such as pitch. Here we examine the content of vocalizations in interaction with such physical traits, finding that vocal cues of social interest modulate the strength of men's preferences for raised pitch in women's voices. Men showed stronger preferences for raised pitch when judging the voices of women who appeared interested in the listener than when judging the voices of women who appeared relatively disinterested in the listener. These findings show that voice preferences are not determined solely by physical properties of voices and that men integrate information about voice pitch and the degree of social interest expressed by women when forming voice preferences. Women's preferences for raised pitch in women's voices were not modulated by cues of social interest, suggesting that the integration of cues of social interest and voice pitch when men judge the attractiveness of women's voices may reflect adaptations that promote efficient allocation of men's mating effort.

  19. Perception of Paralinguistic Traits in Synthesized Voices

    DEFF Research Database (Denmark)

    Baird, Alice Emily; Hasse Jørgensen, Stina; Parada-Cabaleiro, Emilia

    2017-01-01

    Along with the rise of artificial intelligence and the internet-of-things, synthesized voices are now common in daily–life, providing us with guidance, assistance, and even companionship. From formant to concatenative synthesis, the synthesized voice continues to be defined by the same traits we...

  20. Student Voices in School-Based Assessment

    Science.gov (United States)

    Tong, Siu Yin Annie; Adamson, Bob

    2015-01-01

    The value of student voices in dialogues about learning improvement is acknowledged in the literature. This paper examines how the views of students regarding School-based Assessment (SBA), a significant shift in examination policy and practice in secondary schools in Hong Kong, have largely been ignored. The study captures student voices through…

  1. Analog voicing detector responds to pitch

    Science.gov (United States)

    Abel, R. S.; Watkins, H. E.

    1967-01-01

    Modified electronic voice encoder /Vocoder/ includes an independent analog mode of operation in addition to the conventional digital mode. The Vocoder is a bandwidth compression equipment that permits voice transmission over channels, having only a fraction of the bandwidth required for conventional telephone-quality speech transmission.

  2. The Voice of the Technical Writer.

    Science.gov (United States)

    Euler, James S.

    The author's voice is implicit in all writing, even technical writing. It is the expression of the writer's attitude toward audience, subject matter, and self. Effective use of voice is made possible by recognizing the three roles of the technical writer: transmitter, translator, and author. As a transmitter, the writer must consciously apply an…

  3. Student Voice and the Common Core

    Science.gov (United States)

    Yonezawa, Susan

    2015-01-01

    Common Core proponents and detractors debate its merits, but students have voiced their opinion for years. Using a decade's worth of data gathered through design-research on youth voice, this article discusses what high school students have long described as more ideal learning environments for themselves--and how remarkably similar the Common…

  4. Employee voice and engagement : Connections and consequences

    NARCIS (Netherlands)

    Rees, C.; Alfes, K.; Gatenby, M.

    2013-01-01

    This paper considers the relationship between employee voice and employee engagement. Employee perceptions of voice behaviour aimed at improving the functioning of the work group are found to have both a direct impact and an indirect impact on levels of employee engagement. Analysis of data from two

  5. Speaking with the voice of authority

    CERN Multimedia

    2002-01-01

    GPB Consulting has developed a scientific approach to voice coaching. A digital recording of the voice is sent to a lab in Switzerland and analyzed by a computer programme designed by a doctor of psychology and linguistics and a scientist at CERN (1 page).

  6. Managing dysphonia in occupational voice users.

    Science.gov (United States)

    Behlau, Mara; Zambon, Fabiana; Madazio, Glaucya

    2014-06-01

    Recent advances with regard to occupational voice disorders are highlighted with emphasis on issues warranting consideration when assessing, training, and treating professional voice users. Findings include the many particularities between the various categories of professional voice users, the concept that the environment plays a major role in occupational voice disorders, and that biopsychosocial influences should be analyzed on an individual basis. Assessment via self-evaluation protocols to quantify the impact of these disorders is mandatory as a component of an evaluation and to document treatment outcomes. Discomfort or odynophonia has evolved as a critical symptom in this population. Clinical trials are limited and the complexity of the environment may be a limitation in experiment design. This review reinforced the need for large population studies of professional voice users; new data highlighted important factors specific to each group of voice users. Interventions directed at student teachers are necessities to not only improving the quality of future professionals, but also to avoid the frustration and limitations associated with chronic voice problems. The causative relationship between the work environment and voice disorders has not yet been established. Randomized controlled trials are lacking and must be a focus to enhance treatment paradigms for this population.

  7. Does CPAP treatment affect the voice?

    Science.gov (United States)

    Saylam, Güleser; Şahin, Mustafa; Demiral, Dilek; Bayır, Ömer; Yüceege, Melike Bağnu; Çadallı Tatar, Emel; Korkmaz, Mehmet Hakan

    2016-12-20

    The aim of this study was to investigate alterations in voice parameters among patients using continuous positive airway pressure (CPAP) for the treatment of obstructive sleep apnea syndrome. Patients with an indication for CPAP treatment without any voice problems and with normal laryngeal findings were included and voice parameters were evaluated before and 1 and 6 months after CPAP. Videolaryngostroboscopic findings, a self-rated scale (Voice Handicap Index-10, VHI-10), perceptual voice quality assessment (GRBAS: grade, roughness, breathiness, asthenia, strain), and acoustic parameters were compared. Data from 70 subjects (48 men and 22 women) with a mean age of 44.2 ± 6.0 years were evaluated. When compared with the pre-CPAP treatment period, there was a significant increase in the VHI-10 score after 1 month of treatment and in VHI- 10 and total GRBAS scores, jitter percent (P = 0.01), shimmer percent, noise-to-harmonic ratio, and voice turbulence index after 6 months of treatment. Vague negative effects on voice parameters after the first month of CPAP treatment became more evident after 6 months. We demonstrated nonsevere alterations in the voice quality of patients under CPAP treatment. Given that CPAP is a long-term treatment it is important to keep these alterations in mind.

  8. Occupational risk factors and voice disorders.

    Science.gov (United States)

    Vilkman, E

    1996-01-01

    From the point of view of occupational health, the field of voice disorders is very poorly developed as compared, for instance, to the prevention and diagnostics of occupational hearing disorders. In fact, voice disorders have not even been recognized in the field of occupational medicine. Hence, it is obviously very rare in most countries that the voice disorder of a professional voice user, e.g. a teacher, a singer or an actor, is accepted as an occupational disease by insurance companies. However, occupational voice problems do not lack significance from the point of view of the patient. We also know from questionnaires and clinical studies that voice complaints are very common. Another example of job-related health problems, which has proved more successful in terms of its occupational health status, is the repetition strain injury of the elbow, i.e. the "tennis elbow". Its textbook definition could be used as such to describe an occupational voice disorder ("dysphonia professional is"). In the present paper the effects of such risk factors as vocal loading itself, background noise and room acoustics and low relative humidity of the air are discussed. Due to individual factors underlying the development of professional voice disorders, recommendations rather than regulations are called for. There are many simple and even relatively low-cost methods available for the prevention of vocal problems as well as for supporting rehabilitation.

  9. Why Is My Voice Changing? (For Teens)

    Science.gov (United States)

    ... enter puberty earlier or later than others. How Deep Will My Voice Get? How deep a guy's voice gets depends on his genes: ... of Use Notice of Nondiscrimination Visit the Nemours Web site. Note: All information on TeensHealth® is for ...

  10. Stage Voice Training in the London Schools.

    Science.gov (United States)

    Rubin, Lucille S.

    This report is the result of a six-week study in which the voice training offerings at four schools of drama in London were examined using interviews of teachers and directors, observation of voice classes, and attendance at studio presentations and public performances. The report covers such topics as: textbooks and references being used; courses…

  11. Predictors of Choral Directors' Voice Handicap

    Science.gov (United States)

    Schwartz, Sandra

    2013-01-01

    Vocal demands of teaching are considerable and these challenges are greater for choral directors who depend on the voice as a musical and instructive instrument. The purpose of this study was to (1) examine choral directors' vocal condition using a modified Voice Handicap Index (VHI), and (2) determine the extent to which the major variables…

  12. Face Recognition and Tracking in Videos

    Directory of Open Access Journals (Sweden)

    Swapnil Vitthal Tathe

    2017-07-01

    Full Text Available Advancement in computer vision technology and availability of video capturing devices such as surveillance cameras has evoked new video processing applications. The research in video face recognition is mostly biased towards law enforcement applications. Applications involves human recognition based on face and iris, human computer interaction, behavior analysis, video surveillance etc. This paper presents face tracking framework that is capable of face detection using Haar features, recognition using Gabor feature extraction, matching using correlation score and tracking using Kalman filter. The method has good recognition rate for real-life videos and robust performance to changes due to illumination, environmental factors, scale, pose and orientations.

  13. The written voice: implicit memory effects of voice characteristics following silent reading and auditory presentation.

    Science.gov (United States)

    Abramson, Marianne

    2007-12-01

    After being familiarized with two voices, either implicit (auditory lexical decision) or explicit memory (auditory recognition) for words from silently read sentences was assessed among 32 men and 32 women volunteers. In the silently read sentences, the sex of speaker was implied in the initial words, e.g., "He said, ..." or "She said...". Tone in question versus statement was also manipulated by appropriate punctuation. Auditory lexical decision priming was found for sex- and tone-consistent items following silent reading, but only up to 5 min. after silent reading. In a second study, similar lexical decision priming was found following listening to the sentences, although these effects remained reliable after a 2-day delay. The effect sizes for lexical decision priming showed that tone-consistency and sex-consistency were strong following both silent reading and listening 5 min. after studying. These results suggest that readers create episodic traces of text from auditory images of silently read sentences as they do during listening.

  14. Voice disorders in teachers. A review.

    Science.gov (United States)

    Martins, Regina Helena Garcia; Pereira, Eny Regina Bóia Neves; Hidalgo, Caio Bosque; Tavares, Elaine Lara Mendes

    2014-11-01

    Voice disorders are very prevalent among teachers and consequences are serious. Although the literature is extensive, there are differences in the concepts and methodology related to voice problems; most studies are restricted to analyzing the responses of teachers to questionnaires and only a few studies include vocal assessments and videolaryngoscopic examinations to obtain a definitive diagnosis. To review demographic studies related to vocal disorders in teachers to analyze the diverse methodologies, the prevalence rates pointed out by the authors, the main risk factors, the most prevalent laryngeal lesions, and the repercussions of dysphonias on professional activities. The available literature (from 1997 to 2013) was narratively reviewed based on Medline, PubMed, Lilacs, SciELO, and Cochrane library databases. Excluded were articles that specifically analyzed treatment modalities and those that did not make their abstracts available in those databases. The keywords included were teacher, dysphonia, voice disorders, professional voice. Copyright © 2014 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  15. Voice pedagogy-what do we need?

    Science.gov (United States)

    Gill, Brian P; Herbst, Christian T

    2016-12-01

    The final keynote panel of the 10th Pan-European Voice Conference (PEVOC) was concerned with the topic 'Voice pedagogy-what do we need?' In this communication the panel discussion is summarized, and the authors provide a deepening discussion on one of the key questions, addressing the roles and tasks of people working with voice students. In particular, a distinction is made between (1) voice building (derived from the German term 'Stimmbildung'), primarily comprising the functional and physiological aspects of singing; (2) coaching, mostly concerned with performance skills; and (3) singing voice rehabilitation. Both public and private educators are encouraged to apply this distinction to their curricula, in order to arrive at more efficient singing teaching and to reduce the risk of vocal injury to the singers concerned.

  16. Voice Quality Estimation in Wireless Networks

    Directory of Open Access Journals (Sweden)

    Petr Zach

    2015-01-01

    Full Text Available This article deals with the impact of Wireless (Wi-Fi networks on the perceived quality of voice services. The Quality of Service (QoS metrics must be monitored in the computer network during the voice data transmission to ensure proper voice service quality the end-user has paid for, especially in the wireless networks. In addition to the QoS, research area called Quality of Experience (QoE provides metrics and methods for quality evaluation from the end-user’s perspective. This article focuses on a QoE estimation of Voice over IP (VoIP calls in the wireless networks using network simulator. Results contribute to voice quality estimation based on characteristics of the wireless network and location of a wireless client.

  17. Effects of Early Smoking Habits on Young Adult Female Voices in Greece.

    Science.gov (United States)

    Tafiadis, Dionysios; Toki, Eugenia I; Miller, Kevin J; Ziavra, Nausica

    2017-11-01

    Cigarette use is a preventable cause of mortality and diseases. The World Health Organization states that Europe and especially Greece has the highest occurrence of smoking among adults. The prevalence of smoking among women in Greece was estimated to be over 30% in 2012. Smoking is a risk factor for many diseases. Studies have demonstrated the association between smoking and laryngeal pathologies as well as changes in voice characteristics. The purpose of this study was to estimate the effect of early smoking habit on young adult female voices and if they perceive any vocal changes using two assessment methods. The Voice Handicap Index and the acoustic analyses of voice measurements were used, with both serving as mini-assessment protocols. Two hundred and ten young females (110 smokers and 100 nonsmokers) attending the Technological Educational Institute of Epirus in the School of Health and Welfare were included. Statistically significant increases for physical and total scores of the Voice Handicap Index were found in the smokers group (P smoking habits. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  18. Assessing the Impact of Voice-Over Screen-Captured Presentations Delivered Online on Dental Students' Learning.

    Science.gov (United States)

    Schönwetter, Dieter J; Gareau-Wilson, Nicole; Cunha, Rodrigo Sanches; Mello, Isabel

    2016-02-01

    The traditional lecturing method is still one of the most common forms of delivering content to students in dental education, but innovative learning technologies have the potential to improve the effectiveness and quality of teaching dental students. What challenges instructors is the extent to which these learning tools have a direct impact on student learning outcomes. The aim of this study was to assess the impact of a voice-over screen-captured learning tool by identifying a positive, nil, or negative impact on student learning as well as student engagement (affective, behavioral, and cognitive) when compared to the traditional face-to-face lecture. Extraneous variables thought to impact student learning were controlled by the use of baseline measures as well as random assignment of second-year dental students to one of two teaching conditions: voice-over screen-captured presentation delivered online and the traditional classroom lecture. A total of 28 students enrolled in the preclinical course in endodontics at a Canadian dental school participated in the study, 14 in each of the two teaching conditions. The results showed that, in most cases, the students who experienced the online lecture had somewhat higher posttest scores and perceived satisfaction levels than those in the face-to-face lecture group, but the differences did not achieve statistical significance except for their long-term recognition test scores. This study found that the students had comparable learning outcomes whether they experienced the face-to-face or the online lecture, but that the online lecture had a more positive impact on their long-term learning. The controls for extraneous variables used in this study suggest ways to improve research into the comparative impact of traditional and innovative teaching methods on student learning outcomes.

  19. Social power and recognition of emotional prosody: High power is associated with lower recognition accuracy than low power.

    Science.gov (United States)

    Uskul, Ayse K; Paulmann, Silke; Weick, Mario

    2016-02-01

    Listeners have to pay close attention to a speaker's tone of voice (prosody) during daily conversations. This is particularly important when trying to infer the emotional state of the speaker. Although a growing body of research has explored how emotions are processed from speech in general, little is known about how psychosocial factors such as social power can shape the perception of vocal emotional attributes. Thus, the present studies explored how social power affects emotional prosody recognition. In a correlational study (Study 1) and an experimental study (Study 2), we show that high power is associated with lower accuracy in emotional prosody recognition than low power. These results, for the first time, suggest that individuals experiencing high or low power perceive emotional tone of voice differently. (c) 2016 APA, all rights reserved).

  20. A Pattern Recognition Mezzanine based on Associative Memory and FPGA technology for Level 1 Track Triggers for the HL-LHC upgrade

    International Nuclear Information System (INIS)

    Magalotti, D.; Alunni, L.; Bilei, G.M.; Fanò, L.; Servoli, L.; Storchi, L.; Placidi, P.; Spiezia, A.; Biesuz, N.; Fedi, G.; Magazzù, G.; Palla, F.; Rossi, E.; Citraro, S.; Crescioli, F.

    2016-01-01

    The increment of luminosity at HL-LHC will require the introduction of tracker information at Level-1 trigger system for the experiments in order to maintain an acceptable trigger rate for selecting interesting events despite the one order of increased magnitude in the minimum bias interactions. In order to extract the track information in the required latency (∼ 5–10 μ s depending on the experiment), a dedicated hardware processor needs to be used. We here propose a prototype system (Pattern Recognition Mezzanine) as core of pattern recognition and track fitting for HL-LHC experiments, combining the power of both Associative Memory custom ASIC and modern Field Programmable Gate Array (FPGA) devices

  1. Identifying hidden voice and video streams

    Science.gov (United States)

    Fan, Jieyan; Wu, Dapeng; Nucci, Antonio; Keralapura, Ram; Gao, Lixin

    2009-04-01

    Given the rising popularity of voice and video services over the Internet, accurately identifying voice and video traffic that traverse their networks has become a critical task for Internet service providers (ISPs). As the number of proprietary applications that deliver voice and video services to end users increases over time, the search for the one methodology that can accurately detect such services while being application independent still remains open. This problem becomes even more complicated when voice and video service providers like Skype, Microsoft, and Google bundle their voice and video services with other services like file transfer and chat. For example, a bundled Skype session can contain both voice stream and file transfer stream in the same layer-3/layer-4 flow. In this context, traditional techniques to identify voice and video streams do not work. In this paper, we propose a novel self-learning classifier, called VVS-I , that detects the presence of voice and video streams in flows with minimum manual intervention. Our classifier works in two phases: training phase and detection phase. In the training phase, VVS-I first extracts the relevant features, and subsequently constructs a fingerprint of a flow using the power spectral density (PSD) analysis. In the detection phase, it compares the fingerprint of a flow to the existing fingerprints learned during the training phase, and subsequently classifies the flow. Our classifier is not only capable of detecting voice and video streams that are hidden in different flows, but is also capable of detecting different applications (like Skype, MSN, etc.) that generate these voice/video streams. We show that our classifier can achieve close to 100% detection rate while keeping the false positive rate to less that 1%.

  2. Use of speech-to-text technology for documentation by healthcare providers.

    Science.gov (United States)

    Ajami, Sima

    2016-01-01

    Medical records are a critical component of a patient's treatment. However, documentation of patient-related information is considered a secondary activity in the provision of healthcare services, often leading to incomplete medical records and patient data of low quality. Advances in information technology (IT) in the health system and registration of information in electronic health records (EHR) using speechto- text conversion software have facilitated service delivery. This narrative review is a literature search with the help of libraries, books, conference proceedings, databases of Science Direct, PubMed, Proquest, Springer, SID (Scientific Information Database), and search engines such as Yahoo, and Google. I used the following keywords and their combinations: speech recognition, automatic report documentation, voice to text software, healthcare, information, and voice recognition. Due to lack of knowledge of other languages, I searched all texts in English or Persian with no time limits. Of a total of 70, only 42 articles were selected. Speech-to-text conversion technology offers opportunities to improve the documentation process of medical records, reduce cost and time of recording information, enhance the quality of documentation, improve the quality of services provided to patients, and support healthcare providers in legal matters. Healthcare providers should recognize the impact of this technology on service delivery.

  3. A voice-actuated wind tunnel model leak checking system

    Science.gov (United States)

    Larson, William E.

    1989-01-01

    A computer program has been developed that improves the efficiency of wind tunnel model leak checking. The program uses a voice recognition unit to relay a technician's commands to the computer. The computer, after receiving a command, can respond to the technician via a voice response unit. Information about the model pressure orifice being checked is displayed on a gas-plasma terminal. On command, the program records up to 30 seconds of pressure data. After the recording is complete, the raw data and a straight line fit of the data are plotted on the terminal. This allows the technician to make a decision on the integrity of the orifice being checked. All results of the leak check program are stored in a database file that can be listed on the line printer for record keeping purposes or displayed on the terminal to help the technician find unchecked orifices. This program allows one technician to check a model for leaks instead of the two or three previously required.

  4. Current trends in small vocabulary speech recognition for equipment control

    Science.gov (United States)

    Doukas, Nikolaos; Bardis, Nikolaos G.

    2017-09-01

    Speech recognition systems allow human - machine communication to acquire an intuitive nature that approaches the simplicity of inter - human communication. Small vocabulary speech recognition is a subset of the overall speech recognition problem, where only a small number of words need to be recognized. Speaker independent small vocabulary recognition can find significant applications in field equipment used by military personnel. Such equipment may typically be controlled by a small number of commands that need to be given quickly and accurately, under conditions where delicate manual operations are difficult to achieve. This type of application could hence significantly benefit by the use of robust voice operated control components, as they would facilitate the interaction with their users and render it much more reliable in times of crisis. This paper presents current challenges involved in attaining efficient and robust small vocabulary speech recognition. These challenges concern feature selection, classification techniques, speaker diversity and noise effects. A state machine approach is presented that facilitates the voice guidance of different equipment in a variety of situations.

  5. Is it me? Self-recognition bias across sensory modalities and its relationship to autistic traits.

    Science.gov (United States)

    Chakraborty, Anya; Chakrabarti, Bhismadev

    2015-01-01

    Atypical self-processing is an emerging theme in autism research, suggested by lower self-reference effect in memory, and atypical neural responses to visual self-representations. Most research on physical self-processing in autism uses visual stimuli. However, the self is a multimodal construct, and therefore, it is essential to test self-recognition in other sensory modalities as well. Self-recognition in the auditory modality remains relatively unexplored and has not been tested in relation to autism and related traits. This study investigates self-recognition in auditory and visual domain in the general population and tests if it is associated with autistic traits. Thirty-nine neurotypical adults participated in a two-part study. In the first session, individual participant's voice was recorded and face was photographed and morphed respectively with voices and faces from unfamiliar identities. In the second session, participants performed a 'self-identification' task, classifying each morph as 'self' voice (or face) or an 'other' voice (or face). All participants also completed the Autism Spectrum Quotient (AQ). For each sensory modality, slope of the self-recognition curve was used as individual self-recognition metric. These two self-recognition metrics were tested for association between each other, and with autistic traits. Fifty percent 'self' response was reached for a higher percentage of self in the auditory domain compared to the visual domain (t = 3.142; P self-recognition bias across sensory modalities (τ = -0.165, P = 0.204). Higher recognition bias for self-voice was observed in individuals higher in autistic traits (τ AQ = 0.301, P = 0.008). No such correlation was observed between recognition bias for self-face and autistic traits (τ AQ = -0.020, P = 0.438). Our data shows that recognition bias for physical self-representation is not related across sensory modalities. Further, individuals with higher autistic traits were better able

  6. Your Cheatin' Voice Will Tell on You: Detection of Past Infidelity from Voice.

    Science.gov (United States)

    Hughes, Susan M; Harrison, Marissa A

    2017-01-01

    Evidence suggests that many physical, behavioral, and trait qualities can be detected solely from the sound of a person's voice, irrespective of the semantic information conveyed through speech. This study examined whether raters could accurately assess the likelihood that a person has cheated on committed, romantic partners simply by hearing the speaker's voice. Independent raters heard voice samples of individuals who self-reported that they either cheated or had never cheated on their romantic partners. To control for aspects that may clue a listener to the speaker's mate value, we used voice samples that did not differ between these groups for voice attractiveness, age, voice pitch, and other acoustic measures. We found that participants indeed rated the voices of those who had a history of cheating as more likely to cheat. Male speakers were given higher ratings for cheating, while female raters were more likely to ascribe the likelihood to cheat to speakers. Additionally, we manipulated the pitch of the voice samples, and for both sexes, the lower pitched versions were consistently rated to be from those who were more likely to have cheated. Regardless of the pitch manipulation, speakers were able to assess actual history of infidelity; the one exception was that men's accuracy decreased when judging women whose voices were lowered. These findings expand upon the idea that the human voice may be of value as a cheater detection tool and very thin slices of vocal information are all that is needed to make certain assessments about others.

  7. A pneumatic Bionic Voice prosthesis-Pre-clinical trials of controlling the voice onset and offset.

    Directory of Open Access Journals (Sweden)

    Farzaneh Ahmadi

    Full Text Available Despite emergent progress in many fields of bionics, a functional Bionic Voice prosthesis for laryngectomy patients (larynx amputees has not yet been achieved, leading to a lifetime of vocal disability for these patients. This study introduces a novel framework of Pneumatic Bionic Voice Prostheses as an electronic adaptation of the Pneumatic Artificial Larynx (PAL device. The PAL is a non-invasive mechanical voice source, driven exclusively by respiration with an exceptionally high voice quality, comparable to the existing gold standard of Tracheoesophageal (TE voice prosthesis. Following PAL design closely as the reference, Pneumatic Bionic Voice Prostheses seem to have a strong potential to substitute the existing gold standard by generating a similar voice quality while remaining non-invasive and non-surgical. This paper designs the first Pneumatic Bionic Voice prosthesis and evaluates its onset and offset control against the PAL device through pre-clinical trials on one laryngectomy patient. The evaluation on a database of more than five hours of continuous/isolated speech recordings shows a close match between the onset/offset control of the Pneumatic Bionic Voice and the PAL with an accuracy of 98.45 ±0.54%. When implemented in real-time, the Pneumatic Bionic Voice prosthesis controller has an average onset/offset delay of 10 milliseconds compared to the PAL. Hence it addresses a major disadvantage of previous electronic voice prostheses, including myoelectric Bionic Voice, in meeting the short time-frames of controlling the onset/offset of the voice in continuous speech.

  8. A pneumatic Bionic Voice prosthesis-Pre-clinical trials of controlling the voice onset and offset.

    Science.gov (United States)

    Ahmadi, Farzaneh; Noorian, Farzad; Novakovic, Daniel; van Schaik, André

    2018-01-01

    Despite emergent progress in many fields of bionics, a functional Bionic Voice prosthesis for laryngectomy patients (larynx amputees) has not yet been achieved, leading to a lifetime of vocal disability for these patients. This study introduces a novel framework of Pneumatic Bionic Voice Prostheses as an electronic adaptation of the Pneumatic Artificial Larynx (PAL) device. The PAL is a non-invasive mechanical voice source, driven exclusively by respiration with an exceptionally high voice quality, comparable to the existing gold standard of Tracheoesophageal (TE) voice prosthesis. Following PAL design closely as the reference, Pneumatic Bionic Voice Prostheses seem to have a strong potential to substitute the existing gold standard by generating a similar voice quality while remaining non-invasive and non-surgical. This paper designs the first Pneumatic Bionic Voice prosthesis and evaluates its onset and offset control against the PAL device through pre-clinical trials on one laryngectomy patient. The evaluation on a database of more than five hours of continuous/isolated speech recordings shows a close match between the onset/offset control of the Pneumatic Bionic Voice and the PAL with an accuracy of 98.45 ±0.54%. When implemented in real-time, the Pneumatic Bionic Voice prosthesis controller has an average onset/offset delay of 10 milliseconds compared to the PAL. Hence it addresses a major disadvantage of previous electronic voice prostheses, including myoelectric Bionic Voice, in meeting the short time-frames of controlling the onset/offset of the voice in continuous speech.

  9. A pneumatic Bionic Voice prosthesis—Pre-clinical trials of controlling the voice onset and offset

    Science.gov (United States)

    Noorian, Farzad; Novakovic, Daniel; van Schaik, André

    2018-01-01

    Despite emergent progress in many fields of bionics, a functional Bionic Voice prosthesis for laryngectomy patients (larynx amputees) has not yet been achieved, leading to a lifetime of vocal disability for these patients. This study introduces a novel framework of Pneumatic Bionic Voice Prostheses as an electronic adaptation of the Pneumatic Artificial Larynx (PAL) device. The PAL is a non-invasive mechanical voice source, driven exclusively by respiration with an exceptionally high voice quality, comparable to the existing gold standard of Tracheoesophageal (TE) voice prosthesis. Following PAL design closely as the reference, Pneumatic Bionic Voice Prostheses seem to have a strong potential to substitute the existing gold standard by generating a similar voice quality while remaining non-invasive and non-surgical. This paper designs the first Pneumatic Bionic Voice prosthesis and evaluates its onset and offset control against the PAL device through pre-clinical trials on one laryngectomy patient. The evaluation on a database of more than five hours of continuous/isolated speech recordings shows a close match between the onset/offset control of the Pneumatic Bionic Voice and the PAL with an accuracy of 98.45 ±0.54%. When implemented in real-time, the Pneumatic Bionic Voice prosthesis controller has an average onset/offset delay of 10 milliseconds compared to the PAL. Hence it addresses a major disadvantage of previous electronic voice prostheses, including myoelectric Bionic Voice, in meeting the short time-frames of controlling the onset/offset of the voice in continuous speech. PMID:29466455

  10. Error-dependent modulation of speech-induced auditory suppression for pitch-shifted voice feedback

    Directory of Open Access Journals (Sweden)

    Larson Charles R

    2011-06-01

    Full Text Available Abstract Background The motor-driven predictions about expected sensory feedback (efference copies have been proposed to play an important role in recognition of sensory consequences of self-produced motor actions. In the auditory system, this effect was suggested to result in suppression of sensory neural responses to self-produced voices that are predicted by the efference copies during vocal production in comparison with passive listening to the playback of the identical self-vocalizations. In the present study, event-related potentials (ERPs were recorded in response to upward pitch shift stimuli (PSS with five different magnitudes (0, +50, +100, +200 and +400 cents at voice onset during active vocal production and passive listening to the playback. Results Results indicated that the suppression of the N1 component during vocal production was largest for unaltered voice feedback (PSS: 0 cents, became smaller as the magnitude of PSS increased to 200 cents, and was almost completely eliminated in response to 400 cents stimuli. Conclusions Findings of the present study suggest that the brain utilizes the motor predictions (efference copies to determine the source of incoming stimuli and maximally suppresses the auditory responses to unaltered feedback of self-vocalizations. The reduction of suppression for 50, 100 and 200 cents and its elimination for 400 cents pitch-shifted voice auditory feedback support the idea that motor-driven suppression of voice feedback leads to distinctly different sensory neural processing of self vs. non-self vocalizations. This characteristic may enable the audio-vocal system to more effectively detect and correct for unexpected errors in the feedback of self-produced voice pitch compared with externally-generated sounds.

  11. Error-dependent modulation of speech-induced auditory suppression for pitch-shifted voice feedback.

    Science.gov (United States)

    Behroozmand, Roozbeh; Larson, Charles R

    2011-06-06

    The motor-driven predictions about expected sensory feedback (efference copies) have been proposed to play an important role in recognition of sensory consequences of self-produced motor actions. In the auditory system, this effect was suggested to result in suppression of sensory neural responses to self-produced voices that are predicted by the efference copies during vocal production in comparison with passive listening to the playback of the identical self-vocalizations. In the present study, event-related potentials (ERPs) were recorded in response to upward pitch shift stimuli (PSS) with five different magnitudes (0, +50, +100, +200 and +400 cents) at voice onset during active vocal production and passive listening to the playback. Results indicated that the suppression of the N1 component during vocal production was largest for unaltered voice feedback (PSS: 0 cents), became smaller as the magnitude of PSS increased to 200 cents, and was almost completely eliminated in response to 400 cents stimuli. Findings of the present study suggest that the brain utilizes the motor predictions (efference copies) to determine the source of incoming stimuli and maximally suppresses the auditory responses to unaltered feedback of self-vocalizations. The reduction of suppression for 50, 100 and 200 cents and its elimination for 400 cents pitch-shifted voice auditory feedback support the idea that motor-driven suppression of voice feedback leads to distinctly different sensory neural processing of self vs. non-self vocalizations. This characteristic may enable the audio-vocal system to more effectively detect and correct for unexpected errors in the feedback of self-produced voice pitch compared with externally-generated sounds.

  12. Mindfulness of voices, self-compassion, and secure attachment in relation to the experience of hearing voices.

    Science.gov (United States)

    Dudley, James; Eames, Catrin; Mulligan, John; Fisher, Naomi

    2018-03-01

    Developing compassion towards oneself has been linked to improvement in many areas of psychological well-being, including psychosis. Furthermore, developing a non-judgemental, accepting way of relating to voices is associated with lower levels of distress for people who hear voices. These factors have also been associated with secure attachment. This study explores associations between the constructs of mindfulness of voices, self-compassion, and distress from hearing voices and how secure attachment style related to each of these variables. Cross-sectional online. One hundred and twenty-eight people (73% female; M age  = 37.5; 87.5% Caucasian) who currently hear voices completed the Self-Compassion Scale, Southampton Mindfulness of Voices Questionnaire, Relationships Questionnaire, and Hamilton Programme for Schizophrenia Voices Questionnaire. Results showed that mindfulness of voices mediated the relationship between self-compassion and severity of voices, and self-compassion mediated the relationship between mindfulness of voices and severity of voices. Self-compassion and mindfulness of voices were significantly positively correlated with each other and negatively correlated with distress and severity of voices. Mindful relation to voices and self-compassion are associated with reduced distress and severity of voices, which supports the proposed potential benefits of mindful relating to voices and self-compassion as therapeutic skills for people experiencing distress by voice hearing. Greater self-compassion and mindfulness of voices were significantly associated with less distress from voices. These findings support theory underlining compassionate mind training. Mindfulness of voices mediated the relationship between self-compassion and distress from voices, indicating a synergistic relationship between the constructs. Although the current findings do not give a direction of causation, consideration is given to the potential impact of mindful and

  13. Gait recognition based on integral outline

    Science.gov (United States)

    Ming, Guan; Fang, Lv

    2017-02-01

    Biometric identification technology replaces traditional security technology, which has become a trend, and gait recognition also has become a hot spot of research because its feature is difficult to imitate and theft. This paper presents a gait recognition system based on integral outline of human body. The system has three important aspects: the preprocessing of gait image, feature extraction and classification. Finally, using a method of polling to evaluate the performance of the system, and summarizing the problems existing in the gait recognition and the direction of development in the future.

  14. Robustness-related issues in speaker recognition

    CERN Document Server

    Zheng, Thomas Fang

    2017-01-01

    This book presents an overview of speaker recognition technologies with an emphasis on dealing with robustness issues. Firstly, the book gives an overview of speaker recognition, such as the basic system framework, categories under different criteria, performance evaluation and its development history. Secondly, with regard to robustness issues, the book presents three categories, including environment-related issues, speaker-related issues and application-oriented issues. For each category, the book describes the current hot topics, existing technologies, and potential research focuses in the future. The book is a useful reference book and self-learning guide for early researchers working in the field of robust speech recognition.

  15. Exploring multiliteracies, student voice, and scientific practices in two elementary classrooms

    Science.gov (United States)

    Allison, Elizabeth Rowland

    This study explored the voices of children in a changing world with evolving needs and new opportunities. The workplaces of rapidly moving capitalist societies value creativity, collaboration, and critical thinking skills which are of growing importance and manifesting themselves in modern K-12 science classroom cultures (Gee, 2000; New London Group, 2000). This study explored issues of multiliteracies and student voice set within the context of teaching and learning in 4th and 5th grade science classrooms. The purpose of the study was to ascertain what and how multiliteracies and scientific practices (NGSS Lead States, 2013c) are implemented, explore how multiliteracies influence students' voices, and investigate teacher and student perceptions of multiliteracies, student voice, and scientific practices. Grounded in a constructivist framework, a multiple case study was employed in two elementary classrooms. Through observations, student focus groups and interviews, and teacher interviews, a detailed narrative was created to describe a range of multiliteracies, student voice, and scientific practices that occurred with the science classroom context. Using grounded theory analysis, data were coded and analyzed to reveal emergent themes. Data analysis revealed that these two classrooms were enriched with multiliteracies that serve metaphorically as breeding grounds for student voice. In the modern classroom, defined as a space where information is instantly accessible through the Internet, multiliteracies can be developed through inquiry-based, collaborative, and technology-rich experiences. Scientific literacy, cultivated through student communication and collaboration, is arguably a multiliteracy that has not been considered in the literature, and should be, as an integral component of overall individual literacy in the 21st century. Findings revealed four themes. Three themes suggest that teachers address several modes of multiliteracies in science, but identify

  16. Image Quality Enhancement Using the Direction and Thickness of Vein Lines for Finger-Vein Recognition

    OpenAIRE

    Park, Young Ho; Park, Kang Ryoung

    2012-01-01

    On the basis of the increased emphasis placed on the protection of privacy, biometric recognition systems using physical or behavioural characteristics such as fingerprints, facial characteristics, iris and finger‐vein patterns or the voice have been introduced in applications including door access control, personal certification, Internet banking and ATM machines. Among these, finger‐vein recognition is advantageous in that it involves the use of inexpensive and small devices that are diffic...

  17. Authentication: From Passwords to Biometrics: An implementation of a speaker recognition system on Android

    OpenAIRE

    Heimark, Erlend

    2012-01-01

    We implement a biometric authentication system on the Android platform, which is based on text-dependent speaker recognition. The Android version used in the application is Android 4.0. The application makes use of the Modular Audio Recognition Framework, from which many of the algorithms are adapted in the processes of preprocessing and feature extraction. In addition, we employ the Dynamic Time Warping (DTW) algorithm for the comparison of different voice features. A training procedure is i...

  18. Psychological effects of dysphonia in voice professionals.

    Science.gov (United States)

    Salturk, Ziya; Kumral, Tolgar Lutfi; Aydoğdu, Imran; Arslanoğlu, Ahmet; Berkiten, Güler; Yildirim, Güven; Uyar, Yavuz

    2015-08-01

    To evaluate the psychological effects of dysphonia in voice professionals compared to non-voice professionals and in both genders. Cross-sectional analysis. Forty-eight 48 voice professionals and 52 non-voice professionals with dysphonia were included in this study. All participants underwent a complete ear, nose, and throat examination and an evaluation for pathologies that might affect vocal quality. Participants were asked to complete the Turkish versions of the Voice Handicap Index-30 (VHI-30), Perceived Stress Scale (PSS), and the Hospital Anxiety and Depression Scale (HADS). HADS scores were evaluated as HADS-A (anxiety) and HADS-D (depression). Dysphonia status was evaluated by grade, roughness, breathiness, asthenia, and strain (GRBAS) scale perceptually. The results were compared statistically. Significant differences between the two groups were evident when the VHI-30 and PSS data were compared (P = .00001 and P = .00001, respectively). However, neither HADS score (HADS-A and HADS-D) differed between groups. An analysis of the scores in terms of sex revealed that females had significantly higher PSS scores (P = .006). The GRBAS scale revealed no difference between groups (P = .819, .931, .803, .655, and .803, respectively). No between-sex differences in the VHI-30 or HADS scores were evident We found that voice professionals and females experienced more stress and were more dissatisfied with their voices. 4. © 2015 The American Laryngological, Rhinological and Otological Society, Inc.

  19. Reliability in perceptual analysis of voice quality.

    Science.gov (United States)

    Bele, Irene Velsvik

    2005-12-01

    This study focuses on speaking voice quality in male teachers (n = 35) and male actors (n = 36), who represent untrained and trained voice users, because we wanted to investigate normal and supranormal voices. In this study, both substantial and methodologic aspects were considered. It includes a method for perceptual voice evaluation, and a basic issue was rater reliability. A listening group of 10 listeners, 7 experienced speech-language therapists, and 3 speech-language therapist students evaluated the voices by 15 vocal characteristics using VA scales. Two sets of voice signals were investigated: text reading (2 loudness levels) and sustained vowel (3 levels). The results indicated a high interrater reliability for most perceptual characteristics. Connected speech was evaluated more reliably, especially at the normal level, but both types of voice signals were evaluated reliably, although the reliability for connected speech was somewhat higher than for vowels. Experienced listeners tended to be more consistent in their ratings than did the student raters. Some vocal characteristics achieved acceptable reliability even with a smaller panel of listeners. The perceptual characteristics grouped in 4 factors reflected perceptual dimensions.

  20. Muted 'voice': The writing of two groups of postgraduate ...

    African Journals Online (AJOL)

    The purpose of this article is to demonstrate and account for the weak emergence of 'voice' in the writing of students embarking upon their postgraduate studies in Geosciences. The two elements of 'voice' that are emphasised are 'voice' as style of expression and 'voice' as the ability to write distinctly, yet building upon ...

  1. Performance of Phonatory Deviation Diagrams in Synthesized Voice Analysis.

    Science.gov (United States)

    Lopes, Leonardo Wanderley; da Silva, Karoline Evangelista; da Silva Evangelista, Deyverson; Almeida, Anna Alice; Silva, Priscila Oliveira Costa; Lucero, Jorge; Behlau, Mara

    2018-05-02

    To analyze the performance of a phonatory deviation diagram (PDD) in discriminating the presence and severity of voice deviation and the predominant voice quality of synthesized voices. A speech-language pathologist performed the auditory-perceptual analysis of the synthesized voice (n = 871). The PDD distribution of voice signals was analyzed according to area, quadrant, shape, and density. Differences in signal distribution regarding the PDD area and quadrant were detected when differentiating the signals with and without voice deviation and with different predominant voice quality. Differences in signal distribution were found in all PDD parameters as a function of the severity of voice disorder. The PDD area and quadrant can differentiate normal voices from deviant synthesized voices. There are differences in signal distribution in PDD area and quadrant as a function of the severity of voice disorder and the predominant voice quality. However, the PDD area and quadrant do not differentiate the signals as a function of severity of voice disorder and differentiated only the breathy and rough voices from the normal and strained voices. PDD density is able to differentiate only signals with moderate and severe deviation. PDD shape shows differences between signals with different severities of voice deviation. © 2018 S. Karger AG, Basel.

  2. Compact Acoustic Models for Embedded Speech Recognition

    Directory of Open Access Journals (Sweden)

    Lévy Christophe

    2009-01-01

    Full Text Available Speech recognition applications are known to require a significant amount of resources. However, embedded speech recognition only authorizes few KB of memory, few MIPS, and small amount of training data. In order to fit the resource constraints of embedded applications, an approach based on a semicontinuous HMM system using state-independent acoustic modelling is proposed. A transformation is computed and applied to the global model in order to obtain each HMM state-dependent probability density functions, authorizing to store only the transformation parameters. This approach is evaluated on two tasks: digit and voice-command recognition. A fast adaptation technique of acoustic models is also proposed. In order to significantly reduce computational costs, the adaptation is performed only on the global model (using related speaker recognition adaptation techniques with no need for state-dependent data. The whole approach results in a relative gain of more than 20% compared to a basic HMM-based system fitting the constraints.

  3. A Survey of Equipment in the Singing Voice Studio and Its Perceived Effectiveness by Vocologists and Student Singers.

    Science.gov (United States)

    Gerhard, Julia; Rosow, David E

    2016-05-01

    Speech-language pathologists have long used technology for the clinical measurement of the speaking voice, but present research shows that vocal pedagogues and voice students are becoming more accepting of technology in the studio. As a result, the equipment and technology used in singing voice studios by speech-language pathologists and vocal pedagogues are changing. Although guides exist regarding equipment and technology necessary for developing a voice laboratory and private voice studio, there are no data documenting the current implementation of these items and their perceived effectiveness. This study seeks to document current trends in equipment used in voice laboratories and studios. Two separate surveys were distributed to 60 vocologists and approximately 300 student singers representative of the general singing student population. The surveys contained questions about the inventory of items found in voice studios and perceived effectiveness of these items. Data were analyzed using descriptive analyses and statistical analyses when applicable. Twenty-six of 60 potential vocologists responded, and 66 student singers responded. The vocologists reported highly uniform inventories and ratings of studio items. There were wide-ranging differences between the inventories reported by the vocologist and student singer groups. Statistically significant differences between ratings of effectiveness of studio items were found for 11 of the 17 items. In all significant cases, vocologists rated usefulness to be higher than student singers. Although the order of rankings of vocologists and student singers was similar, a much higher percentage of vocologists report the items as being efficient and effective than students. The historically typical studio items, including the keyboard and mirror, were ranked as most effective by both vocologists and student singers. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  4. Voicing children's critique and utopias

    DEFF Research Database (Denmark)

    Husted, Mia; Lind, Unni

    and restrictions, Call for aesthetics an sensuality, Longings for home and parents, Longings for better social relations Making children's voice visible allows preschool teachers to reflect children's knowledge and life word in pedagogical practice. Keywords: empowerment and participation, action research...... children to raise and render visible their own critique and wishes related to their everyday life in daycare. Research on how and why to engage children as participants in research and in institutional developments addresses overall interests in democratization and humanization that can be traced back...... to strategies for Nordic welfare developments and the Conventions on Children's Rights. The theoretical and methodological framework follow the lines of how to form and learn democracy of Lewin (1948) and Dewey (1916). The study is carried out as action research involving 50 children at age three to five...

  5. His Master’s Voice?

    DEFF Research Database (Denmark)

    Sörbom, Adrienne; Garsten, Christina

    This paper departs from an interest in the involvement of business leaders in the sphere of politics, in the broad sense. Many global business leaders today do much more than engage narrowly in their own corporation and its search for profit. At a general level, we are seeing a proliferation...... as political. What is the role of business in the World Economic Forum, and how do business corporations advance their interests through the WEF? The results show that corporations find a strategically positioned amplifier for their non-market interests in the WEF. The WEF functions to enhance and gain...... leverage for their ideas and priorities in a highly selective and resourceful environment. In the long run, both the market priorities and the political interests of business may be served by engagement in the WEF. However, the WEF cannot only be conceived as the extended voice of corporations. The WEF...

  6. Giving the Customer a Voice

    DEFF Research Database (Denmark)

    Van der Hoven, Christopher; Michea, Adela; Varnes, Claus

    , for example there are studies that have strongly criticized focus groups, interviews and surveys (e.g. Ulwick, 2002; Goffin et al, 2010; Sandberg, 2002). In particular, a point is made that, “…traditional market research and development approaches proved to be particularly ill-suited to breakthrough products...... the voice of the customer (VoC) through market research is well documented (Davis, 1993; Mullins and Sutherland, 1998; Cooper et al., 2002; Flint, 2002; Davilla et al., 2006; Cooper and Edgett, 2008; Cooper and Dreher, 2010; Goffin and Mitchell, 2010). However, not all research methods are well received......” (Deszca et al, 2010, p613). Therefore, in situations where traditional techniques - interviews and focus groups - are ineffective, the question is which market research techniques are appropriate, particularly for developing breakthrough products? To investigate this, an attempt was made to access...

  7. Dangertalk: Voices of abortion providers.

    Science.gov (United States)

    Martin, Lisa A; Hassinger, Jane A; Debbink, Michelle; Harris, Lisa H

    2017-07-01

    Researchers have described the difficulties of doing abortion work, including the psychosocial costs to individual providers. Some have discussed the self-censorship in which providers engage in to protect themselves and the pro-choice movement. However, few have examined the costs of this self-censorship to public discourse and social movements in the US. Using qualitative data collected during abortion providers' discussions of their work, we explore the tensions between their narratives and pro-choice discourse, and examine the types of stories that are routinely silenced - narratives we name "dangertalk". Using these data, we theorize about the ways in which giving voice to these tensions might transform current abortion discourse by disrupting false dichotomies and better reflecting the complex realities of abortion. We present a conceptual model for dangertalk in abortion discourse, connecting it to functions of dangertalk in social movements more broadly. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Mediatization: a concept, multiple voices

    Directory of Open Access Journals (Sweden)

    Pedro Gilberto GOMES

    2016-12-01

    Full Text Available Mediatization has become increasingly a key concept, fundamental, essential to describe the present and the history of media and communicative change taking place. Thus, it became part of a whole, one can not see them as a separate sphere. In this perspective, the media coverage is used as a concept to describe the process of expansion of the different technical means and consider the interrelationships between the communicative change, means and sociocultural change. However, although many researchers use the concept of mediatization, each gives you the meaning that best suits your needs. Thus, the concept of media coverage is treated with multiple voices. This paper discusses this problem and present a preliminary pre-position on the matter.

  9. Specific features of modern voice protection systems

    Directory of Open Access Journals (Sweden)

    Roman A. Ustinov

    2017-11-01

    Full Text Available Nowadays, speech technologies are among the most vibrant sectors of the world’s economy. Of high importance is the problem of ensuring the security of speech information (SI. Here we discuss SI protection systems within a modern communication model. The model is multimodal, multithreaded, and implies a large number of subscribers interacting via several communication lines. With this in mind, we perform a detailed analysis of threats to the confidentiality, integrity and accessibility of SI. Existing methods of counteraction against these threats are discussed, and shown to be insufficient to ensure the safety of voice messages (VM in full. Mean while, there are new threats to the integrity and accessibility of SI, the solutions for which are either do not exist, or only being developed. We propose our original approach to counter these threats. Steganography methods are the most promising for ensuring the integrity of the VM.  In particular, using audiomarkers allows one to reliably trace speaker’sidentity throughout the entire communication session. In order to counter the threats to SI availability due to the capacity of the communication channel and the limited volumes of VM data storage, it is necessary to improve existing adaptive speech compression algorithms, along with developing new ones. Furthermore, such algorithms must keep the specified level of speech intelligibility.

  10. Disability: a voice in Australian bioethics?

    Science.gov (United States)

    Newell, Christopher

    2003-06-01

    The rise of research and advocacy over the years to establish a disability voice in Australia with regard to bioethical issues is explored. This includes an analysis of some of the political processes and engagement in mainstream bioethical debate. An understanding of the politics of rejected knowledge is vital in understanding the muted disability voices in Australian bioethics and public policy. It is also suggested that the voices of those who are marginalised or oppressed in society, such as people with disability, have particular contribution to make in fostering critical bioethics.

  11. Unfamiliar voice identification: Effect of post-event information on accuracy and voice ratings

    Directory of Open Access Journals (Sweden)

    Harriet Mary Jessica Smith

    2014-04-01

    Full Text Available This study addressed the effect of misleading post-event information (PEI on voice ratings, identification accuracy, and confidence, as well as the link between verbal recall and accuracy. Participants listened to a dialogue between male and female targets, then read misleading information about voice pitch. Participants engaged in verbal recall, rated voices on a feature checklist, and made a lineup decision. Accuracy rates were low, especially on target-absent lineups. Confidence and accuracy were unrelated, but the number of facts recalled about the voice predicted later lineup accuracy. There was a main effect of misinformation on ratings of target voice pitch, but there was no effect on identification accuracy or confidence ratings. As voice lineup evidence from earwitnesses is used in courts, the findings have potential applied relevance.

  12. Bringing voice in policy building.

    Science.gov (United States)

    Lotrecchiano, Gaetano R; Kane, Mary; Zocchi, Mark S; Gosa, Jessica; Lazar, Danielle; Pines, Jesse M

    2017-07-03

    Purpose The purpose of this paper is to describe the use of group concept mapping (GCM) as a tool for developing a conceptual model of an episode of acute, unscheduled care from illness or injury to outcomes such as recovery, death and chronic illness. Design/methodology/approach After generating a literature review drafting an initial conceptual model, GCM software (CS Global MAX TM ) is used to organize and identify strengths and directionality between concepts generated through feedback about the model from several stakeholder groups: acute care and non-acute care providers, patients, payers and policymakers. Through online and in-person population-specific focus groups, the GCM approach seeks feedback, assigned relationships and articulated priorities from participants to produce an output map that described overarching concepts and relationships within and across subsamples. Findings A clustered concept map made up of relational data points that produced a taxonomy of feedback was used to update the model for use in soliciting additional feedback from two technical expert panels (TEPs), and finally, a public comment exercise was performed. The results were a stakeholder-informed improved model for an acute care episode, identified factors that influence process and outcomes, and policy recommendations, which were delivered to the Department of Health and Human Services's (DHHS) Assistant Secretary for Preparedness and Response. Practical implications This study provides an example of the value of cross-population multi-stakeholder input to increase voice in shared problem health stakeholder groups. Originality/value This paper provides GCM results and a visual analysis of the relational characteristics both within and across sub-populations involved in the study. It also provides an assessment of observational key factors supporting how different stakeholder voices can be integrated to inform model development and policy recommendations.

  13. Cross-cultural emotional prosody recognition: evidence from Chinese and British listeners.

    Science.gov (United States)

    Paulmann, Silke; Uskul, Ayse K

    2014-01-01

    This cross-cultural study of emotional tone of voice recognition tests the in-group advantage hypothesis (Elfenbein & Ambady, 2002) employing a quasi-balanced design. Individuals of Chinese and British background were asked to recognise pseudosentences produced by Chinese and British native speakers, displaying one of seven emotions (anger, disgust, fear, happy, neutral tone of voice, sad, and surprise). Findings reveal that emotional displays were recognised at rates higher than predicted by chance; however, members of each cultural group were more accurate in recognising the displays communicated by a member of their own cultural group than a member of the other cultural group. Moreover, the evaluation of error matrices indicates that both culture groups relied on similar mechanism when recognising emotional displays from the voice. Overall, the study reveals evidence for both universal and culture-specific principles in vocal emotion recognition.

  14. Kazakh Traditional Dance Gesture Recognition

    Science.gov (United States)

    Nussipbekov, A. K.; Amirgaliyev, E. N.; Hahn, Minsoo

    2014-04-01

    Full body gesture recognition is an important and interdisciplinary research field which is widely used in many application spheres including dance gesture recognition. The rapid growth of technology in recent years brought a lot of contribution in this domain. However it is still challenging task. In this paper we implement Kazakh traditional dance gesture recognition. We use Microsoft Kinect camera to obtain human skeleton and depth information. Then we apply tree-structured Bayesian network and Expectation Maximization algorithm with K-means clustering to calculate conditional linear Gaussians for classifying poses. And finally we use Hidden Markov Model to detect dance gestures. Our main contribution is that we extend Kinect skeleton by adding headwear as a new skeleton joint which is calculated from depth image. This novelty allows us to significantly improve the accuracy of head gesture recognition of a dancer which in turn plays considerable role in whole body gesture recognition. Experimental results show the efficiency of the proposed method and that its performance is comparable to the state-of-the-art system performances.

  15. Multistage Data Selection-based Unsupervised Speaker Adaptation for Personalized Speech Emotion Recognition

    NARCIS (Netherlands)

    Kim, Jaebok; Park, Jeong-Sik

    This paper proposes an efficient speech emotion recognition (SER) approach that utilizes personal voice data accumulated on personal devices. A representative weakness of conventional SER systems is the user-dependent performance induced by the speaker independent (SI) acoustic model framework. But,

  16. Multimodal emotion recognition as assessment for learning in a game-based communication skills training

    NARCIS (Netherlands)

    Nadolski, Rob; Bahreini, Kiavash; Westera, Wim

    2014-01-01

    This paper presentation describes how our FILTWAM software artifacts for face and voice emotion recognition will be used for assessing learners' progress and providing adequate feedback in an online game-based communication skills training. This constitutes an example of in-game assessment for

  17. Multimodal Emotion Recognition for Assessment of Learning in a Game-Based Communication Skills Training

    NARCIS (Netherlands)

    Bahreini, Kiavash; Nadolski, Rob; Westera, Wim

    2015-01-01

    This paper describes how our FILTWAM software artifacts for face and voice emotion recognition will be used for assessing learners' progress and providing adequate feedback in an online game-based communication skills training. This constitutes an example of in-game assessment for mainly formative

  18. Music Information Retrieval from a Singing Voice Using Lyrics and Melody Information

    Directory of Open Access Journals (Sweden)

    Shozo Makino

    2007-01-01

    Full Text Available Recently, several music information retrieval (MIR systems which retrieve musical pieces by the user's singing voice have been developed. All of these systems use only melody information for retrieval, although lyrics information is also useful for retrieval. In this paper, we propose a new MIR system that uses both lyrics and melody information. First, we propose a new lyrics recognition method. A finite state automaton (FSA is used as recognition grammar, and about 86% retrieval accuracy was obtained. We also develop an algorithm for verifying a hypothesis output by a lyrics recognizer. Melody information is extracted from an input song using several pieces of information of the hypothesis, and a total score is calculated from the recognition score and the verification score. From the experimental results, 95.0% retrieval accuracy was obtained with a query consisting of five words.

  19. Music Information Retrieval from a Singing Voice Using Lyrics and Melody Information

    Directory of Open Access Journals (Sweden)

    Suzuki Motoyuki

    2007-01-01

    Full Text Available Recently, several music information retrieval (MIR systems which retrieve musical pieces by the user's singing voice have been developed. All of these systems use only melody information for retrieval, although lyrics information is also useful for retrieval. In this paper, we propose a new MIR system that uses both lyrics and melody information. First, we propose a new lyrics recognition method. A finite state automaton (FSA is used as recognition grammar, and about retrieval accuracy was obtained. We also develop an algorithm for verifying a hypothesis output by a lyrics recognizer. Melody information is extracted from an input song using several pieces of information of the hypothesis, and a total score is calculated from the recognition score and the verification score. From the experimental results, 95.0 retrieval accuracy was obtained with a query consisting of five words.

  20. Connections between voice ergonomic risk factors in classrooms and teachers' voice production.

    Science.gov (United States)

    Rantala, Leena M; Hakala, Suvi; Holmqvist, Sofia; Sala, Eeva

    2012-01-01

    The aim of the study was to investigate if voice ergonomic risk factors in classrooms correlated with acoustic parameters of teachers' voice production. The voice ergonomic risk factors in the fields of working culture, working postures and indoor air quality were assessed in 40 classrooms using the Voice Ergonomic Assessment in Work Environment - Handbook and Checklist. Teachers (32 females, 8 males) from the above-mentioned classrooms recorded text readings before and after a working day. Fundamental frequency, sound pressure level (SPL) and the slope of the spectrum (alpha ratio) were analyzed. The higher the number of the risk factors in the classrooms, the higher SPL the teachers used and the more strained the males' voices (increased alpha ratio) were. The SPL was already higher before the working day in the teachers with higher risk than in those with lower risk. In the working environment with many voice ergonomic risk factors, speakers increase voice loudness and use more strained voice quality (males). A practical implication of the results is that voice ergonomic assessments are needed in schools. Copyright © 2013 S. Karger AG, Basel.

  1. [Applicability of Voice Handicap Index to the evaluation of voice therapy effectiveness in teachers].

    Science.gov (United States)

    Niebudek-Bogusz, Ewa; Kuzańska, Anna; Błoch, Piotr; Domańska, Maja; Woźnicka, Ewelina; Politański, Piotr; Sliwińska-Kowalska, Mariola

    2007-01-01

    The aim of this study was to assess the applicability of Voice Handicap Index (VHI) to the evaluation of effectiveness of functional voice disorders treatment in teachers. The subjects were 45 female teachers with functional dysphonia who evaluated their voice problems according to the subjective VHI scale before and after phoniatric management. Group I (29 patients) were subjected to vocal training, whereas group II (16 patients) received only voice hygiene instructions. The results demonstrated that differences in the mean VHI score before and after phoniatric treatment were significantly higher in group 1 than in group II (p teacher's dysphonia.

  2. Influence of classroom acoustics on the voice levels of teachers with and without voice problems: a field study

    DEFF Research Database (Denmark)

    Pelegrin Garcia, David; Lyberg-Åhlander, Viveka; Rydell, Roland

    2010-01-01

    of the classroom. The results thus suggest that teachers with voice problems are more aware of classroom acoustic conditions than their healthy colleagues and make use of the more supportive rooms to lower their voice levels. This behavior may result from an adaptation process of the teachers with voice problems...... of the voice problems was made with a questionnaire and a laryngological examination. During teaching, the sound pressure level at the teacher’s position was monitored. The teacher’s voice level and the activity noise level were separated using mixed Gaussians. In addition, objective acoustic parameters...... of Reverberation Time and Voice Support were measured in the 30 empty classrooms of the study. An empirical model shows that the measured voice levels depended on the activity noise levels and the voice support. Teachers with and without voice problems were differently affected by the voice support...

  3. VoiceThread as a Peer Review and Dissemination Tool for Undergraduate Research

    Science.gov (United States)

    Guertin, L. A.

    2012-12-01

    VoiceThread has been utilized in an undergraduate research methods course for peer review and final research project dissemination. VoiceThread (http://www.voicethread.com) can be considered a social media tool, as it is a web-based technology with the capacity to enable interactive dialogue. VoiceThread is an application that allows a user to place a media collection online containing images, audio, videos, documents, and/or presentations in an interface that facilitates asynchronous communication. Participants in a VoiceThread can be passive viewers of the online content or engaged commenters via text, audio, video, with slide annotations via a doodle tool. The VoiceThread, which runs across browsers and operating systems, can be public or private for viewing and commenting and can be embedded into any website. Although few university students are aware of the VoiceThread platform (only 10% of the students surveyed by Ng (2012)), the 2009 K-12 edition of The Horizon Report (Johnson et al., 2009) lists VoiceThread as a tool to watch because of the opportunities it provides as a collaborative learning environment. In Fall 2011, eleven students enrolled in an undergraduate research methods course at Penn State Brandywine each conducted their own small-scale research project. Upon conclusion of the projects, students were required to create a poster summarizing their work for peer review. To facilitate the peer review process outside of class, each student-created PowerPoint file was placed in a VoiceThread with private access to only the class members and instructor. Each student was assigned to peer review five different student posters (i.e., VoiceThread images) with the audio and doodle tools to comment on formatting, clarity of content, etc. After the peer reviews were complete, the students were allowed to edit their PowerPoint poster files for a new VoiceThread. In the new VoiceThread, students were required to video record themselves describing their research

  4. Former Auctioneer Finds Voice After Aphasia

    Science.gov (United States)

    ... Aphasia Follow us Former Auctioneer Finds Voice After Aphasia Speech impairment changed his life One unremarkable September ... 10 Tips for Communicating with Someone who has Aphasia Talk to them in a quiet, calm, relaxed ...

  5. A model to explain human voice production

    Science.gov (United States)

    Vilas Bôas, C. S. N.; Gobara, S. T.

    2018-05-01

    This article presents a device constructed with low-cost material to demonstrate and explain voice production. It also provides a contextualized, interdisciplinary approach to introduce the study of sound waves.

  6. A lesson in listening: Is the student voice heard in the rush to ...

    African Journals Online (AJOL)

    This is encouraging, as the call to incorporate technology in teaching and learning in higher education is increasing. The student voice in the planning and implementation of blended learning strategies is, however, not adequately addressed in many of the studies to date. Objective. To utilise videos and blogging in a ...

  7. Cognitive processing load during listening is reduced more by decreasing voice similarity than by increasing spatial separation between target and masker speech

    NARCIS (Netherlands)

    Zekveld, A.A.; Rudner, M.; Kramer, S.E.; Lyzenga, J.; Ronnberg, J.

    2014-01-01

    We investigated changes in speech recognition and cognitive processing load due to the masking release attributable to decreasing similarity between target and masker speech. This was achieved by using masker voices with either the same (female) gender as the target speech or different gender (male)

  8. Control of automated system with voice commands

    OpenAIRE

    Švara, Denis

    2012-01-01

    In smart houses contemporary achievements in the fields of automation, communications, security and artificial intelligence, increase comfort and improve the quality of user's lifes. For the purpose of this thesis we developed a system for managing a smart house with voice commands via smart phone. We focused at voice commands most. We want move from communication with fingers - touches, to a more natural, human relationship - speech. We developed the entire chain of communication, by which t...

  9. Voice disorders in Nigerian primary school teachers.

    Science.gov (United States)

    Akinbode, R; Lam, K B H; Ayres, J G; Sadhra, S

    2014-07-01

    The prolonged use or abuse of voice may lead to vocal fatigue and vocal fold tissue damage. School teachers routinely use their voices intensively at work and are therefore at a higher risk of dysphonia. To determine the prevalence of voice disorders among primary school teachers in Lagos, Nigeria, and to explore associated risk factors. Teaching and non-teaching staff from 19 public and private primary schools completed a self-administered questionnaire to obtain information on personal lifestyles, work experience and environment, and voice disorder symptoms. Dysphonia was defined as the presence of at least one of the following: hoarseness, repetitive throat clearing, tired voice or straining to speak. A total of 341 teaching and 155 non-teaching staff participated. The prevalence of dysphonia in teachers was 42% compared with 18% in non-teaching staff. A significantly higher proportion of the teachers reported that voice symptoms had affected their ability to communicate effectively. School type (public/private) did not predict the presence of dysphonia. Statistically significant associations were found for regular caffeinated drink intake (odds ratio [OR] = 3.07; 95% confidence interval [CI]: 1.51-6.62), frequent upper respiratory tract infection (OR = 3.60; 95% CI: 1.39-9.33) and raised voice while teaching (OR = 10.1; 95% CI: 5.07-20.2). Nigerian primary school teachers were at risk for dysphonia. Important environment and personal factors were upper respiratory infection, the need to frequently raise the voice when teaching and regular intake of caffeinated drinks. Dysphonia was not associated with age or years of teaching. © The Author 2014. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  10. Voicing Others’ Voices: Spotlighting the Researcher as Narrator

    Directory of Open Access Journals (Sweden)

    Dan O’SULLIVAN

    2015-12-01

    Full Text Available As qualitative research undertakings are not independent of the researcher, the “indissoluble interrelationship between interpreter and interpretation” (Thomas & James, 2006, p. 782 renders it necessary for researchers to understand that their text is a representation, a version of the truth that is the product of writerly choices, and that it is discursive. Endlessly creative, artistic and political, as there is no single interpretative truth, the interpretative process facilitates the refashioning of representations, the remaking of choices and the probing of discourses. As a consequence of the particularity of any researcher’s account, issues pertaining to researcher identity and authorial stance always remain central to research endeavours (Kamler & Thomson, 2006, p. 68; Denzin & Lincoln 2011, pp. 14-15. Therefore, researchers are encouraged to be reflexive about their analyses and research accounts (Elliott, 2005, p. 152, as reflexivity helps spotlight the role of the researcher as narrator. In turn, spotlighting the researcher as narrator foregrounds a range of complex issues about voice, representation and interpretive authority (Chase, 2005, p. 657; Genishi & Glupczynski, 2006, p. 671; Eisenhart, 2006. In essence, therefore, this paper is reflective of the challenges of “doing” qualitative research in educational settings. Its particular focus-the shaping of beginning primary teachers’ identities, in Ireland, throughout the course of their initial year of occupational experience, post-graduation- endeavours to highlight issues pertaining to the researcher as narrator (O’Sullivan, 2014.

  11. Voicing others’ voices: Spotlighting the researcher as narrator

    Directory of Open Access Journals (Sweden)

    Dan O'Sullivan

    2015-09-01

    Full Text Available As qualitative research undertakings are not independent of the researcher, the “indissoluble interrelationship between interpreter and interpretation” (Thomas & James, 2006, p. 782 renders it necessary for researchers to understand that their text is a representation, a version of the truth that is the product of writerly choices, and that it is discursive. Endlessly creative, artistic and political, as there is no single interpretative truth, the interpretative process facilitates the refashioning of representations, the remaking of choices and the probing of discourses. As a consequence of the particularity of any researcher’s account, issues pertaining to researcher identity and authorial stance always remain central to research endeavours (Kamler & Thomson, 2006, p. 68; Denzin & Lincoln 2011, pp. 14-15. Therefore, researchers are encouraged to be reflexive about their analyses and research accounts (Elliott, 2005, p. 152, as reflexivity helps spotlight the role of the researcher as narrator. In turn, spotlighting the researcher as narrator foregrounds a range of complex issues about voice, representation and interpretive authority (Chase, 2005, p. 657; Genishi & Glupczynski, 2006, p. 671; Eisenhart, 2006. In essence, therefore, this paper is reflective of the challenges of “doing” qualitative research in educational settings. Its particular focus-the shaping of beginning primary teachers’ identities, in Ireland, throughout the course of their initial year of occupational experience, post-graduation- endeavours to highlight issues pertaining to the researcher as narrator (O’Sullivan, 2014.

  12. Voice pitch influences perceptions of sexual infidelity.

    Science.gov (United States)

    O'Connor, Jillian J M; Re, Daniel E; Feinberg, David R

    2011-02-28

    Sexual infidelity can be costly to members of both the extra-pair and the paired couple. Thus, detecting infidelity risk is potentially adaptive if it aids in avoiding cuckoldry or loss of parental and relationship investment. Among men, testosterone is inversely related to voice pitch, relationship and offspring investment, and is positively related to the pursuit of short-term relationships, including extra-pair sex. Among women, estrogen is positively related to voice pitch, attractiveness, and the likelihood of extra-pair involvement. Although prior work has demonstrated a positive relationship between men's testosterone levels and infidelity, this study is the first to investigate attributions of infidelity as a function of sexual dimorphism in male and female voices. We found that men attributed high infidelity risk to feminized women's voices, but not significantly more often than did women. Women attributed high infidelity risk to masculinized men's voices at significantly higher rates than did men. These data suggest that voice pitch is used as an indicator of sexual strategy in addition to underlying mate value. The aforementioned attributions may be adaptive if they prevent cuckoldry and/or loss of parental and relationship investment via avoidance of partners who may be more likely to be unfaithful.

  13. Voice Pitch Influences Perceptions of Sexual Infidelity

    Directory of Open Access Journals (Sweden)

    Jillian J.M. O'Connor

    2011-01-01

    Full Text Available Sexual infidelity can be costly to members of both the extra-pair and the paired couple. Thus, detecting infidelity risk is potentially adaptive if it aids in avoiding cuckoldry or loss of parental and relationship investment. Among men, testosterone is inversely related to voice pitch, relationship and offspring investment, and is positively related to the pursuit of short-term relationships, including extra-pair sex. Among women, estrogen is positively related to voice pitch, attractiveness, and the likelihood of extra-pair involvement. Although prior work has demonstrated a positive relationship between men's testosterone levels and infidelity, this study is the first to investigate attributions of infidelity as a function of sexual dimorphism in male and female voices. We found that men attributed high infidelity risk to feminized women's voices, but not significantly more often than did women. Women attributed high infidelity risk to masculinized men's voices at significantly higher rates than did men. These data suggest that voice pitch is used as an indicator of sexual strategy in addition to underlying mate value. The aforementioned attributions may be adaptive if they prevent cuckoldry and/or loss of parental and relationship investment via avoidance of partners who may be more likely to be unfaithful.

  14. Multivariate sensitivity to voice during auditory categorization.

    Science.gov (United States)

    Lee, Yune Sang; Peelle, Jonathan E; Kraemer, David; Lloyd, Samuel; Granger, Richard

    2015-09-01

    Past neuroimaging studies have documented discrete regions of human temporal cortex that are more strongly activated by conspecific voice sounds than by nonvoice sounds. However, the mechanisms underlying this voice sensitivity remain unclear. In the present functional MRI study, we took a novel approach to examining voice sensitivity, in which we applied a signal detection paradigm to the assessment of multivariate pattern classification among several living and nonliving categories of auditory stimuli. Within this framework, voice sensitivity can be interpreted as a distinct neural representation of brain activity that correctly distinguishes human vocalizations from other auditory object categories. Across a series of auditory categorization tests, we found that bilateral superior and middle temporal cortex consistently exhibited robust sensitivity to human vocal sounds. Although the strongest categorization was in distinguishing human voice from other categories, subsets of these regions were also able to distinguish reliably between nonhuman categories, suggesting a general role in auditory object categorization. Our findings complement the current evidence of cortical sensitivity to human vocal sounds by revealing that the greatest sensitivity during categorization tasks is devoted to distinguishing voice from nonvoice categories within human temporal cortex. Copyright © 2015 the American Physiological Society.

  15. Voice Quality in Mobile Telecommunication System

    Directory of Open Access Journals (Sweden)

    Evaldas Stankevičius

    2013-05-01

    Full Text Available The article deals with methods measuring the quality of voice transmitted over the mobile network as well as related problem, algorithms and options. It presents the created voice quality measurement system and discusses its adequacy as well as efficiency. Besides, the author presents the results of system application under the optimal hardware configuration. Under almost ideal conditions, the system evaluates the voice quality with MOS 3.85 average estimate; while the standardized TEMS Investigation 9.0 has 4.05 average MOS estimate. Next, the article presents the discussion of voice quality predictor implementation and investigates the predictor using nonlinear and linear prediction methods of voice quality dependence on the mobile network settings. Nonlinear prediction using artificial neural network resulted in the correlation coefficient of 0.62. While the linear prediction method using the least mean squares resulted in the correlation coefficient of 0.57. The analytical expression of voice quality features from the three network parameters: BER, C / I, RSSI is given as well.Article in Lithuanian

  16. Voice Use Among Music Theory Teachers: A Voice Dosimetry and Self-Assessment Study.

    Science.gov (United States)

    Schiller, Isabel S; Morsomme, Dominique; Remacle, Angélique

    2017-07-25

    This study aimed (1) to investigate music theory teachers' professional and extra-professional vocal loading and background noise exposure, (2) to determine the correlation between vocal loading and background noise, and (3) to determine the correlation between vocal loading and self-evaluation data. Using voice dosimetry, 13 music theory teachers were monitored for one workweek. The parameters analyzed were voice sound pressure level (SPL), fundamental frequency (F0), phonation time, vocal loading index (VLI), and noise SPL. Spearman correlation was used to correlate vocal loading parameters (voice SPL, F0, and phonation time) and noise SPL. Each day, the subjects self-assessed their voice using visual analog scales. VLI and self-evaluation data were correlated using Spearman correlation. Vocal loading parameters and noise SPL were significantly higher in the professional than in the extra-professional environment. Voice SPL, phonation time, and female subjects' F0 correlated positively with noise SPL. VLI correlated with self-assessed voice quality, vocal fatigue, and amount of singing and speaking voice produced. Teaching music theory is a profession with high vocal demands. More background noise is associated with increased vocal loading and may indirectly increase the risk for voice disorders. Correlations between VLI and self-assessments suggest that these teachers are well aware of their vocal demands and feel their effect on voice quality and vocal fatigue. Visual analog scales seem to represent a useful tool for subjective vocal loading assessment and associated symptoms in these professional voice users. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  17. Use of digital speech recognition in diagnostics radiology

    International Nuclear Information System (INIS)

    Arndt, H.; Stockheim, D.; Mutze, S.; Petersein, J.; Gregor, P.; Hamm, B.

    1999-01-01

    Purpose: Applicability and benefits of digital speech recognition in diagnostic radiology were tested using the speech recognition system SP 6000. Methods: The speech recognition system SP 6000 was integrated into the network of the institute and connected to the existing Radiological Information System (RIS). Three subjects used this system for writing 2305 findings from dictation. After the recognition process the date, length of dictation, time required for checking/correction, kind of examination and error rate were recorded for every dictation. With the same subjects, a correlation was performed with 625 conventionally written finding. Results: After an 1-hour initial training the average error rates were 8.4 to 13.3%. The first adaptation of the speech recognition system (after nine days) decreased the average error rates to 2.4 to 10.7% due to the ability of the program to learn. The 2 nd and 3 rd adaptations resulted only in small changes of the error rate. An individual comparison of the error rate developments in the same kind of investigation showed the relative independence of the error rate on the individual user. Conclusion: The results show that the speech recognition system SP 6000 can be evaluated as an advantageous alternative for quickly recording radiological findings. A comparison between manually writing and dictating the findings verifies the individual differences of the writing speeds and shows the advantage of the application of voice recognition when faced with normal keyboard performance. (orig.) [de

  18. Automatic determination of pathological voice transformation coefficients for TDPDOLA using neural network

    International Nuclear Information System (INIS)

    Belgacem, H.; Cherif, A.

    2011-01-01

    One of the biggest challenges in vocal transformation with TD-PSOLA technique is the selection of modified parameters that will make a successful speech resynthesis. The best selection methods are by using human ratters. This study focuses on automatic determination of the pathological voice transformation coefficients using an Artificial Neural Network this way by comparing the results to the previous manual work. Four characterizied parameters (RATA-PLP, Jitter, Shimmer and RAP) were chosen. The system is developed with supervised training, consists of recognition (neural network) for synthesis (TD-PSOLA). The experimental results show that the parameter sets selected by the proposed system can be successfully used to resynthesize and demonstrating that our system can assist in vocal of pathological voice's transformation.

  19. Emotional voice processing: investigating the role of genetic variation in the serotonin transporter across development.

    Directory of Open Access Journals (Sweden)

    Tobias Grossmann

    Full Text Available The ability to effectively respond to emotional information carried in the human voice plays a pivotal role for social interactions. We examined how genetic factors, especially the serotonin transporter genetic variation (5-HTTLPR, affect the neurodynamics of emotional voice processing in infants and adults by measuring event-related brain potentials (ERPs. The results revealed that infants distinguish between emotions during an early perceptual processing stage, whereas adults recognize and evaluate the meaning of emotions during later semantic processing stages. While infants do discriminate between emotions, only in adults was genetic variation associated with neurophysiological differences in how positive and negative emotions are processed in the brain. This suggests that genetic association with neurocognitive functions emerges during development, emphasizing the role that variation in serotonin plays in the maturation of brain systems involved in emotion recognition.

  20. Face Detection and Recognition

    National Research Council Canada - National Science Library

    Jain, Anil K

    2004-01-01

    This report describes research efforts towards developing algorithms for a robust face recognition system to overcome many of the limitations found in existing two-dimensional facial recognition systems...

  1. Graphical symbol recognition

    OpenAIRE

    K.C. , Santosh; Wendling , Laurent

    2015-01-01

    International audience; The chapter focuses on one of the key issues in document image processing i.e., graphical symbol recognition. Graphical symbol recognition is a sub-field of a larger research domain: pattern recognition. The chapter covers several approaches (i.e., statistical, structural and syntactic) and specially designed symbol recognition techniques inspired by real-world industrial problems. It, in general, contains research problems, state-of-the-art methods that convey basic s...

  2. Updating signal typing in voice: addition of type 4 signals.

    Science.gov (United States)

    Sprecher, Alicia; Olszewski, Aleksandra; Jiang, Jack J; Zhang, Yu

    2010-06-01

    The addition of a fourth type of voice to Titze's voice classification scheme is proposed. This fourth voice type is characterized by primarily stochastic noise behavior and is therefore unsuitable for both perturbation and correlation dimension analysis. Forty voice samples were classified into the proposed four types using narrowband spectrograms. Acoustic, perceptual, and correlation dimension analyses were completed for all voice samples. Perturbation measures tended to increase with voice type. Based on reliability cutoffs, the type 1 and type 2 voices were considered suitable for perturbation analysis. Measures of unreliability were higher for type 3 and 4 voices. Correlation dimension analyses increased significantly with signal type as indicated by a one-way analysis of variance. Notably, correlation dimension analysis could not quantify the type 4 voices. The proposed fourth voice type represents a subset of voices dominated by noise behavior. Current measures capable of evaluating type 4 voices provide only qualitative data (spectrograms, perceptual analysis, and an infinite correlation dimension). Type 4 voices are highly complex and the development of objective measures capable of analyzing these voices remains a topic of future investigation.

  3. Altered emotional recognition and expression in patients with Parkinson’s disease

    Directory of Open Access Journals (Sweden)

    Jin Y

    2017-11-01

    Full Text Available Yazhou Jin,* Zhiqi Mao,* Zhipei Ling, Xin Xu, Zhiyuan Zhang, Xinguang Yu Department of Neurosurgery, People’s Liberation Army General Hospital, Beijing, People’s Republic of China *These authors contributed equally to this work Background: Parkinson’s disease (PD patients exhibit deficits in emotional recognition and expression abilities, including emotional faces and voices. The aim of this study was to explore emotional processing in pre-deep brain stimulation (pre-DBS PD patients using two sensory modalities (visual and auditory. Methods: Fifteen PD patients who needed DBS surgery and 15 healthy, age- and gender-matched controls were recruited as participants. All participants were assessed by the Karolinska Directed Emotional Faces database 50 Faces Recognition test. Vocal recognition was evaluated by the Montreal Affective Voices database 50 Voices Recognition test. For emotional facial expression, the participants were asked to imitate five basic emotions (neutral, happiness, anger, fear, and sadness. The subjects were required to express nonverbal vocalizations of the five basic emotions. Fifteen Chinese native speakers were recruited as decoders. We recorded the accuracy of the responses, reaction time, and confidence level. Results: For emotional recognition and expression, the PD group scored lower on both facial and vocal emotional processing than did the healthy control group. There were significant differences between the two groups in both reaction time and confidence level. A significant relationship was also found between emotional recognition and emotional expression when considering all participants between the two groups together. Conclusion: The PD group exhibited poorer performance on both the recognition and expression tasks. Facial emotion deficits and vocal emotion abnormalities were associated with each other. In addition, our data allow us to speculate that emotional recognition and expression may share a common

  4. Testing and Demonstrating Speaker Verification Technology in Iraqi-Arabic as Part of the Iraqi Enrollment Via Voice Authentication Project (IEVAP) in Support of the Global War on Terrorism (GWOT)

    National Research Council Canada - National Science Library

    Withee, Jeffrey W; Pena, Edwin D

    2007-01-01

    This thesis documents the findings of an Iraqi-Arabic language test and concept of operations for speaker verification technology as part of the Iraqi Banking System in support of the Iraqi Enrollment...

  5. Enhanced Living by Assessing Voice Pathology Using a Co-Occurrence Matrix.

    Science.gov (United States)

    Muhammad, Ghulam; Alhamid, Mohammed F; Hossain, M Shamim; Almogren, Ahmad S; Vasilakos, Athanasios V

    2017-01-29

    A large number of the population around the world suffers from various disabilities. Disabilities affect not only children but also adults of different professions. Smart technology can assist the disabled population and lead to a comfortable life in an enhanced living environment (ELE). In this paper, we propose an effective voice pathology assessment system that works in a smart home framework. The proposed system takes input from various sensors, and processes the acquired voice signals and electroglottography (EGG) signals. Co-occurrence matrices in different directions and neighborhoods from the spectrograms of these signals were obtained. Several features such as energy, entropy, contrast, and homogeneity from these matrices were calculated and fed into a Gaussian mixture model-based classifier. Experiments were performed with a publicly available database, namely, the Saarbrucken voice database. The results demonstrate the feasibility of the proposed system in light of its high accuracy and speed. The proposed system can be extended to assess other disabilities in an ELE.

  6. Value driven innovation in medical device design: a process for balancing stakeholder voices.

    Science.gov (United States)

    de Ana, F J; Umstead, K A; Phillips, G J; Conner, C P

    2013-09-01

    The innovation process has often been represented as a linear process which funnels customer needs through various business and process filters. This method may be appropriate for some consumer products, but in the medical device industry there are some inherent limitations to the traditional innovation funnel approach. In the medical device industry, there are a number of stakeholders who need to have their voices heard throughout the innovation process. Each stakeholder has diverse and unique needs relating to the medical device, the needs of one may highly affect the needs of another, and the relationships between stakeholders may be tenuous. This paper describes the application of a spiral innovation process to the development of a medical device which considers three distinct stakeholder voices: the Voice of the Customer, the Voice of the Business and the Voice of the Technology. The process is presented as a case study focusing on the front-end redesign of a class III medical device for an orthopedics company. Starting from project initiation and scope alignment, the process describes four phases, Discover, Envision, Create, and Refine, and concludes with value assessment of the final design features.

  7. A long distance voice transmission system based on the white light LED

    Science.gov (United States)

    Tian, Chunyu; Wei, Chang; Wang, Yulian; Wang, Dachi; Yu, Benli; Xu, Feng

    2017-10-01

    A long distance voice transmission system based on a visible light communication technology (VLCT) is proposed in the paper. Our proposed system includes transmitter, receiver and the voice signal processing of single chip microcomputer. In the compact-sized LED transmitter, we use on-off-keying and not-return-to-zero (OOK-NRZ) to easily realize high speed modulation, and then systematic complexity is reduced. A voice transmission system, which possesses the properties of the low-noise and wide modulation band, is achieved by the design of high efficiency receiving optical path and using filters to reduce noise from the surrounding light. To improve the speed of the signal processing, we use single chip microcomputer to code and decode voice signal. Furthermore, serial peripheral interface (SPI) is adopted to accurately transmit voice signal data. The test results of our proposed system show that the transmission distance of this system is more than100 meters with the maximum data rate of 1.5 Mbit/s and a SNR of 30dB. This system has many advantages, such as simple construction, low cost and strong practicality. Therefore, it has extensive application prospect in the fields of the emergency communication and indoor wireless communication, etc.

  8. Diagnostic value of voice acoustic analysis in assessment of occupational voice pathologies in teachers.

    Science.gov (United States)

    Niebudek-Bogusz, Ewa; Fiszer, Marta; Kotylo, Piotr; Sliwinska-Kowalska, Mariola

    2006-01-01

    It has been shown that teachers are at risk of developing occupational dysphonia, which accounts for over 25% of all occupational diseases diagnosed in Poland. The most frequently used method of diagnosing voice diseases is videostroboscopy. However, to facilitate objective evaluation of voice efficiency as well as medical certification of occupational voice disorders, it is crucial to implement quantitative methods of voice assessment, particularly voice acoustic analysis. The aim of the study was to assess the results of acoustic analysis in 66 female teachers (aged 40-64 years), including 35 subjects with occupational voice pathologies (e.g., vocal nodules) and 31 subjects with functional dysphonia. The acoustic analysis was performed using the IRIS software, before and after a 30-minute vocal loading test. All participants were subjected also to laryngological and videostroboscopic examinations. After the vocal effort, the acoustic parameters displayed statistically significant abnormalities, mostly lowered fundamental frequency (Fo) and incorrect values of shimmer and noise to harmonic ratio. To conclude, quantitative voice acoustic analysis using the IRIS software seems to be an effective complement to voice examinations, which is particularly helpful in diagnosing occupational dysphonia.

  9. Analysis of failure of voice production by a sound-producing voice prosthesis

    NARCIS (Netherlands)

    van der Torn, M.; van Gogh, C.D.L.; Verdonck-de Leeuw, I M; Festen, J.M.; Mahieu, H.F.

    OBJECTIVE: To analyse the cause of failing voice production by a sound-producing voice prosthesis (SPVP). METHODS: The functioning of a prototype SPVP is described in a female laryngectomee before and after its sound-producing mechanism was impeded by tracheal phlegm. This assessment included:

  10. Interactive Augmentation of Voice Quality and Reduction of Breath Airflow in the Soprano Voice.

    Science.gov (United States)

    Rothenberg, Martin; Schutte, Harm K

    2016-11-01

    In 1985, at a conference sponsored by the National Institutes of Health, Martin Rothenberg first described a form of nonlinear source-tract acoustic interaction mechanism by which some sopranos, singing in their high range, can use to reduce the total airflow, to allow holding the note longer, and simultaneously enrich the quality of the voice, without straining the voice. (M. Rothenberg, "Source-Tract Acoustic Interaction in the Soprano Voice and Implications for Vocal Efficiency," Fourth International Conference on Vocal Fold Physiology, New Haven, Connecticut, June 3-6, 1985.) In this paper, we describe additional evidence for this type of nonlinear source-tract interaction in some soprano singing and describe an analogous interaction phenomenon in communication engineering. We also present some implications for voice research and pedagogy. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  11. Silent Speech Recognition as an Alternative Communication Device for Persons with Laryngectomy.

    Science.gov (United States)

    Meltzner, Geoffrey S; Heaton, James T; Deng, Yunbin; De Luca, Gianluca; Roy, Serge H; Kline, Joshua C

    2017-12-01

    Each year thousands of individuals require surgical removal of their larynx (voice box) due to trauma or disease, and thereby require an alternative voice source or assistive device to verbally communicate. Although natural voice is lost after laryngectomy, most muscles controlling speech articulation remain intact. Surface electromyographic (sEMG) activity of speech musculature can be recorded from the neck and face, and used for automatic speech recognition to provide speech-to-text or synthesized speech as an alternative means of communication. This is true even when speech is mouthed or spoken in a silent (subvocal) manner, making it an appropriate communication platform after laryngectomy. In this study, 8 individuals at least 6 months after total laryngectomy were recorded using 8 sEMG sensors on their face (4) and neck (4) while reading phrases constructed from a 2,500-word vocabulary. A unique set of phrases were used for training phoneme-based recognition models for each of the 39 commonly used phonemes in English, and the remaining phrases were used for testing word recognition of the models based on phoneme identification from running speech. Word error rates were on average 10.3% for the full 8-sensor set (averaging 9.5% for the top 4 participants), and 13.6% when reducing the sensor set to 4 locations per individual (n=7). This study provides a compelling proof-of-concept for sEMG-based alaryngeal speech recognition, with the strong potential to further improve recognition performance.

  12. FPGA-Based Implementation of Lithuanian Isolated Word Recognition Algorithm

    Directory of Open Access Journals (Sweden)

    Tomyslav Sledevič

    2013-05-01

    Full Text Available The paper describes the FPGA-based implementation of Lithuanian isolated word recognition algorithm. FPGA is selected for parallel process implementation using VHDL to ensure fast signal processing at low rate clock signal. Cepstrum analysis was applied to features extraction in voice. The dynamic time warping algorithm was used to compare the vectors of cepstrum coefficients. A library of 100 words features was created and stored in the internal FPGA BRAM memory. Experimental testing with speaker dependent records demonstrated the recognition rate of 94%. The recognition rate of 58% was achieved for speaker-independent records. Calculation of cepstrum coefficients lasted for 8.52 ms at 50 MHz clock, while 100 DTWs took 66.56 ms at 25 MHz clock.Article in Lithuanian

  13. A system of automatic speaker recognition on a minicomputer

    International Nuclear Information System (INIS)

    El Chafei, Cherif

    1978-01-01

    This study describes a system of automatic speaker recognition using the pitch of the voice. The pre-treatment consists in the extraction of the speakers' discriminating characteristics taken from the pitch. The programme of recognition gives, firstly, a preselection and then calculates the distance between the speaker's characteristics to be recognized and those of the speakers already recorded. An experience of recognition has been realized. It has been undertaken with 15 speakers and included 566 tests spread over an intermittent period of four months. The discriminating characteristics used offer several interesting qualities. The algorithms concerning the measure of the characteristics on one hand, the speakers' classification on the other hand, are simple. The results obtained in real time with a minicomputer are satisfactory. Furthermore they probably could be improved if we considered other speaker's discriminating characteristics but this was unfortunately not in our possibilities. (author) [fr

  14. Comparison of Forced-Alignment Speech Recognition and Humans for Generating Reference VAD

    DEFF Research Database (Denmark)

    Kraljevski, Ivan; Tan, Zheng-Hua; Paola Bissiri, Maria

    2015-01-01

    This present paper aims to answer the question whether forced-alignment speech recognition can be used as an alternative to humans in generating reference Voice Activity Detection (VAD) transcriptions. An investigation of the level of agreement between automatic/manual VAD transcriptions and the ......This present paper aims to answer the question whether forced-alignment speech recognition can be used as an alternative to humans in generating reference Voice Activity Detection (VAD) transcriptions. An investigation of the level of agreement between automatic/manual VAD transcriptions...... and the reference ones produced by a human expert was carried out. Thereafter, statistical analysis was employed on the automatically produced and the collected manual transcriptions. Experimental results confirmed that forced-alignment speech recognition can provide accurate and consistent VAD labels....

  15. Recognition and Toleration

    DEFF Research Database (Denmark)

    Lægaard, Sune

    2010-01-01

    Recognition and toleration are ways of relating to the diversity characteristic of multicultural societies. The article concerns the possible meanings of toleration and recognition, and the conflict that is often claimed to exist between these two approaches to diversity. Different forms...... or interpretations of recognition and toleration are considered, confusing and problematic uses of the terms are noted, and the compatibility of toleration and recognition is discussed. The article argues that there is a range of legitimate and importantly different conceptions of both toleration and recognition...

  16. On-device mobile speech recognition

    OpenAIRE

    Mustafa, MK

    2016-01-01

    Despite many years of research, Speech Recognition remains an active area of research in Artificial Intelligence. Currently, the most common commercial application of this technology on mobile devices uses a wireless client – server approach to meet the computational and memory demands of the speech recognition process. Unfortunately, such an approach is unlikely to remain viable when fully applied over the approximately 7.22 Billion mobile phones currently in circulation. In this thesis we p...

  17. Voice and Narrative in L1 Writing

    DEFF Research Database (Denmark)

    Krogh, Ellen; Piekut, Anke

    2015-01-01

    This paper investigates issues of voice and narrative in L1 writing. Three branches of research are initial-ly discussed: research on narratives as resources for identity work, research on writer identity and voice as an essential aspect of identity, and research on Bildung in L1 writing. Subsequ...... training of voice and narratives as a resource for academic writing, and that the Bildung potential of L1 writing may be tied to this issue.......This paper investigates issues of voice and narrative in L1 writing. Three branches of research are initial-ly discussed: research on narratives as resources for identity work, research on writer identity and voice as an essential aspect of identity, and research on Bildung in L1 writing...... in lower secondary L1, she found that her previous writing strategies were not rewarded in upper secondary school. In the second empiri-cal study, two upper-secondary exam papers are investigated, with a focus on their approaches to exam genres and their use of narrative resources to address issues...

  18. 8 CFR 1292.2 - Organizations qualified for recognition; requests for recognition; withdrawal of recognition...

    Science.gov (United States)

    2010-01-01

    ...; requests for recognition; withdrawal of recognition; accreditation of representatives; roster. 1292.2...; requests for recognition; withdrawal of recognition; accreditation of representatives; roster. (a) Qualifications of organizations. A non-profit religious, charitable, social service, or similar organization...

  19. Adsorption characteristics, recognition properties, and preliminary application of nordihydroguaiaretic acid molecularly imprinted polymers prepared by sol-gel surface imprinting technology

    Science.gov (United States)

    Liao, Sen; Zhang, Wen; Long, Wei; Hou, Dan; Yang, Xuechun; Tan, Ni

    2016-02-01

    In this paper, a new core-shell composite of nordihydroguaiaretic acid (NDGA) molecularly imprinted polymers layer-coated silica gel (MIP@SiO2) was prepared through sol-gel technique and applied as a material for extraction of NDGA from Ephedra. It was synthesized using NDGA as the template molecule, γ-aminopropyltriethoxysilane (APTS) and methyltriethoxysilane (MTEOS) as the functional monomers, tetraethyl orthosilicate (TEOS) as the cross-linker and ethanol as the porogenic solvent in the surface of silica. The non-imprinted polymers layer-coated silica gel (NIP@SiO2) were prepared with the same procedure, but with the absence of template molecule. In addition, the optimum adsorption affinity occurred when the molar ratio of NDGA:APTS:MTEOS:TEOS was 1:6:2:80. The prepared MIP@SiO2 and NIP@SiO2 were analyzed by scanning electron microscopy (SEM), thermogravimetric analysis (TGA), and Fourier transform-infrared spectroscopy (FT-IR). Their affinity properties to NDGA were evaluated through dynamic adsorption, static adsorption, and selective recognition experiments, and the results showed the saturated adsorption capacity of MIP@SiO2 could reach to 5.90 mg g-1, which was two times more than that of NIP@SiO2. High performance liquid chromatography (HPLC) was used to evaluate the extraction of NDGA from the medicinal plant ephedra by the above prepared materials, and the results indicated that the MIP@SiO2 had potential application in separation of the natural active component NDGA from medicinal plants.

  20. The Voice Transcription Technique: Use of Voice Recognition Software to Transcribe Digital Interview Data in Qualitative Research

    Science.gov (United States)

    Matheson, Jennifer L.

    2007-01-01

    Transcribing interview data is a time-consuming task that most qualitative researchers dislike. Transcribing is even more difficult for people with physical limitations because traditional transcribing requires manual dexterity and the ability to sit at a computer for long stretches of time. Researchers have begun to explore using an automated…

  1. Hemispheric association and dissociation of voice and speech information processing in stroke.

    Science.gov (United States)

    Jones, Anna B; Farrall, Andrew J; Belin, Pascal; Pernet, Cyril R

    2015-10-01

    As we listen to someone speaking, we extract both linguistic and non-linguistic information. Knowing how these two sets of information are processed in the brain is fundamental for the general understanding of social communication, speech recognition and therapy of language impairments. We investigated the pattern of performances in phoneme versus gender categorization in left and right hemisphere stroke patients, and found an anatomo-functional dissociation in the right frontal cortex, establishing a new syndrome in voice discrimination abilities. In addition, phoneme and gender performances were most often associated than dissociated in the left hemisphere patients, suggesting a common neural underpinnings. Copyright © 2015 Elsevier Ltd. All rights reserved.

  2. Teachers’ voice use in teaching environment. Aspects on speakers’ comfort

    DEFF Research Database (Denmark)

    Lyberg-Åhlander, Viveka; Rydell, Roland; Löfqvist, Anders

    2015-01-01

    use and prevalence of voice problems in teachers and to explore their ratings of vocally loading aspects of their working environment. Method: A questionnaire-survey in 467 teachers aiming to explore the prevalence of voice problems in teaching staff identified teachers with voice problems and vocally...... in the teaching environment and aspects of the classroom environment were also measured. Results: Teachers with voice problems were more affected by any loading factor in the work-environment and were more perceptive of the room acoustics. Differences between the groups were found during field......-measurements of the voice, while there were no differences in the findings from the clinical examinations of larynx and voice. Conclusion: Teachers suffering from voice problems react stronger to loading factors in the teaching environment. It is in the interplay between the individual and the work environment that voice...

  3. Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

    International Nuclear Information System (INIS)

    Holzrichter, J.F.; Ng, L.C.

    1998-01-01

    The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs

  4. Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

    Science.gov (United States)

    Holzrichter, John F.; Ng, Lawrence C.

    1998-01-01

    The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching.

  5. Optical Pattern Recognition

    Science.gov (United States)

    Yu, Francis T. S.; Jutamulia, Suganda

    2008-10-01

    Contributors; Preface; 1. Pattern recognition with optics Francis T. S. Yu and Don A. Gregory; 2. Hybrid neural networks for nonlinear pattern recognition Taiwei Lu; 3. Wavelets, optics, and pattern recognition Yao Li and Yunglong Sheng; 4. Applications of the fractional Fourier transform to optical pattern recognition David Mendlovic, Zeev Zalesky and Haldum M. Oxaktas; 5. Optical implementation of mathematical morphology Tien-Hsin Chao; 6. Nonlinear optical correlators with improved discrimination capability for object location and recognition Leonid P. Yaroslavsky; 7. Distortion-invariant quadratic filters Gregory Gheen; 8. Composite filter synthesis as applied to pattern recognition Shizhou Yin and Guowen Lu; 9. Iterative procedures in electro-optical pattern recognition Joseph Shamir; 10. Optoelectronic hybrid system for three-dimensional object pattern recognition Guoguang Mu, Mingzhe Lu and Ying Sun; 11. Applications of photrefractive devices in optical pattern recognition Ziangyang Yang; 12. Optical pattern recognition with microlasers Eung-Gi Paek; 13. Optical properties and applications of bacteriorhodopsin Q. Wang Song and Yu-He Zhang; 14. Liquid-crystal spatial light modulators Aris Tanone and Suganda Jutamulia; 15. Representations of fully complex functions on real-time spatial light modulators Robert W. Cohn and Laurence G. Hassbrook; Index.

  6. ACOUSTIC SPEECH RECOGNITION FOR MARATHI LANGUAGE USING SPHINX

    Directory of Open Access Journals (Sweden)

    Aman Ankit

    2016-09-01

    Full Text Available Speech recognition or speech to text processing, is a process of recognizing human speech by the computer and converting into text. In speech recognition, transcripts are created by taking recordings of speech as audio and their text transcriptions. Speech based applications which include Natural Language Processing (NLP techniques are popular and an active area of research. Input to such applications is in natural language and output is obtained in natural language. Speech recognition mostly revolves around three approaches namely Acoustic phonetic approach, Pattern recognition approach and Artificial intelligence approach. Creation of acoustic model requires a large database of speech and training algorithms. The output of an ASR system is recognition and translation of spoken language into text by computers and computerized devices. ASR today finds enormous application in tasks that require human machine interfaces like, voice dialing, and etc. Our key contribution in this paper is to create corpora for Marathi language and explore the use of Sphinx engine for automatic speech recognition

  7. Voice and Video Telephony Services in Smartphone

    Directory of Open Access Journals (Sweden)

    2006-01-01

    Full Text Available Multimedia telephony is a delay-sensitive application. Packet losses, relatively less critical than delay, are allowed up to a certain threshold. They represent the QoS constraints that have to be respected to guarantee the operation of the telephony service and user satisfaction. In this work we introduce a new smartphone architecture characterized by two process levels called application processor (AP and mobile termination (MT, respectively. Here, they communicate through a serial channel. Moreover, we focus our attention on two very important UMTS services: voice and video telephony. Through a simulation study the impact of voice and video telephony is evaluated on the structure considered using the protocols known at this moment to realize voice and video telephony

  8. Voice-activated intelligent radiologic image display

    International Nuclear Information System (INIS)

    Fisher, P.

    1989-01-01

    The authors present a computer-based expert computer system called Mammo-Icon, which automatically assists the radiologist's case analysis by reviewing the trigger phrase output of a commercially available voice transcription system in he domain of mammography. A commercially available PC-based voice dictation system is coupled to an expert system implemented on a microcomputer. Software employs the LISP and C computer languages. Mammo-Icon responds to the trigger phrase output of a voice dictation system with a textual discussion of the potential significance of the findings that have been described and a display of reference images that may help the radiologist to confirm a suspected diagnosis or consider additional diagnoses. This results in automatic availability of potentially useful computer-based expert advice, making such systems much more likely to be used in routine clinical practice

  9. Effects of Voice on Emotional Arousal

    Directory of Open Access Journals (Sweden)

    Psyche eLoui

    2013-10-01

    Full Text Available Music is a powerful medium capable of eliciting a broad range of emotions. Although the relationship between language and music is well documented, relatively little is known about the effects of lyrics and the voice on the emotional processing of music and on listeners’ preferences. In the present study, we investigated the effects of vocals in music on participants’ perceived valence and arousal in songs. Participants (N = 50 made valence and arousal ratings for familiar songs that were presented with and without the voice. We observed robust effects of vocal content on perceived arousal. Furthermore, we found that the effect of the voice on enhancing arousal ratings is independent of familiarity of the song and differs across genders and age: females were more influenced by vocals than males; furthermore these gender effects were enhanced among older adults. Results highlight the effects of gender and aging in emotion perception and are discussed in terms of the social roles of music.

  10. Measurement of Voice Onset Time in Maxillectomy Patients

    OpenAIRE

    Hattori, Mariko; Sumita, Yuka I.; Taniguchi, Hisashi

    2014-01-01

    Objective speech evaluation using acoustic measurement is needed for the proper rehabilitation of maxillectomy patients. For digital evaluation of consonants, measurement of voice onset time is one option. However, voice onset time has not been measured in maxillectomy patients as their consonant sound spectra exhibit unique characteristics that make the measurement of voice onset time challenging. In this study, we established criteria for measuring voice onset time in maxillectomy patients ...

  11. Influence of Smartphones and Software on Acoustic Voice Measures.

    OpenAIRE

    Elizabeth U. Grillo; Jenna N. Brosious; Staci L. Sorrell; Supraja Anand

    2016-01-01

    This study assessed the within-subject variability of voice measures captured using different recording devices (i.e., smartphones and head mounted microphone) and software programs (i.e., Analysis of Dysphonia in Speech and Voice (ADSV), Multi-dimensional Voice Program (MDVP), and Praat).  Correlations between the software programs that calculated the voice measures were also analyzed.  Results demonstrated no significant within-subject variability across devices and software and that some o...

  12. Adsorption characteristics, recognition properties, and preliminary application of nordihydroguaiaretic acid molecularly imprinted polymers prepared by sol–gel surface imprinting technology

    Energy Technology Data Exchange (ETDEWEB)

    Liao, Sen; Zhang, Wen; Long, Wei; Hou, Dan; Yang, Xuechun; Tan, Ni, E-mail: tannii@21cn.com

    2016-02-28

    Graphical abstract: - Highlights: • Nordihydroguaiaretic acid imprinted polymer with imprinting factor 2.12 was prepared for the first time through hydrogen bonding and hydrophobic interaction between the template molecules and the bifunctional monomers. • The obtained surface molecularly imprinting polymers exhibited high affinity and selectivity to the template molecules. • The prepared surface molecularly imprinted polymers were used in separation the natural active component nordihydroguaiaretic acid from medicinal plants. - Abstract: In this paper, a new core-shell composite of nordihydroguaiaretic acid (NDGA) molecularly imprinted polymers layer-coated silica gel (MIP@SiO{sub 2}) was prepared through sol–gel technique and applied as a material for extraction of NDGA from Ephedra. It was synthesized using NDGA as the template molecule, γ-aminopropyltriethoxysilane (APTS) and methyltriethoxysilane (MTEOS) as the functional monomers, tetraethyl orthosilicate (TEOS) as the cross-linker and ethanol as the porogenic solvent in the surface of silica. The non-imprinted polymers layer-coated silica gel (NIP@SiO{sub 2}) were prepared with the same procedure, but with the absence of template molecule. In addition, the optimum adsorption affinity occurred when the molar ratio of NDGA:APTS:MTEOS:TEOS was 1:6:2:80. The prepared MIP@SiO{sub 2} and NIP@SiO{sub 2} were analyzed by scanning electron microscopy (SEM), thermogravimetric analysis (TGA), and Fourier transform-infrared spectroscopy (FT-IR). Their affinity properties to NDGA were evaluated through dynamic adsorption, static adsorption, and selective recognition experiments, and the results showed the saturated adsorption capacity of MIP@SiO{sub 2} could reach to 5.90 mg g{sup −1}, which was two times more than that of NIP@SiO{sub 2}. High performance liquid chromatography (HPLC) was used to evaluate the extraction of NDGA from the medicinal plant ephedra by the above prepared materials, and the results

  13. Adsorption characteristics, recognition properties, and preliminary application of nordihydroguaiaretic acid molecularly imprinted polymers prepared by sol–gel surface imprinting technology

    International Nuclear Information System (INIS)

    Liao, Sen; Zhang, Wen; Long, Wei; Hou, Dan; Yang, Xuechun; Tan, Ni

    2016-01-01

    Graphical abstract: - Highlights: • Nordihydroguaiaretic acid imprinted polymer with imprinting factor 2.12 was prepared for the first time through hydrogen bonding and hydrophobic interaction between the template molecules and the bifunctional monomers. • The obtained surface molecularly imprinting polymers exhibited high affinity and selectivity to the template molecules. • The prepared surface molecularly imprinted polymers were used in separation the natural active component nordihydroguaiaretic acid from medicinal plants. - Abstract: In this paper, a new core-shell composite of nordihydroguaiaretic acid (NDGA) molecularly imprinted polymers layer-coated silica gel (MIP@SiO_2) was prepared through sol–gel technique and applied as a material for extraction of NDGA from Ephedra. It was synthesized using NDGA as the template molecule, γ-aminopropyltriethoxysilane (APTS) and methyltriethoxysilane (MTEOS) as the functional monomers, tetraethyl orthosilicate (TEOS) as the cross-linker and ethanol as the porogenic solvent in the surface of silica. The non-imprinted polymers layer-coated silica gel (NIP@SiO_2) were prepared with the same procedure, but with the absence of template molecule. In addition, the optimum adsorption affinity occurred when the molar ratio of NDGA:APTS:MTEOS:TEOS was 1:6:2:80. The prepared MIP@SiO_2 and NIP@SiO_2 were analyzed by scanning electron microscopy (SEM), thermogravimetric analysis (TGA), and Fourier transform-infrared spectroscopy (FT-IR). Their affinity properties to NDGA were evaluated through dynamic adsorption, static adsorption, and selective recognition experiments, and the results showed the saturated adsorption capacity of MIP@SiO_2 could reach to 5.90 mg g"−"1, which was two times more than that of NIP@SiO_2. High performance liquid chromatography (HPLC) was used to evaluate the extraction of NDGA from the medicinal plant ephedra by the above prepared materials, and the results indicated that the MIP@SiO_2 had

  14. Method for Improving EEG Based Emotion Recognition by Combining It with Synchronized Biometric and Eye Tracking Technologies in a Non-invasive and Low Cost Way.

    Science.gov (United States)

    López-Gil, Juan-Miguel; Virgili-Gomá, Jordi; Gil, Rosa; García, Roberto

    2016-01-01

    Technical advances, particularly the integration of wearable and embedded sensors, facilitate tracking of physiological responses in a less intrusive way. Currently, there are many devices that allow gathering biometric measurements from human beings, such as EEG Headsets or Health Bracelets. The massive data sets generated by tracking of EEG and physiology may be used, among other things, to infer knowledge about human moods and emotions. Apart from direct biometric signal measurement, eye tracking systems are nowadays capable of determining the point of gaze of the users when interacting in ICT environments, which provides an added value research on many different areas, such as psychology or marketing. We present a process in which devices for eye tracking, biometric, and EEG signal measurements are synchronously used for studying both basic and complex emotions. We selected the least intrusive devices for different signal data collection given the study requirements and cost constraints, so users would behave in the most natural way possible. On the one hand, we have been able to determine basic emotions participants were experiencing by means of valence and arousal. On the other hand, a complex emotion such as empathy has also been detected. To validate the usefulness of this approach, a study involving forty-four people has been carried out, where they were exposed to a series of affective stimuli while their EEG activity, biometric signals, and eye position were synchronously recorded to detect self-regulation. The hypothesis of the work was that people who self-regulated would show significantly different results when analyzing their EEG data. Participants were divided into two groups depending on whether Electro Dermal Activity (EDA) data indicated they self-regulated or not. The comparison of the results obtained using different machine learning algorithms for emotion recognition shows that using EEG activity alone as a predictor for self-regulation does

  15. Method for Improving EEG Based Emotion Recognition by Combining It with Synchronized Biometric and Eye Tracking Technologies in a Non-invasive and Low Cost Way

    Science.gov (United States)

    López-Gil, Juan-Miguel; Virgili-Gomá, Jordi; Gil, Rosa; Guilera, Teresa; Batalla, Iolanda; Soler-González, Jorge; García, Roberto

    2016-01-01

    Technical advances, particularly the integration of wearable and embedded sensors, facilitate tracking of physiological responses in a less intrusive way. Currently, there are many devices that allow gathering biometric measurements from human beings, such as EEG Headsets or Health Bracelets. The massive data sets generated by tracking of EEG and physiology may be used, among other things, to infer knowledge about human moods and emotions. Apart from direct biometric signal measurement, eye tracking systems are nowadays capable of determining the point of gaze of the users when interacting in ICT environments, which provides an added value research on many different areas, such as psychology or marketing. We present a process in which devices for eye tracking, biometric, and EEG signal measurements are synchronously used for studying both basic and complex emotions. We selected the least intrusive devices for different signal data collection given the study requirements and cost constraints, so users would behave in the most natural way possible. On the one hand, we have been able to determine basic emotions participants were experiencing by means of valence and arousal. On the other hand, a complex emotion such as empathy has also been detected. To validate the usefulness of this approach, a study involving forty-four people has been carried out, where they were exposed to a series of affective stimuli while their EEG activity, biometric signals, and eye position were synchronously recorded to detect self-regulation. The hypothesis of the work was that people who self-regulated would show significantly different results when analyzing their EEG data. Participants were divided into two groups depending on whether Electro Dermal Activity (EDA) data indicated they self-regulated or not. The comparison of the results obtained using different machine learning algorithms for emotion recognition shows that using EEG activity alone as a predictor for self-regulation does

  16. Speech Acquisition and Automatic Speech Recognition for Integrated Spacesuit Audio Systems

    Science.gov (United States)

    Huang, Yiteng; Chen, Jingdong; Chen, Shaoyan

    2010-01-01

    A voice-command human-machine interface system has been developed for spacesuit extravehicular activity (EVA) missions. A multichannel acoustic signal processing method has been created for distant speech acquisition in noisy and reverberant environments. This technology reduces noise by exploiting differences in the statistical nature of signal (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, the automatic speech recognition (ASR) accuracy can be improved to the level at which crewmembers would find the speech interface useful. The developed speech human/machine interface will enable both crewmember usability and operational efficiency. It can enjoy a fast rate of data/text entry, small overall size, and can be lightweight. In addition, this design will free the hands and eyes of a suited crewmember. The system components and steps include beam forming/multi-channel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, model adaption, ASR HMM (Hidden Markov Model) training, and ASR decoding. A state-of-the-art phoneme recognizer can obtain an accuracy rate of 65 percent when the training and testing data are free of noise. When it is used in spacesuits, the rate drops to about 33 percent. With the developed microphone array speech-processing technologies, the performance is improved and the phoneme recognition accuracy rate rises to 44 percent. The recognizer can be further improved by combining the microphone array and HMM model adaptation techniques and using speech samples collected from inside spacesuits. In addition, arithmetic complexity models for the major HMMbased ASR components were developed. They can help real-time ASR system designers select proper tasks when in the face of constraints in computational resources.

  17. The electronic cry: Voice and gender in electroacoustic music

    NARCIS (Netherlands)

    Bosma, H.M.

    2013-01-01

    The voice provides an entrance to discuss gender and related fundamental issues in electroacoustic music that are relevant as well in other musical genres and outside of music per se: the role of the female voice; the use of language versus non-verbal vocal sounds; the relation of voice, embodiment

  18. Original Knowledge, Gender and the Word's Mythology: Voicing the Doctorate

    Science.gov (United States)

    Carter, Susan

    2012-01-01

    Using mythology as a generative matrix, this article investigates the relationship between knowledge, words, embodiment and gender as they play out in academic writing's voice and, in particular, in doctoral voice. The doctoral thesis is defensive, a performance seeking admittance into discipline scholarship. Yet in finding its scholarly voice,…

  19. The Influence of Sleep Disorders on Voice Quality.

    Science.gov (United States)

    Rocha, Bruna Rainho; Behlau, Mara

    2017-09-19

    To verify the influence of sleep quality on the voice. Descriptive and analytical cross-sectional study. Data were collected by an online or printed survey divided in three parts: (1) demographic data and vocal health aspects; (2) self-assessment of sleep and vocal quality, and the influence that sleep has on voice; and (3) sleep and voice self-assessment inventories-the Epworth Sleepiness Scale (ESS), the Pittsburgh Sleep Quality Index (PSQI), and the Voice Handicap Index reduced version (VHI-10). A total of 862 people were included (493 women, 369 men), with a mean age of 32 years old (maximum age of 79 and minimum age of 18 years old). The perception of the influence that sleep has on voice showed a difference (P influence a voice handicap are vocal self-assessment, ESS total score, and self-assessment of the influence that sleep has on voice. The absence of daytime sleepiness is a protective factor (odds ratio [OR] > 1) against perceived voice handicap; the presence of daytime sleepiness is a damaging factor (OR influences voice. Perceived poor sleep quality is related to perceived poor vocal quality. Individuals with a voice handicap observe a greater influence of sleep on voice than those without. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  20. Acoustic Analysis of Voice in Singers: A Systematic Review

    Science.gov (United States)

    Gunjawate, Dhanshree R.; Ravi, Rohit; Bellur, Rajashekhar

    2018-01-01

    Purpose: Singers are vocal athletes having specific demands from their voice and require special consideration during voice evaluation. Presently, there is a lack of standards for acoustic evaluation in them. The aim of the present study was to systematically review the available literature on the acoustic analysis of voice in singers. Method: A…