WorldWideScience

Sample records for voice recognition software

  1. Voice recognition software can be used for scientific articles

    DEFF Research Database (Denmark)

    Pommergaard, Hans-Christian; Huang, Chenxi; Burcharth, Jacob

    2015-01-01

    INTRODUCTION: Dictation of scientific articles has been recognised as an efficient method for producing high-quality, first article drafts. However, standardised transcription service by a secretary may not be available for all researchers and voice recognition software (VRS) may therefore...... with a median score of five (range: 3-9), which was improved with the addition of 5,000 words. CONCLUSION: The out-of-the-box performance of VRS was acceptable and improved after additional words were added. Further studies are needed to investigate the effect of additional software accuracy training....

  2. Voice recognition software can be used for scientific articles

    DEFF Research Database (Denmark)

    Pommergaard, Hans-Christian; Huang, Chenxi; Burcharth, Jacob

    2015-01-01

    INTRODUCTION: Dictation of scientific articles has been recognised as an efficient method for producing high-quality, first article drafts. However, standardised transcription service by a secretary may not be available for all researchers and voice recognition software (VRS) may therefore...... be an alternative. The purpose of this study was to evaluate the out-of-the-box accuracy of VRS. METHODS: Eleven young researchers without dictation experience dictated the first draft of their own scientific article after thorough preparation according to a pre-defined schedule. The dictate transcribed by VRS...

  3. Voice recognition software can be used for scientific articles.

    Science.gov (United States)

    Pommergaard, Hans-Christian; Huang, Chenxi; Burcharth, Jacob; Rosenberg, Jacob

    2015-02-01

    Dictation of scientific articles has been recognised as an efficient method for producing high-quality, first article drafts. However, standardised transcription service by a secretary may not be available for all researchers and voice recognition software (VRS) may therefore be an alternative. The purpose of this study was to evaluate the out-of-the-box accuracy of VRS. Eleven young researchers without dictation experience dictated the first draft of their own scientific article after thorough preparation according to a pre-defined schedule. The dictate transcribed by VRS was compared with the same dictate transcribed by an experienced research secretary, and the effect of adding words to the vocabulary of the VRS was investigated. The number of errors per hundred words was used as outcome. Furthermore, three experienced researchers assessed the subjective readability using a Likert scale (0-10). Dragon Nuance Premium version 12.5 was used as VRS. The median number of errors per hundred words was 18 (range: 8.5-24.3), which improved when 15,000 words were added to the vocabulary. Subjective readability assessment showed that the texts were understandable with a median score of five (range: 3-9), which was improved with the addition of 5,000 words. The out-of-the-box performance of VRS was acceptable and improved after additional words were added. Further studies are needed to investigate the effect of additional software accuracy training.

  4. Use of voice recognition software in an outpatient pediatric specialty practice.

    Science.gov (United States)

    Issenman, Robert M; Jaffer, Iqbal H

    2004-09-01

    Voice recognition software (VRS), with specialized medical vocabulary, is being promoted to enhance physician efficiency, decrease costs, and improve patient safety. This study reports the experience of a pediatric subspecialist (pediatric gastroenterology) physician with the use of Dragon Naturally Speaking (version 6; ScanSoft Inc, Peabody, MA), incorporated for use with a proprietary electronic medical record, in a large university medical center ambulatory care service. After 2 hours of group orientation and 2 hours of individual VRS instruction, the physician trained the software for 1 month (30 letters) during a hospital slowdown. Set-up, dictation, and correction times for the physician and medical transcriptionist were recorded for these training sessions, as well as for 42 subsequently dictated letters. Figures were extrapolated to the yearly clinic volume for the physician, to estimate costs (physician: 110 dollars per hour; transcriptionist: 11 dollars per hour, US dollars). The use of VRS required an additional 200% of physician dictation and correction time (9 minutes vs 3 minutes), compared with the use of electronic signatures for letters typed by an experienced transcriptionist and imported into the electronic medical record. When the cost of the license agreement and the costs of physician and transcriptionist time were included, the use of the software cost 100% more, for the amount of dictation performed annually by the physician. VRS is an intriguing technology. It holds the possibility of streamlining medical practice. However, the learning curve and accuracy of the tested version of the software limit broad physician acceptance at this time.

  5. Success with voice recognition.

    Science.gov (United States)

    Sferrella, Sheila M

    2003-01-01

    You need a compelling reason to implement voice recognition technology. At my institution, the compelling reason was a turnaround time for Radiology results of more than two days. Only 41 percent of our reports were transcribed and signed within 24 hours. In November 1998, a team from Lehigh Valley Hospital went to RSNA and reviewed every voice system on the market. The evaluation was done with the radiologist workflow in mind, and we came back from the meeting with the vendor selection completed. The next steps included developing a business plan, approval of funds, reference calls to more than 15 sites and contract negotiation, all of which took about six months. The department of Radiology at Lehigh Valley Hospital and Health Network (LVHHN) is a multi-site center that performs over 360,000 procedures annually. The department handles all modalities of radiology: general diagnosis, neuroradiology, ultrasound, CT Scan, MRI, interventional radiology, arthography, myelography, bone densitometry, nuclear medicine, PET imaging, vascular lab and other advanced procedures. The department consists of 200 FTEs and a medical staff of more than 40 radiologists. The budget is in the $10.3 million range. There are three hospital sites and four outpatient imaging center sites where services are provided. At Lehigh Valley Hospital, radiologists are not dedicated to one subspecialty, so implementing a voice system by modality was not an option. Because transcription was so far behind, we needed to eliminate that part of the process. As a result, we decided to deploy the system all at once and with the radiologists as editors. The planning and testing phase took about four months, and the implementation took two weeks. We deployed over 40 workstations and trained close to 50 physicians. The radiologists brought in an extra radiologist from our group for the two weeks of training. That allowed us to train without taking a radiologist out of the department. We trained three to six

  6. FILTWAM and Voice Emotion Recognition

    NARCIS (Netherlands)

    Bahreini, Kiavash; Nadolski, Rob; Westera, Wim

    2014-01-01

    This paper introduces the voice emotion recognition part of our framework for improving learning through webcams and microphones (FILTWAM). This framework enables multimodal emotion recognition of learners during game-based learning. The main goal of this study is to validate the use of microphone

  7. Voice congruency facilitates word recognition.

    Directory of Open Access Journals (Sweden)

    Sandra Campeanu

    Full Text Available Behavioral studies of spoken word memory have shown that context congruency facilitates both word and source recognition, though the level at which context exerts its influence remains equivocal. We measured event-related potentials (ERPs while participants performed both types of recognition task with words spoken in four voices. Two voice parameters (i.e., gender and accent varied between speakers, with the possibility that none, one or two of these parameters was congruent between study and test. Results indicated that reinstating the study voice at test facilitated both word and source recognition, compared to similar or no context congruency at test. Behavioral effects were paralleled by two ERP modulations. First, in the word recognition test, the left parietal old/new effect showed a positive deflection reflective of context congruency between study and test words. Namely, the same speaker condition provided the most positive deflection of all correctly identified old words. In the source recognition test, a right frontal positivity was found for the same speaker condition compared to the different speaker conditions, regardless of response success. Taken together, the results of this study suggest that the benefit of context congruency is reflected behaviorally and in ERP modulations traditionally associated with recognition memory.

  8. Voice congruency facilitates word recognition.

    Science.gov (United States)

    Campeanu, Sandra; Craik, Fergus I M; Alain, Claude

    2013-01-01

    Behavioral studies of spoken word memory have shown that context congruency facilitates both word and source recognition, though the level at which context exerts its influence remains equivocal. We measured event-related potentials (ERPs) while participants performed both types of recognition task with words spoken in four voices. Two voice parameters (i.e., gender and accent) varied between speakers, with the possibility that none, one or two of these parameters was congruent between study and test. Results indicated that reinstating the study voice at test facilitated both word and source recognition, compared to similar or no context congruency at test. Behavioral effects were paralleled by two ERP modulations. First, in the word recognition test, the left parietal old/new effect showed a positive deflection reflective of context congruency between study and test words. Namely, the same speaker condition provided the most positive deflection of all correctly identified old words. In the source recognition test, a right frontal positivity was found for the same speaker condition compared to the different speaker conditions, regardless of response success. Taken together, the results of this study suggest that the benefit of context congruency is reflected behaviorally and in ERP modulations traditionally associated with recognition memory.

  9. Voice Recognition Interface in the Rehabilitation of Combat Amputees

    National Research Council Canada - National Science Library

    Lenhart, Martha; Yancosek, Kathleen E

    2004-01-01

    The goal of this pilot study is to assess the impact of training on voice recognition software as part of the rehabilitation process that Military patients with amputation, or peripheral nerve loss...

  10. An investigation and comparison of speech recognition software for determining if bird song recordings contain legible human voices

    Directory of Open Access Journals (Sweden)

    Tim D. Hunt

    Full Text Available The purpose of this work was to test the effectiveness of using readily available speech recognition API services to determine if recordings of bird song had inadvertently recorded human voices. A mobile phone was used to record a human speaking at increasing distances from the phone in an outside setting with bird song occurring in the background. One of the services was trained with sample recordings and each service was compared for their ability to return recognized words. The services from Google and IBM performed similarly and the Microsoft service, that allowed training, performed slightly better. However, all three services failed to perform at a level that would enable recordings with recognizable human speech to be deleted in order to maintain full privacy protection.

  11. Frequency and analysis of non-clinical errors made in radiology reports using the National Integrated Medical Imaging System voice recognition dictation software.

    Science.gov (United States)

    Motyer, R E; Liddy, S; Torreggiani, W C; Buckley, O

    2016-11-01

    Voice recognition (VR) dictation of radiology reports has become the mainstay of reporting in many institutions worldwide. Despite benefit, such software is not without limitations, and transcription errors have been widely reported. Evaluate the frequency and nature of non-clinical transcription error using VR dictation software. Retrospective audit of 378 finalised radiology reports. Errors were counted and categorised by significance, error type and sub-type. Data regarding imaging modality, report length and dictation time was collected. 67 (17.72 %) reports contained ≥1 errors, with 7 (1.85 %) containing 'significant' and 9 (2.38 %) containing 'very significant' errors. A total of 90 errors were identified from the 378 reports analysed, with 74 (82.22 %) classified as 'insignificant', 7 (7.78 %) as 'significant', 9 (10 %) as 'very significant'. 68 (75.56 %) errors were 'spelling and grammar', 20 (22.22 %) 'missense' and 2 (2.22 %) 'nonsense'. 'Punctuation' error was most common sub-type, accounting for 27 errors (30 %). Complex imaging modalities had higher error rates per report and sentence. Computed tomography contained 0.040 errors per sentence compared to plain film with 0.030. Longer reports had a higher error rate, with reports >25 sentences containing an average of 1.23 errors per report compared to 0-5 sentences containing 0.09. These findings highlight the limitations of VR dictation software. While most error was deemed insignificant, there were occurrences of error with potential to alter report interpretation and patient management. Longer reports and reports on more complex imaging had higher error rates and this should be taken into account by the reporting radiologist.

  12. Voice Recognition in Face-Blind Patients

    Science.gov (United States)

    Liu, Ran R.; Pancaroglu, Raika; Hills, Charlotte S.; Duchaine, Brad; Barton, Jason J. S.

    2016-01-01

    Right or bilateral anterior temporal damage can impair face recognition, but whether this is an associative variant of prosopagnosia or part of a multimodal disorder of person recognition is an unsettled question, with implications for cognitive and neuroanatomic models of person recognition. We assessed voice perception and short-term recognition of recently heard voices in 10 subjects with impaired face recognition acquired after cerebral lesions. All 4 subjects with apperceptive prosopagnosia due to lesions limited to fusiform cortex had intact voice discrimination and recognition. One subject with bilateral fusiform and anterior temporal lesions had a combined apperceptive prosopagnosia and apperceptive phonagnosia, the first such described case. Deficits indicating a multimodal syndrome of person recognition were found only in 2 subjects with bilateral anterior temporal lesions. All 3 subjects with right anterior temporal lesions had normal voice perception and recognition, 2 of whom performed normally on perceptual discrimination of faces. This confirms that such lesions can cause a modality-specific associative prosopagnosia. PMID:25349193

  13. Implicit multisensory associations influence voice recognition.

    Directory of Open Access Journals (Sweden)

    Katharina von Kriegstein

    2006-10-01

    Full Text Available Natural objects provide partially redundant information to the brain through different sensory modalities. For example, voices and faces both give information about the speech content, age, and gender of a person. Thanks to this redundancy, multimodal recognition is fast, robust, and automatic. In unimodal perception, however, only part of the information about an object is available. Here, we addressed whether, even under conditions of unimodal sensory input, crossmodal neural circuits that have been shaped by previous associative learning become activated and underpin a performance benefit. We measured brain activity with functional magnetic resonance imaging before, while, and after participants learned to associate either sensory redundant stimuli, i.e. voices and faces, or arbitrary multimodal combinations, i.e. voices and written names, ring tones, and cell phones or brand names of these cell phones. After learning, participants were better at recognizing unimodal auditory voices that had been paired with faces than those paired with written names, and association of voices with faces resulted in an increased functional coupling between voice and face areas. No such effects were observed for ring tones that had been paired with cell phones or names. These findings demonstrate that brief exposure to ecologically valid and sensory redundant stimulus pairs, such as voices and faces, induces specific multisensory associations. Consistent with predictive coding theories, associative representations become thereafter available for unimodal perception and facilitate object recognition. These data suggest that for natural objects effective predictive signals can be generated across sensory systems and proceed by optimization of functional connectivity between specialized cortical sensory modules.

  14. Robotics control using isolated word recognition of voice input

    Science.gov (United States)

    Weiner, J. M.

    1977-01-01

    A speech input/output system is presented that can be used to communicate with a task oriented system. Human speech commands and synthesized voice output extend conventional information exchange capabilities between man and machine by utilizing audio input and output channels. The speech input facility is comprised of a hardware feature extractor and a microprocessor implemented isolated word or phrase recognition system. The recognizer offers a medium sized (100 commands), syntactically constrained vocabulary, and exhibits close to real time performance. The major portion of the recognition processing required is accomplished through software, minimizing the complexity of the hardware feature extractor.

  15. Robust matching for voice recognition

    Science.gov (United States)

    Higgins, Alan; Bahler, L.; Porter, J.; Blais, P.

    1994-10-01

    This paper describes an automated method of comparing a voice sample of an unknown individual with samples from known speakers in order to establish or verify the individual's identity. The method is based on a statistical pattern matching approach that employs a simple training procedure, requires no human intervention (transcription, work or phonetic marketing, etc.), and makes no assumptions regarding the expected form of the statistical distributions of the observations. The content of the speech material (vocabulary, grammar, etc.) is not assumed to be constrained in any way. An algorithm is described which incorporates frame pruning and channel equalization processes designed to achieve robust performance with reasonable computational resources. An experimental implementation demonstrating the feasibility of the concept is described.

  16. Pegembangan Game dengan Menggunakan Teknologi Voice Recognition Berbasis Android

    Directory of Open Access Journals (Sweden)

    Franky Hadinata Marpaung

    2014-06-01

    Full Text Available The purpose of this research is to create a new kind of game by using technology that rarely used in current games. It is developed as an entertainment media and also a social media in which the users can play the games together via multiplayer mode. This research uses Scrum development method since it supports small scaled developer and it supports software increment along the development. Using this game application, the users can play and watch interesting animations by controlling it with their voice, listen the character imitating the users’ voice, play various mini games both in single player or multiplayer mode via Bluetooth connection. The conclusion is that game application of My Name is Dug use voice recognition and inter-devices connection as its main features. It also has various mini games that support both single player and multiplayer.

  17. Obligatory and facultative brain regions for voice-identity recognition

    Science.gov (United States)

    Roswandowitz, Claudia; Kappes, Claudia; Obrig, Hellmuth; von Kriegstein, Katharina

    2018-01-01

    Abstract Recognizing the identity of others by their voice is an important skill for social interactions. To date, it remains controversial which parts of the brain are critical structures for this skill. Based on neuroimaging findings, standard models of person-identity recognition suggest that the right temporal lobe is the hub for voice-identity recognition. Neuropsychological case studies, however, reported selective deficits of voice-identity recognition in patients predominantly with right inferior parietal lobe lesions. Here, our aim was to work towards resolving the discrepancy between neuroimaging studies and neuropsychological case studies to find out which brain structures are critical for voice-identity recognition in humans. We performed a voxel-based lesion-behaviour mapping study in a cohort of patients (n = 58) with unilateral focal brain lesions. The study included a comprehensive behavioural test battery on voice-identity recognition of newly learned (voice-name, voice-face association learning) and familiar voices (famous voice recognition) as well as visual (face-identity recognition) and acoustic control tests (vocal-pitch and vocal-timbre discrimination). The study also comprised clinically established tests (neuropsychological assessment, audiometry) and high-resolution structural brain images. The three key findings were: (i) a strong association between voice-identity recognition performance and right posterior/mid temporal and right inferior parietal lobe lesions; (ii) a selective association between right posterior/mid temporal lobe lesions and voice-identity recognition performance when face-identity recognition performance was factored out; and (iii) an association of right inferior parietal lobe lesions with tasks requiring the association between voices and faces but not voices and names. The results imply that the right posterior/mid temporal lobe is an obligatory structure for voice-identity recognition, while the inferior parietal

  18. Improving Speaker Recognition by Biometric Voice Deconstruction

    Directory of Open Access Journals (Sweden)

    Luis Miguel eMazaira-Fernández

    2015-09-01

    Full Text Available Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g. YouTube to broadcast its message. In this new scenario, classical identification methods (such fingerprints or face recognition have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. Through the present paper, a new methodology to characterize speakers will be shown. This methodology is benefiting from the advances achieved during the last years in understanding and modelling voice production. The paper hypothesizes that a gender dependent characterization of speakers combined with the use of a new set of biometric parameters extracted from the components resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract gender-dependent extended biometric parameters are given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions.

  19. Improving Speaker Recognition by Biometric Voice Deconstruction

    Science.gov (United States)

    Mazaira-Fernandez, Luis Miguel; Álvarez-Marquina, Agustín; Gómez-Vilda, Pedro

    2015-01-01

    Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g., YouTube) to broadcast its message. In this new scenario, classical identification methods (such as fingerprints or face recognition) have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. The present study benefits from the advances achieved during last years in understanding and modeling voice production. The paper hypothesizes that a gender-dependent characterization of speakers combined with the use of a set of features derived from the components, resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract the gender-dependent extended biometric parameters is given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions. PMID:26442245

  20. Influence of Smartphones and Software on Acoustic Voice Measures.

    OpenAIRE

    Elizabeth U. Grillo; Jenna N. Brosious; Staci L. Sorrell; Supraja Anand

    2016-01-01

    This study assessed the within-subject variability of voice measures captured using different recording devices (i.e., smartphones and head mounted microphone) and software programs (i.e., Analysis of Dysphonia in Speech and Voice (ADSV), Multi-dimensional Voice Program (MDVP), and Praat).  Correlations between the software programs that calculated the voice measures were also analyzed.  Results demonstrated no significant within-subject variability across devices and software and that some o...

  1. Obligatory and facultative brain regions for voice-identity recognition.

    Science.gov (United States)

    Roswandowitz, Claudia; Kappes, Claudia; Obrig, Hellmuth; von Kriegstein, Katharina

    2018-01-01

    Recognizing the identity of others by their voice is an important skill for social interactions. To date, it remains controversial which parts of the brain are critical structures for this skill. Based on neuroimaging findings, standard models of person-identity recognition suggest that the right temporal lobe is the hub for voice-identity recognition. Neuropsychological case studies, however, reported selective deficits of voice-identity recognition in patients predominantly with right inferior parietal lobe lesions. Here, our aim was to work towards resolving the discrepancy between neuroimaging studies and neuropsychological case studies to find out which brain structures are critical for voice-identity recognition in humans. We performed a voxel-based lesion-behaviour mapping study in a cohort of patients (n = 58) with unilateral focal brain lesions. The study included a comprehensive behavioural test battery on voice-identity recognition of newly learned (voice-name, voice-face association learning) and familiar voices (famous voice recognition) as well as visual (face-identity recognition) and acoustic control tests (vocal-pitch and vocal-timbre discrimination). The study also comprised clinically established tests (neuropsychological assessment, audiometry) and high-resolution structural brain images. The three key findings were: (i) a strong association between voice-identity recognition performance and right posterior/mid temporal and right inferior parietal lobe lesions; (ii) a selective association between right posterior/mid temporal lobe lesions and voice-identity recognition performance when face-identity recognition performance was factored out; and (iii) an association of right inferior parietal lobe lesions with tasks requiring the association between voices and faces but not voices and names. The results imply that the right posterior/mid temporal lobe is an obligatory structure for voice-identity recognition, while the inferior parietal lobe is

  2. Familiar Person Recognition: Is Autonoetic Consciousness More Likely to Accompany Face Recognition Than Voice Recognition?

    Science.gov (United States)

    Barsics, Catherine; Brédart, Serge

    2010-11-01

    Autonoetic consciousness is a fundamental property of human memory, enabling us to experience mental time travel, to recollect past events with a feeling of self-involvement, and to project ourselves in the future. Autonoetic consciousness is a characteristic of episodic memory. By contrast, awareness of the past associated with a mere feeling of familiarity or knowing relies on noetic consciousness, depending on semantic memory integrity. Present research was aimed at evaluating whether conscious recollection of episodic memories is more likely to occur following the recognition of a familiar face than following the recognition of a familiar voice. Recall of semantic information (biographical information) was also assessed. Previous studies that investigated the recall of biographical information following person recognition used faces and voices of famous people as stimuli. In this study, the participants were presented with personally familiar people's voices and faces, thus avoiding the presence of identity cues in the spoken extracts and allowing a stricter control of frequency exposure with both types of stimuli (voices and faces). In the present study, the rate of retrieved episodic memories, associated with autonoetic awareness, was significantly higher from familiar faces than familiar voices even though the level of overall recognition was similar for both these stimuli domains. The same pattern was observed regarding semantic information retrieval. These results and their implications for current Interactive Activation and Competition person recognition models are discussed.

  3. Influence of Smartphones and Software on Acoustic Voice Measures.

    Directory of Open Access Journals (Sweden)

    Elizabeth U. Grillo

    2016-12-01

    Full Text Available This study assessed the within-subject variability of voice measures captured using different recording devices (i.e., smartphones and head mounted microphone and software programs (i.e., Analysis of Dysphonia in Speech and Voice (ADSV, Multi-dimensional Voice Program (MDVP, and Praat.  Correlations between the software programs that calculated the voice measures were also analyzed.  Results demonstrated no significant within-subject variability across devices and software and that some of the measures were highly correlated across software programs.  The study suggests that certain smartphones may be appropriate to record daily voice measures representing the effects of vocal loading within individuals.  In addition, even though different algorithms are used to compute voice measures across software programs, some of the programs and measures share a similar relationship.

  4. Electrolarynx Voice Recognition Utilizing Pulse Coupled Neural Network

    Directory of Open Access Journals (Sweden)

    Fatchul Arifin

    2010-08-01

    Full Text Available The laryngectomies patient has no ability to speak normally because their vocal chords have been removed. The easiest option for the patient to speak again is by using electrolarynx speech. This tool is placed on the lower chin. Vibration of the neck while speaking is used to produce sound. Meanwhile, the technology of "voice recognition" has been growing very rapidly. It is expected that the technology of "voice recognition" can also be used by laryngectomies patients who use electrolarynx.This paper describes a system for electrolarynx speech recognition. Two main parts of the system are feature extraction and pattern recognition. The Pulse Coupled Neural Network – PCNN is used to extract the feature and characteristic of electrolarynx speech. Varying of β (one of PCNN parameter also was conducted. Multi layer perceptron is used to recognize the sound patterns. There are two kinds of recognition conducted in this paper: speech recognition and speaker recognition. The speech recognition recognizes specific speech from every people. Meanwhile, speaker recognition recognizes specific speech from specific person. The system ran well. The "electrolarynx speech recognition" has been tested by recognizing of “A” and "not A" voice. The results showed that the system had 94.4% validation. Meanwhile, the electrolarynx speaker recognition has been tested by recognizing of “saya” voice from some different speakers. The results showed that the system had 92.2% validation. Meanwhile, the best β parameter of PCNN for electrolarynx recognition is 3.

  5. Investigations of Hemispheric Specialization of Self-Voice Recognition

    Science.gov (United States)

    Rosa, Christine; Lassonde, Maryse; Pinard, Claudine; Keenan, Julian Paul; Belin, Pascal

    2008-01-01

    Three experiments investigated functional asymmetries related to self-recognition in the domain of voices. In Experiment 1, participants were asked to identify one of three presented voices (self, familiar or unknown) by responding with either the right or the left-hand. In Experiment 2, participants were presented with auditory morphs between the…

  6. When the face fits: recognition of celebrities from matching and mismatching faces and voices.

    Science.gov (United States)

    Stevenage, Sarah V; Neil, Greg J; Hamlin, Iain

    2014-01-01

    The results of two experiments are presented in which participants engaged in a face-recognition or a voice-recognition task. The stimuli were face-voice pairs in which the face and voice were co-presented and were either "matched" (same person), "related" (two highly associated people), or "mismatched" (two unrelated people). Analysis in both experiments confirmed that accuracy and confidence in face recognition was consistently high regardless of the identity of the accompanying voice. However accuracy of voice recognition was increasingly affected as the relationship between voice and accompanying face declined. Moreover, when considering self-reported confidence in voice recognition, confidence remained high for correct responses despite the proportion of these responses declining across conditions. These results converged with existing evidence indicating the vulnerability of voice recognition as a relatively weak signaller of identity, and results are discussed in the context of a person-recognition framework.

  7. Acoustic cues for the recognition of self-voice and other-voice

    Directory of Open Access Journals (Sweden)

    Mingdi eXu

    2013-10-01

    Full Text Available Self-recognition, being indispensable for successful social communication, has become a major focus in current social neuroscience. The physical aspects of the self are most typically manifested in the face and voice. Compared with the wealth of studies on self-face recognition, self-voice recognition (SVR has not gained much attention. Converging evidence has suggested that the fundamental frequency (F0 and formant structures serve as the key acoustic cues for other-voice recognition (OVR. However, little is known about which, and how, acoustic cues are utilized for SVR as opposed to OVR. To address this question, we independently manipulated the F0 and formant information of recorded voices and investigated their contributions to SVR and OVR. Japanese participants were presented with recorded vocal stimuli and were asked to identify the speaker—either themselves or one of their peers. Six groups of 5 peers of the same sex participated in the study. Under conditions where the formant information was fully preserved and where only the frequencies lower than the third formant (F3 were retained, accuracies of SVR deteriorated significantly with the modulation of the F0, and the results were comparable for OVR. By contrast, under a condition where only the frequencies higher than F3 were retained, the accuracy of SVR was significantly higher than that of OVR throughout the range of F0 modulations, and the F0 scarcely affected the accuracies of SVR and OVR. Our results indicate that while both F0 and formant information are involved in SVR, as well as in OVR, the advantage of SVR is manifested only when major formant information for speech intelligibility is absent. These findings imply the robustness of self-voice representation, possibly by virtue of auditory familiarity and other factors such as its association with motor/articulatory representation.

  8. A Robust Multimodal Bio metric Authentication Scheme with Voice and Face Recognition

    International Nuclear Information System (INIS)

    Kasban, H.

    2017-01-01

    This paper proposes a multimodal biometric scheme for human authentication based on fusion of voice and face recognition. For voice recognition, three categories of features (statistical coefficients, cepstral coefficients and voice timbre) are used and compared. The voice identification modality is carried out using Gaussian Mixture Model (GMM). For face recognition, three recognition methods (Eigenface, Linear Discriminate Analysis (LDA), and Gabor filter) are used and compared. The combination of voice and face biometrics systems into a single multimodal biometrics system is performed using features fusion and scores fusion. This study shows that the best results are obtained using all the features (cepstral coefficients, statistical coefficients and voice timbre features) for voice recognition, LDA face recognition method and scores fusion for the multimodal biometrics system

  9. Voice reinstatement modulates neural indices of continuous word recognition.

    Science.gov (United States)

    Campeanu, Sandra; Craik, Fergus I M; Backer, Kristina C; Alain, Claude

    2014-09-01

    The present study was designed to examine listeners' ability to use voice information incidentally during spoken word recognition. We recorded event-related brain potentials (ERPs) during a continuous recognition paradigm in which participants indicated on each trial whether the spoken word was "new" or "old." Old items were presented at 2, 8 or 16 words following the first presentation. Context congruency was manipulated by having the same word repeated by either the same speaker or a different speaker. The different speaker could share the gender, accent or neither feature with the word presented the first time. Participants' accuracy was greatest when the old word was spoken by the same speaker than by a different speaker. In addition, accuracy decreased with increasing lag. The correct identification of old words was accompanied by an enhanced late positivity over parietal sites, with no difference found between voice congruency conditions. In contrast, an earlier voice reinstatement effect was observed over frontal sites, an index of priming that preceded recollection in this task. Our results provide further evidence that acoustic and semantic information are integrated into a unified trace and that acoustic information facilitates spoken word recollection. Copyright © 2014 Elsevier Ltd. All rights reserved.

  10. The Neuropsychology of Familiar Person Recognition from Face and Voice

    Directory of Open Access Journals (Sweden)

    Guido Gainotti

    2014-05-01

    Full Text Available Prosopagnosia has been considered for a long period of time as the most important and almost exclusive disorder in the recognition of familiar people. In recent years, however, this conviction has been undermined by the description of patients showing a concomitant defect in the recognition of familiar faces and voices as a consequence of lesions encroaching upon the right anterior temporal lobe (ATL. These new data have obliged researchers to reconsider on one hand the construct of ‘associative prosopagnosia’ and on the other hand current models of people recognition. A systematic review of the patterns of familiar people recognition disorders observed in patients with right and left ATL lesions has shown that in patients with right ATL lesions face familiarity feelings and the retrieval of person-specific semantic information from faces are selectively affected, whereas in patients with left ATL lesions the defect selectively concerns famous people naming. Furthermore, some patients with right ATL lesions and intact face familiarity feelings show a defect in the retrieval of person-specific semantic knowledge greater from face than from name. These data are at variance with current models assuming: (a that familiarity feelings are generated at the level of person identity nodes (PINs where information processed by various sensory modalities converge, and (b that PINs provide a modality-free gateway to a single semantic system, where information about people is stored in an amodal format. They suggest, on the contrary: (a that familiarity feelings are generated at the level of modality-specific recognition units; (b that face and voice recognition units are represented more in the right than in the left ATLs; (c that in the right ATL are mainly stored person-specific information based on a convergence of perceptual information, whereas in the left ATLs are represented verbally-mediated person-specific information.

  11. Understanding the mechanisms of familiar voice-identity recognition in the human brain.

    Science.gov (United States)

    Maguinness, Corrina; Roswandowitz, Claudia; von Kriegstein, Katharina

    2018-03-31

    Humans have a remarkable skill for voice-identity recognition: most of us can remember many voices that surround us as 'unique'. In this review, we explore the computational and neural mechanisms which may support our ability to represent and recognise a unique voice-identity. We examine the functional architecture of voice-sensitive regions in the superior temporal gyrus/sulcus, and bring together findings on how these regions may interact with each other, and additional face-sensitive regions, to support voice-identity processing. We also contrast findings from studies on neurotypicals and clinical populations which have examined the processing of familiar and unfamiliar voices. Taken together, the findings suggest that representations of familiar and unfamiliar voices might dissociate in the human brain. Such an observation does not fit well with current models for voice-identity processing, which by-and-large assume a common sequential analysis of the incoming voice signal, regardless of voice familiarity. We provide a revised audio-visual integrative model of voice-identity processing which brings together traditional and prototype models of identity processing. This revised model includes a mechanism of how voice-identity representations are established and provides a novel framework for understanding and examining the potential differences in familiar and unfamiliar voice processing in the human brain. Copyright © 2018 Elsevier Ltd. All rights reserved.

  12. Literature review of voice recognition and generation technology for Army helicopter applications

    Science.gov (United States)

    Christ, K. A.

    1984-08-01

    This report is a literature review on the topics of voice recognition and generation. Areas covered are: manual versus vocal data input, vocabulary, stress and workload, noise, protective masks, feedback, and voice warning systems. Results of the studies presented in this report indicate that voice data entry has less of an impact on a pilot's flight performance, during low-level flying and other difficult missions, than manual data entry. However, the stress resulting from such missions may cause the pilot's voice to change, reducing the recognition accuracy of the system. The noise present in helicopter cockpits also causes the recognition accuracy to decrease. Noise-cancelling devices are being developed and improved upon to increase the recognition performance in noisy environments. Future research in the fields of voice recognition and generation should be conducted in the areas of stress and workload, vocabulary, and the types of voice generation best suited for the helicopter cockpit. Also, specific tasks should be studied to determine whether voice recognition and generation can be effectively applied.

  13. Effect of voice recognition on radiologist reporting time

    International Nuclear Information System (INIS)

    Bhan, S.N.; Coblentz, C.L.; Norman, G.R.; Ali, S.H.

    2008-01-01

    To study the effect that voice recognition (VR) has on radiologist reporting efficiency in a clinical setting and to identify variables associated with faster reporting time. Five radiologists were observed during the routine reporting of 402 plain radiograph studies using either VR (n 217) or conventional dictation (CD) (n = 185). Two radiologists were observed reporting 66 computed tomography (CT) studies using either VR (n - 39) or CD (n - 27). The time spent per reporting cycle, defined as the radiologist's time spent on a study from report finalization to the subsequent report finalization, was compared. As well, characteristics about the radiologist and their reporting style were collected and correlated against reporting time. For plain radiographs, radiologists took 134% (P = 0.048) more time to produce reports using VR, but there was significant variability between radiologists. Significant association with faster reporting times using VR included: English as a first language (r-0.24), use of a template (r -0.34), use of a headset microphone (r -0.46), and increased experience with VR (r -0.43). Experience as a staff radiologist and having previous study for comparison did not correlate with reporting time. For CT, there was no significant difference in reporting time identified between VR and CD (P 0.61). Overall, VR slightly decreases the reporting efficiency of radiologists. However, efficiency may be improved if English is a first language, a headset microphone, and macros and templates are use. (author)

  14. Superior voice recognition in a patient with acquired prosopagnosia and object agnosia.

    Science.gov (United States)

    Hoover, Adria E N; Démonet, Jean-François; Steeves, Jennifer K E

    2010-11-01

    Anecdotally, it has been reported that individuals with acquired prosopagnosia compensate for their inability to recognize faces by using other person identity cues such as hair, gait or the voice. Are they therefore superior at the use of non-face cues, specifically voices, to person identity? Here, we empirically measure person and object identity recognition in a patient with acquired prosopagnosia and object agnosia. We quantify person identity (face and voice) and object identity (car and horn) recognition for visual, auditory, and bimodal (visual and auditory) stimuli. The patient is unable to recognize faces or cars, consistent with his prosopagnosia and object agnosia, respectively. He is perfectly able to recognize people's voices and car horns and bimodal stimuli. These data show a reverse shift in the typical weighting of visual over auditory information for audiovisual stimuli in a compromised visual recognition system. Moreover, the patient shows selectively superior voice recognition compared to the controls revealing that two different stimulus domains, persons and objects, may not be equally affected by sensory adaptation effects. This also implies that person and object identity recognition are processed in separate pathways. These data demonstrate that an individual with acquired prosopagnosia and object agnosia can compensate for the visual impairment and become quite skilled at using spared aspects of sensory processing. In the case of acquired prosopagnosia it is advantageous to develop a superior use of voices for person identity recognition in everyday life. Copyright © 2010 Elsevier Ltd. All rights reserved.

  15. Voice recognition for radiology reporting: Is it good enough?

    International Nuclear Information System (INIS)

    Rana, D.S.; Hurst, G.; Shepstone, L.; Pilling, J.; Cockburn, J.; Crawford, M.

    2005-01-01

    AIM: To compare the efficiency and accuracy of radiology reports generated by voice recognition (VR) against the traditional tape dictation-transcription (DT) method. MATERIALS AND METHODS: Two hundred and twenty previously reported computed radiography (CR) and cross-sectional imaging (CSI) examinations were separately entered into the Radiology Information System (RIS) using both VR and DT. The times taken and errors found in the reports were compared using univariate analyses based upon the sign-test, and a general linear model constructed to examine the mean differences between the two methods. RESULTS: There were significant reductions (p<0.001) in the mean difference in the reporting times using VR compared with DT for the two reporting methods assessed (CR, +67.4; CSI, +122.1 s). There was a significant increase in the mean difference in the actual radiologist times using VR compared with DT in the CSI reports; -14.3 s, p=0.037 (more experienced user); -13.7 s, p=0.014 (less experienced user). There were significantly more total and major errors when using VR compared with DT for CR reports (-0.25 and -0.26, respectively), and in total errors for CSI (-0.75, p<0.001), but no difference in major errors (-0.16, p=0.168). Although there were significantly more errors with VR in the less experienced group of users (mean difference in total errors -0.90, and major errors -0.40, p<0.001), there was no significant difference in the more experienced (p=0.419 and p=0.814, respectively). CONCLUSIONS: VR is a viable reporting method for experienced users, with a quicker overall report production time (despite an increase in the radiologists' time) and a tendency to more errors for inexperienced users

  16. Evolving Spiking Neural Networks for Recognition of Aged Voices.

    Science.gov (United States)

    Silva, Marco; Vellasco, Marley M B R; Cataldo, Edson

    2017-01-01

    The aging of the voice, known as presbyphonia, is a natural process that can cause great change in vocal quality of the individual. This is a relevant problem to those people who use their voices professionally, and its early identification can help determine a suitable treatment to avoid its progress or even to eliminate the problem. This work focuses on the development of a new model for the identification of aging voices (independently of their chronological age), using as input attributes parameters extracted from the voice and glottal signals. The proposed model, named Quantum binary-real evolving Spiking Neural Network (QbrSNN), is based on spiking neural networks (SNNs), with an unsupervised training algorithm, and a Quantum-Inspired Evolutionary Algorithm that automatically determines the most relevant attributes and the optimal parameters that configure the SNN. The QbrSNN model was evaluated in a database composed of 120 records, containing samples from three groups of speakers. The results obtained indicate that the proposed model provides better accuracy than other approaches, with fewer input attributes. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  17. Motorcycle Start-stop System based on Intelligent Biometric Voice Recognition

    Science.gov (United States)

    Winda, A.; E Byan, W. R.; Sofyan; Armansyah; Zariantin, D. L.; Josep, B. G.

    2017-03-01

    Current mechanical key in the motorcycle is prone to bulgary, being stolen or misplaced. Intelligent biometric voice recognition as means to replace this mechanism is proposed as an alternative. The proposed system will decide whether the voice is belong to the user or not and the word utter by the user is ‘On’ or ‘Off’. The decision voice will be sent to Arduino in order to start or stop the engine. The recorded voice is processed in order to get some features which later be used as input to the proposed system. The Mel-Frequency Ceptral Coefficient (MFCC) is adopted as a feature extraction technique. The extracted feature is the used as input to the SVM-based identifier. Experimental results confirm the effectiveness of the proposed intelligent voice recognition and word recognition system. It show that the proposed method produces a good training and testing accuracy, 99.31% and 99.43%, respectively. Moreover, the proposed system shows the performance of false rejection rate (FRR) and false acceptance rate (FAR) accuracy of 0.18% and 17.58%, respectively. In the intelligent word recognition shows that the training and testing accuracy are 100% and 96.3%, respectively.

  18. Effects of emotional and perceptual-motor stress on a voice recognition system's accuracy: An applied investigation

    Science.gov (United States)

    Poock, G. K.; Martin, B. J.

    1984-02-01

    This was an applied investigation examining the ability of a speech recognition system to recognize speakers' inputs when the speakers were under different stress levels. Subjects were asked to speak to a voice recognition system under three conditions: (1) normal office environment, (2) emotional stress, and (3) perceptual-motor stress. Results indicate a definite relationship between voice recognition system performance and the type of low stress reference patterns used to achieve recognition.

  19. Methods and Software Architecture for Activity Recognition from Position Data

    DEFF Research Database (Denmark)

    Godsk, Torben

    This thesis describes my studies on the subject of recognizing cow activities from satellite based position data. The studies comprise methods and software architecture for activity recognition from position data, applied to cow activity recognition. The development of methods and software....... The results of these calculations are applied to a given standard machine learning algorithm, and the activity, performed by the cow as the measurements were recorded, is recognized. The software architecture integrates these methods and ensures flexible activity recognition. For instance, it is flexible...... in relation to the use of different sensors modalities and/or within different domains. In addition, the methods and their integration with the software architecture ensures both robust and accurate activity recognition. Utilized, it enables me to classify the five activities robustly and with high success...

  20. Pengoperasian Beban Listrik Fase Tunggal Terkendali Melalui Minimum System Berbasis Mikrokontroler Dan Sensor Voice Recognition (Vr)

    OpenAIRE

    Goeritno, Arief; Ginting, Sandy Ferdiansyah; Yatim, Rakhmad

    2017-01-01

    Minimum system berbasis mikrokontroler dan sensor voice recognition (VR) sebagai pengendali aktuator telah digunakan untuk pengoperasian beban listrik fase tunggal. Minimum system adalah suatu sistem yang tersusun melalui 2 (dua) tahapan, yaitu (a) diagram rangkaian dan bentuk fisis board dan (b) pengawatan terintegrasi terhadap minimum system pada sistem mikrokontroler ATmega16. Keberadaan sistem mikrokontroler pada minimum system perlu program tertanam melalui pemrograman berbasis bahasa ...

  1. Impact of a voice recognition system on report cycle time and radiologist reading time

    Science.gov (United States)

    Melson, David L.; Brophy, Robert; Blaine, G. James; Jost, R. Gilbert; Brink, Gary S.

    1998-07-01

    Because of its exciting potential to improve clinical service, as well as reduce costs, a voice recognition system for radiological dictation was recently installed at our institution. This system will be clinically successful if it dramatically reduces radiology report turnaround time without substantially affecting radiologist dictation and editing time. This report summarizes an observer study currently under way in which radiologist reporting times using the traditional transcription system and the voice recognition system are compared. Four radiologists are observed interpreting portable intensive care unit (ICU) chest examinations at a workstation in the chest reading area. Data are recorded with the radiologists using the transcription system and using the voice recognition system. The measurements distinguish between time spent performing clerical tasks and time spent actually dictating the report. Editing time and the number of corrections made are recorded. Additionally, statistics are gathered to assess the voice recognition system's impact on the report cycle time -- the time from report dictation to availability of an edited and finalized report -- and the length of reports.

  2. The recognition of female voice based on voice registers in singing techniques in real-time using hankel transform method and macdonald function

    Science.gov (United States)

    Meiyanti, R.; Subandi, A.; Fuqara, N.; Budiman, M. A.; Siahaan, A. P. U.

    2018-03-01

    A singer doesn’t just recite the lyrics of a song, but also with the use of particular sound techniques to make it more beautiful. In the singing technique, more female have a diverse sound registers than male. There are so many registers of the human voice, but the voice registers used while singing, among others, Chest Voice, Head Voice, Falsetto, and Vocal fry. Research of speech recognition based on the female’s voice registers in singing technique is built using Borland Delphi 7.0. Speech recognition process performed by the input recorded voice samples and also in real time. Voice input will result in weight energy values based on calculations using Hankel Transformation method and Macdonald Functions. The results showed that the accuracy of the system depends on the accuracy of sound engineering that trained and tested, and obtained an average percentage of the successful introduction of the voice registers record reached 48.75 percent, while the average percentage of the successful introduction of the voice registers in real time to reach 57 percent.

  3. Artificially intelligent recognition of Arabic speaker using voice print-based local features

    Science.gov (United States)

    Mahmood, Awais; Alsulaiman, Mansour; Muhammad, Ghulam; Akram, Sheeraz

    2016-11-01

    Local features for any pattern recognition system are based on the information extracted locally. In this paper, a local feature extraction technique was developed. This feature was extracted in the time-frequency plain by taking the moving average on the diagonal directions of the time-frequency plane. This feature captured the time-frequency events producing a unique pattern for each speaker that can be viewed as a voice print of the speaker. Hence, we referred to this technique as voice print-based local feature. The proposed feature was compared to other features including mel-frequency cepstral coefficient (MFCC) for speaker recognition using two different databases. One of the databases used in the comparison is a subset of an LDC database that consisted of two short sentences uttered by 182 speakers. The proposed feature attained 98.35% recognition rate compared to 96.7% for MFCC using the LDC subset.

  4. Voice Activity Detection. Fundamentals and Speech Recognition System Robustness

    OpenAIRE

    Ramirez, J.; Gorriz, J. M.; Segura, J. C.

    2007-01-01

    This chapter has shown an overview of the main challenges in robust speech detection and a review of the state of the art and applications. VADs are frequently used in a number of applications including speech coding, speech enhancement and speech recognition. A precise VAD extracts a set of discriminative speech features from the noisy speech and formulates the decision in terms of well defined rule. The chapter has summarized three robust VAD methods that yield high speech/non-speech discri...

  5. Voice recognition through phonetic features with Punjabi utterances

    Science.gov (United States)

    Kaur, Jasdeep; Juglan, K. C.; Sharma, Vishal; Upadhyay, R. K.

    2017-07-01

    This paper deals with perception and disorders of speech in view of Punjabi language. Visualizing the importance of voice identification, various parameters of speaker identification has been studied. The speech material was recorded with a tape recorder in their normal and disguised mode of utterances. Out of the recorded speech materials, the utterances free from noise, etc were selected for their auditory and acoustic spectrographic analysis. The comparison of normal and disguised speech of seven subjects is reported. The fundamental frequency (F0) at similar places, Plosive duration at certain phoneme, Amplitude ratio (A1:A2) etc. were compared in normal and disguised speech. It was found that the formant frequency of normal and disguised speech remains almost similar only if it is compared at the position of same vowel quality and quantity. If the vowel is more closed or more open in the disguised utterance the formant frequency will be changed in comparison to normal utterance. The ratio of the amplitude (A1: A2) is found to be speaker dependent. It remains unchanged in the disguised utterance. However, this value may shift in disguised utterance if cross sectioning is not done at the same location.

  6. Speech Recognition of Aged Voices in the AAL Context: Detection of Distress Sentences

    OpenAIRE

    Aman , Frédéric; Vacher , Michel; Rossato , Solange; Portet , François

    2013-01-01

    International audience; By 2050, about a third of the French population will be over 65. In the context of technologies development aiming at helping aged people to live independently at home, the CIRDO project aims at implementing an ASR system into a social inclusion product designed for elderly people in order to detect distress situations. Speech recognition systems present higher word error rate when speech is uttered by elderly speakers compared to when non-aged voice is considered. Two...

  7. Evaluating a voice recognition system: finding the right product for your department.

    Science.gov (United States)

    Freeh, M; Dewey, M; Brigham, L

    2001-06-01

    The Department of Radiology at the University of Utah Health Sciences Center has been in the process of transitioning from the traditional film-based department to a digital imaging department for the past 2 years. The department is now transitioning from the traditional method of dictating reports (dictation by radiologist to transcription to review and signing by radiologist) to a voice recognition system. The transition to digital operations will not be complete until we have the ability to directly interface the dictation process with the image review process. Voice recognition technology has advanced to the level where it can and should be an integral part of the new way of working in radiology and is an integral part of an efficient digital imaging department. The transition to voice recognition requires the task of identifying the product and the company that will best meet a department's needs. This report introduces the methods we used to evaluate the vendors and the products available as we made our purchasing decision. We discuss our evaluation method and provide a checklist that can be used by other departments to assist with their evaluation process. The criteria used in the evaluation process fall into the following major categories: user operations, technical infrastructure, medical dictionary, system interfaces, service support, cost, and company strength. Conclusions drawn from our evaluation process will be detailed, with the intention being to shorten the process for others as they embark on a similar venture. As more and more organizations investigate the many products and services that are now being offered to enhance the operations of a radiology department, it becomes increasingly important that solid methods are used to most effectively evaluate the new products. This report should help others complete the task of evaluating a voice recognition system and may be adaptable to other products as well.

  8. The Pandora software development kit for pattern recognition

    Energy Technology Data Exchange (ETDEWEB)

    Marshall, J.S.; Thomson, M.A. [University of Cambridge, Cavendish Laboratory, Cambridge (United Kingdom)

    2015-09-15

    The development of automated solutions to pattern recognition problems is important in many areas of scientific research and human endeavour. This paper describes the implementation of the Pandora software development kit, which aids the process of designing, implementing and running pattern recognition algorithms. The Pandora Application Programming Interfaces ensure simple specification of the building-blocks defining a pattern recognition problem. The logic required to solve the problem is implemented in algorithms. The algorithms request operations to create or modify data structures and the operations are performed by the Pandora framework. This design promotes an approach using many decoupled algorithms, each addressing specific topologies. Details of algorithms addressing two pattern recognition problems in High Energy Physics are presented: reconstruction of events at a high-energy e{sup +}e{sup -} linear collider and reconstruction of cosmic ray or neutrino events in a liquid argon time projection chamber. (orig.)

  9. Actuator prototype system by voice commands using free software

    Directory of Open Access Journals (Sweden)

    Jaime Andrango

    2016-06-01

    Full Text Available This prototype system is a software application that through the use of techniques of digital signal processing, extracts information from the user's speech, which is then used to manage the on/off actuator on a peripheral computer when vowels are pronounced. The method applies spectral differences. The application uses the parallel port as actuator, with the information recorded in the memory address 378H. This prototype was developed using free software tools for its versatility and dynamism, and to allow other researchers to base on it for further studies.

  10. Behavioral biometrics for verification and recognition of malicious software agents

    Science.gov (United States)

    Yampolskiy, Roman V.; Govindaraju, Venu

    2008-04-01

    Homeland security requires technologies capable of positive and reliable identification of humans for law enforcement, government, and commercial applications. As artificially intelligent agents improve in their abilities and become a part of our everyday life, the possibility of using such programs for undermining homeland security increases. Virtual assistants, shopping bots, and game playing programs are used daily by millions of people. We propose applying statistical behavior modeling techniques developed by us for recognition of humans to the identification and verification of intelligent and potentially malicious software agents. Our experimental results demonstrate feasibility of such methods for both artificial agent verification and even for recognition purposes.

  11. Proactiveness in entrepreneurial software firms: the executives' voice

    Directory of Open Access Journals (Sweden)

    Jean-Pierre Boissin

    2010-12-01

    Full Text Available This article approaches proactiveness in firms, considered to be one of the dimensions of the entrepreneurial orientation. Its goal is to introduce the results of an exploratory and qualitative study, which aimed to characterize the proactiveness in entrepreneurial software firms. The theory resumes the concepts of entrepreneurial firms, entrepreneurial orientation and proactiveness. The data gathering was accomplished through deeper interviews with executives from 13 software firms that stand out in terms of entrepreneurship in Rio Grande do Sul state. The results of the study demonstrate that firms are proactive and show a characterization regarding this behavior, starting from the conceptual base adopted in the present study. Among the proactiveness elements in the researched organizations, the onesrelated to environment monitoring and opportunities quest are highlighted. The study also consoliding a components’ set of proactiveness based on the theory and organizational practice reported by executives.

  12. Emotion Recognition From Singing Voices Using Contemporary Commercial Music and Classical Styles.

    Science.gov (United States)

    Hakanpää, Tua; Waaramaa, Teija; Laukkanen, Anne-Maria

    2018-02-22

    This study examines the recognition of emotion in contemporary commercial music (CCM) and classical styles of singing. This information may be useful in improving the training of interpretation in singing. This is an experimental comparative study. Thirteen singers (11 female, 2 male) with a minimum of 3 years' professional-level singing studies (in CCM or classical technique or both) participated. They sang at three pitches (females: a, e1, a1, males: one octave lower) expressing anger, sadness, joy, tenderness, and a neutral state. Twenty-nine listeners listened to 312 short (0.63- to 4.8-second) voice samples, 135 of which were sung using a classical singing technique and 165 of which were sung in a CCM style. The listeners were asked which emotion they heard. Activity and valence were derived from the chosen emotions. The percentage of correct recognitions out of all the answers in the listening test (N = 9048) was 30.2%. The recognition percentage for the CCM-style singing technique was higher (34.5%) than for the classical-style technique (24.5%). Valence and activation were better perceived than the emotions themselves, and activity was better recognized than valence. A higher pitch was more likely to be perceived as joy or anger, and a lower pitch as sorrow. Both valence and activation were better recognized in the female CCM samples than in the other samples. There are statistically significant differences in the recognition of emotions between classical and CCM styles of singing. Furthermore, in the singing voice, pitch affects the perception of emotions, and valence and activity are more easily recognized than emotions. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  13. Cultural in-group advantage: emotion recognition in African American and European American faces and voices.

    Science.gov (United States)

    Wickline, Virginia B; Bailey, Wendy; Nowicki, Stephen

    2009-03-01

    The authors explored whether there were in-group advantages in emotion recognition of faces and voices by culture or geographic region. Participants were 72 African American students (33 men, 39 women), 102 European American students (30 men, 72 women), 30 African international students (16 men, 14 women), and 30 European international students (15 men, 15 women). The participants determined emotions in African American and European American faces and voices. Results showed an in-group advantage-sometimes by culture, less often by race-in recognizing facial and vocal emotional expressions. African international students were generally less accurate at interpreting American nonverbal stimuli than were European American, African American, and European international peers. Results suggest that, although partly universal, emotional expressions have subtle differences across cultures that persons must learn.

  14. Software for roof defects recognition on aerial photographs

    Science.gov (United States)

    Yudin, D.; Naumov, A.; Dolzhenko, A.; Patrakova, E.

    2018-05-01

    The article presents information on software for roof defects recognition on aerial photographs, made with air drones. An areal image segmentation mechanism is described. It allows detecting roof defects – unsmoothness that causes water stagnation after rain. It is shown that HSV-transformation approach allows quick detection of stagnation areas, their size and perimeters, but is sensitive to shadows and changes of the roofing-types. Deep Fully Convolutional Network software solution eliminates this drawback. The tested data set consists of the roofing photos with defects and binary masks for them. FCN approach gave acceptable results of image segmentation in Dice metric average value. This software can be used in inspection automation of roof conditions in the production sector and housing and utilities infrastructure.

  15. A memory like a female Fur Seal: long-lasting recognition of pup's voice by mothers.

    Science.gov (United States)

    Mathevon, Nicolas; Charrier, Isabelle; Aubin, Thierry

    2004-06-01

    In colonial mammals like fur seals, mutual vocal recognition between mothers and their pup is of primary importance for breeding success. Females alternate feeding sea-trips with suckling periods on land, and when coming back from the ocean, they have to vocally find their offspring among numerous similar-looking pups. Young fur seals emit a 'mother-attraction call' that presents individual characteristics. In this paper, we review the perceptual process of pup's call recognition by Subantarctic Fur Seal Arctocephalus tropicalis mothers. To identify their progeny, females rely on the frequency modulation pattern and spectral features of this call. As the acoustic characteristics of a pup's call change throughout the lactation period due to the growing process, mothers have thus to refine their memorization of their pup's voice. Field experiments show that female Fur Seals are able to retain all the successive versions of their pup's call.

  16. Analysis And Voice Recognition In Indonesian Language Using MFCC And SVM Method

    Directory of Open Access Journals (Sweden)

    Harvianto Harvianto

    2016-06-01

    Full Text Available Voice recognition technology is one of biometric technology. Sound is a unique part of the human being which made an individual can be easily distinguished one from another. Voice can also provide information such as gender, emotion, and identity of the speaker. This research will record human voices that pronounce digits between 0 and 9 with and without noise. Features of this sound recording will be extracted using Mel Frequency Cepstral Coefficient (MFCC. Mean, standard deviation, max, min, and the combination of them will be used to construct the feature vectors. This feature vectors then will be classified using Support Vector Machine (SVM. There will be two classification models. The first one is based on the speaker and the other one based on the digits pronounced. The classification model then will be validated by performing 10-fold cross-validation.The best average accuracy from two classification model is 91.83%. This result achieved using Mean + Standard deviation + Min + Max as features.

  17. Suggestions for Layout and Functional Behavior of Software-Based Voice Switch Keysets

    Science.gov (United States)

    Scott, David W.

    2010-01-01

    Marshall Space Flight Center (MSFC) provides communication services for a number of real time environments, including Space Shuttle Propulsion support and International Space Station (ISS) payload operations. In such settings, control team members speak with each other via multiple voice circuits or loops. Each loop has a particular purpose and constituency, and users are assigned listen and/or talk capabilities for a given loop based on their role in fulfilling the purpose. A voice switch is a given facility's hardware and software that supports such communication, and may be interconnected with other facilities switches to create a large network that, from an end user perspective, acts like a single system. Since users typically monitor and/or respond to several voice loops concurrently for hours on end and real time operations can be very dynamic and intense, it s vital that a control panel or keyset for interfacing with the voice switch be a servant that reduces stress, not a master that adds it. Implementing the visual interface on a computer screen provides tremendous flexibility and configurability, but there s a very real risk of overcomplication. (Remember how office automation made life easier, which led to a deluge of documents that made life harder?) This paper a) discusses some basic human factors considerations related to keysets implemented as application software windows, b) suggests what to standardize at the facility level and what to leave to the user's preference, and c) provides screen shot mockups for a robust but reasonably simple user experience. Concepts apply to keyset needs in almost any type of operations control or support center.

  18. It doesn't matter what you say: FMRI correlates of voice learning and recognition independent of speech content.

    Science.gov (United States)

    Zäske, Romi; Awwad Shiekh Hasan, Bashar; Belin, Pascal

    2017-09-01

    Listeners can recognize newly learned voices from previously unheard utterances, suggesting the acquisition of high-level speech-invariant voice representations during learning. Using functional magnetic resonance imaging (fMRI) we investigated the anatomical basis underlying the acquisition of voice representations for unfamiliar speakers independent of speech, and their subsequent recognition among novel voices. Specifically, listeners studied voices of unfamiliar speakers uttering short sentences and subsequently classified studied and novel voices as "old" or "new" in a recognition test. To investigate "pure" voice learning, i.e., independent of sentence meaning, we presented German sentence stimuli to non-German speaking listeners. To disentangle stimulus-invariant and stimulus-dependent learning, during the test phase we contrasted a "same sentence" condition in which listeners heard speakers repeating the sentences from the preceding study phase, with a "different sentence" condition. Voice recognition performance was above chance in both conditions although, as expected, performance was higher for same than for different sentences. During study phases activity in the left inferior frontal gyrus (IFG) was related to subsequent voice recognition performance and same versus different sentence condition, suggesting an involvement of the left IFG in the interactive processing of speaker and speech information during learning. Importantly, at test reduced activation for voices correctly classified as "old" compared to "new" emerged in a network of brain areas including temporal voice areas (TVAs) of the right posterior superior temporal gyrus (pSTG), as well as the right inferior/middle frontal gyrus (IFG/MFG), the right medial frontal gyrus, and the left caudate. This effect of voice novelty did not interact with sentence condition, suggesting a role of temporal voice-selective areas and extra-temporal areas in the explicit recognition of learned voice identity

  19. Emotionally conditioning the target-speech voice enhances recognition of the target speech under "cocktail-party" listening conditions.

    Science.gov (United States)

    Lu, Lingxi; Bao, Xiaohan; Chen, Jing; Qu, Tianshu; Wu, Xihong; Li, Liang

    2018-05-01

    Under a noisy "cocktail-party" listening condition with multiple people talking, listeners can use various perceptual/cognitive unmasking cues to improve recognition of the target speech against informational speech-on-speech masking. One potential unmasking cue is the emotion expressed in a speech voice, by means of certain acoustical features. However, it was unclear whether emotionally conditioning a target-speech voice that has none of the typical acoustical features of emotions (i.e., an emotionally neutral voice) can be used by listeners for enhancing target-speech recognition under speech-on-speech masking conditions. In this study we examined the recognition of target speech against a two-talker speech masker both before and after the emotionally neutral target voice was paired with a loud female screaming sound that has a marked negative emotional valence. The results showed that recognition of the target speech (especially the first keyword in a target sentence) was significantly improved by emotionally conditioning the target speaker's voice. Moreover, the emotional unmasking effect was independent of the unmasking effect of the perceived spatial separation between the target speech and the masker. Also, (skin conductance) electrodermal responses became stronger after emotional learning when the target speech and masker were perceptually co-located, suggesting an increase of listening efforts when the target speech was informationally masked. These results indicate that emotionally conditioning the target speaker's voice does not change the acoustical parameters of the target-speech stimuli, but the emotionally conditioned vocal features can be used as cues for unmasking target speech.

  20. Educational Pedagogy Explored: Attachment, Voice, and Students’ Limited Recognition of the Purpose of Writing

    Directory of Open Access Journals (Sweden)

    Rebecca A. Fairchild

    2013-07-01

    Full Text Available The following teacher research case-study involved an exploration of educational pedagogy by working with a freshman composition student at a college university. All data collected for the study was gathered during the 2013 spring semester. The study was driven by an inquiry based approach where the researcher determined the center of focus that arose from an exploration of the student as a writer through a survey, a classroom observation, multiple one-on-one meetings, and email conversations. The focus area that arose was the student’s limited recognition that writing was done solely for school purposes. Related puzzlements stemming from this focus area included the student’s lack of attachment and lack of voice in her writing. The conclusive data provided insights for how to educate students in future classrooms regarding how vital it is for students to be able to attach themselves to their work.

  1. Interfacing COTS Speech Recognition and Synthesis Software to a Lotus Notes Military Command and Control Database

    Science.gov (United States)

    Carr, Oliver

    2002-10-01

    Speech recognition and synthesis technologies have become commercially viable over recent years. Two current market leading products in speech recognition technology are Dragon NaturallySpeaking and IBM ViaVoice. This report describes the development of speech user interfaces incorporating these products with Lotus Notes and Java applications. These interfaces enable data entry using speech recognition and allow warnings and instructions to be issued via speech synthesis. The development of a military vocabulary to improve user interaction is discussed. The report also describes an evaluation in terms of speed of the various speech user interfaces developed using Dragon NaturallySpeaking and IBM ViaVoice with a Lotus Notes Command and Control Support System Log database.

  2. Voice recognition versus transcriptionist: error rates and productivity in MRI reporting.

    Science.gov (United States)

    Strahan, Rodney H; Schneider-Kolsky, Michal E

    2010-10-01

    Despite the frequent introduction of voice recognition (VR) into radiology departments, little evidence still exists about its impact on workflow, error rates and costs. We designed a study to compare typographical errors, turnaround times (TAT) from reported to verified and productivity for VR-generated reports versus transcriptionist-generated reports in MRI. Fifty MRI reports generated by VR and 50 finalized MRI reports generated by the transcriptionist, of two radiologists, were sampled retrospectively. Two hundred reports were scrutinised for typographical errors and the average TAT from dictated to final approval. To assess productivity, the average MRI reports per hour for one of the radiologists was calculated using data from extra weekend reporting sessions. Forty-two % and 30% of the finalized VR reports for each of the radiologists investigated contained errors. Only 6% and 8% of the transcriptionist-generated reports contained errors. The average TAT for VR was 0 h, and for the transcriptionist reports TAT was 89 and 38.9 h. Productivity was calculated at 8.6 MRI reports per hour using VR and 13.3 MRI reports using the transcriptionist, representing a 55% increase in productivity. Our results demonstrate that VR is not an effective method of generating reports for MRI. Ideally, we would have the report error rate and productivity of a transcriptionist and the TAT of VR. © 2010 The Authors. Journal of Medical Imaging and Radiation Oncology © 2010 The Royal Australian and New Zealand College of Radiologists.

  3. Voice recognition versus transcriptionist: error rated and productivity in MRI reporting

    International Nuclear Information System (INIS)

    Strahan, Rodney H.; Schneider-Kolsky, Michal E.

    2010-01-01

    Full text: Purpose: Despite the frequent introduction of voice recognition (VR) into radiology departments, little evidence still exists about its impact on workflow, error rates and costs. We designed a study to compare typographical errors, turnaround times (TAT) from reported to verified and productivity for VR-generated reports versus transcriptionist-generated reports in MRI. Methods: Fifty MRI reports generated by VR and 50 finalised MRI reports generated by the transcriptionist, of two radiologists, were sampled retrospectively. Two hundred reports were scrutinised for typographical errors and the average TAT from dictated to final approval. To assess productivity, the average MRI reports per hour for one of the radiologists was calculated using data from extra weekend reporting sessions. Results: Forty-two % and 30% of the finalised VR reports for each of the radiologists investigated contained errors. Only 6% and 8% of the transcriptionist-generated reports contained errors. The average TAT for VR was 0 h, and for the transcriptionist reports TAT was 89 and 38.9 h. Productivity was calculated at 8.6 MRI reports per hour using VR and 13.3 MRI reports using the transcriptionist, representing a 55% increase in productivity. Conclusion: Our results demonstrate that VR is not an effective method of generating reports for MRI. Ideally, we would have the report error rate and productivity of a transcriptionist and the TAT of VR.

  4. Automatic Speech Recognition Systems for the Evaluation of Voice and Speech Disorders in Head and Neck Cancer

    Directory of Open Access Journals (Sweden)

    Andreas Maier

    2010-01-01

    Full Text Available In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appropriate for objective and quick evaluation of intelligibility. In this study we investigate the applicability of the method to speech disorders caused by head and neck cancer. Intelligibility was quantified by speech recognition on recordings of a standard text read by 41 German laryngectomized patients with cancer of the larynx or hypopharynx and 49 German patients who had suffered from oral cancer. The speech recognition provides the percentage of correctly recognized words of a sequence, that is, the word recognition rate. Automatic evaluation was compared to perceptual ratings by a panel of experts and to an age-matched control group. Both patient groups showed significantly lower word recognition rates than the control group. Automatic speech recognition yielded word recognition rates which complied with experts' evaluation of intelligibility on a significant level. Automatic speech recognition serves as a good means with low effort to objectify and quantify the most important aspect of pathologic speech—the intelligibility. The system was successfully applied to voice and speech disorders.

  5. A self-teaching image processing and voice-recognition-based, intelligent and interactive system to educate visually impaired children

    Science.gov (United States)

    Iqbal, Asim; Farooq, Umar; Mahmood, Hassan; Asad, Muhammad Usman; Khan, Akrama; Atiq, Hafiz Muhammad

    2010-02-01

    A self teaching image processing and voice recognition based system is developed to educate visually impaired children, chiefly in their primary education. System comprises of a computer, a vision camera, an ear speaker and a microphone. Camera, attached with the computer system is mounted on the ceiling opposite (on the required angle) to the desk on which the book is placed. Sample images and voices in the form of instructions and commands of English, Urdu alphabets, Numeric Digits, Operators and Shapes are already stored in the database. A blind child first reads the embossed character (object) with the help of fingers than he speaks the answer, name of the character, shape etc into the microphone. With the voice command of a blind child received by the microphone, image is taken by the camera which is processed by MATLAB® program developed with the help of Image Acquisition and Image processing toolbox and generates a response or required set of instructions to child via ear speaker, resulting in self education of a visually impaired child. Speech recognition program is also developed in MATLAB® with the help of Data Acquisition and Signal Processing toolbox which records and process the command of the blind child.

  6. A memory like a female Fur Seal: long-lasting recognition of pup's voice by mothers

    Directory of Open Access Journals (Sweden)

    Nicolas Mathevon

    2004-06-01

    Full Text Available In colonial mammals like fur seals, mutual vocal recognition between mothers and their pup is of primary importance for breeding success. Females alternate feeding sea-trips with suckling periods on land, and when coming back from the ocean, they have to vocally find their offspring among numerous similar-looking pups. Young fur seals emit a 'mother-attraction call' that presents individual characteristics. In this paper, we review the perceptual process of pup's call recognition by Subantarctic Fur Seal Arctocephalus tropicalis mothers. To identify their progeny, females rely on the frequency modulation pattern and spectral features of this call. As the acoustic characteristics of a pup's call change throughout the lactation period due to the growing process, mothers have thus to refine their memorization of their pup's voice. Field experiments show that female Fur Seals are able to retain all the successive versions of their pup's call.Em mamíferos coloniais como as focas, o reconhecimento vocal mútuo entre as mães e seu filhote é de importância primordial para o sucesso reprodutivo. As fêmeas alternam viagens de alimentação no mar com períodos de amamentação em terra e, quando voltam à colônia, elas devem achar vocalmente seu filhote no meio de muitos outros visualmente semelhantes. As jovens focas emitem um ''grito de atração da mãe'' que apresenta características individuais. Examinamos aqui o processo perceptual do reconhecimento do grito do filhote pela mãe numa população sub-antártica da foca Arctocephalus tropicalis. Para identificar seu filhote as fêmeas se baseiam no padrão da freqüência de modulação e outras características espectrais deste grito. Como os parâmetros acústicos do grito de um filhote mudam ao longo do período de amamentação por causa do seu crescimento, as mães precisam de uma memorização refinada da voz de seu filhote. Experiências de campo mostram que as fêmeas desta espécie s

  7. Malavefes: A computational voice-enabled malaria fuzzy informatics software for correct dosage prescription of anti-malarial drugs

    Directory of Open Access Journals (Sweden)

    Olugbenga O. Oluwagbemi

    2018-04-01

    Full Text Available Malaria is one of the infectious diseases consistently inherent in many Sub-Sahara African countries. Among the issues of concern are the consequences of wrong diagnosis and dosage administration of anti-malarial drugs on sick patients; these have resulted into various degrees of complications ranging from severe headaches, stomach and body discomfort, blurred vision, dizziness, hallucinations, and in extreme cases, death. Many expert systems have been developed to support different infectious disease diagnoses, but not sure of any yet, that have been specifically designed as a voice-based application to diagnose and translate malaria patients’ symptomatic data for pre-laboratory screening and correct prescription of proper dosage of the appropriate medication. We developed Malavefes, (a malaria voice-enabled computational fuzzy expert system for correct dosage prescription of anti-malarial drugs using Visual Basic.NET., and Java programming languages. Data collation for this research was conducted by survey from existing literature and interview from public health experts. The database for this malaria drug informatics system was implemented using Microsoft Access. The Root Sum Square (RSS was implemented as the inference engine of Malavefes to make inferences from rules, while Centre of Gravity (CoG was implemented as the defuzzification engine. The drug recommendation module was voice-enabled. Additional anti-malaria drug expiration validation software was developed using Java programming language. We conducted a user-evaluation of the performance and user-experience of the Malavefes software. Keywords: Informatics, Bioinformatics, Fuzzy, Anti-malaria, Voice computing, Dosage prescription

  8. SOFTWARE EFFORT ESTIMATION FRAMEWORK TO IMPROVE ORGANIZATION PRODUCTIVITY USING EMOTION RECOGNITION OF SOFTWARE ENGINEERS IN SPONTANEOUS SPEECH

    Directory of Open Access Journals (Sweden)

    B.V.A.N.S.S. Prabhakar Rao

    2015-10-01

    Full Text Available Productivity is a very important part of any organisation in general and software industry in particular. Now a day’s Software Effort estimation is a challenging task. Both Effort and Productivity are inter-related to each other. This can be achieved from the employee’s of the organization. Every organisation requires emotionally stable employees in their firm for seamless and progressive working. Of course, in other industries this may be achieved without man power. But, software project development is labour intensive activity. Each line of code should be delivered from software engineer. Tools and techniques may helpful and act as aid or supplementary. Whatever be the reason software industry has been suffering with success rate. Software industry is facing lot of problems in delivering the project on time and within the estimated budget limit. If we want to estimate the required effort of the project it is significant to know the emotional state of the team member. The responsibility of ensuring emotional contentment falls on the human resource department and the department can deploy a series of systems to carry out its survey. This analysis can be done using a variety of tools, one such, is through study of emotion recognition. The data needed for this is readily available and collectable and can be an excellent source for the feedback systems. The challenge of recognition of emotion in speech is convoluted primarily due to the noisy recording condition, the variations in sentiment in sample space and exhibition of multiple emotions in a single sentence. The ambiguity in the labels of training set also increases the complexity of problem addressed. The existing models using probabilistic models have dominated the study but present a flaw in scalability due to statistical inefficiency. The problem of sentiment prediction in spontaneous speech can thus be addressed using a hybrid system comprising of a Convolution Neural Network and

  9. Revisiting vocal perception in non-human animals: a review of vowel discrimination, speaker voice recognition, and speaker normalization

    Directory of Open Access Journals (Sweden)

    Buddhamas eKriengwatana

    2015-01-01

    Full Text Available The extent to which human speech perception evolved by taking advantage of predispositions and pre-existing features of vertebrate auditory and cognitive systems remains a central question in the evolution of speech. This paper reviews asymmetries in vowel perception, speaker voice recognition, and speaker normalization in non-human animals – topics that have not been thoroughly discussed in relation to the abilities of non-human animals, but are nonetheless important aspects of vocal perception. Throughout this paper we demonstrate that addressing these issues in non-human animals is relevant and worthwhile because many non-human animals must deal with similar issues in their natural environment. That is, they must also discriminate between similar-sounding vocalizations, determine signaler identity from vocalizations, and resolve signaler-dependent variation in vocalizations from conspecifics. Overall, we find that, although plausible, the current evidence is insufficiently strong to conclude that directional asymmetries in vowel perception are specific to humans, or that non-human animals can use voice characteristics to recognize human individuals. However, we do find some indication that non-human animals can normalize speaker differences. Accordingly, we identify avenues for future research that would greatly improve and advance our understanding of these topics.

  10. Modular Algorithm Testbed Suite (MATS): A Software Framework for Automatic Target Recognition

    Science.gov (United States)

    2017-01-01

    NAVAL SURFACE WARFARE CENTER PANAMA CITY DIVISION PANAMA CITY, FL 32407-7001 TECHNICAL REPORT NSWC PCD TR-2017-004 MODULAR ...31-01-2017 Technical Modular Algorithm Testbed Suite (MATS): A Software Framework for Automatic Target Recognition DR...flexible platform to facilitate the development and testing of ATR algorithms. To that end, NSWC PCD has created the Modular Algorithm Testbed Suite

  11. Computerized literature reference system: use of an optical scanner and optical character recognition software.

    Science.gov (United States)

    Lossef, S V; Schwartz, L H

    1990-09-01

    A computerized reference system for radiology journal articles was developed by using an IBM-compatible personal computer with a hand-held optical scanner and optical character recognition software. This allows direct entry of scanned text from printed material into word processing or data-base files. Additionally, line diagrams and photographs of radiographs can be incorporated into these files. A text search and retrieval software program enables rapid searching for keywords in scanned documents. The hand scanner and software programs are commercially available, relatively inexpensive, and easily used. This permits construction of a personalized radiology literature file of readily accessible text and images requiring minimal typing or keystroke entry.

  12. Speech pattern recognition for forensic acoustic purposes

    OpenAIRE

    Herrera Martínez, Marcelo; Aldana Blanco, Andrea Lorena; Guzmán Palacios, Ana María

    2014-01-01

    The present paper describes the development of a software for analysis of acoustic voice parameters (APAVOIX), which can be used for forensic acoustic purposes, based on the speaker recognition and identification. This software enables to observe in a clear manner, the parameters which are sufficient and necessary when performing a comparison between two voice signals, the suspicious and the original one. These parameters are used according to the classic method, generally used by state entit...

  13. The Voice Transcription Technique: Use of Voice Recognition Software to Transcribe Digital Interview Data in Qualitative Research

    Science.gov (United States)

    Matheson, Jennifer L.

    2007-01-01

    Transcribing interview data is a time-consuming task that most qualitative researchers dislike. Transcribing is even more difficult for people with physical limitations because traditional transcribing requires manual dexterity and the ability to sit at a computer for long stretches of time. Researchers have begun to explore using an automated…

  14. Higher-order neural network software for distortion invariant object recognition

    Science.gov (United States)

    Reid, Max B.; Spirkovska, Lilly

    1991-01-01

    The state-of-the-art in pattern recognition for such applications as automatic target recognition and industrial robotic vision relies on digital image processing. We present a higher-order neural network model and software which performs the complete feature extraction-pattern classification paradigm required for automatic pattern recognition. Using a third-order neural network, we demonstrate complete, 100 percent accurate invariance to distortions of scale, position, and in-plate rotation. In a higher-order neural network, feature extraction is built into the network, and does not have to be learned. Only the relatively simple classification step must be learned. This is key to achieving very rapid training. The training set is much smaller than with standard neural network software because the higher-order network only has to be shown one view of each object to be learned, not every possible view. The software and graphical user interface run on any Sun workstation. Results of the use of the neural software in autonomous robotic vision systems are presented. Such a system could have extensive application in robotic manufacturing.

  15. Acoustic and capacity analysis of voice academic teachers with diagnosed hyperfunctional dysphonia by using DiagnoScope Specialist software.

    Science.gov (United States)

    Zielińska-Bliźniewska, Hanna; Pietkiewicz, Piotr; Miłoński, Jarosław; Urbaniak, Joanna; Olszewski, Jurek

    2013-01-01

    The aim of the study was to assess the acoustic and capacity analyses of voice in academic teachers with hyperfunctional dysphonia using DiagnoScope Specialist software. The study covered 46 female academic teachers aged 34-48 years. The women were diagnosed with hyperfunctional dysphonia (with absence of organic pathologies). Having obtained the informed consent, a primary medical history was taken, videolaryngoscopic and stroboscopic examinations were performed and diagnostic voice acoustic and capacity analyses were carried out using DiagnoScope Specialist software. The acoustic analysis carried out of academic teachers with diagnosed hyperfunctional dysphonia showed enhancement in the following parameters: fundamental frequency (FO) by 1.2%; relative average perturbation (Jitter by 100.0% and RAP by 81.8%); relative amplitude perturbation quotient (APQ) by 2.9%; non-harmonic to harmonic ratio (U2H) by 16.0%; and noise to harmonic ratio (NHR) by 13.4%. A decrease of 2.5% from normal values was noted in relative amplitude perturbation (Shimmer). Formant frequencies also showed reduction (F1 by 10.7%, F2 by 5.1%, F3 by 2.2%, and F4 by 3.5%). The harmonic perturbation quotient (HPQ) was 0.8% lower and the residual harmonic perturbation quotient (RHPQ) 16.8% lower, with the residual to harmonic (R2H) decreasing by 35.1 per cent; the sub-harmonic to harmonic (S2H) by 2.4%; and the Yanagihara coefficient by 20.2%. The capacity analysis with the DiagnoScope Specialist software showed figures significantly lower than normal values of the following parameters: phonation time, true phonation time, phonation break coefficients, vocal capacity coefficient and mean vocal capacity. Copyright © 2013 Polish Otorhinolaryngology - Head and Neck Surgery Society. Published by Elsevier Urban & Partner Sp. z.o.o. All rights reserved.

  16. Batch metadata assignment to archival photograph collections using facial recognition software

    Directory of Open Access Journals (Sweden)

    Kyle Banerjee

    2013-07-01

    Full Text Available Useful metadata is essential to giving individual meaning and value within the context of a greater image collection as well as making them more discoverable. However, often little information is available about the photos themselves, so adding consistent metadata to large collections of digital and digitized photographs is a time consuming process requiring highly experienced staff. By using facial recognition software, staff can identify individuals more quickly and reliably. Knowledge of individuals in photos helps staff determine when and where photos are taken and also improves understanding of the subject matter. This article demonstrates simple techniques for using facial recognition software and command line tools to assign, modify, and read metadata for large archival photograph collections.

  17. Syntactic and semantic errors in radiology reports associated with speech recognition software.

    Science.gov (United States)

    Ringler, Michael D; Goss, Brian C; Bartholmai, Brian J

    2017-03-01

    Speech recognition software can increase the frequency of errors in radiology reports, which may affect patient care. We retrieved 213,977 speech recognition software-generated reports from 147 different radiologists and proofread them for errors. Errors were classified as "material" if they were believed to alter interpretation of the report. "Immaterial" errors were subclassified as intrusion/omission or spelling errors. The proportion of errors and error type were compared among individual radiologists, imaging subspecialty, and time periods. In all, 20,759 reports (9.7%) contained errors, of which 3992 (1.9%) were material errors. Among immaterial errors, spelling errors were more common than intrusion/omission errors ( p reports, reports reinterpreting results of outside examinations, and procedural studies (all p < .001). Error rate decreased over time ( p < .001), which suggests that a quality control program with regular feedback may reduce errors.

  18. Hardware/Software Co-Design of a Traffic Sign Recognition System Using Zynq FPGAs

    Directory of Open Access Journals (Sweden)

    Yan Han

    2015-12-01

    Full Text Available Traffic sign recognition (TSR, taken as an important component of an intelligent vehicle system, has been an emerging research topic in recent years. In this paper, a traffic sign detection system based on color segmentation, speeded-up robust features (SURF detection and the k-nearest neighbor classifier is introduced. The proposed system benefits from the SURF detection algorithm, which achieves invariance to rotated, skewed and occluded signs. In addition to the accuracy and robustness issues, a TSR system should target a real-time implementation on an embedded system. Therefore, a hardware/software co-design architecture for a Zynq-7000 FPGA is presented as a major objective of this work. The sign detection operations are accelerated by programmable hardware logic that searches the potential candidates for sign classification. Sign recognition and classification uses a feature extraction and matching algorithm, which is implemented as a software component that runs on the embedded ARM CPU.

  19. The Effects of Certain Background Noises on the Performance of a Voice Recognition System.

    Science.gov (United States)

    1980-09-01

    Principles in Experimental Design. New York: McGraw-Hill, 1962. Woodworth, R.S. and H. Schlosberg, Experimental Psychology, (Revised edition), New...collection iheet APPENDIX II EXPERIMENTAL PROTOCOL AND SUBJECTS’ INSTRICTJONS THIS IS AN EXPERIMENT DESIGNED TO EVALUJATE SOME ," lE RECOGNITION EQUIPMENT. I...37. CDR Paul Chatelier OUSD R&E Room 3D129 Pentagon Washington, D.C. 20301 38. Ralph Cleveland NFMSO Code 9333 Mechanicsburg, PA 17055 39. Clay Coler

  20. The Voice as Computer Interface: A Look at Tomorrow's Technologies.

    Science.gov (United States)

    Lange, Holley R.

    1991-01-01

    Discussion of voice as the communications device for computer-human interaction focuses on voice recognition systems for use within a library environment. Voice technologies are described, including voice response and voice recognition; examples of voice systems in use in libraries are examined; and further possibilities, including use with…

  1. TreeRipper web application: towards a fully automated optical tree recognition software

    Directory of Open Access Journals (Sweden)

    Hughes Joseph

    2011-05-01

    Full Text Available Abstract Background Relationships between species, genes and genomes have been printed as trees for over a century. Whilst this may have been the best format for exchanging and sharing phylogenetic hypotheses during the 20th century, the worldwide web now provides faster and automated ways of transferring and sharing phylogenetic knowledge. However, novel software is needed to defrost these published phylogenies for the 21st century. Results TreeRipper is a simple website for the fully-automated recognition of multifurcating phylogenetic trees (http://linnaeus.zoology.gla.ac.uk/~jhughes/treeripper/. The program accepts a range of input image formats (PNG, JPG/JPEG or GIF. The underlying command line c++ program follows a number of cleaning steps to detect lines, remove node labels, patch-up broken lines and corners and detect line edges. The edge contour is then determined to detect the branch length, tip label positions and the topology of the tree. Optical Character Recognition (OCR is used to convert the tip labels into text with the freely available tesseract-ocr software. 32% of images meeting the prerequisites for TreeRipper were successfully recognised, the largest tree had 115 leaves. Conclusions Despite the diversity of ways phylogenies have been illustrated making the design of a fully automated tree recognition software difficult, TreeRipper is a step towards automating the digitization of past phylogenies. We also provide a dataset of 100 tree images and associated tree files for training and/or benchmarking future software. TreeRipper is an open source project licensed under the GNU General Public Licence v3.

  2. A Voice Operated Tour Planning System for Autonomous Mobile Robots

    Directory of Open Access Journals (Sweden)

    Charles V. Smith Iii

    2010-06-01

    Full Text Available Control systems driven by voice recognition software have been implemented before but lacked the context driven approach to generate relevant responses and actions. A partially voice activated control system for mobile robotics is presented that allows an autonomous robot to interact with people and the environment in a meaningful way, while dynamically creating customized tours. Many existing control systems also require substantial training for voice application. The system proposed requires little to no training and is adaptable to chaotic environments. The traversable area is mapped once and from that map a fully customized route is generated to the user

  3. The Usefulness of Automatic Speech Recognition (ASR Eyespeak Software in Improving Iraqi EFL Students’ Pronunciation

    Directory of Open Access Journals (Sweden)

    Lina Fathi Sidig Sidgi

    2017-02-01

    Full Text Available The present study focuses on determining whether automatic speech recognition (ASR technology is reliable for improving English pronunciation to Iraqi EFL students. Non-native learners of English are generally concerned about improving their pronunciation skills, and Iraqi students face difficulties in pronouncing English sounds that are not found in their native language (Arabic. This study is concerned with ASR and its effectiveness in overcoming this difficulty. The data were obtained from twenty participants randomly selected from first-year college students at Al-Turath University College from the Department of English in Baghdad-Iraq. The students had participated in a two month pronunciation instruction course using ASR Eyespeak software. At the end of the pronunciation instruction course using ASR Eyespeak software, the students completed a questionnaire to get their opinions about the usefulness of the ASR Eyespeak in improving their pronunciation. The findings of the study revealed that the students found ASR Eyespeak software very useful in improving their pronunciation and helping them realise their pronunciation mistakes. They also reported that learning pronunciation with ASR Eyespeak enjoyable.

  4. Object and Facial Recognition in Augmented and Virtual Reality: Investigation into Software, Hardware and Potential Uses

    Science.gov (United States)

    Schulte, Erin

    2017-01-01

    As augmented and virtual reality grows in popularity, and more researchers focus on its development, other fields of technology have grown in the hopes of integrating with the up-and-coming hardware currently on the market. Namely, there has been a focus on how to make an intuitive, hands-free human-computer interaction (HCI) utilizing AR and VR that allows users to control their technology with little to no physical interaction with hardware. Computer vision, which is utilized in devices such as the Microsoft Kinect, webcams and other similar hardware has shown potential in assisting with the development of a HCI system that requires next to no human interaction with computing hardware and software. Object and facial recognition are two subsets of computer vision, both of which can be applied to HCI systems in the fields of medicine, security, industrial development and other similar areas.

  5. Surveillance application using patten recognition software at the EBR-II Reactor Facility

    International Nuclear Information System (INIS)

    Olson, D.L.

    1992-01-01

    The System State Analyzer (SSA) is a software based pattern recognition system. For the past several year this system has been used at Argonne National Laboratory's Experimental Breeder Reactor 2 (EBR-2) reactor for detection of degradation and other abnormalities in plant systems. Currently there are two versions of the SSA being used at EBR-2. One version of SSA is used for daily surveillance and trending of the reactor delta-T and startups of the reactor. Another version of the SSA is the QSSA which is used to monitor individual systems of the reactor such as the Secondary Sodium System, Secondary Sodium Pumps, and Steam Generator. This system has been able to detect problems such as signals being affected by temperature variations due to a failing temperature controller

  6. Monitoring caustic injuries from emergency department databases using automatic keyword recognition software.

    Science.gov (United States)

    Vignally, P; Fondi, G; Taggi, F; Pitidis, A

    2011-03-31

    In Italy the European Union Injury Database reports the involvement of chemical products in 0.9% of home and leisure accidents. The Emergency Department registry on domestic accidents in Italy and the Poison Control Centres record that 90% of cases of exposure to toxic substances occur in the home. It is not rare for the effects of chemical agents to be observed in hospitals, with a high potential risk of damage - the rate of this cause of hospital admission is double the domestic injury average. The aim of this study was to monitor the effects of injuries caused by caustic agents in Italy using automatic free-text recognition in Emergency Department medical databases. We created a Stata software program to automatically identify caustic or corrosive injury cases using an agent-specific list of keywords. We focused attention on the procedure's sensitivity and specificity. Ten hospitals in six regions of Italy participated in the study. The program identified 112 cases of injury by caustic or corrosive agents. Checking the cases by quality controls (based on manual reading of ED reports), we assessed 99 cases as true positive, i.e. 88.4% of the patients were automatically recognized by the software as being affected by caustic substances (99% CI: 80.6%- 96.2%), that is to say 0.59% (99% CI: 0.45%-0.76%) of the whole sample of home injuries, a value almost three times as high as that expected (p < 0.0001) from European codified information. False positives were 11.6% of the recognized cases (99% CI: 5.1%- 21.5%). Our automatic procedure for caustic agent identification proved to have excellent product recognition capacity with an acceptable level of excess sensitivity. Contrary to our a priori hypothesis, the automatic recognition system provided a level of identification of agents possessing caustic effects that was significantly much greater than was predictable on the basis of the values from current codifications reported in the European Database.

  7. Voice, Schooling, Inequality, and Scale

    Science.gov (United States)

    Collins, James

    2013-01-01

    The rich studies in this collection show that the investigation of voice requires analysis of "recognition" across layered spatial-temporal and sociolinguistic scales. I argue that the concepts of voice, recognition, and scale provide insight into contemporary educational inequality and that their study benefits, in turn, from paying attention to…

  8. Speech recognition software and electronic psychiatric progress notes: physicians' ratings and preferences

    Directory of Open Access Journals (Sweden)

    Derman Yaron D

    2010-08-01

    Full Text Available Abstract Background The context of the current study was mandatory adoption of electronic clinical documentation within a large mental health care organization. Psychiatric electronic documentation has unique needs by the nature of dense narrative content. Our goal was to determine if speech recognition (SR would ease the creation of electronic progress note (ePN documents by physicians at our institution. Methods Subjects: Twelve physicians had access to SR software on their computers for a period of four weeks to create ePN. Measurements: We examined SR software in relation to its perceived usability, data entry time savings, impact on the quality of care and quality of documentation, and the impact on clinical and administrative workflow, as compared to existing methods for data entry. Data analysis: A series of Wilcoxon signed rank tests were used to compare pre- and post-SR measures. A qualitative study design was used. Results Six of twelve participants completing the study favoured the use of SR (five with SR alone plus one with SR via hand-held digital recorder for creating electronic progress notes over their existing mode of data entry. There was no clear perceived benefit from SR in terms of data entry time savings, quality of care, quality of documentation, or impact on clinical and administrative workflow. Conclusions Although our findings are mixed, SR may be a technology with some promise for mental health documentation. Future investigations of this nature should use more participants, a broader range of document types, and compare front- and back-end SR methods.

  9. Exploring expressivity and emotion with artificial voice and speech technologies.

    Science.gov (United States)

    Pauletto, Sandra; Balentine, Bruce; Pidcock, Chris; Jones, Kevin; Bottaci, Leonardo; Aretoulaki, Maria; Wells, Jez; Mundy, Darren P; Balentine, James

    2013-10-01

    Emotion in audio-voice signals, as synthesized by text-to-speech (TTS) technologies, was investigated to formulate a theory of expression for user interface design. Emotional parameters were specified with markup tags, and the resulting audio was further modulated with post-processing techniques. Software was then developed to link a selected TTS synthesizer with an automatic speech recognition (ASR) engine, producing a chatbot that could speak and listen. Using these two artificial voice subsystems, investigators explored both artistic and psychological implications of artificial speech emotion. Goals of the investigation were interdisciplinary, with interest in musical composition, augmentative and alternative communication (AAC), commercial voice announcement applications, human-computer interaction (HCI), and artificial intelligence (AI). The work-in-progress points towards an emerging interdisciplinary ontology for artificial voices. As one study output, HCI tools are proposed for future collaboration.

  10. Human activity recognition from wireless sensor network data: benchmark and software

    NARCIS (Netherlands)

    van Kasteren, T.L.M.; Englebienne, G.; Kröse, B.J.A.; Chen, L.; Nugent, C.; Biswas, J.; Hoey, J.

    2011-01-01

    Although activity recognition is an active area of research no common benchmark for evaluating the performance of activity recognition methods exists. In this chapter we present the state of the art probabilistic models used in activity recognition and show their performance on several real world

  11. Automatic Speech Recognition Systems for the Evaluation of Voice and Speech Disorders in Head and Neck Cancer

    OpenAIRE

    Andreas Maier; Tino Haderlein; Florian Stelzle; Elmar Nöth; Emeka Nkenke; Frank Rosanowski; Anne Schützenberger; Maria Schuster

    2010-01-01

    In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appropriate for objective and quick evaluation of intelligibility. In this study we investigate the applicability of the method to speech disorders caused by head and neck cancer. Intelligibility was quantified by speech recognition on recordings of a standard text read by 41 German laryngect...

  12. An automatic speech recognition system with speaker-independent identification support

    Science.gov (United States)

    Caranica, Alexandru; Burileanu, Corneliu

    2015-02-01

    The novelty of this work relies on the application of an open source research software toolkit (CMU Sphinx) to train, build and evaluate a speech recognition system, with speaker-independent support, for voice-controlled hardware applications. Moreover, we propose to use the trained acoustic model to successfully decode offline voice commands on embedded hardware, such as an ARMv6 low-cost SoC, Raspberry PI. This type of single-board computer, mainly used for educational and research activities, can serve as a proof-of-concept software and hardware stack for low cost voice automation systems.

  13. Low-Cost Implementation of a Named Entity Recognition System for Voice-Activated Human-Appliance Interfaces in a Smart Home

    Directory of Open Access Journals (Sweden)

    Geonwoo Park

    2018-02-01

    Full Text Available When we develop voice-activated human-appliance interface systems in smart homes, named entity recognition (NER is an essential tool for extracting execution targets from natural language commands. Previous studies on NER systems generally include supervised machine-learning methods that require a substantial amount of human-annotated training corpus. In the smart home environment, categories of named entities should be defined according to voice-activated devices (e.g., food names for refrigerators and song titles for music players. The previous machine-learning methods make it difficult to change categories of named entities because a large amount of the training corpus should be newly constructed by hand. To address this problem, we present a semi-supervised NER system to minimize the time-consuming and labor-intensive task of constructing the training corpus. Our system uses distant supervision methods with two kinds of auto-labeling processes: auto-labeling based on heuristic rules for single-class named entity corpus generation and auto-labeling based on a pre-trained single-class NER model for multi-class named entity corpus generation. Then, our system improves NER accuracy by using a bagging-based active learning method. In our experiments that included a generic domain that featured 11 named entity classes and a context-specific domain about baseball that featured 21 named entity classes, our system demonstrated good performances in both domains, with F1-measures of 0.777 and 0.958, respectively. Since our system was built from a relatively small human-annotated training corpus, we believe it is a viable alternative to current NER systems in smart home environments.

  14. Recognition

    DEFF Research Database (Denmark)

    Gimmler, Antje

    2017-01-01

    In this article, I shall examine the cognitive, heuristic and theoretical functions of the concept of recognition. To evaluate both the explanatory power and the limitations of a sociological concept, the theory construction must be analysed and its actual productivity for sociological theory mus...

  15. Effect of Acting Experience on Emotion Expression and Recognition in Voice: Non-Actors Provide Better Stimuli than Expected.

    Science.gov (United States)

    Jürgens, Rebecca; Grass, Annika; Drolet, Matthis; Fischer, Julia

    Both in the performative arts and in emotion research, professional actors are assumed to be capable of delivering emotions comparable to spontaneous emotional expressions. This study examines the effects of acting training on vocal emotion depiction and recognition. We predicted that professional actors express emotions in a more realistic fashion than non-professional actors. However, professional acting training may lead to a particular speech pattern; this might account for vocal expressions by actors that are less comparable to authentic samples than the ones by non-professional actors. We compared 80 emotional speech tokens from radio interviews with 80 re-enactments by professional and inexperienced actors, respectively. We analyzed recognition accuracies for emotion and authenticity ratings and compared the acoustic structure of the speech tokens. Both play-acted conditions yielded similar recognition accuracies and possessed more variable pitch contours than the spontaneous recordings. However, professional actors exhibited signs of different articulation patterns compared to non-trained speakers. Our results indicate that for emotion research, emotional expressions by professional actors are not better suited than those from non-actors.

  16. Landsat TM band 431 combine on clustering analysis for pattern recognition land use using idrisi 4.2 software

    International Nuclear Information System (INIS)

    Wiweka, Arief H.; Izzawati, Tjahyaningsih A.

    1997-01-01

    The recognition of earth object's pattern which is recorded on remote sensing digital image can do by classification process based on the group of spectral pixel value. The spectral assessment on a spatial which represent the object characteristic can be helped through supervised or unsupervised. On certain case, there no media, such as maps, airborne, photo, the capability of field observation and the knowledge of object's location. Classification process can be done by clustering. The group of pixel based on the wide of the whole value interval of spectral image, then the class group base on the desired accuracy. The clustering method in Idris 4.2 software equipments are sequential method, statistic, iso data, and RGB. The clustering existence can help pre-process pattern recognition

  17. Automatic speech recognition for report generation in computed tomography

    International Nuclear Information System (INIS)

    Teichgraeber, U.K.M.; Ehrenstein, T.; Lemke, M.; Liebig, T.; Stobbe, H.; Hosten, N.; Keske, U.; Felix, R.

    1999-01-01

    Purpose: A study was performed to compare the performance of automatic speech recognition (ASR) with conventional transcription. Materials and Methods: 100 CT reports were generated by using ASR and 100 CT reports were dictated and written by medical transcriptionists. The time for dictation and correction of errors by the radiologist was assessed and the type of mistakes was analysed. The text recognition rate was calculated in both groups and the average time between completion of the imaging study by the technologist and generation of the written report was assessed. A commercially available speech recognition technology (ASKA Software, IBM Via Voice) running of a personal computer was used. Results: The time for the dictation using digital voice recognition was 9.4±2.3 min compared to 4.5±3.6 min with an ordinary Dictaphone. The text recognition rate was 97% with digital voice recognition and 99% with medical transcriptionists. The average time from imaging completion to written report finalisation was reduced from 47.3 hours with medical transcriptionists to 12.7 hours with ASR. The analysis of misspellings demonstrated (ASR vs. medical transcriptionists): 3 vs. 4 for syntax errors, 0 vs. 37 orthographic mistakes, 16 vs. 22 mistakes in substance and 47 vs. erroneously applied terms. Conclusions: The use of digital voice recognition as a replacement for medical transcription is recommendable when an immediate availability of written reports is necessary. (orig.) [de

  18. TU-C-17A-03: An Integrated Contour Evaluation Software Tool Using Supervised Pattern Recognition for Radiotherapy

    Energy Technology Data Exchange (ETDEWEB)

    Chen, H; Tan, J; Kavanaugh, J; Dolly, S; Gay, H; Thorstad, W; Anastasio, M; Altman, M; Mutic, S; Li, H [Washington University School of Medicine, Saint Louis, MO (United States)

    2014-06-15

    Purpose: Radiotherapy (RT) contours delineated either manually or semiautomatically require verification before clinical usage. Manual evaluation is very time consuming. A new integrated software tool using supervised pattern contour recognition was thus developed to facilitate this process. Methods: The contouring tool was developed using an object-oriented programming language C# and application programming interfaces, e.g. visualization toolkit (VTK). The C# language served as the tool design basis. The Accord.Net scientific computing libraries were utilized for the required statistical data processing and pattern recognition, while the VTK was used to build and render 3-D mesh models from critical RT structures in real-time and 360° visualization. Principal component analysis (PCA) was used for system self-updating geometry variations of normal structures based on physician-approved RT contours as a training dataset. The inhouse design of supervised PCA-based contour recognition method was used for automatically evaluating contour normality/abnormality. The function for reporting the contour evaluation results was implemented by using C# and Windows Form Designer. Results: The software input was RT simulation images and RT structures from commercial clinical treatment planning systems. Several abilities were demonstrated: automatic assessment of RT contours, file loading/saving of various modality medical images and RT contours, and generation/visualization of 3-D images and anatomical models. Moreover, it supported the 360° rendering of the RT structures in a multi-slice view, which allows physicians to visually check and edit abnormally contoured structures. Conclusion: This new software integrates the supervised learning framework with image processing and graphical visualization modules for RT contour verification. This tool has great potential for facilitating treatment planning with the assistance of an automatic contour evaluation module in avoiding

  19. Software

    Energy Technology Data Exchange (ETDEWEB)

    Macedo, R.; Budd, G.; Ross, E.; Wells, P.

    2010-07-15

    The software section of this journal presented new software programs that have been developed to help in the exploration and development of hydrocarbon resources. Software provider IHS Inc. has made additions to its geological and engineering analysis software tool, IHS PETRA, a product used by geoscientists and engineers to visualize, analyze and manage well production, well log, drilling, reservoir, seismic and other related information. IHS PETRA also includes a directional well module and a decline curve analysis module to improve analysis capabilities in unconventional reservoirs. Petris Technology Inc. has developed a software to help manage the large volumes of data. PetrisWinds Enterprise (PWE) helps users find and manage wellbore data, including conventional wireline and MWD core data; analysis core photos and images; waveforms and NMR; and external files documentation. Ottawa-based Ambercore Software Inc. has been collaborating with Nexen on the Petroleum iQ software for steam assisted gravity drainage (SAGD) producers. Petroleum iQ integrates geology and geophysics data with engineering data in 3D and 4D. Calgary-based Envirosoft Corporation has developed a software that reduces the costly and time-consuming effort required to comply with Directive 39 of the Alberta Energy Resources Conservation Board. The product includes an emissions modelling software. Houston-based Seismic Micro-Technology (SMT) has developed the Kingdom software that features the latest in seismic interpretation. Holland-based Joa Oil and Gas and Calgary-based Computer Modelling Group have both supplied the petroleum industry with advanced reservoir simulation software that enables reservoir interpretation. The 2010 software survey included a guide to new software applications designed to facilitate petroleum exploration, drilling and production activities. Oil and gas producers can use the products for a range of functions, including reservoir characterization and accounting. In

  20. Pattern-recognition software detecting the onset of failures in complex systems

    International Nuclear Information System (INIS)

    Mott, J.; King, R.

    1987-01-01

    A very general mathematical framework for embodying learned data from a complex system and combining it with a current observation to estimate the true current state of the system has been implemented using nearly universal pattern-recognition algorithms and applied to surveillance of the EBR-II power plant. In this application the methodology can provide signal validation and replacement of faulty signals on a near-real-time basis for hundreds of plant parameters. The mathematical framework, the pattern-recognition algorithms, examples of the learning and estimating process, and plant operating decisions made using this methodology are discussed. The entire methodology has been reduced to a set of FORTRAN subroutines which are small, fast, robust and executable on a personal computer with a serial link to the system's data acquisition computer, or on the data acquisition computer itself

  1. EBR-II [Experimental Breeder Reactor-II] system surveillance using pattern recognition software

    International Nuclear Information System (INIS)

    Mott, J.E.; Radtke, W.H.; King, R.W.

    1986-02-01

    The problem of most accurately determining the Experimental Breeder Reactor-II (EBR-II) reactor outlet temperature from currently available plant signals is investigated. Historically, the reactor outlet pipe was originally instrumented with 8 temperature sensors but, during 22 years of operation, all these instruments have failed except for one remaining thermocouple, and its output had recently become suspect. Using pattern recognition methods to compare values of 129 plant signals for similarities over a 7 month period spanning reconfiguration of the core and recalibration of many plant signals, it was determined that the remaining reactor outlet pipe thermocouple is still useful as an indicator of true mixed mean reactor outlet temperature. Application of this methodology to investigate one specific signal has automatically validated the vast majority of the 129 signals used for pattern recognition and also highlighted a few inconsistent signals for further investigation

  2. DolphinAtack: Inaudible Voice Commands

    OpenAIRE

    Zhang, Guoming; Yan, Chen; Ji, Xiaoyu; Zhang, Taimin; Zhang, Tianchen; Xu, Wenyuan

    2017-01-01

    Speech recognition (SR) systems such as Siri or Google Now have become an increasingly popular human-computer interaction method, and have turned various systems into voice controllable systems(VCS). Prior work on attacking VCS shows that the hidden voice commands that are incomprehensible to people can control the systems. Hidden voice commands, though hidden, are nonetheless audible. In this work, we design a completely inaudible attack, DolphinAttack, that modulates voice commands on ultra...

  3. The expression and recognition of emotions in the voice across five nations: A lens model analysis based on acoustic features.

    Science.gov (United States)

    Laukka, Petri; Elfenbein, Hillary Anger; Thingujam, Nutankumar S; Rockstuhl, Thomas; Iraki, Frederick K; Chui, Wanda; Althoff, Jean

    2016-11-01

    This study extends previous work on emotion communication across cultures with a large-scale investigation of the physical expression cues in vocal tone. In doing so, it provides the first direct test of a key proposition of dialect theory, namely that greater accuracy of detecting emotions from one's own cultural group-known as in-group advantage-results from a match between culturally specific schemas in emotional expression style and culturally specific schemas in emotion recognition. Study 1 used stimuli from 100 professional actors from five English-speaking nations vocally conveying 11 emotional states (anger, contempt, fear, happiness, interest, lust, neutral, pride, relief, sadness, and shame) using standard-content sentences. Detailed acoustic analyses showed many similarities across groups, and yet also systematic group differences. This provides evidence for cultural accents in expressive style at the level of acoustic cues. In Study 2, listeners evaluated these expressions in a 5 × 5 design balanced across groups. Cross-cultural accuracy was greater than expected by chance. However, there was also in-group advantage, which varied across emotions. A lens model analysis of fundamental acoustic properties examined patterns in emotional expression and perception within and across groups. Acoustic cues were used relatively similarly across groups both to produce and judge emotions, and yet there were also subtle cultural differences. Speakers appear to have a culturally nuanced schema for enacting vocal tones via acoustic cues, and perceivers have a culturally nuanced schema in judging them. Consistent with dialect theory's prediction, in-group judgments showed a greater match between these schemas used for emotional expression and perception. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  4. RangerMaster trademark: Real-time pattern recognition software for in-field analysis of radiation sources

    International Nuclear Information System (INIS)

    Murray, W.S.; Ziemba, F.; Szluk, N.

    1998-01-01

    RangerMaster trademark is the embedded firmware for Quantrad Sensor's integrated nuclear instrument package, the Ranger trademark. The Ranger trademark, which is both a gamma-ray and neutron detection system, was originally developed at Los Alamos National Laboratory for in situ surveys at the Plutonium Facility to confirm the presence of nuclear materials. The new RangerMaster trademark software expands the library of isotopes and simplifies the operation of the instrument by providing an easy mode suitable for untrained operators. The expanded library of the Ranger trademark now includes medical isotopes 99 Tc, 201 Tl, 111 In, 67 Ga, 133 Xe, 103 Pa, and 131 I; industrial isotopes 241 Am, 57 Co, 133 Ba, 137 Cs, 40 K, 60 Co, 232 Th, 226 Ra, and 207 Bi; and nuclear materials 235 U, 238 U, 233 U, and 239 Pu. To accomplish isotopic identification, a simulated spectrum for each of the isotopes was generated using SYNTH. The SYNTH spectra formed the basis for the knowledge-based expert system and selection of the regions of interest that are used in the pattern recognition system. The knowledge-based pattern recognition system was tested against actual spectra under field conditions

  5. RangerMasterTM: real-time pattern recognition software for in-field analysis of radiation sources

    International Nuclear Information System (INIS)

    Murray, W.S.; Ziemba, F.; Szluk, N.

    1998-01-01

    RangerMaster TM is the embedded firmware for Quantrad Sensor's integrated nuclear instrument package, the Ranger TM . The Ranger TM , which is both a gamma-ray and neutron detection system, was originally developed at Los Alamos National Laboratory for in situ surveys at the Plutonium Facility to confirm the presence of nuclear materials. The new RangerMaster TM software expands the library of isotopes and simplifies the operation of the instrument by providing an 'easy' mode suitable for untrained operators. The expanded library of the Ranger TM now includes medical isotopes 99 Tc, 201 Tl, 111 In, 67 Ga, 133 Xe, 103 Pa, and 131 I; industrial isotopes 241 Am, 57 Co, 133 Ba, 137 Cs, 40 K, 60 Co, 232 Th, 226 Ra, and 207 Bi; and nuclear materials 235 U, 238 U, 233 U, and 239 Pu. To accomplish isotopic identification, a simulated spectrum for each of the isotopes was generated using SYNTH 2 . The SYNTH spectra formed the basis for the knowledge-based expert system and selection of the regions of interest that are used in the pattern recognition system. The knowledge-based pattern recognition system was tested against actual spectra under field conditions. (author)

  6. Practical applications of interactive voice technologies: Some accomplishments and prospects

    Science.gov (United States)

    Grady, Michael W.; Hicklin, M. B.; Porter, J. E.

    1977-01-01

    A technology assessment of the application of computers and electronics to complex systems is presented. Three existing systems which utilize voice technology (speech recognition and speech generation) are described. Future directions in voice technology are also described.

  7. Multimodal emotion recognition as assessment for learning in a game-based communication skills training

    NARCIS (Netherlands)

    Nadolski, Rob; Bahreini, Kiavash; Westera, Wim

    2014-01-01

    This paper presentation describes how our FILTWAM software artifacts for face and voice emotion recognition will be used for assessing learners' progress and providing adequate feedback in an online game-based communication skills training. This constitutes an example of in-game assessment for

  8. Multimodal Emotion Recognition for Assessment of Learning in a Game-Based Communication Skills Training

    NARCIS (Netherlands)

    Bahreini, Kiavash; Nadolski, Rob; Westera, Wim

    2015-01-01

    This paper describes how our FILTWAM software artifacts for face and voice emotion recognition will be used for assessing learners' progress and providing adequate feedback in an online game-based communication skills training. This constitutes an example of in-game assessment for mainly formative

  9. Forensic Automatic Speaker Recognition Based on Likelihood Ratio Using Acoustic-phonetic Features Measured Automatically

    Directory of Open Access Journals (Sweden)

    Huapeng Wang

    2015-01-01

    Full Text Available Forensic speaker recognition is experiencing a remarkable paradigm shift in terms of the evaluation framework and presentation of voice evidence. This paper proposes a new method of forensic automatic speaker recognition using the likelihood ratio framework to quantify the strength of voice evidence. The proposed method uses a reference database to calculate the within- and between-speaker variability. Some acoustic-phonetic features are extracted automatically using the software VoiceSauce. The effectiveness of the approach was tested using two Mandarin databases: A mobile telephone database and a landline database. The experiment's results indicate that these acoustic-phonetic features do have some discriminating potential and are worth trying in discrimination. The automatic acoustic-phonetic features have acceptable discriminative performance and can provide more reliable results in evidence analysis when fused with other kind of voice features.

  10. Facial recognition software success rates for the identification of 3D surface reconstructed facial images: implications for patient privacy and security.

    Science.gov (United States)

    Mazura, Jan C; Juluru, Krishna; Chen, Joseph J; Morgan, Tara A; John, Majnu; Siegel, Eliot L

    2012-06-01

    Image de-identification has focused on the removal of textual protected health information (PHI). Surface reconstructions of the face have the potential to reveal a subject's identity even when textual PHI is absent. This study assessed the ability of a computer application to match research subjects' 3D facial reconstructions with conventional photographs of their face. In a prospective study, 29 subjects underwent CT scans of the head and had frontal digital photographs of their face taken. Facial reconstructions of each CT dataset were generated on a 3D workstation. In phase 1, photographs of the 29 subjects undergoing CT scans were added to a digital directory and tested for recognition using facial recognition software. In phases 2-4, additional photographs were added in groups of 50 to increase the pool of possible matches and the test for recognition was repeated. As an internal control, photographs of all subjects were tested for recognition against an identical photograph. Of 3D reconstructions, 27.5% were matched correctly to corresponding photographs (95% upper CL, 40.1%). All study subject photographs were matched correctly to identical photographs (95% lower CL, 88.6%). Of 3D reconstructions, 96.6% were recognized simply as a face by the software (95% lower CL, 83.5%). Facial recognition software has the potential to recognize features on 3D CT surface reconstructions and match these with photographs, with implications for PHI.

  11. Audiovisual speech facilitates voice learning.

    Science.gov (United States)

    Sheffert, Sonya M; Olson, Elizabeth

    2004-02-01

    In this research, we investigated the effects of voice and face information on the perceptual learning of talkers and on long-term memory for spoken words. In the first phase, listeners were trained over several days to identify voices from words presented auditorily or audiovisually. The training data showed that visual information about speakers enhanced voice learning, revealing cross-modal connections in talker processing akin to those observed in speech processing. In the second phase, the listeners completed an auditory or audiovisual word recognition memory test in which equal numbers of words were spoken by familiar and unfamiliar talkers. The data showed that words presented by familiar talkers were more likely to be retrieved from episodic memory, regardless of modality. Together, these findings provide new information about the representational code underlying familiar talker recognition and the role of stimulus familiarity in episodic word recognition.

  12. Application of neural network and pattern recognition software to the automated analysis of continuous nuclear monitoring of on-load reactors

    Energy Technology Data Exchange (ETDEWEB)

    Howell, J.A.; Eccleston, G.W.; Halbig, J.K.; Klosterbuer, S.F. [Los Alamos National Lab., NM (United States); Larson, T.W. [California Polytechnic State Univ., San Luis Obispo, CA (US)

    1993-08-01

    Automated analysis using pattern recognition and neural network software can help interpret data, call attention to potential anomalies, and improve safeguards effectiveness. Automated software analysis, based on pattern recognition and neural networks, was applied to data collected from a radiation core discharge monitor system located adjacent to an on-load reactor core. Unattended radiation sensors continuously collect data to monitor on-line refueling operations in the reactor. The huge volume of data collected from a number of radiation channels makes it difficult for a safeguards inspector to review it all, check for consistency among the measurement channels, and find anomalies. Pattern recognition and neural network software can analyze large volumes of data from continuous, unattended measurements, thereby improving and automating the detection of anomalies. The authors developed a prototype pattern recognition program that determines the reactor power level and identifies the times when fuel bundles are pushed through the core during on-line refueling. Neural network models were also developed to predict fuel bundle burnup to calculate the region on the on-load reactor face from which fuel bundles were discharged based on the radiation signals. In the preliminary data set, which was limited and consisted of four distinct burnup regions, the neural network model correctly predicted the burnup region with an accuracy of 92%.

  13. Voiced Excitations

    National Research Council Canada - National Science Library

    Holzricher, John

    2004-01-01

    To more easily obtain a voiced excitation function for speech characterization, measurements of skin motion, tracheal tube, and vocal fold, motions were made and compared to EM sensor-glottal derived...

  14. [Assessment of voice acoustic parameters in female teachers with diagnosed occupational voice disorders].

    Science.gov (United States)

    Niebudek-Bogusz, Ewa; Fiszer, Marta; Sliwińska-Kowalska, Mariola

    2005-01-01

    Laryngovideostroboscopy is the method most frequently used in the assessment of voice disorders. However, the employment of quantitative methods, such as voice acoustic analysis, is essential for evaluating the effectiveness of prophylactic and therapeutic activities as well as for objective medical certification of larynx pathologies. The aim of this study was to examine voice acoustic parameters in female teachers with occupational voice diseases. Acoustic analysis (IRIS software) was performed in 66 female teachers, including 35 teachers with occupational voice diseases and 31 with functional dysphonia. The teachers with occupational voice diseases presented the lower average fundamental frequency (193 Hz) compared to the group with functional dysphonia (209 Hz) and to the normative value (236 Hz), whereas other acoustic parameters did not differ significantly in both groups. Voice acoustic analysis, when applied separately from vocal loading, cannot be used as a testing method to verify the diagnosis of occupational voice disorders.

  15. Familiarity and Voice Representation: From Acoustic-Based Representation to Voice Averages

    Directory of Open Access Journals (Sweden)

    Maureen Fontaine

    2017-07-01

    Full Text Available The ability to recognize an individual from their voice is a widespread ability with a long evolutionary history. Yet, the perceptual representation of familiar voices is ill-defined. In two experiments, we explored the neuropsychological processes involved in the perception of voice identity. We specifically explored the hypothesis that familiar voices (trained-to-familiar (Experiment 1, and famous voices (Experiment 2 are represented as a whole complex pattern, well approximated by the average of multiple utterances produced by a single speaker. In experiment 1, participants learned three voices over several sessions, and performed a three-alternative forced-choice identification task on original voice samples and several “speaker averages,” created by morphing across varying numbers of different vowels (e.g., [a] and [i] produced by the same speaker. In experiment 2, the same participants performed the same task on voice samples produced by familiar speakers. The two experiments showed that for famous voices, but not for trained-to-familiar voices, identification performance increased and response times decreased as a function of the number of utterances in the averages. This study sheds light on the perceptual representation of familiar voices, and demonstrates the power of average in recognizing familiar voices. The speaker average captures the unique characteristics of a speaker, and thus retains the information essential for recognition; it acts as a prototype of the speaker.

  16. The software for automatic creation of the formal grammars used by speech recognition, computer vision, editable text conversion systems, and some new functions

    Science.gov (United States)

    Kardava, Irakli; Tadyszak, Krzysztof; Gulua, Nana; Jurga, Stefan

    2017-02-01

    For more flexibility of environmental perception by artificial intelligence it is needed to exist the supporting software modules, which will be able to automate the creation of specific language syntax and to make a further analysis for relevant decisions based on semantic functions. According of our proposed approach, of which implementation it is possible to create the couples of formal rules of given sentences (in case of natural languages) or statements (in case of special languages) by helping of computer vision, speech recognition or editable text conversion system for further automatic improvement. In other words, we have developed an approach, by which it can be achieved to significantly improve the training process automation of artificial intelligence, which as a result will give us a higher level of self-developing skills independently from us (from users). At the base of our approach we have developed a software demo version, which includes the algorithm and software code for the entire above mentioned component's implementation (computer vision, speech recognition and editable text conversion system). The program has the ability to work in a multi - stream mode and simultaneously create a syntax based on receiving information from several sources.

  17. Advances in software development for intelligent interfaces for alarm and emergency management consoles

    International Nuclear Information System (INIS)

    Moseley, M.R.; Olson, C.E.

    1986-01-01

    Recent advances in technology allow features like voice synthesis, voice and speech recognition, image understanding, and intelligent data base management to be incorporated in computer driven alarm and emergency management information systems. New software development environments make it possible to do rapid prototyping of custom applications. Three examples using these technologies are discussed. 1) Maximum use is made of high-speed graphics and voice synthesis to implement a state-of-the-art alarm processing and display system with features that make the operator-machine interface efficient and accurate. 2) An application generator which has the capability of ''building'' a specific alarm processing and display application in a matter of a few hours, using the site definition developed in the security planning phase to produce the custom application. 3) A software tool, is described which permits rapid prototyping of human-machine interfaces for a variety of applications including emergency management, alarm display and process information display

  18. Voice Based City Panic Button System

    Science.gov (United States)

    Febriansyah; Zainuddin, Zahir; Bachtiar Nappu, M.

    2018-03-01

    The development of voice activated panic button application aims to design faster early notification of hazardous condition in community to the nearest police by using speech as the detector where the current application still applies touch-combination on screen and use coordination of orders from control center then the early notification still takes longer time. The method used in this research was by using voice recognition as the user voice detection and haversine formula for the comparison of closest distance between the user and the police. This research was equipped with auto sms, which sent notification to the victim’s relatives, that was also integrated with Google Maps application (GMaps) as the map to the victim’s location. The results show that voice registration on the application reaches 100%, incident detection using speech recognition while the application is running is 94.67% in average, and the auto sms to the victim relatives reaches 100%.

  19. Improved sensitivity of wearable nanogenerators made of electrospun Eu3+ doped P(VDF-HFP)/graphene composite nanofibers for self-powered voice recognition

    Science.gov (United States)

    Adhikary, Prakriti; Biswas, Anirban; Mandal, Dipankar

    2016-12-01

    Composite nanofibers of Eu3+ doped poly(vinylidene fluoride-co-hexafluoropropylene) (P(VDF-HFP))/graphene are prepared by the electrospinning technique for the fabrication of ultrasensitive wearable piezoelectric nanogenerators (WPNGs) where the post-poling technique is not necessary. It is found that the complete conversion of the piezoelectric β-phase and the improvement of the degree of crystallinity is governed by the incorporation of Eu3+ and graphene sheets into P(VDF-HFP) nanofibers. The flexible nanocomposite fibers are associated with a hypersensitive electronic transition that results in an intense red light emission, and WPNGs also have the capability of detecting external pressure as low as ~23 Pa with a higher degree of acoustic sensitivity, ~11 V Pa-1, than has ever been previously reported. This means that ultrasensitive WPNGs can be utilized to recognize human voices, which suggests they could be a potential tool in the biomedical and national security sectors. The capacitor’s ability to charge from abundant environmental vibrations, such as music, wind, body motion, etc, drives WPNGs as a power source for portable electronics. This fact may open up the prospect of using the Eu3+ doped P(VDF-HFP)/graphene composite electrospun nanofibers, with their multifunctional properties such as vibration sensitivity, wearability, red light emission capability and piezoelectric energy harvesting, for various promising applications in portable electronics, health care monitoring, noise detection and security monitoring.

  20. Pattern recognition and data mining software based on artificial neural networks applied to proton transfer in aqueous environments

    International Nuclear Information System (INIS)

    Tahat Amani; Marti Jordi; Khwaldeh Ali; Tahat Kaher

    2014-01-01

    In computational physics proton transfer phenomena could be viewed as pattern classification problems based on a set of input features allowing classification of the proton motion into two categories: transfer ‘occurred’ and transfer ‘not occurred’. The goal of this paper is to evaluate the use of artificial neural networks in the classification of proton transfer events, based on the feed-forward back propagation neural network, used as a classifier to distinguish between the two transfer cases. In this paper, we use a new developed data mining and pattern recognition tool for automating, controlling, and drawing charts of the output data of an Empirical Valence Bond existing code. The study analyzes the need for pattern recognition in aqueous proton transfer processes and how the learning approach in error back propagation (multilayer perceptron algorithms) could be satisfactorily employed in the present case. We present a tool for pattern recognition and validate the code including a real physical case study. The results of applying the artificial neural networks methodology to crowd patterns based upon selected physical properties (e.g., temperature, density) show the abilities of the network to learn proton transfer patterns corresponding to properties of the aqueous environments, which is in turn proved to be fully compatible with previous proton transfer studies. (condensed matter: structural, mechanical, and thermal properties)

  1. Enhanced Recognition of Written Words and Enjoyment of Reading in Struggling Beginner Readers through Whole-Word Multimedia Software

    Science.gov (United States)

    Karemaker, Arjette; Pitchford, Nicola J.; O'Malley, Claire

    2010-01-01

    The effectiveness of a reading intervention using the whole-word multimedia software "Oxford Reading Tree (ORT) for Clicker" was compared to a reading intervention using traditional ORT Big Books. Developing literacy skills and attitudes towards learning to read were assessed in a group of 17 struggling beginner readers aged 5-6 years. Each child…

  2. Benefits for Voice Learning Caused by Concurrent Faces Develop over Time.

    Science.gov (United States)

    Zäske, Romi; Mühl, Constanze; Schweinberger, Stefan R

    2015-01-01

    Recognition of personally familiar voices benefits from the concurrent presentation of the corresponding speakers' faces. This effect of audiovisual integration is most pronounced for voices combined with dynamic articulating faces. However, it is unclear if learning unfamiliar voices also benefits from audiovisual face-voice integration or, alternatively, is hampered by attentional capture of faces, i.e., "face-overshadowing". In six study-test cycles we compared the recognition of newly-learned voices following unimodal voice learning vs. bimodal face-voice learning with either static (Exp. 1) or dynamic articulating faces (Exp. 2). Voice recognition accuracies significantly increased for bimodal learning across study-test cycles while remaining stable for unimodal learning, as reflected in numerical costs of bimodal relative to unimodal voice learning in the first two study-test cycles and benefits in the last two cycles. This was independent of whether faces were static images (Exp. 1) or dynamic videos (Exp. 2). In both experiments, slower reaction times to voices previously studied with faces compared to voices only may result from visual search for faces during memory retrieval. A general decrease of reaction times across study-test cycles suggests facilitated recognition with more speaker repetitions. Overall, our data suggest two simultaneous and opposing mechanisms during bimodal face-voice learning: while attentional capture of faces may initially impede voice learning, audiovisual integration may facilitate it thereafter.

  3. Tips for Healthy Voices

    Science.gov (United States)

    ... prevent voice problems and maintain a healthy voice: Drink water (stay well hydrated): Keeping your body well hydrated by drinking plenty of water each day (6-8 glasses) is essential to maintaining a healthy voice. The ...

  4. Probing echoic memory with different voices.

    Science.gov (United States)

    Madden, D J; Bastian, J

    1977-05-01

    Considerable evidence has indicated that some acoustical properties of spoken items are preserved in an "echoic" memory for approximately 2 sec. However, some of this evidence has also shown that changing the voice speaking the stimulus items has a disruptive effect on memory which persists longer than that of other acoustical variables. The present experiment examined the effect of voice changes on response bias as well as on accuracy in a recognition memory task. The task involved judging recognition probes as being present in or absent from sets of dichotically presented digits. Recognition of probes spoken in the same voice as that of the dichotic items was more accurate than recognition of different-voice probes at each of three retention intervals of up to 4 sec. Different-voice probes increased the likelihood of "absent" responses, but only up to a 1.4-sec delay. These shifts in response bias may represent a property of echoic memory which should be investigated further.

  5. Voice Onset Time in Azerbaijani Consonants

    Directory of Open Access Journals (Sweden)

    Ali Jahan

    2009-10-01

    Full Text Available Objective: Voice onset time is known to be cue for the distinction between voiced and voiceless stops and it can be used to describe or categorize a range of developmental, neuromotor and linguistic disorders. The aim of this study is determination of standard values of voice onset time for Azerbaijani language (Tabriz dialect. Materials & Methods: In this description-analytical study, 30 Azeris persons whom were selected conveniently by simple selection, uttered 46 monosyllabic words initiating with 6 Azerbaijani stops twice. Using Praat software, the voice onset time values were analyzed by waveform and wideband spectrogram in milliseconds. Vowel effect, sex differences and the effect of place of articulation on VOT, were evaluated and data were analyzed by one-way ANOVA test. Results: There was no significant difference in voice onset time between male and female Azeris speakers (P<0.05. Vowel and place of articulation had significant correlation with voice onset time (P<0.001. Voice onset time values for /b/, /p/, /d/, /t/, /g/, /k/, and [c], [ɟ] allophones were 10.64, 86.88, 13.35, 87.09, 26.25, 100.62, 131.19, 63.18 mili second, respectively. Conclusion: Voice onset time values are the same for Azerbaijani men and women. However, like many other languages, back and high vowels and back place of articulation lengthen VOT. Also, voiceless stops are aspirated in this language and voiced stops have positive VOT values.

  6. Productivity, part 2: cloud storage, remote meeting tools, screencasting, speech recognition software, password managers, and online data backup.

    Science.gov (United States)

    Lackey, Amanda E; Pandey, Tarun; Moshiri, Mariam; Lalwani, Neeraj; Lall, Chandana; Bhargava, Puneet

    2014-06-01

    It is an opportune time for radiologists to focus on personal productivity. The ever increasing reliance on computers and the Internet has significantly changed the way we work. Myriad software applications are available to help us improve our personal efficiency. In this article, the authors discuss some tools that help improve collaboration and personal productivity, maximize e-learning, and protect valuable digital data. Published by Elsevier Inc.

  7. Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework

    Science.gov (United States)

    West, P.; Michaelis, J.; Lebot, T.; McGuinness, D. L.; Fox, P. A.

    2014-12-01

    Providing proper citation and attribution for published data, derived data products, and the software tools used to generate them, has always been an important aspect of scientific research. However, It is often the case that this type of detailed citation and attribution is lacking. This is in part because it often requires manual markup since dynamic generation of this type of provenance information is not typically done by the tools used to access, manipulate, transform, and visualize data. In addition, the tools themselves lack the information needed to be properly cited themselves. The OPeNDAP Hyrax Software Framework is a tool that provides access to and the ability to constrain, manipulate, and transform, different types of data from different data formats, into a common format, the DAP (Data Access Protocol), in order to derive new data products. A user, or another software client, specifies an HTTP URL in order to access a particular piece of data, and appropriately transform it to suit a specific purpose of use. The resulting data products, however, do not contain any information about what data was used to create it, or the software process used to generate it, let alone information that would allow the proper citing and attribution to down stream researchers and tool developers. We will present our approach to provenance capture in Hyrax including a mechanism that can be used to report back to the hosting site any derived products, such as publications and reports, using the W3C PROV recommendation pingback service. We will demonstrate our utilization of Semantic Web and Web standards, the development of an information model that extends the PROV model for provenance capture, and the development of the pingback service. We will present our findings, as well as our practices for providing provenance information, visualization of the provenance information, and the development of pingback services, to better enable scientists and tool developers to be

  8. ATMS software: Fuzzy Hough Transform in a hybrid algorithm for counting the overlapped etched tracks and orientation recognition

    International Nuclear Information System (INIS)

    Khayat, O.; Ghergherehchi, M.; Afarideh, H.; Durrani, S.A.; Pouyan, Ali A.; Kim, Y.S.

    2013-01-01

    A computer program named ATMS written in MATLAB and running with a friendly interface has been developed for recognition and parametric measurements of etched tracks in images captured from the surface of Solid State Nuclear Track Detectors. The program, using image analysis tools, counts the number of etched tracks and depending on the current working mode classifies them according to their radii (small object removal) or their axis (non-perpendicular or non-circular etched tracks), their mean intensity value and their orientation through the minor and major axes. Images of the detectors' surfaces are input to the code, which generates text and figure files as output, including the number of counted etched tracks with the associated track parameters, histograms and a figure showing edge and center of detected etched tracks. ATMS code is running hierarchically as calibration, testing and measurement modes to demonstrate the reliability, repeatability and adaptability. Fuzzy Hough Transform is used for the estimation of the number of etched tracks and their parameters, providing results even in cases that overlapping and orientation occur. ATMS code is finally converted to a standalone file which makes it able to run out of MATLAB environment. - Highlights: ► Presenting a novel code named ATMS for nuclear track measurements. ► Execution in three modes for generality, adaptability and reliability. ► Using Fuzzy Hough Transform for overlapping detection and orientation recognition. ► Using DFT as a filter for noise removal process in track images. ► Processing the noisy track images and demonstration of the presented code

  9. A Wireless LAN and Voice Information System for Underground Coal Mine

    OpenAIRE

    Yu Zhang; Wei Yang; Dongsheng Han; Young-Il Kim

    2014-01-01

    In this paper we constructed a wireless information system, and developed a wireless voice communication subsystem based on Wireless Local Area Networks (WLAN) for underground coal mine, which employs Voice over IP (VoIP) technology and Session Initiation Protocol (SIP) to achieve wireless voice dispatching communications. The master control voice dispatching interface and call terminal software are also developed on the WLAN ground server side to manage and implement the voice dispatching co...

  10. Voice user interface design for emerging multilingual markets

    CSIR Research Space (South Africa)

    Van Huyssteen, G

    2012-10-01

    Full Text Available Multilingual emerging markets hold many opportunities for the application of spoken language technologies, such as automatic speech recognition (ASR) or test-to-speech (TTS) technologies in interactive voice response (IVR) systems. However...

  11. Educational Technology and Student Voice: Examining Teacher Candidates' Perceptions

    Science.gov (United States)

    Byker, Erik Jon; Putman, S. Michael; Handler, Laura; Polly, Drew

    2017-01-01

    Student Voice is a term that honors the participatory roles that students have when they enter learning spaces like classrooms. Student Voice is the recognition of students' choice, creativity, and freedom. Seminal educationists--like Dewey and Montessori--centered the purposes of education in the flourishing and valuing of Student Voice. This…

  12. Simple and efficient method for region of interest value extraction from picture archiving and communication system viewer with optical character recognition software and macro program.

    Science.gov (United States)

    Lee, Young Han; Park, Eun Hae; Suh, Jin-Suck

    2015-01-01

    The objectives are: 1) to introduce a simple and efficient method for extracting region of interest (ROI) values from a Picture Archiving and Communication System (PACS) viewer using optical character recognition (OCR) software and a macro program, and 2) to evaluate the accuracy of this method with a PACS workstation. This module was designed to extract the ROI values on the images of the PACS, and created as a development tool by using open-source OCR software and an open-source macro program. The principal processes are as follows: (1) capture a region of the ROI values as a graphic file for OCR, (2) recognize the text from the captured image by OCR software, (3) perform error-correction, (4) extract the values including area, average, standard deviation, max, and min values from the text, (5) reformat the values into temporary strings with tabs, and (6) paste the temporary strings into the spreadsheet. This principal process was repeated for the number of ROIs. The accuracy of this module was evaluated on 1040 recognitions from 280 randomly selected ROIs of the magnetic resonance images. The input times of ROIs were compared between conventional manual method and this extraction module-assisted input method. The module for extracting ROI values operated successfully using the OCR and macro programs. The values of the area, average, standard deviation, maximum, and minimum could be recognized and error-corrected with AutoHotkey-coded module. The average input times using the conventional method and the proposed module-assisted method were 34.97 seconds and 7.87 seconds, respectively. A simple and efficient method for ROI value extraction was developed with open-source OCR and a macro program. Accurate inputs of various numbers from ROIs can be extracted with this module. The proposed module could be applied to the next generation of PACS or existing PACS that have not yet been upgraded. Copyright © 2015 AUR. Published by Elsevier Inc. All rights reserved.

  13. Voice parameters and videonasolaryngoscopy in children with vocal nodules: a longitudinal study, before and after voice therapy.

    Science.gov (United States)

    Valadez, Victor; Ysunza, Antonio; Ocharan-Hernandez, Esther; Garrido-Bustamante, Norma; Sanchez-Valerio, Araceli; Pamplona, Ma C

    2012-09-01

    Vocal Nodules (VN) are a functional voice disorder associated with voice misuse and abuse in children. There are few reports addressing vocal parameters in children with VN, especially after a period of vocal rehabilitation. The purpose of this study is to describe measurements of vocal parameters including Fundamental Frequency (FF), Shimmer (S), and Jitter (J), videonasolaryngoscopy examination and clinical perceptual assessment, before and after voice therapy in children with VN. Voice therapy was provided using visual support through Speech-Viewer software. Twenty patients with VN were studied. An acoustical analysis of voice was performed and compared with data from subjects from a control group matched by age and gender. Also, clinical perceptual assessment of voice and videonasolaryngoscopy were performed to all patients with VN. After a period of voice therapy, provided with visual support using Speech Viewer-III (SV-III-IBM) software, new acoustical analyses, perceptual assessments and videonasolaryngoscopies were performed. Before the onset of voice therapy, there was a significant difference (ptherapy period, a significant improvement (pvocal nodules were no longer discernible on the vocal folds in any of the cases. SV-III software seems to be a safe and reliable method for providing voice therapy in children with VN. Acoustic voice parameters, perceptual data and videonasolaryngoscopy were significantly improved after the speech therapy period was completed. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  14. Diagnostic value of voice acoustic analysis in assessment of occupational voice pathologies in teachers.

    Science.gov (United States)

    Niebudek-Bogusz, Ewa; Fiszer, Marta; Kotylo, Piotr; Sliwinska-Kowalska, Mariola

    2006-01-01

    It has been shown that teachers are at risk of developing occupational dysphonia, which accounts for over 25% of all occupational diseases diagnosed in Poland. The most frequently used method of diagnosing voice diseases is videostroboscopy. However, to facilitate objective evaluation of voice efficiency as well as medical certification of occupational voice disorders, it is crucial to implement quantitative methods of voice assessment, particularly voice acoustic analysis. The aim of the study was to assess the results of acoustic analysis in 66 female teachers (aged 40-64 years), including 35 subjects with occupational voice pathologies (e.g., vocal nodules) and 31 subjects with functional dysphonia. The acoustic analysis was performed using the IRIS software, before and after a 30-minute vocal loading test. All participants were subjected also to laryngological and videostroboscopic examinations. After the vocal effort, the acoustic parameters displayed statistically significant abnormalities, mostly lowered fundamental frequency (Fo) and incorrect values of shimmer and noise to harmonic ratio. To conclude, quantitative voice acoustic analysis using the IRIS software seems to be an effective complement to voice examinations, which is particularly helpful in diagnosing occupational dysphonia.

  15. Auditory Modeling for Noisy Speech Recognition

    National Research Council Canada - National Science Library

    2000-01-01

    ... digital filtering for noise cancellation which interfaces to speech recognition software. It uses auditory features in speech recognition training, and provides applications to multilingual spoken language translation...

  16. Alexa, Siri, Cortana, and More: An Introduction to Voice Assistants.

    Science.gov (United States)

    Hoy, Matthew B

    2018-01-01

    Voice assistants are software agents that can interpret human speech and respond via synthesized voices. Apple's Siri, Amazon's Alexa, Microsoft's Cortana, and Google's Assistant are the most popular voice assistants and are embedded in smartphones or dedicated home speakers. Users can ask their assistants questions, control home automation devices and media playback via voice, and manage other basic tasks such as email, to-do lists, and calendars with verbal commands. This column will explore the basic workings and common features of today's voice assistants. It will also discuss some of the privacy and security issues inherent to voice assistants and some potential future uses for these devices. As voice assistants become more widely used, librarians will want to be familiar with their operation and perhaps consider them as a means to deliver library services and materials.

  17. Voice synthesis application

    Science.gov (United States)

    Lightstone, P. C.; Davidson, W. M.

    1982-04-01

    The military detection assessment laboratory houses an experimental field system which assesses different alarm indicators such as fence disturbance sensors, MILES cables, and microwave Racons. A speech synthesis board which could be interfaced, by means of a computer, to an alarm logger making verbal acknowledgement of alarms possible was purchased. Different products and different types of voice synthesis were analyzed before a linear predictive code device produced by Telesensory Speech Systems of Palo Alto, California was chosen. This device is called the Speech 1000 Board and has a dedicated 8085 processor. A multiplexer card was designed and the Sp 1000 interfaced through the card into a TMS 990/100M Texas Instrument microcomputer. It was also necessary to design the software with the capability of recognizing and flagging an alarm on any 1 of 32 possible lines. The experimental field system was then packaged with a dc power supply, LED indicators, speakers, and switches, and deployed in the field performing reliably.

  18. Voice-associated static face image releases speech from informational masking.

    Science.gov (United States)

    Gao, Yayue; Cao, Shuyang; Qu, Tianshu; Wu, Xihong; Li, Haifeng; Zhang, Jinsheng; Li, Liang

    2014-06-01

    In noisy, multipeople talking environments such as a cocktail party, listeners can use various perceptual and/or cognitive cues to improve recognition of target speech against masking, particularly informational masking. Previous studies have shown that temporally prepresented voice cues (voice primes) improve recognition of target speech against speech masking but not noise masking. This study investigated whether static face image primes that have become target-voice associated (i.e., facial images linked through associative learning with voices reciting the target speech) can be used by listeners to unmask speech. The results showed that in 32 normal-hearing younger adults, temporally prepresenting a voice-priming sentence with the same voice reciting the target sentence significantly improved the recognition of target speech that was masked by irrelevant two-talker speech. When a person's face photograph image became associated with the voice reciting the target speech by learning, temporally prepresenting the target-voice-associated face image significantly improved recognition of target speech against speech masking, particularly for the last two keywords in the target sentence. Moreover, speech-recognition performance under the voice-priming condition was significantly correlated to that under the face-priming condition. The results suggest that learned facial information on talker identity plays an important role in identifying the target-talker's voice and facilitating selective attention to the target-speech stream against the masking-speech stream. © 2014 The Institute of Psychology, Chinese Academy of Sciences and Wiley Publishing Asia Pty Ltd.

  19. Advances in software development for intelligent interfaces for alarm and emergency management consoles

    International Nuclear Information System (INIS)

    Moseley, M.R.; Olson, C.E.

    1986-01-01

    Recent advances in technology allow features like voice synthesis, voice and speech recognition, image understanding, and intelligent data base management to be incorporated in computer driven alarm and emergency management information systems. New software development environments make it possible to do rapid prototyping of custom applications. Three examples using these technologies are discussed. (1) Maximum use is made of high-speed graphics and voice synthesis to implement a state-of-the-art alarm processing and display system with features that make the operator-machine interface efficient and accurate. Although very functional, this system is not portable or flexible; the software would have to be substantially rewritten for other applications. (2) An application generator which has the capability of ''building'' a specific alarm processing and display application in a matter of a few hours, using the site definition developed in the security planning phase to produce the custom application. This package is based on a standardized choice of hardware, within which it is capable of building a system to order, automatically constructing graphics, data tables, alarm prioritization rules, and interfaces to peripherals. (3) A software tool, the User Interface Management System (UIMS), is described which permits rapid prototyping of human-machine interfaces for a variety of applications including emergency management, alarm display and process information display. The object-oriented software of the UIMS achieves rapid prototyping of a new interface by standardizing to a class library of software objects instead of hardware objects

  20. DSP Based System for Real time Voice Synthesis Applications Development

    OpenAIRE

    Arsinte, Radu; Ferencz, Attila; Miron, Costin

    2008-01-01

    This paper describes an experimental system designed for development of real time voice synthesis applications. The system is composed from a DSP coprocessor card, equipped with an TMS320C25 or TMS320C50 chip, voice acquisition module (ADDA2),host computer (IBM-PC compatible), software specific tools.

  1. Objective Voice Parameters in Colombian School Workers with Healthy Voices

    Directory of Open Access Journals (Sweden)

    Lady Catherine Cantor Cutiva

    2015-09-01

    Full Text Available Objectives: To characterize the objective voice parameters among school workers, and to identi­fy associated factors of three objective voice parameters, namely fundamental frequency, sound pressure level and maximum phonation time. Materials and methods: We conducted a cross-sectional study among 116 Colombian teachers and 20 Colombian non-teachers. After signing the informed consent form, participants filled out a questionnaire. Then, a voice sample was recorded and evaluated perceptually by a speech therapist and by objective voice analysis with praat software. Short-term environmental measurements of sound level, temperature, humi­dity, and reverberation time were conducted during visits at the workplaces, such as classrooms and offices. Linear regression analysis was used to determine associations between individual and work-related factors and objective voice parameters. Results: Compared with men, women had higher fundamental frequency (201 Hz for teachers and 209 for non-teachers vs. 120 Hz for teachers and 127 for non-teachers and sound pressure level (82 dB vs. 80 dB, and shorter maximum phonation time (around 14 seconds vs. around 16 seconds. Female teachers younger than 50 years of age evidenced a significant tendency to speak with lower fundamental frequen­cy and shorter mpt compared with female teachers older than 50 years of age. Female teachers had significantly higher fundamental frequency (66 Hz, higher sound pressure level (2 dB and short phonation time (2 seconds than male teachers. Conclusion: Female teachers younger than 50 years of age had significantly lower F0 and shorter mpt compared with those older than 50 years of age. The multivariate analysis showed that gender was a much more important determinant of variations in F0, spl and mpt than age and teaching occupation. Objectively measured temperature also contributed to the changes on spl among school workers.

  2. Dimensionality in voice quality.

    Science.gov (United States)

    Bele, Irene Velsvik

    2007-05-01

    This study concerns speaking voice quality in a group of male teachers (n = 35) and male actors (n = 36), as the purpose was to investigate normal and supranormal voices. The goal was the development of a method of valid perceptual evaluation for normal to supranormal and resonant voices. The voices (text reading at two loudness levels) had been evaluated by 10 listeners, for 15 vocal characteristics using VA scales. In this investigation, the results of an exploratory factor analysis of the vocal characteristics used in this method are presented, reflecting four dimensions of major importance for normal and supranormal voices. Special emphasis is placed on the effects on voice quality of a change in the loudness variable, as two loudness levels are studied. Furthermore, the vocal characteristics Sonority and Ringing voice quality are paid special attention, as the essence of the term "resonant voice" was a basic issue throughout a doctoral dissertation where this study was included.

  3. Perceiving a stranger's voice as being one's own: a 'rubber voice' illusion?

    Directory of Open Access Journals (Sweden)

    Zane Z Zheng

    2011-04-01

    Full Text Available We describe an illusion in which a stranger's voice, when presented as the auditory concomitant of a participant's own speech, is perceived as a modified version of their own voice. When the congruence between utterance and feedback breaks down, the illusion is also broken. Compared to a baseline condition in which participants heard their own voice as feedback, hearing a stranger's voice induced robust changes in the fundamental frequency (F0 of their production. Moreover, the shift in F0 appears to be feedback dependent, since shift patterns depended reliably on the relationship between the participant's own F0 and the stranger-voice F0. The shift in F0 was evident both when the illusion was present and after it was broken, suggesting that auditory feedback from production may be used separately for self-recognition and for vocal motor control. Our findings indicate that self-recognition of voices, like other body attributes, is malleable and context dependent.

  4. Writing with Voice

    Science.gov (United States)

    Kesler, Ted

    2012-01-01

    In this Teaching Tips article, the author argues for a dialogic conception of voice, based in the work of Mikhail Bakhtin. He demonstrates a dialogic view of voice in action, using two writing examples about the same topic from his daughter, a fifth-grade student. He then provides five practical tips for teaching a dialogic conception of voice in…

  5. Marshall’s Voice

    Directory of Open Access Journals (Sweden)

    Halper Thomas

    2017-12-01

    Full Text Available Most judicial opinions, for a variety of reasons, do not speak with the voice of identifiable judges, but an analysis of several of John Marshall’s best known opinions reveals a distinctive voice, with its characteristic language and style of argumentation. The power of this voice helps to account for the influence of his views.

  6. Users’ Perceived Difficulties and Corresponding Reformulation Strategies in Google Voice Search

    Directory of Open Access Journals (Sweden)

    Wei Jeng

    2016-06-01

    Full Text Available In this article, we report users’ perceptions of query input errors and query reformulation strategies in voice search using data collected through a laboratory user study. Our results reveal that: 1 users’ perceived obstacles during a voice search can be related to speech recognition errors and topic complexity; 2 users naturally develop different strategies to deal with various types of words (e.g., acronyms, single-worded queries, non-English words with high error rates in speech recognition; and 3 users can have various emotional reactions when encounter voice input errors and they develop preferred usage occasions for voice search.

  7. Controlling An Electric Car Starter System Through Voice

    Directory of Open Access Journals (Sweden)

    A.B. Muhammad Firdaus

    2015-04-01

    Full Text Available Abstract These days automotive has turned into a stand out amongst the most well-known modes of transportation on the grounds that a large number of Malaysians could bear to have an auto. There are numerous decisions of innovations in auto that have in the market. One of the engineering is voice controlled framework. Voice Recognition is the procedure of consequently perceiving a certain statement talked by a specific speaker focused around individual data included in discourse waves. This paper is to make an car controlled by voice of human. An essential pre-processing venture in Voice Recognition systems is to recognize the vicinity of noise. Sensitivity to speech variability lacking recognition precision and helplessness to mimic are among the principle specialized obstacles that keep the far reaching selection of speech-based recognition systems. Voice recognition systems work sensibly well with a quiet conditions however inadequately under loud conditions or in twisted channels. The key focus of the project is to control an electric car starter system.

  8. METHODS FOR QUALITY ENHANCEMENT OF USER VOICE SIGNAL IN VOICE AUTHENTICATION SYSTEMS

    Directory of Open Access Journals (Sweden)

    O. N. Faizulaieva

    2014-03-01

    Full Text Available The reasonability for the usage of computer systems user voice in the authentication process is proved. The scientific task for improving the signal/noise ratio of the user voice signal in the authentication system is considered. The object of study is the process of input and output of the voice signal of authentication system user in computer systems and networks. Methods and means for input and extraction of voice signal against external interference signals are researched. Methods for quality enhancement of user voice signal in voice authentication systems are suggested. As modern computer facilities, including mobile ones, have two-channel audio card, the usage of two microphones is proposed in the voice signal input system of authentication system. Meanwhile, the task of forming a lobe of microphone array in a desired area of voice signal registration (100 Hz to 8 kHz is solved. The usage of directional properties of the proposed microphone array gives the possibility to have the influence of external interference signals two or three times less in the frequency range from 4 to 8 kHz. The possibilities for implementation of space-time processing of the recorded signals using constant and adaptive weighting factors are investigated. The simulation results of the proposed system for input and extraction of signals during digital processing of narrowband signals are presented. The proposed solutions make it possible to improve the value of the signal/noise ratio of the useful signals recorded up to 10, ..., 20 dB under the influence of external interference signals in the frequency range from 4 to 8 kHz. The results may be useful to specialists working in the field of voice recognition and speaker’s discrimination.

  9. Singing voice outcomes following singing voice therapy.

    Science.gov (United States)

    Dastolfo-Hromack, Christina; Thomas, Tracey L; Rosen, Clark A; Gartner-Schmidt, Jackie

    2016-11-01

    The objectives of this study were to describe singing voice therapy (SVT), describe referred patient characteristics, and document the outcomes of SVT. Retrospective. Records of patients receiving SVT between June 2008 and June 2013 were reviewed (n = 51). All diagnoses were included. Demographic information, number of SVT sessions, and symptom severity were retrieved from the medical record. Symptom severity was measured via the 10-item Singing Voice Handicap Index (SVHI-10). Treatment outcome was analyzed by diagnosis, history of previous training, and SVHI-10. SVHI-10 scores decreased following SVT (mean change = 11, 40% decrease) (P singing lessons (n = 10) also completed an average of three SVT sessions. Primary muscle tension dysphonia (MTD1) and benign vocal fold lesion (lesion) were the most common diagnoses. Most patients (60%) had previous vocal training. SVHI-10 decrease was not significantly different between MTD and lesion. This is the first outcome-based study of SVT in a disordered population. Diagnosis of MTD or lesion did not influence treatment outcomes. Duration of SVT was short (approximately three sessions). Voice care providers are encouraged to partner with a singing voice therapist to provide optimal care for the singing voice. This study supports the use of SVT as a tool for the treatment of singing voice disorders. 4 Laryngoscope, 126:2546-2551, 2016. © 2016 The American Laryngological, Rhinological and Otological Society, Inc.

  10. Face the voice

    DEFF Research Database (Denmark)

    Lønstrup, Ansa

    2014-01-01

    will be based on a reception aesthetic and phenomenological approach, the latter as presented by Don Ihde in his book Listening and Voice. Phenomenologies of Sound , and my analytical sketches will be related to theoretical statements concerning the understanding of voice and media (Cavarero, Dolar, La......Belle, Neumark). Finally, the article will discuss the specific artistic combination and our auditory experience of mediated human voices and sculpturally projected faces in an art museum context under the general conditions of the societal panophonia of disembodied and mediated voices, as promoted by Steven...

  11. The Army word recognition system

    Science.gov (United States)

    Hadden, David R.; Haratz, David

    1977-01-01

    The application of speech recognition technology in the Army command and control area is presented. The problems associated with this program are described as well as as its relevance in terms of the man/machine interactions, voice inflexions, and the amount of training needed to interact with and utilize the automated system.

  12. Developing and modeling of voice control system for prosthetic robot arm in medical systems

    Directory of Open Access Journals (Sweden)

    Koksal Gundogdu

    2018-04-01

    Full Text Available In parallel with the development of technology, various control methods are also developed. Voice control system is one of these control methods. In this study, an effective modelling upon mathematical models used in the literature is performed, and a voice control system is developed in order to control prosthetic robot arms. The developed control system has been applied on four-jointed RRRR robot arm. Implementation tests were performed on the designed system. As a result of the tests; it has been observed that the technique utilized in our system achieves about 11% more efficient voice recognition than currently used techniques in the literature. With the improved mathematical modelling, it has been shown that voice commands could be effectively used for controlling the prosthetic robot arm. Keywords: Voice recognition model, Voice control, Prosthetic robot arm, Robotic control, Forward kinematic

  13. Voice-activated intelligent radiologic image display

    International Nuclear Information System (INIS)

    Fisher, P.

    1989-01-01

    The authors present a computer-based expert computer system called Mammo-Icon, which automatically assists the radiologist's case analysis by reviewing the trigger phrase output of a commercially available voice transcription system in he domain of mammography. A commercially available PC-based voice dictation system is coupled to an expert system implemented on a microcomputer. Software employs the LISP and C computer languages. Mammo-Icon responds to the trigger phrase output of a voice dictation system with a textual discussion of the potential significance of the findings that have been described and a display of reference images that may help the radiologist to confirm a suspected diagnosis or consider additional diagnoses. This results in automatic availability of potentially useful computer-based expert advice, making such systems much more likely to be used in routine clinical practice

  14. Voice Response Systems Technology.

    Science.gov (United States)

    Gerald, Jeanette

    1984-01-01

    Examines two methods of generating synthetic speech in voice response systems, which allow computers to communicate in human terms (speech), using human interface devices (ears): phoneme and reconstructed voice systems. Considerations prior to implementation, current and potential applications, glossary, directory, and introduction to Input Output…

  15. Clinical Voices - an update

    DEFF Research Database (Denmark)

    Fusaroli, Riccardo; Weed, Ethan

    Anomalous aspects of speech and voice, including pitch, fluency, and voice quality, are reported to characterise many mental disorders. However, it has proven difficult to quantify and explain this oddness of speech by employing traditional statistical methods. In this talk we will show how...

  16. Automatic speech recognition for radiological reporting

    International Nuclear Information System (INIS)

    Vidal, B.

    1991-01-01

    Large vocabulary speech recognition, its techniques and its software and hardware technology, are being developed, aimed at providing the office user with a tool that could significantly improve both quantity and quality of his work: the dictation machine, which allows memos and documents to be input using voice and a microphone instead of fingers and a keyboard. The IBM Rome Science Center, together with the IBM Research Division, has built a prototype recognizer that accepts sentences in natural language from 20.000-word Italian vocabulary. The unit runs on a personal computer equipped with a special hardware capable of giving all the necessary computing power. The first laboratory experiments yielded very interesting results and pointed out such system characteristics to make its use possible in operational environments. To this purpose, the dictation of medical reports was considered as a suitable application. In cooperation with the 2nd Radiology Department of S. Maria della Misericordia Hospital (Udine, Italy), a system was experimented by radiology department doctors during their everyday work. The doctors were able to directly dictate their reports to the unit. The text appeared immediately on the screen, and eventual errors could be corrected either by voice or by using the keyboard. At the end of report dictation, the doctors could both print and archive the text. The report could also be forwarded to hospital information system, when the latter was available. Our results have been very encouraging: the system proved to be robust, simple to use, and accurate (over 95% average recognition rate). The experiment was precious for suggestion and comments, and its results are useful for system evolution towards improved system management and efficency

  17. EXPERIMENTAL STUDY OF FIRMWARE FOR INPUT AND EXTRACTION OF USER’S VOICE SIGNAL IN VOICE AUTHENTICATION SYSTEMS

    Directory of Open Access Journals (Sweden)

    O. N. Faizulaieva

    2014-09-01

    Full Text Available Scientific task for improving the signal-to-noise ratio for user’s voice signal in computer systems and networks during the process of user’s voice authentication is considered. The object of study is the process of input and extraction of the voice signal of authentication system user in computer systems and networks. Methods and means for input and extraction of the voice signal on the background of external interference signals are investigated. Ways for quality improving of the user’s voice signal in systems of voice authentication are investigated experimentally. Firmware means for experimental unit of input and extraction of the user’s voice signal against external interference influence are considered. As modern computer means, including mobile, have two-channel audio card, two microphones are used in the voice signal input. The distance between sonic-wave sensors is 20 mm and it provides forming one direction pattern lobe of microphone array in a desired area of voice signal registration (from 100 Hz to 8 kHz. According to the results of experimental studies, the usage of directional properties of the proposed microphone array and space-time processing of the recorded signals with implementation of constant and adaptive weighting factors has made it possible to reduce considerably the influence of interference signals. The results of firmware experimental studies for input and extraction of the user’s voice signal against external interference influence are shown. The proposed solutions will give the possibility to improve the value of the signal/noise ratio of the useful signals recorded up to 20 dB under the influence of external interference signals in the frequency range from 4 to 8 kHz. The results may be useful to specialists working in the field of voice recognition and speaker discrimination.

  18. Onset and Maturation of Fetal Heart Rate Response to the Mother's Voice over Late Gestation

    Science.gov (United States)

    Kisilevsky, Barbara S.; Hains, Sylvia M. J.

    2011-01-01

    Background: Term fetuses discriminate their mother's voice from a female stranger's, suggesting recognition/learning of some property of her voice. Identification of the onset and maturation of the response would increase our understanding of the influence of environmental sounds on the development of sensory abilities and identify the period when…

  19. The effect of voice onset time differences on lexical access in Dutch

    NARCIS (Netherlands)

    Alphen, P.M. van; McQueen, J.M.

    2006-01-01

    Effects on spoken-word recognition of prevoicing differences in Dutch initial voiced plosives were examined. In 2 cross-modal identity-priming experiments, participants heard prime words and nonwords beginning with voiced plosives with 12, 6, or 0 periods of prevoicing or matched items beginning

  20. Culture/Religion and Identity: Social Justice versus Recognition

    Science.gov (United States)

    Bekerman, Zvi

    2012-01-01

    Recognition is the main word attached to multicultural perspectives. The multicultural call for recognition, the one calling for the recognition of cultural minorities and identities, the one now voiced by liberal states all over and also in Israel was a more difficult one. It took the author some time to realize that calling for the recognition…

  1. SURVEY OF BIOMETRIC SYSTEMS USING IRIS RECOGNITION

    OpenAIRE

    S.PON SANGEETHA; DR.M.KARNAN

    2014-01-01

    The security plays an important role in any type of organization in today’s life. Iris recognition is one of the leading automatic biometric systems in the area of security which is used to identify the individual person. Biometric systems include fingerprints, facial features, voice recognition, hand geometry, handwriting, the eye retina and the most secured one presented in this paper, the iris recognition. Biometric systems has become very famous in security systems because it is not possi...

  2. Assessment voice synthesizers for reading in digital books

    Directory of Open Access Journals (Sweden)

    Sérvulo Fernandes da Silva Neto

    2013-07-01

    Full Text Available The digital accessibility shows ways to information access in digital media that assist people with different types of disabilities to a better interaction with the computer independent of its limitations. Of these tools are composed by voice synthesizers, that supposedly simplifying their access to any recorded knowledge through digital technologies. However such tools have emerged originally in countries foreign language. Which brings us to the following research problem: the voice synthesizers are appropriate for reading digital books in the Portuguese language? The objective of this study was to analyze and classify different software tools voice synthesizers in combination with software digital book readers to support accessibility to e-books in Portuguese. Through literature review were identified applications software voice synthesizers, composing the sample analyzed in this work. We used a simplified version of the method of Multiple Criteria Decision Support - MMDA, to assess these. In the research 12 were considered readers of e-books and 11 software voice synthesizer, tested with six formats of e-books (E-pub, PDF, HTML, DOC, TXT, and Mobi. In accordance with the results, the software Virtual Vision achieved the highest score. Relative to formats, it was found that the PDF has measured a better score when summed the results of the three synthesizers. In the studied universe contacted that many synthesizers simply cannot be used because they did not support the Portuguese language.

  3. Using voice to create hospital progress notes: Description of a mobile application and supporting system integrated with a commercial electronic health record.

    Science.gov (United States)

    Payne, Thomas H; Alonso, W David; Markiel, J Andrew; Lybarger, Kevin; White, Andrew A

    2018-01-01

    We describe the development and design of a smartphone app-based system to create inpatient progress notes using voice, commercial automatic speech recognition software, with text processing to recognize spoken voice commands and format the note, and integration with a commercial EHR. This new system fits hospital rounding workflow and was used to support a randomized clinical trial testing whether use of voice to create notes improves timeliness of note availability, note quality, and physician satisfaction with the note creation process. The system was used to create 709 notes which were placed in the corresponding patient's EHR record. The median time from pressing the Send button to appearance of the formatted note in the Inbox was 8.8 min. It was generally very reliable, accepted by physician users, and secure. This approach provides an alternative to use of keyboard and templates to create progress notes and may appeal to physicians who prefer voice to typing. Copyright © 2017 Elsevier Inc. All rights reserved.

  4. Voice following radiotherapy

    International Nuclear Information System (INIS)

    Stoicheff, M.L.

    1975-01-01

    This study was undertaken to provide information on the voice of patients following radiotherapy for glottic cancer. Part I presents findings from questionnaires returned by 227 of 235 patients successfully irradiated for glottic cancer from 1960 through 1971. Part II presents preliminary findings on the speaking fundamental frequencies of 22 irradiated patients. Normal to near-normal voice was reported by 83 percent of the 227 patients; however, 80 percent did indicate persisting vocal difficulties such as fatiguing of voice with much usage, inability to sing, reduced loudness, hoarse voice quality and inability to shout. Amount of talking during treatments appeared to affect length of time for voice to recover following treatments in those cases where it took from nine to 26 weeks; also, with increasing years since treatment, patients rated their voices more favorably. Smoking habits following treatments improved significantly with only 27 percent smoking heavily as compared with 65 percent prior to radiation therapy. No correlation was found between smoking (during or after treatments) and vocal ratings or between smoking and length of time for voice to recover. There was no relationship found between reported vocal ratings and stage of the disease

  5. Voice Savers for Music Teachers

    Science.gov (United States)

    Cookman, Starr

    2012-01-01

    Music teachers are in a class all their own when it comes to voice use. These elite vocal athletes require stamina, strength, and flexibility from their voices day in, day out for hours at a time. Voice rehabilitation clinics and research show that music education ranks high among the professionals most commonly affected by voice problems.…

  6. Gender recognition from vocal source

    Science.gov (United States)

    Sorokin, V. N.; Makarov, I. S.

    2008-07-01

    Efficiency of automatic recognition of male and female voices based on solving the inverse problem for glottis area dynamics and for waveform of the glottal airflow volume velocity pulse is studied. The inverse problem is regularized through the use of analytical models of the voice excitation pulse and of the dynamics of the glottis area, as well as the model of one-dimensional glottal airflow. Parameters of these models and spectral parameters of the volume velocity pulse are considered. The following parameters are found to be most promising: the instant of maximum glottis area, the maximum derivative of the area, the slope of the spectrum of the glottal airflow volume velocity pulse, the amplitude ratios of harmonics of this spectrum, and the pitch. On the plane of the first two main components in the space of these parameters, an almost twofold decrease in the classification error relative to that for the pitch alone is attained. The male voice recognition probability is found to be 94.7%, and the female voice recognition probability is 95.9%.

  7. Voice-to-Phoneme Conversion Algorithms for Voice-Tag Applications in Embedded Platforms

    Directory of Open Access Journals (Sweden)

    Yan Ming Cheng

    2008-08-01

    Full Text Available We describe two voice-to-phoneme conversion algorithms for speaker-independent voice-tag creation specifically targeted at applications on embedded platforms. These algorithms (batch mode and sequential are compared in speech recognition experiments where they are first applied in a same-language context in which both acoustic model training and voice-tag creation and application are performed on the same language. Then, their performance is tested in a cross-language setting where the acoustic models are trained on a particular source language while the voice-tags are created and applied on a different target language. In the same-language environment, both algorithms either perform comparably to or significantly better than the baseline where utterances are manually transcribed by a phonetician. In the cross-language context, the voice-tag performances vary depending on the source-target language pair, with the variation reflecting predicted phonological similarity between the source and target languages. Among the most similar languages, performance nears that of the native-trained models and surpasses the native reference baseline.

  8. Voice - How humans communicate?

    Science.gov (United States)

    Tiwari, Manjul; Tiwari, Maneesha

    2012-01-01

    Voices are important things for humans. They are the medium through which we do a lot of communicating with the outside world: our ideas, of course, and also our emotions and our personality. The voice is the very emblem of the speaker, indelibly woven into the fabric of speech. In this sense, each of our utterances of spoken language carries not only its own message but also, through accent, tone of voice and habitual voice quality it is at the same time an audible declaration of our membership of particular social regional groups, of our individual physical and psychological identity, and of our momentary mood. Voices are also one of the media through which we (successfully, most of the time) recognize other humans who are important to us-members of our family, media personalities, our friends, and enemies. Although evidence from DNA analysis is potentially vastly more eloquent in its power than evidence from voices, DNA cannot talk. It cannot be recorded planning, carrying out or confessing to a crime. It cannot be so apparently directly incriminating. As will quickly become evident, voices are extremely complex things, and some of the inherent limitations of the forensic-phonetic method are in part a consequence of the interaction between their complexity and the real world in which they are used. It is one of the aims of this article to explain how this comes about. This subject have unsolved questions, but there is no direct way to present the information that is necessary to understand how voices can be related, or not, to their owners.

  9. Voice Quality Measuring Setup with Automatic Voice over IP Call Generator and Lawful Interception Packet Analyzer

    Directory of Open Access Journals (Sweden)

    PLEVA Matus

    Full Text Available This paper describes the packet measuring laboratory setup, which could be used also for lawful interception applications, using professional packet analyzer, Voice over IP call generator, free call server (Asterisk linux setup and appropriate software and hardware described below. This setup was used for measuring the quality of the automatically generated VoIP calls under stressed network conditions, when the call manager server was flooded with high bandwidth traffic, near the bandwidth limit of the connected switch. The call generator realizes 30 calls simultaneously and the packet capturer & analyzercould decode the VoIP traffic, extract RTP session data, automatically analyze the voice quality using standardized MOS (Mean Opinion Score values and describe also the source of the voice degradation (jitter, packet loss, codec, delay, etc..

  10. Connections between voice ergonomic risk factors and voice symptoms, voice handicap, and respiratory tract diseases.

    Science.gov (United States)

    Rantala, Leena M; Hakala, Suvi J; Holmqvist, Sofia; Sala, Eeva

    2012-11-01

    The aim of the study was to investigate the connections between voice ergonomic risk factors found in classrooms and voice-related problems in teachers. Voice ergonomic assessment was performed in 39 classrooms in 14 elementary schools by means of a Voice Ergonomic Assessment in Work Environment--Handbook and Checklist. The voice ergonomic risk factors assessed included working culture, noise, indoor air quality, working posture, stress, and access to a sound amplifier. Teachers from the above-mentioned classrooms reported their voice symptoms, respiratory tract diseases, and completed a Voice Handicap Index (VHI). The more voice ergonomic risk factors found in the classroom the higher were the teachers' total scores on voice symptoms and VHI. Stress was the factor that correlated most strongly with voice symptoms. Poor indoor air quality increased the occurrence of laryngitis. Voice ergonomics were poor in the classrooms studied and voice ergonomic risk factors affected the voice. It is important to convey information on voice ergonomics to education administrators and those responsible for school planning and taking care of school buildings. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  11. SPEECH EMOTION RECOGNITION USING MODIFIED QUADRATIC DISCRIMINATION FUNCTION

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    Quadratic Discrimination Function(QDF)is commonly used in speech emotion recognition,which proceeds on the premise that the input data is normal distribution.In this Paper,we propose a transformation to normalize the emotional features,then derivate a Modified QDF(MQDF) to speech emotion recognition.Features based on prosody and voice quality are extracted and Principal Component Analysis Neural Network (PCANN) is used to reduce dimension of the feature vectors.The results show that voice quality features are effective supplement for recognition.and the method in this paper could improve the recognition ratio effectively.

  12. Automatic Speaker Recognition for Mobile Forensic Applications

    Directory of Open Access Journals (Sweden)

    Mohammed Algabri

    2017-01-01

    Full Text Available Presently, lawyers, law enforcement agencies, and judges in courts use speech and other biometric features to recognize suspects. In general, speaker recognition is used for discriminating people based on their voices. The process of determining, if a suspected speaker is the source of trace, is called forensic speaker recognition. In such applications, the voice samples are most probably noisy, the recording sessions might mismatch each other, the sessions might not contain sufficient recording for recognition purposes, and the suspect voices are recorded through mobile channel. The identification of a person through his voice within a forensic quality context is challenging. In this paper, we propose a method for forensic speaker recognition for the Arabic language; the King Saud University Arabic Speech Database is used for obtaining experimental results. The advantage of this database is that each speaker’s voice is recorded in both clean and noisy environments, through a microphone and a mobile channel. This diversity facilitates its usage in forensic experimentations. Mel-Frequency Cepstral Coefficients are used for feature extraction and the Gaussian mixture model-universal background model is used for speaker modeling. Our approach has shown low equal error rates (EER, within noisy environments and with very short test samples.

  13. A Single Case Design Evaluation of a Software and Tutor Intervention Addressing Emotion Recognition and Social Interaction in Four Boys with ASD

    Science.gov (United States)

    Lacava, Paul G.; Rankin, Ana; Mahlios, Emily; Cook, Katie; Simpson, Richard L.

    2010-01-01

    Many students with Autism Spectrum Disorders (ASD) have delays learning to recognize emotions. Social behavior is also challenging, including initiating interactions, responding to others, developing peer relationships, and so forth. In this single case design study we investigated the relationship between use of computer software ("Mind Reading:…

  14. Relationship Between Voice and Motor Disabilities of Parkinson's Disease.

    Science.gov (United States)

    Majdinasab, Fatemeh; Karkheiran, Siamak; Soltani, Majid; Moradi, Negin; Shahidi, Gholamali

    2016-11-01

    To evaluate voice of Iranian patients with Parkinson's disease (PD) and find any relationship between motor disabilities and acoustic voice parameters as speech motor components. We evaluated 27 Farsi-speaking PD patients and 21 age- and sex-matched healthy persons as control. Motor performance was assessed by the Unified Parkinson's Disease Rating Scale part III and Hoehn and Yahr rating scale in the "on" state. Acoustic voice evaluation, including fundamental frequency (f0), standard deviation of f0, minimum of f0, maximum of f0, shimmer, jitter, and harmonic to noise ratio, was done using the Praat software via /a/ prolongation. No difference was seen between the voice of the patients and the voice of the controls. f0 and its variation had a significant correlation with the duration of the disease, but did not have any relationships with the Unified Parkinson's Disease Rating Scale part III. Only limited relationship was observed between voice and motor disabilities. Tremor is an important main feature of PD that affects motor and phonation systems. Females had an older age at onset, more prolonged disease, and more severe motor disabilities (not statistically significant), but phonation disorders were more frequent in males and showed more relationship with severity of motor disabilities. Voice is affected by PD earlier than many other motor components and is more sensitive to disease progression. Tremor is the most effective part of PD that impacts voice. PD has more effect on voice of male versus female patients. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  15. Cost-Sensitive Learning for Emotion Robust Speaker Recognition

    Directory of Open Access Journals (Sweden)

    Dongdong Li

    2014-01-01

    Full Text Available In the field of information security, voice is one of the most important parts in biometrics. Especially, with the development of voice communication through the Internet or telephone system, huge voice data resources are accessed. In speaker recognition, voiceprint can be applied as the unique password for the user to prove his/her identity. However, speech with various emotions can cause an unacceptably high error rate and aggravate the performance of speaker recognition system. This paper deals with this problem by introducing a cost-sensitive learning technology to reweight the probability of test affective utterances in the pitch envelop level, which can enhance the robustness in emotion-dependent speaker recognition effectively. Based on that technology, a new architecture of recognition system as well as its components is proposed in this paper. The experiment conducted on the Mandarin Affective Speech Corpus shows that an improvement of 8% identification rate over the traditional speaker recognition is achieved.

  16. Cost-sensitive learning for emotion robust speaker recognition.

    Science.gov (United States)

    Li, Dongdong; Yang, Yingchun; Dai, Weihui

    2014-01-01

    In the field of information security, voice is one of the most important parts in biometrics. Especially, with the development of voice communication through the Internet or telephone system, huge voice data resources are accessed. In speaker recognition, voiceprint can be applied as the unique password for the user to prove his/her identity. However, speech with various emotions can cause an unacceptably high error rate and aggravate the performance of speaker recognition system. This paper deals with this problem by introducing a cost-sensitive learning technology to reweight the probability of test affective utterances in the pitch envelop level, which can enhance the robustness in emotion-dependent speaker recognition effectively. Based on that technology, a new architecture of recognition system as well as its components is proposed in this paper. The experiment conducted on the Mandarin Affective Speech Corpus shows that an improvement of 8% identification rate over the traditional speaker recognition is achieved.

  17. Voice Therapy Practices and Techniques: A Survey of Voice Clinicians.

    Science.gov (United States)

    Mueller, Peter B.; Larson, George W.

    1992-01-01

    Eighty-three voice disorder therapists' ratings of statements regarding voice therapy practices indicated that vocal nodules are the most frequent disorder treated; vocal abuse and hard glottal attack elimination, counseling, and relaxation were preferred treatment approaches; and voice therapy is more effective with adults than with children.…

  18. Smartphone App for Voice Disorders

    Science.gov (United States)

    ... on. Feature: Taste, Smell, Hearing, Language, Voice, Balance Smartphone App for Voice Disorders Past Issues / Fall 2013 ... developed a mobile monitoring device that relies on smartphone technology to gather a week's worth of talking, ...

  19. Effects of Medications on Voice

    Science.gov (United States)

    ... ENTCareers Marketplace Find an ENT Doctor Near You Effects of Medications on Voice Effects of Medications on Voice Patient Health Information News ... replacement therapy post-menopause may have a variable effect. An inadequate level of thyroid replacement medication in ...

  20. Hearing Voices and Seeing Things

    Science.gov (United States)

    ... Facts for Families Guide Facts for Families - Vietnamese Hearing Voices and Seeing Things No. 102; Updated October ... delusions (a fixed, false, and often bizarre belief). Hearing voices or seeing things that are not there ...

  1. Using Hierarchical Time Series Clustering Algorithm and Wavelet Classifier for Biometric Voice Classification

    Directory of Open Access Journals (Sweden)

    Simon Fong

    2012-01-01

    Full Text Available Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers’ gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

  2. Multidimensional assessment of strongly irregular voices such as in substitution voicing and spasmodic dysphonia: a compilation of own research.

    Science.gov (United States)

    Moerman, Mieke; Martens, Jean-Pierre; Dejonckere, Philippe

    2015-04-01

    This article is a compilation of own research performed during the European COoperation in Science and Technology (COST) action 2103: 'Advance Voice Function Assessment', an initiative of voice and speech processing teams consisting of physicists, engineers, and clinicians. This manuscript concerns analyzing largely irregular voicing types, namely substitution voicing (SV) and adductor spasmodic dysphonia (AdSD). A specific perceptual rating scale (IINFVo) was developed, and the Auditory Model Based Pitch Extractor (AMPEX), a piece of software that automatically analyses running speech and generates pitch values in background noise, was applied. The IINFVo perceptual rating scale has been shown to be useful in evaluating SV. The analysis of strongly irregular voices stimulated a modification of the European Laryngological Society's assessment protocol which was originally designed for the common types of (less severe) dysphonia. Acoustic analysis with AMPEX demonstrates that the most informative features are, for SV, the voicing-related acoustic features and, for AdSD, the perturbation measures. Poor correlations between self-assessment and acoustic and perceptual dimensions in the assessment of highly irregular voices argue for a multidimensional approach.

  3. Speech recognition systems on the Cell Broadband Engine

    Energy Technology Data Exchange (ETDEWEB)

    Liu, Y; Jones, H; Vaidya, S; Perrone, M; Tydlitat, B; Nanda, A

    2007-04-20

    In this paper we describe our design, implementation, and first results of a prototype connected-phoneme-based speech recognition system on the Cell Broadband Engine{trademark} (Cell/B.E.). Automatic speech recognition decodes speech samples into plain text (other representations are possible) and must process samples at real-time rates. Fortunately, the computational tasks involved in this pipeline are highly data-parallel and can receive significant hardware acceleration from vector-streaming architectures such as the Cell/B.E. Identifying and exploiting these parallelism opportunities is challenging, but also critical to improving system performance. We observed, from our initial performance timings, that a single Cell/B.E. processor can recognize speech from thousands of simultaneous voice channels in real time--a channel density that is orders-of-magnitude greater than the capacity of existing software speech recognizers based on CPUs (central processing units). This result emphasizes the potential for Cell/B.E.-based speech recognition and will likely lead to the future development of production speech systems using Cell/B.E. clusters.

  4. Development of Efficient Authoring Software for e-Learning Contents

    Science.gov (United States)

    Kozono, Kazutake; Teramoto, Akemi; Akiyama, Hidenori

    The contents creation in e-Learning system becomes an important problem. The contents of e-Learning should include figure and voice media for a high-level educational effect. However, the use of figure and voice complicates the operation of authoring software considerably. A new authoring software, which can build e-Learning contents efficiently, has been developed to solve this problem. This paper reports development results of the authoring software.

  5. Aerodynamic and sound intensity measurements in tracheoesophageal voice

    NARCIS (Netherlands)

    Grolman, Wilko; Eerenstein, Simone E. J.; Tan, Frédérique M. L.; Tange, Rinze A.; Schouwenburg, Paul F.

    2007-01-01

    BACKGROUND: In laryngectomized patients, tracheoesophageal voice generally provides a better voice quality than esophageal voice. Understanding the aerodynamics of voice production in patients with a voice prosthesis is important for optimizing prosthetic designs and successful voice rehabilitation.

  6. [Voice disorders in female teachers assessed by Voice Handicap Index].

    Science.gov (United States)

    Niebudek-Bogusz, Ewa; Kuzańska, Anna; Woźnicka, Ewelina; Sliwińska-Kowalska, Mariola

    2007-01-01

    The aim of this study was to assess the application of Voice Handicap Index (VHI) in the diagnosis of occupational voice disorders in female teachers. The subjective assessment of voice by VHI was performed in fifty subjects with dysphonia diagnosed in laryngovideostroboscopic examination. The control group comprised 30 women whose jobs did not involve vocal effort. The results of the total VHI score and each of its subscales: functional, emotional and physical was significantly worse in the study group than in controls (p teachers estimated their own voice problems as a moderate disability, while 12% of them reported severe voice disability. However, all non-teachers assessed their voice problems as slight, their results ranged at the lowest level of VHI score. This study confirmed that VHI as a tool for self-assessment of voice can be a significant contribution to the diagnosis of occupational dysphonia.

  7. Perceptual complexity of faces and voices modulates cross-modal behavioral facilitation effects

    Directory of Open Access Journals (Sweden)

    Frédéric Joassin

    2018-04-01

    Full Text Available Joassin et al. (Neuroscience Letters, 2004,369,132-137 observed that the recognition of face-voice associations led to an interference effect, i.e. to decreased performances relative to the recognition of faces presented in isolation. In the present experiment, we tested the hypothesis that this interference effect could be due to the fact that voices were more difficult to recognize than faces. For this purpose, we modified some faces by morphing to make them as difficult to recognize as the voices. Twenty one healthy volunteers performed a recogniton task of previously learned face-voice associations in 5 conditions: voices (A, natural faces (V, morphed faces (V30, voice-natural face associations (AV and voice-morphed faces associations (AV30. As expected, AV led to interference, as it was less well and slower performed than V. However, when faces were as difficult to recognize as voices, their simultaneous presentation produced a clear facilitation, AV30 being significantly better and faster performed than A and V30. These results demonstrate that matching or not the perceptual complexity of the unimodal stimuli modulates the potential cross-modal gains of the bimodal situations.

  8. Bringing voice in policy building.

    Science.gov (United States)

    Lotrecchiano, Gaetano R; Kane, Mary; Zocchi, Mark S; Gosa, Jessica; Lazar, Danielle; Pines, Jesse M

    2017-07-03

    Purpose The purpose of this paper is to describe the use of group concept mapping (GCM) as a tool for developing a conceptual model of an episode of acute, unscheduled care from illness or injury to outcomes such as recovery, death and chronic illness. Design/methodology/approach After generating a literature review drafting an initial conceptual model, GCM software (CS Global MAX TM ) is used to organize and identify strengths and directionality between concepts generated through feedback about the model from several stakeholder groups: acute care and non-acute care providers, patients, payers and policymakers. Through online and in-person population-specific focus groups, the GCM approach seeks feedback, assigned relationships and articulated priorities from participants to produce an output map that described overarching concepts and relationships within and across subsamples. Findings A clustered concept map made up of relational data points that produced a taxonomy of feedback was used to update the model for use in soliciting additional feedback from two technical expert panels (TEPs), and finally, a public comment exercise was performed. The results were a stakeholder-informed improved model for an acute care episode, identified factors that influence process and outcomes, and policy recommendations, which were delivered to the Department of Health and Human Services's (DHHS) Assistant Secretary for Preparedness and Response. Practical implications This study provides an example of the value of cross-population multi-stakeholder input to increase voice in shared problem health stakeholder groups. Originality/value This paper provides GCM results and a visual analysis of the relational characteristics both within and across sub-populations involved in the study. It also provides an assessment of observational key factors supporting how different stakeholder voices can be integrated to inform model development and policy recommendations.

  9. Listen to a voice

    DEFF Research Database (Denmark)

    Hølge-Hazelton, Bibi

    2001-01-01

    Listen to the voice of a young girl Lonnie, who was diagnosed with Type 1 diabetes at 16. Imagine that she is deeply involved in the social security system. She lives with her mother and two siblings in a working class part of a small town. She is at a special school for problematic youth, and her...

  10. Sustainable Consumer Voices

    DEFF Research Database (Denmark)

    Klitmøller, Anders; Rask, Morten; Jensen, Nevena

    2011-01-01

    Aiming to explore how user driven innovation can inform high level design strategies, an in-depth empirical study was carried out, based on data from 50 observations of private vehicle users. This paper reports the resulting 5 consumer voices: Technology Enthusiast, Environmentalist, Design Lover...

  11. Voices of courage

    Directory of Open Access Journals (Sweden)

    Noraida Abdullah Karim

    2007-07-01

    Full Text Available In May 2007 the Women’s Commission for Refugee Women and Children1 presented its annual Voices of Courage awards to three displaced people who have dedicated their lives to promoting economic opportunities for refugee and displaced women and youth. These are their (edited testimonies.

  12. What the voice reveals

    NARCIS (Netherlands)

    Ko, Sei Jin

    2007-01-01

    Given that the voice is our main form of communication, we know surprisingly little about how it impacts judgment and behavior. Furthermore, the modern advancement in telecommunication systems, such as cellular phones, has meant that a large proportion of our everyday interactions are conducted

  13. Bodies and Voices

    DEFF Research Database (Denmark)

    A wide-ranging collection of essays centred on readings of the body in contemporary literary and socio-anthropological discourse, from slavery and rape to female genital mutilation, from clothing, ocular pornography, voice, deformation and transmutation to the imprisoned, dismembered, remembered...

  14. Human voice perception.

    Science.gov (United States)

    Latinus, Marianne; Belin, Pascal

    2011-02-22

    We are all voice experts. First and foremost, we can produce and understand speech, and this makes us a unique species. But in addition to speech perception, we routinely extract from voices a wealth of socially-relevant information in what constitutes a more primitive, and probably more universal, non-linguistic mode of communication. Consider the following example: you are sitting in a plane, and you can hear a conversation in a foreign language in the row behind you. You do not see the speakers' faces, and you cannot understand the speech content because you do not know the language. Yet, an amazing amount of information is available to you. You can evaluate the physical characteristics of the different protagonists, including their gender, approximate age and size, and associate an identity to the different voices. You can form a good idea of the different speaker's mood and affective state, as well as more subtle cues as the perceived attractiveness or dominance of the protagonists. In brief, you can form a fairly detailed picture of the type of social interaction unfolding, which a brief glance backwards can on the occasion help refine - sometimes surprisingly so. What are the acoustical cues that carry these different types of vocal information? How does our brain process and analyse this information? Here we briefly review an emerging field and the main tools used in voice perception research. Copyright © 2011 Elsevier Ltd. All rights reserved.

  15. Voice application development for Android

    CERN Document Server

    McTear, Michael

    2013-01-01

    This book will give beginners an introduction to building voice-based applications on Android. It will begin by covering the basic concepts and will build up to creating a voice-based personal assistant. By the end of this book, you should be in a position to create your own voice-based applications on Android from scratch in next to no time.Voice Application Development for Android is for all those who are interested in speech technology and for those who, as owners of Android devices, are keen to experiment with developing voice apps for their devices. It will also be useful as a starting po

  16. Voice similarity in identical twins.

    Science.gov (United States)

    Van Gysel, W D; Vercammen, J; Debruyne, F

    2001-01-01

    If people are asked to discriminate visually the two individuals of a monozygotic twin (MT), they mostly get into trouble. Does this problem also exist when listening to twin voices? Twenty female and 10 male MT voices were randomly assembled with one "strange" voice to get voice trios. The listeners (10 female students in Speech and Language Pathology) were asked to label the twins (voices 1-2, 1-3 or 2-3) in two conditions: two standard sentences read aloud and a 2.5-second midsection of a sustained /a/. The proportion correctly labelled twins was for female voices 82% and 63% and for male voices 74% and 52% for the sentences and the sustained /a/ respectively, both being significantly greater than chance (33%). The acoustic analysis revealed a high intra-twin correlation for the speaking fundamental frequency (SFF) of the sentences and the fundamental frequency (F0) of the sustained /a/. So the voice pitch could have been a useful characteristic in the perceptual identification of the twins. We conclude that there is a greater perceptual resemblance between the voices of identical twins than between voices without genetic relationship. The identification however is not perfect. The voice pitch possibly contributes to the correct twin identifications.

  17. Garbage Modeling for On-device Speech Recognition

    NARCIS (Netherlands)

    Van Gysel, C.; Velikovich, L.; McGraw, I.; Beaufays, F.

    2015-01-01

    User interactions with mobile devices increasingly depend on voice as a primary input modality. Due to the disadvantages of sending audio across potentially spotty network connections for speech recognition, in recent years there has been growing attention to performing recognition on-device. The

  18. A Spoken English Recognition Expert System.

    Science.gov (United States)

    1983-09-01

    34Speech Recognition by Computer," Scientific American. New York: Scientific American, April 1981: 64-76. 16. Marcus, Mitchell P. A Theo of Syntactic...prob)...) Pcssible words for voice decoder to choose from are: gents dishes issues itches ewes folks foes comunications units eunichs error * farce

  19. A Wireless LAN and Voice Information System for Underground Coal Mine

    Directory of Open Access Journals (Sweden)

    Yu Zhang

    2014-06-01

    Full Text Available In this paper we constructed a wireless information system, and developed a wireless voice communication subsystem based on Wireless Local Area Networks (WLAN for underground coal mine, which employs Voice over IP (VoIP technology and Session Initiation Protocol (SIP to achieve wireless voice dispatching communications. The master control voice dispatching interface and call terminal software are also developed on the WLAN ground server side to manage and implement the voice dispatching communication. A testing system for voice communication was constructed in tunnels of an underground coal mine, which was used to actually test the wireless voice communication subsystem via a network analysis tool, named Clear Sight Analyzer. In tests, the actual flow charts of registration, call establishment and call removal were analyzed by capturing call signaling of SIP terminals, and the key performance indicators were evaluated in coal mine, including average subjective value of voice quality, packet loss rate, delay jitter, disorder packet transmission and end-to- end delay. Experimental results and analysis demonstrate that the wireless voice communication subsystem developed communicates well in underground coal mine environment, achieving the designed function of voice dispatching communication.

  20. Incorporating Speech Recognition into a Natural User Interface

    Science.gov (United States)

    Chapa, Nicholas

    2017-01-01

    The Augmented/ Virtual Reality (AVR) Lab has been working to study the applicability of recent virtual and augmented reality hardware and software to KSC operations. This includes the Oculus Rift, HTC Vive, Microsoft HoloLens, and Unity game engine. My project in this lab is to integrate voice recognition and voice commands into an easy to modify system that can be added to an existing portion of a Natural User Interface (NUI). A NUI is an intuitive and simple to use interface incorporating visual, touch, and speech recognition. The inclusion of speech recognition capability will allow users to perform actions or make inquiries using only their voice. The simplicity of needing only to speak to control an on-screen object or enact some digital action means that any user can quickly become accustomed to using this system. Multiple programs were tested for use in a speech command and recognition system. Sphinx4 translates speech to text using a Hidden Markov Model (HMM) based Language Model, an Acoustic Model, and a word Dictionary running on Java. PocketSphinx had similar functionality to Sphinx4 but instead ran on C. However, neither of these programs were ideal as building a Java or C wrapper slowed performance. The most ideal speech recognition system tested was the Unity Engine Grammar Recognizer. A Context Free Grammar (CFG) structure is written in an XML file to specify the structure of phrases and words that will be recognized by Unity Grammar Recognizer. Using Speech Recognition Grammar Specification (SRGS) 1.0 makes modifying the recognized combinations of words and phrases very simple and quick to do. With SRGS 1.0, semantic information can also be added to the XML file, which allows for even more control over how spoken words and phrases are interpreted by Unity. Additionally, using a CFG with SRGS 1.0 produces a Finite State Machine (FSM) functionality limiting the potential for incorrectly heard words or phrases. The purpose of my project was to

  1. Pattern recognition

    CERN Document Server

    Theodoridis, Sergios

    2003-01-01

    Pattern recognition is a scientific discipline that is becoming increasingly important in the age of automation and information handling and retrieval. Patter Recognition, 2e covers the entire spectrum of pattern recognition applications, from image analysis to speech recognition and communications. This book presents cutting-edge material on neural networks, - a set of linked microprocessors that can form associations and uses pattern recognition to ""learn"" -and enhances student motivation by approaching pattern recognition from the designer's point of view. A direct result of more than 10

  2. Very low bit rate voice for packetized mobile applications

    International Nuclear Information System (INIS)

    Knittle, C.D.; Malone, K.T.

    1991-01-01

    This paper reports that transmitting digital voice via packetized mobile communications systems that employ relatively short packet lengths and narrow bandwidths often necessitates very low bit rate coding of the voice data. Sandia National Laboratories is currently developing an efficient voice coding system operating at 800 bits per second (bps). The coding scheme is a modified version of the 2400 bps NSA LPC-10e standard. The most significant modification to the LPC-10e scheme is the vector quantization of the line spectrum frequencies associated with the synthesis filters. An outline of a hardware implementation for the 800 bps coder is presented. The speech quality of the coder is generally good, although speaker recognition is not possible. Further research is being conducted to reduce the memory requirements and complexity of the vector quantizer, and to increase the quality of the reconstructed speech. This work may be of use dealing with nuclear materials

  3. Double Fourier analysis for Emotion Identification in Voiced Speech

    International Nuclear Information System (INIS)

    Sierra-Sosa, D.; Bastidas, M.; Ortiz P, D.; Quintero, O.L.

    2016-01-01

    We propose a novel analysis alternative, based on two Fourier Transforms for emotion recognition from speech. Fourier analysis allows for display and synthesizes different signals, in terms of power spectral density distributions. A spectrogram of the voice signal is obtained performing a short time Fourier Transform with Gaussian windows, this spectrogram portraits frequency related features, such as vocal tract resonances and quasi-periodic excitations during voiced sounds. Emotions induce such characteristics in speech, which become apparent in spectrogram time-frequency distributions. Later, the signal time-frequency representation from spectrogram is considered an image, and processed through a 2-dimensional Fourier Transform in order to perform the spatial Fourier analysis from it. Finally features related with emotions in voiced speech are extracted and presented. (paper)

  4. Forensic Speaker Recognition Law Enforcement and Counter-Terrorism

    CERN Document Server

    Patil, Hemant

    2012-01-01

    Forensic Speaker Recognition: Law Enforcement and Counter-Terrorism is an anthology of the research findings of 35 speaker recognition experts from around the world. The volume provides a multidimensional view of the complex science involved in determining whether a suspect’s voice truly matches forensic speech samples, collected by law enforcement and counter-terrorism agencies, that are associated with the commission of a terrorist act or other crimes. While addressing such topics as the challenges of forensic case work, handling speech signal degradation, analyzing features of speaker recognition to optimize voice verification system performance, and designing voice applications that meet the practical needs of law enforcement and counter-terrorism agencies, this material all sounds a common theme: how the rigors of forensic utility are demanding new levels of excellence in all aspects of speaker recognition. The contributors are among the most eminent scientists in speech engineering and signal process...

  5. Hearing the unheard: An interdisciplinary, mixed methodology study of women’s experiences of hearing voices (auditory verbal hallucinations

    Directory of Open Access Journals (Sweden)

    Simon eMcCarthy-Jones

    2015-12-01

    Full Text Available This paper explores the experiences of women who ‘hear voices’ (auditory verbal hallucinations. We begin by examining historical understandings of women hearing voices, showing these have been driven by androcentric theories of how women’s bodies functioned, leading to women being viewed as requiring their voices be interpreted by men. We show the twentieth-century was associated with recognition that the mental violation of women’s minds (represented by some voice-hearing was often a consequence of the physical violation of women’s bodies. We next report the results of a qualitative study into voice-hearing women’s experiences (N=8. This found similarities between women’s relationships with their voices and their relationships with others and the wider social context. Finally, we present results from a quantitative study comparing voice-hearing in women (n=65 and men (n=132 in a psychiatric setting. Women were more likely than men to have certain forms of voice-hearing (voices conversing and to have antecedent events of trauma, physical illness, and relationship problems. Voices identified as female may have more positive affect than male voices. We conclude that women voice-hearers have and continue to face specific challenges necessitating research and activism, and hope this paper will act as a stimulus to such work.

  6. Intra-oral pressure-based voicing control of electrolaryngeal speech with intra-oral vibrator.

    Science.gov (United States)

    Takahashi, Hirokazu; Nakao, Masayuki; Kikuchi, Yataro; Kaga, Kimitaka

    2008-07-01

    In normal speech, coordinated activities of intrinsic laryngeal muscles suspend a glottal sound at utterance of voiceless consonants, automatically realizing a voicing control. In electrolaryngeal speech, however, the lack of voicing control is one of the causes of unclear voice, voiceless consonants tending to be misheard as the corresponding voiced consonants. In the present work, we developed an intra-oral vibrator with an intra-oral pressure sensor that detected utterance of voiceless phonemes during the intra-oral electrolaryngeal speech, and demonstrated that an intra-oral pressure-based voicing control could improve the intelligibility of the speech. The test voices were obtained from one electrolaryngeal speaker and one normal speaker. We first investigated on the speech analysis software how a voice onset time (VOT) and first formant (F1) transition of the test consonant-vowel syllables contributed to voiceless/voiced contrasts, and developed an adequate voicing control strategy. We then compared the intelligibility of consonant-vowel syllables among the intra-oral electrolaryngeal speech with and without online voicing control. The increase of intra-oral pressure, typically with a peak ranging from 10 to 50 gf/cm2, could reliably identify utterance of voiceless consonants. The speech analysis and intelligibility test then demonstrated that a short VOT caused the misidentification of the voiced consonants due to a clear F1 transition. Finally, taking these results together, the online voicing control, which suspended the prosthetic tone while the intra-oral pressure exceeded 2.5 gf/cm2 and during the 35 milliseconds that followed, proved efficient to improve the voiceless/voiced contrast.

  7. Risk factors for voice problems in teachers.

    NARCIS (Netherlands)

    Kooijman, P.G.C.; Jong, F.I.C.R.S. de; Thomas, G.; Huinck, W.J.; Donders, A.R.T.; Graamans, K.; Schutte, H.K.

    2006-01-01

    In order to identify factors that are associated with voice problems and voice-related absenteeism in teachers, 1,878 questionnaires were analysed. The questionnaires inquired about personal data, voice complaints, voice-related absenteeism from work and conditions that may lead to voice complaints

  8. Risk factors for voice problems in teachers

    NARCIS (Netherlands)

    Kooijman, P. G. C.; de Jong, F. I. C. R. S.; Thomas, G.; Huinck, W.; Donders, R.; Graamans, K.; Schutte, H. K.

    2006-01-01

    In order to identify factors that are associated with voice problems and voice-related absenteeism in teachers, 1,878 questionnaires were analysed. The questionnaires inquired about personal data, voice complaints, voice-related absenteeism from work and conditions that may lead to voice complaints

  9. You're a What? Voice Actor

    Science.gov (United States)

    Liming, Drew

    2009-01-01

    This article talks about voice actors and features Tony Oliver, a professional voice actor. Voice actors help to bring one's favorite cartoon and video game characters to life. They also do voice-overs for radio and television commercials and movie trailers. These actors use the sound of their voice to sell a character's emotions--or an advertised…

  10. Voice search for development

    CSIR Research Space (South Africa)

    Barnard, E

    2010-09-01

    Full Text Available of speech technology development, similar approaches are likely to be applicable in both circumstances. However, within these broad approaches there are details which are specific to certain languages (or lan- guage families) that may require solutions... to the modeling of pitch were therefore required. Similarly, it is possible that novel solutions will be required to deal with the click sounds that occur in some Southern Bantu languages, or the voicing Copyright  2010 ISCA 26-30 September 2010, Makuhari...

  11. Smart Homes with Voice Activated Systems for Disabled People

    OpenAIRE

    Bekir Busatlic; Nejdet Dogru; Isaac Lera; Enes Sukic

    2017-01-01

    Smart home refers to the application of various technologies to semi-unsupervised home control It refers to systems that control temperature, lighting, door locks, windows and many other appliances. The aim of this study was to design a system that will use existing technology to showcase how it can benefit people with disabilities. This work uses only off-the-shelf products (smart home devices and controllers), speech recognition technology, open-source code libraries. The Voice Activated Sm...

  12. Voice and silence in organizations

    Directory of Open Access Journals (Sweden)

    Moaşa, H.

    2011-01-01

    Full Text Available Unlike previous research on voice and silence, this article breaksthe distance between the two and declines to treat them as opposites. Voice and silence are interrelated and intertwined strategic forms ofcommunication which presuppose each other in such a way that the absence of one would minimize completely the other’s presence. Social actors are not voice, or silence. Social actors can have voice or silence, they can do both because they operate at multiple levels and deal with multiple issues at different moments in time.

  13. Factors that motivate software developers in Nigerian's software ...

    African Journals Online (AJOL)

    It was also observed those courtesy, good reward systems, regular training, recognition, tolerance of mistakes and good leadership were high motivators of software developers. Keywords: Software developers, information technology, project managers, Nigeria International Journal of Natural and Applied Sciences, 6(4): ...

  14. Voice Biometrics for Information Assurance Applications

    National Research Council Canada - National Science Library

    Kang, George

    2002-01-01

    .... The ultimate goal of voice biometrics is to enable the use of voice as a password. Voice biometrics are "man-in-the-loop" systems in which system performance is significantly dependent on human performance...

  15. Speech emotion recognition based on statistical pitch model

    Institute of Scientific and Technical Information of China (English)

    WANG Zhiping; ZHAO Li; ZOU Cairong

    2006-01-01

    A modified Parzen-window method, which keep high resolution in low frequencies and keep smoothness in high frequencies, is proposed to obtain statistical model. Then, a gender classification method utilizing the statistical model is proposed, which have a 98% accuracy of gender classification while long sentence is dealt with. By separation the male voice and female voice, the mean and standard deviation of speech training samples with different emotion are used to create the corresponding emotion models. Then the Bhattacharyya distance between the test sample and statistical models of pitch, are utilized for emotion recognition in speech.The normalization of pitch for the male voice and female voice are also considered, in order to illustrate them into a uniform space. Finally, the speech emotion recognition experiment based on K Nearest Neighbor shows that, the correct rate of 81% is achieved, where it is only 73.85%if the traditional parameters are utilized.

  16. Objective voice parameters in Colombian school workers with healthy voices

    NARCIS (Netherlands)

    L.C. Cantor Cutiva (Lady Catherine); A. Burdorf (Alex)

    2015-01-01

    textabstractObjectives: To characterize the objective voice parameters among school workers, and to identify associated factors of three objective voice parameters, namely fundamental frequency, sound pressure level and maximum phonation time. Materials and methods: We conducted a cross-sectional

  17. Pedagogic Voice: Student Voice in Teaching and Engagement Pedagogies

    Science.gov (United States)

    Baroutsis, Aspa; McGregor, Glenda; Mills, Martin

    2016-01-01

    In this paper, we are concerned with the notion of "pedagogic voice" as it relates to the presence of student "voice" in teaching, learning and curriculum matters at an alternative, or second chance, school in Australia. This school draws upon many of the principles of democratic schooling via its utilisation of student voice…

  18. Facing Sound - Voicing Art

    DEFF Research Database (Denmark)

    Lønstrup, Ansa

    2013-01-01

    This article is based on examples of contemporary audiovisual art, with a special focus on the Tony Oursler exhibition Face to Face at Aarhus Art Museum ARoS in Denmark in March-July 2012. My investigation involves a combination of qualitative interviews with visitors, observations of the audience´s...... interactions with the exhibition and the artwork in the museum space and short analyses of individual works of art based on reception aesthetics and phenomenology and inspired by newer writings on sound, voice and listening....

  19. Voice over IP Security

    CERN Document Server

    Keromytis, Angelos D

    2011-01-01

    Voice over IP (VoIP) and Internet Multimedia Subsystem technologies (IMS) are rapidly being adopted by consumers, enterprises, governments and militaries. These technologies offer higher flexibility and more features than traditional telephony (PSTN) infrastructures, as well as the potential for lower cost through equipment consolidation and, for the consumer market, new business models. However, VoIP systems also represent a higher complexity in terms of architecture, protocols and implementation, with a corresponding increase in the potential for misuse. In this book, the authors examine the

  20. Bodies, Spaces, Voices, Silences

    OpenAIRE

    Donatella Mazzoleni; Pietro Vitiello

    2013-01-01

    A good architecture should not only allow functional, formal and technical quality for urban spaces, but also let the voice of the city be perceived, listened, enjoyed. Every city has got its specific sound identity, or “ISO” (R. O. Benenzon), made up of a complex texture of background noises and fluctuation of sound figures emerging and disappearing in a game of continuous fadings. For instance, the ISO of Naples is characterized by a spread need of hearing the sound return of one’s/others v...

  1. The Voices of the Documentarist

    Science.gov (United States)

    Utterback, Ann S.

    1977-01-01

    Discusses T. S. Elliot's essay, "The Three Voices of Poetry" which conceptualizes the position taken by the poet or creator. Suggests that an examination of documentary film, within the three voices concept, expands the critical framework of the film genre. (MH)

  2. Development of a voice database to aid children with hearing impairments

    International Nuclear Information System (INIS)

    Kuzman, M G; Agüero, P D; Tulli, J C; Gonzalez, E L; Cervellini, M P; Uriz, A J

    2011-01-01

    In the development of software for voice analysis or training, for people with hearing impairments, a database having sounds of properly pronounced words is of paramount importance. This paper shows the advantage that will be obtained from getting an own voice database, rather than using those coming from other countries, even having the same language, in the development of speech training software aimed to people with hearing impairments. This database will be used by software developers at the School of Engineering of Mar del Plata National University.

  3. Speech Recognition

    Directory of Open Access Journals (Sweden)

    Adrian Morariu

    2009-01-01

    Full Text Available This paper presents a method of speech recognition by pattern recognition techniques. Learning consists in determining the unique characteristics of a word (cepstral coefficients by eliminating those characteristics that are different from one word to another. For learning and recognition, the system will build a dictionary of words by determining the characteristics of each word to be used in the recognition. Determining the characteristics of an audio signal consists in the following steps: noise removal, sampling it, applying Hamming window, switching to frequency domain through Fourier transform, calculating the magnitude spectrum, filtering data, determining cepstral coefficients.

  4. Acute effects of radioiodine therapy on the voice and larynx of basedow-Graves patients

    International Nuclear Information System (INIS)

    Isolan-Cury, Roberta Werlang; Cury, Adriano Namo; Monte, Osmar; Silva, Marta Assumpcao de Andrada e; Duprat, Andre; Marone, Marilia; Almeida, Renata de; Iglesias, Alexandre

    2008-01-01

    Graves's disease is the most common cause of hyperthyroidism. There are three current therapeutic options: anti-thyroid medication, surgery, and radioactive iodine (I 131). There are few data in the literature regarding the effects of radioiodine therapy on the larynx and voice. The aim of this study was: to assess the effect of radioiodine therapy on the voice of Basedow-Graves patients. Material and method: A prospective study was done. Following the diagnosis of Grave's disease, patients underwent investigation of their voice, measurement of maximum phonatory time (/a/) and the s/z ratio, fundamental frequency analysis (Praat software), laryngoscopy and (perceptive-auditory) analysis in three different conditions: pre-treatment, 4 days, and 20 days post-radioiodine therapy. Conditions are based on the inflammatory pattern of thyroid tissue (Jones et al. 1999). Results: No statistically significant differences were found in voice characteristics in these three conditions. Conclusion: Radioiodine therapy does not affect voice quality. (author)

  5. Bodies, Spaces, Voices, Silences

    Directory of Open Access Journals (Sweden)

    Donatella Mazzoleni

    2013-07-01

    Full Text Available A good architecture should not only allow functional, formal and technical quality for urban spaces, but also let the voice of the city be perceived, listened, enjoyed. Every city has got its specific sound identity, or “ISO” (R. O. Benenzon, made up of a complex texture of background noises and fluctuation of sound figures emerging and disappearing in a game of continuous fadings. For instance, the ISO of Naples is characterized by a spread need of hearing the sound return of one’s/others voices, by a hate of silence. Cities may fall ill: illness from noise, within super-crowded neighbourhoods, or illness from silence, in the forced isolation of peripheries. The proposal of an urban music therapy denotes an unpublished and innovative enlarged interdisciplinary research path, where architecture, music, medicine, psychology, communication science may converge, in order to work for rebalancing spaces and relation life of the urban collectivity, through the care of body and sound dimensions.

  6. Page Recognition: Quantum Leap In Recognition Technology

    Science.gov (United States)

    Miller, Larry

    1989-07-01

    No milestone has proven as elusive as the always-approaching "year of the LAN," but the "year of the scanner" might claim the silver medal. Desktop scanners have been around almost as long as personal computers. And everyone thinks they are used for obvious desktop-publishing and business tasks like scanning business documents, magazine articles and other pages, and translating those words into files your computer understands. But, until now, the reality fell far short of the promise. Because it's true that scanners deliver an accurate image of the page to your computer, but the software to recognize this text has been woefully disappointing. Old optical-character recognition (OCR) software recognized such a limited range of pages as to be virtually useless to real users. (For example, one OCR vendor specified 12-point Courier font from an IBM Selectric typewriter: the same font in 10-point, or from a Diablo printer, was unrecognizable!) Computer dealers have told me the chasm between OCR expectations and reality is so broad and deep that nine out of ten prospects leave their stores in disgust when they learn the limitations. And this is a very important, very unfortunate gap. Because the promise of recognition -- what people want it to do -- carries with it tremendous improvements in our productivity and ability to get tons of written documents into our computers where we can do real work with it. The good news is that a revolutionary new development effort has led to the new technology of "page recognition," which actually does deliver the promise we've always wanted from OCR. I'm sure every reader appreciates the breakthrough represented by the laser printer and page-makeup software, a combination so powerful it created new reasons for buying a computer. A similar breakthrough is happening right now in page recognition: the Macintosh (and, I must admit, other personal computers) equipped with a moderately priced scanner and OmniPage software (from Caere

  7. Initial Progress Toward Development of a Voice-Based Computer-Delivered Motivational Intervention for Heavy Drinking College Students: An Experimental Study

    Science.gov (United States)

    Lechner, William J; MacGlashan, James; Wray, Tyler B; Littman, Michael L

    2017-01-01

    perceived importance of changing drinking behaviors. In comparison to the text-based computer-delivered intervention condition, those assigned to voice-based computer-delivered intervention reported significantly fewer alcohol-related problems at the 1-month follow-up (incident rate ratio 0.60, 95% CI 0.44-0.83, P=.002). The conditions did not differ significantly on perceived importance of changing drinking or on measures of drinking quantity and frequency of heavy drinking. Conclusions Results indicate that it is feasible to construct a series of open-ended questions and a bank of responses and follow-up prompts that can be used in a future fully automated voice-based computer-delivered intervention that may mirror more closely human-delivered motivational interventions to reduce drinking. Such efforts will require using advanced speech recognition capabilities and machine-learning approaches to train a program to mirror the decisions made by human controllers in the voice-based computer-delivered intervention used in this study. In addition, future studies should examine enhancements that can increase the perceived warmth and empathy of voice-based computer-delivered intervention, possibly through greater personalization, improvements in the speech generation software, and embodying the computer-delivered intervention in a physical form. PMID:28659259

  8. Initial Progress Toward Development of a Voice-Based Computer-Delivered Motivational Intervention for Heavy Drinking College Students: An Experimental Study.

    Science.gov (United States)

    Kahler, Christopher W; Lechner, William J; MacGlashan, James; Wray, Tyler B; Littman, Michael L

    2017-06-28

    behaviors. In comparison to the text-based computer-delivered intervention condition, those assigned to voice-based computer-delivered intervention reported significantly fewer alcohol-related problems at the 1-month follow-up (incident rate ratio 0.60, 95% CI 0.44-0.83, P=.002). The conditions did not differ significantly on perceived importance of changing drinking or on measures of drinking quantity and frequency of heavy drinking. Results indicate that it is feasible to construct a series of open-ended questions and a bank of responses and follow-up prompts that can be used in a future fully automated voice-based computer-delivered intervention that may mirror more closely human-delivered motivational interventions to reduce drinking. Such efforts will require using advanced speech recognition capabilities and machine-learning approaches to train a program to mirror the decisions made by human controllers in the voice-based computer-delivered intervention used in this study. In addition, future studies should examine enhancements that can increase the perceived warmth and empathy of voice-based computer-delivered intervention, possibly through greater personalization, improvements in the speech generation software, and embodying the computer-delivered intervention in a physical form. ©Christopher W Kahler, William J Lechner, James MacGlashan, Tyler B Wray, Michael L Littman. Originally published in JMIR Mental Health (http://mental.jmir.org), 28.06.2017.

  9. The role of spectral and temporal cues in voice gender discrimination by normal-hearing listeners and cochlear implant users.

    Science.gov (United States)

    Fu, Qian-Jie; Chinchilla, Sherol; Galvin, John J

    2004-09-01

    The present study investigated the relative importance of temporal and spectral cues in voice gender discrimination and vowel recognition by normal-hearing subjects listening to an acoustic simulation of cochlear implant speech processing and by cochlear implant users. In the simulation, the number of speech processing channels ranged from 4 to 32, thereby varying the spectral resolution; the cutoff frequencies of the channels' envelope filters ranged from 20 to 320 Hz, thereby manipulating the available temporal cues. For normal-hearing subjects, results showed that both voice gender discrimination and vowel recognition scores improved as the number of spectral channels was increased. When only 4 spectral channels were available, voice gender discrimination significantly improved as the envelope filter cutoff frequency was increased from 20 to 320 Hz. For all spectral conditions, increasing the amount of temporal information had no significant effect on vowel recognition. Both voice gender discrimination and vowel recognition scores were highly variable among implant users. The performance of cochlear implant listeners was similar to that of normal-hearing subjects listening to comparable speech processing (4-8 spectral channels). The results suggest that both spectral and temporal cues contribute to voice gender discrimination and that temporal cues are especially important for cochlear implant users to identify the voice gender when there is reduced spectral resolution.

  10. Science and Software

    Science.gov (United States)

    Zelt, C. A.

    2017-12-01

    Earth science attempts to understand how the earth works. This research often depends on software for modeling, processing, inverting or imaging. Freely sharing open-source software is essential to prevent reinventing the wheel and allows software to be improved and applied in ways the original author may never have envisioned. For young scientists, releasing software can increase their name ID when applying for jobs and funding, and create opportunities for collaborations when scientists who collect data want the software's creator to be involved in their project. However, we frequently hear scientists say software is a tool, it's not science. Creating software that implements a new or better way of earth modeling or geophysical processing, inverting or imaging should be viewed as earth science. Creating software for things like data visualization, format conversion, storage, or transmission, or programming to enhance computational performance, may be viewed as computer science. The former, ideally with an application to real data, can be published in earth science journals, the latter possibly in computer science journals. Citations in either case should accurately reflect the impact of the software on the community. Funding agencies need to support more software development and open-source releasing, and the community should give more high-profile awards for developing impactful open-source software. Funding support and community recognition for software development can have far reaching benefits when the software is used in foreseen and unforeseen ways, potentially for years after the original investment in the software development. For funding, an open-source release that is well documented should be required, with example input and output files. Appropriate funding will provide the incentive and time to release user-friendly software, and minimize the need for others to duplicate the effort. All funded software should be available through a single web site

  11. Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System.

    Science.gov (United States)

    Partila, Pavol; Voznak, Miroslav; Tovarek, Jaromir

    2015-01-01

    The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.

  12. Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System

    Directory of Open Access Journals (Sweden)

    Pavol Partila

    2015-01-01

    Full Text Available The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.

  13. Crossing Cultures with Multi-Voiced Journals

    Science.gov (United States)

    Styslinger, Mary E.; Whisenant, Alison

    2004-01-01

    In this article, the authors discuss the benefits of using multi-voiced journals as a teaching strategy in reading instruction. Multi-voiced journals, an adaptation of dual-voiced journals, encourage responses to reading in varied, cultured voices of characters. It is similar to reading journals in that they prod students to connect to the lives…

  14. Objective voice and speech analysis of persons with chronic hoarseness by prosodic analysis of speech samples.

    Science.gov (United States)

    Haderlein, Tino; Döllinger, Michael; Matoušek, Václav; Nöth, Elmar

    2016-10-01

    Automatic voice assessment is often performed using sustained vowels. In contrast, speech analysis of read-out texts can be applied to voice and speech assessment. Automatic speech recognition and prosodic analysis were used to find regression formulae between automatic and perceptual assessment of four voice and four speech criteria. The regression was trained with 21 men and 62 women (average age 49.2 years) and tested with another set of 24 men and 49 women (48.3 years), all suffering from chronic hoarseness. They read the text 'Der Nordwind und die Sonne' ('The North Wind and the Sun'). Five voice and speech therapists evaluated the data on 5-point Likert scales. Ten prosodic and recognition accuracy measures (features) were identified which describe all the examined criteria. Inter-rater correlation within the expert group was between r = 0.63 for the criterion 'match of breath and sense units' and r = 0.87 for the overall voice quality. Human-machine correlation was between r = 0.40 for the match of breath and sense units and r = 0.82 for intelligibility. The perceptual ratings of different criteria were highly correlated with each other. Likewise, the feature sets modeling the criteria were very similar. The automatic method is suitable for assessing chronic hoarseness in general and for subgroups of functional and organic dysphonia. In its current version, it is almost as reliable as a randomly picked rater from a group of voice and speech therapists.

  15. Software for radiation protection

    International Nuclear Information System (INIS)

    Graffunder, H.

    2002-01-01

    The software products presented are universally usable programs for radiation protection. The systems were designed in order to establish a comprehensive database specific to radiation protection and, on this basis, model in programs subjects of radiation protection. Development initially focused on the creation of the database. Each software product was to access the same nuclide-specific data; input errors and differences in spelling were to be excluded from the outset. This makes the products more compatible with each other and able to exchange data among each other. The software products are modular in design. Functions recurring in radiation protection are always treated the same way in different programs, and also represented the same way on the program surface. The recognition effect makes it easy for users to familiarize with the products quickly. All software products are written in German and are tailored to the administrative needs and codes and regulations in Germany and in Switzerland. (orig.) [de

  16. How to help teachers' voices.

    Science.gov (United States)

    Saatweber, Margarete

    2008-01-01

    It has been shown that teachers are at high risk of developing occupational dysphonia, and it has been widely accepted that the vocal characteristics of a speaker play an important role in determining the reactions of listeners. The functions of breathing, breathing movement, breathing tonus, voice vibrations and articulation tonus are transmitted to the listener. So we may conclude that listening to the teacher's voice at school influences children's behavior and the perception of spoken language. This paper presents the concept of Schlaffhorst-Andersen including exercises to help teachers improve their voice, breathing, movement and their posture. Copyright 2008 S. Karger AG, Basel.

  17. Voice stress analysis and evaluation

    Science.gov (United States)

    Haddad, Darren M.; Ratley, Roy J.

    2001-02-01

    Voice Stress Analysis (VSA) systems are marketed as computer-based systems capable of measuring stress in a person's voice as an indicator of deception. They are advertised as being less expensive, easier to use, less invasive in use, and less constrained in their operation then polygraph technology. The National Institute of Justice have asked the Air Force Research Laboratory for assistance in evaluating voice stress analysis technology. Law enforcement officials have also been asking questions about this technology. If VSA technology proves to be effective, its value for military and law enforcement application is tremendous.

  18. Predicting Voice Disorder Status From Smoothed Measures of Cepstral Peak Prominence Using Praat and Analysis of Dysphonia in Speech and Voice (ADSV).

    Science.gov (United States)

    Sauder, Cara; Bretl, Michelle; Eadie, Tanya

    2017-09-01

    The purposes of this study were to (1) determine and compare the diagnostic accuracy of a single acoustic measure, smoothed cepstral peak prominence (CPPS), to predict voice disorder status from connected speech samples using two software systems: Analysis of Dysphonia in Speech and Voice (ADSV) and Praat; and (2) to determine the relationship between measures of CPPS generated from these programs. This is a retrospective cross-sectional study. Measures of CPPS were obtained from connected speech recordings of 100 subjects with voice disorders and 70 nondysphonic subjects without vocal complaints using commercially available ADSV and freely downloadable Praat software programs. Logistic regression and receiver operating characteristic (ROC) analyses were used to evaluate and compare the diagnostic accuracy of CPPS measures. Relationships between CPPS measures from the programs were determined. Results showed acceptable overall accuracy rates (75% accuracy, ADSV; 82% accuracy, Praat) and area under the ROC curves (area under the curve [AUC] = 0.81, ADSV; AUC = 0.91, Praat) for predicting voice disorder status, with slight differences in sensitivity and specificity. CPPS measures derived from Praat were uniquely predictive of disorder status above and beyond CPPS measures from ADSV (χ 2 (1) = 40.71, P disorder status using either program. Clinicians may consider using CPPS to complement clinical voice evaluation and screening protocols. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  19. Voice Habits and Behaviors: Voice Care Among Flamenco Singers.

    Science.gov (United States)

    Garzón García, Marina; Muñoz López, Juana; Y Mendoza Lara, Elvira

    2017-03-01

    The purpose of this study is to analyze the vocal behavior of flamenco singers, as compared with classical music singers, to establish a differential vocal profile of voice habits and behaviors in flamenco music. Bibliographic review was conducted, and the Singer's Vocal Habits Questionnaire, an experimental tool designed by the authors to gather data regarding hygiene behavior, drinking and smoking habits, type of practice, voice care, and symptomatology perceived in both the singing and the speaking voice, was administered. We interviewed 94 singers, divided into two groups: the flamenco experimental group (FEG, n = 48) and the classical control group (CCG, n = 46). Frequency analysis, a Likert scale, and discriminant and exploratory factor analysis were used to obtain a differential profile for each group. The FEG scored higher than the CCG in speaking voice symptomatology. The FEG scored significantly higher than the CCG in use of "inadequate vocal technique" when singing. Regarding voice habits, the FEG scored higher in "lack of practice and warm-up" and "environmental habits." A total of 92.6% of the subjects classified themselves correctly in each group. The Singer's Vocal Habits Questionnaire has proven effective in differentiating flamenco and classical singers. Flamenco singers are exposed to numerous vocal risk factors that make them more prone to vocal fatigue, mucosa dehydration, phonotrauma, and muscle stiffness than classical singers. Further research is needed in voice training in flamenco music, as a means to strengthen the voice and enable it to meet the requirements of this musical genre. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  20. A posteriori error estimates in voice source recovery

    Science.gov (United States)

    Leonov, A. S.; Sorokin, V. N.

    2017-12-01

    The inverse problem of voice source pulse recovery from a segment of a speech signal is under consideration. A special mathematical model is used for the solution that relates these quantities. A variational method of solving inverse problem of voice source recovery for a new parametric class of sources, that is for piecewise-linear sources (PWL-sources), is proposed. Also, a technique for a posteriori numerical error estimation for obtained solutions is presented. A computer study of the adequacy of adopted speech production model with PWL-sources is performed in solving the inverse problems for various types of voice signals, as well as corresponding study of a posteriori error estimates. Numerical experiments for speech signals show satisfactory properties of proposed a posteriori error estimates, which represent the upper bounds of possible errors in solving the inverse problem. The estimate of the most probable error in determining the source-pulse shapes is about 7-8% for the investigated speech material. It is noted that a posteriori error estimates can be used as a criterion of the quality for obtained voice source pulses in application to speaker recognition.

  1. Popstjerne af lys, lyd og software

    DEFF Research Database (Denmark)

    Hasse Jørgensen, Stina

    2016-01-01

    Hatsune Miku is a 3D animated hologram, her voice is a vocaloid. In other words she is a software application. Nevertheless she is a worldstar with stadion concerts and an astronomical number of fans. She is a crowdsourced Internet phenomena: her fans composes her hits and choreographs her...

  2. Voice and choice by delegation.

    Science.gov (United States)

    van de Bovenkamp, Hester; Vollaard, Hans; Trappenburg, Margo; Grit, Kor

    2013-02-01

    In many Western countries, options for citizens to influence public services are increased to improve the quality of services and democratize decision making. Possibilities to influence are often cast into Albert Hirschman's taxonomy of exit (choice), voice, and loyalty. In this article we identify delegation as an important addition to this framework. Delegation gives individuals the chance to practice exit/choice or voice without all the hard work that is usually involved in these options. Empirical research shows that not many people use their individual options of exit and voice, which could lead to inequality between users and nonusers. We identify delegation as a possible solution to this problem, using Dutch health care as a case study to explore this option. Notwithstanding various advantages, we show that voice and choice by delegation also entail problems of inequality and representativeness.

  3. Voice Force tulekul / Tõnu Ojala

    Index Scriptorium Estoniae

    Ojala, Tõnu, 1969-

    2005-01-01

    60. sünnipäeva tähistava Tallinna Tehnikaülikooli Akadeemilise Meeskoori juubelihooaja üritusest - a capella pop-gruppide festivalist Voice Force (kontserdid 12. nov. klubis Parlament ja 3. dets. Vene Kultuurikeskuses)

  4. Taking Care of Your Voice

    Science.gov (United States)

    ... negative effect on voice. Exercise regularly. Exercise increases stamina and muscle tone. This helps provide good posture ... testing man-made and biological materials and stem cell technologies that may eventually be used to engineer ...

  5. The Christian voice in philosophy

    Directory of Open Access Journals (Sweden)

    Stuart Fowler

    1982-03-01

    Full Text Available In this paper the Rev. Stuart Fowler outlines a Christian voice in Philosophy and urges the Christian philosopher to investigate his position and his stance with integrity and honesty.

  6. Memory for faces and voices varies as a function of sex and expressed emotion.

    Science.gov (United States)

    S Cortes, Diana; Laukka, Petri; Lindahl, Christina; Fischer, Håkan

    2017-01-01

    We investigated how memory for faces and voices (presented separately and in combination) varies as a function of sex and emotional expression (anger, disgust, fear, happiness, sadness, and neutral). At encoding, participants judged the expressed emotion of items in forced-choice tasks, followed by incidental Remember/Know recognition tasks. Results from 600 participants showed that accuracy (hits minus false alarms) was consistently higher for neutral compared to emotional items, whereas accuracy for specific emotions varied across the presentation modalities (i.e., faces, voices, and face-voice combinations). For the subjective sense of recollection ("remember" hits), neutral items received the highest hit rates only for faces, whereas for voices and face-voice combinations anger and fear expressions instead received the highest recollection rates. We also observed better accuracy for items by female expressers, and own-sex bias where female participants displayed memory advantage for female faces and face-voice combinations. Results further suggest that own-sex bias can be explained by recollection, rather than familiarity, rates. Overall, results show that memory for faces and voices may be influenced by the expressions that they carry, as well as by the sex of both items and participants. Emotion expressions may also enhance the subjective sense of recollection without enhancing memory accuracy.

  7. Recognizing famous voices: influence of stimulus duration and different types of retrieval cues.

    Science.gov (United States)

    Schweinberger, S R; Herholz, A; Sommer, W

    1997-04-01

    The current investigation measured the effects of increasing stimulus duration on listeners' ability to recognize famous voices. In addition, the investigation studied the influence of different types of cues on the naming of voices that could not be named before. Participants were presented with samples of famous and unfamiliar voices and were asked to decide whether or not the samples were spoken by a famous person. The duration of each sample increased in seven steps from 0.25 s up to a maximum of 2 s. Voice recognition improvements with stimulus duration were with a growth function. Gains were most rapid within the first second and less pronounced thereafter. When participants were unable to name a famous voice, they were cued with either a second voice sample, the occupation, or the initials of the celebrity. Initials were most effective in eliciting the name only when semantic information about the speaker had been accessed prior to cue presentation. Paralleling previous research on face naming, this may indicate that voice naming is contingent on previous activation of person-specific semantic information.

  8. Memory for faces and voices varies as a function of sex and expressed emotion.

    Directory of Open Access Journals (Sweden)

    Diana S Cortes

    Full Text Available We investigated how memory for faces and voices (presented separately and in combination varies as a function of sex and emotional expression (anger, disgust, fear, happiness, sadness, and neutral. At encoding, participants judged the expressed emotion of items in forced-choice tasks, followed by incidental Remember/Know recognition tasks. Results from 600 participants showed that accuracy (hits minus false alarms was consistently higher for neutral compared to emotional items, whereas accuracy for specific emotions varied across the presentation modalities (i.e., faces, voices, and face-voice combinations. For the subjective sense of recollection ("remember" hits, neutral items received the highest hit rates only for faces, whereas for voices and face-voice combinations anger and fear expressions instead received the highest recollection rates. We also observed better accuracy for items by female expressers, and own-sex bias where female participants displayed memory advantage for female faces and face-voice combinations. Results further suggest that own-sex bias can be explained by recollection, rather than familiarity, rates. Overall, results show that memory for faces and voices may be influenced by the expressions that they carry, as well as by the sex of both items and participants. Emotion expressions may also enhance the subjective sense of recollection without enhancing memory accuracy.

  9. The voice of emotion across species: how do human listeners recognize animals' affective states?

    Directory of Open Access Journals (Sweden)

    Marina Scheumann

    Full Text Available Voice-induced cross-taxa emotional recognition is the ability to understand the emotional state of another species based on its voice. In the past, induced affective states, experience-dependent higher cognitive processes or cross-taxa universal acoustic coding and processing mechanisms have been discussed to underlie this ability in humans. The present study sets out to distinguish the influence of familiarity and phylogeny on voice-induced cross-taxa emotional perception in humans. For the first time, two perspectives are taken into account: the self- (i.e. emotional valence induced in the listener versus the others-perspective (i.e. correct recognition of the emotional valence of the recording context. Twenty-eight male participants listened to 192 vocalizations of four different species (human infant, dog, chimpanzee and tree shrew. Stimuli were recorded either in an agonistic (negative emotional valence or affiliative (positive emotional valence context. Participants rated the emotional valence of the stimuli adopting self- and others-perspective by using a 5-point version of the Self-Assessment Manikin (SAM. Familiarity was assessed based on subjective rating, objective labelling of the respective stimuli and interaction time with the respective species. Participants reliably recognized the emotional valence of human voices, whereas the results for animal voices were mixed. The correct classification of animal voices depended on the listener's familiarity with the species and the call type/recording context, whereas there was less influence of induced emotional states and phylogeny. Our results provide first evidence that explicit voice-induced cross-taxa emotional recognition in humans is shaped more by experience-dependent cognitive mechanisms than by induced affective states or cross-taxa universal acoustic coding and processing mechanisms.

  10. Practical Voice Recognition for the Aircraft Cockpit, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — This proposal responds to the urgent need for improved pilot interfaces in the modern aircraft cockpit. Recent advances in aircraft equipment bring tremendous...

  11. Software engineering

    CERN Document Server

    Sommerville, Ian

    2010-01-01

    The ninth edition of Software Engineering presents a broad perspective of software engineering, focusing on the processes and techniques fundamental to the creation of reliable, software systems. Increased coverage of agile methods and software reuse, along with coverage of 'traditional' plan-driven software engineering, gives readers the most up-to-date view of the field currently available. Practical case studies, a full set of easy-to-access supplements, and extensive web resources make teaching the course easier than ever.

  12. Voice and gesture-based 3D multimedia presentation tool

    Science.gov (United States)

    Fukutake, Hiromichi; Akazawa, Yoshiaki; Okada, Yoshihiro

    2007-09-01

    This paper proposes a 3D multimedia presentation tool that allows the user to manipulate intuitively only through the voice input and the gesture input without using a standard keyboard or a mouse device. The authors developed this system as a presentation tool to be used in a presentation room equipped a large screen like an exhibition room in a museum because, in such a presentation environment, it is better to use voice commands and the gesture pointing input rather than using a keyboard or a mouse device. This system was developed using IntelligentBox, which is a component-based 3D graphics software development system. IntelligentBox has already provided various types of 3D visible, reactive functional components called boxes, e.g., a voice input component and various multimedia handling components. IntelligentBox also provides a dynamic data linkage mechanism called slot-connection that allows the user to develop 3D graphics applications by combining already existing boxes through direct manipulations on a computer screen. Using IntelligentBox, the 3D multimedia presentation tool proposed in this paper was also developed as combined components only through direct manipulations on a computer screen. The authors have already proposed a 3D multimedia presentation tool using a stage metaphor and its voice input interface. This time, we extended the system to make it accept the user gesture input besides voice commands. This paper explains details of the proposed 3D multimedia presentation tool and especially describes its component-based voice and gesture input interfaces.

  13. Speech to Text Software Evaluation Report

    CERN Document Server

    Martins Santo, Ana Luisa

    2017-01-01

    This document compares out-of-box performance of three commercially available speech recognition software: Vocapia VoxSigma TM , Google Cloud Speech, and Lime- craft Transcriber. It is defined a set of evaluation criteria and test methods for speech recognition softwares. The evaluation of these softwares in noisy environments are also included for the testing purposes. Recognition accuracy was compared using noisy environments and languages. Testing in ”ideal” non-noisy environment of a quiet room has been also performed for comparison.

  14. Smart Homes with Voice Activated Systems for Disabled People

    Directory of Open Access Journals (Sweden)

    Bekir Busatlic

    2017-02-01

    Full Text Available Smart home refers to the application of various technologies to semi-unsupervised home control It refers to systems that control temperature, lighting, door locks, windows and many other appliances. The aim of this study was to design a system that will use existing technology to showcase how it can benefit people with disabilities. This work uses only off-the-shelf products (smart home devices and controllers, speech recognition technology, open-source code libraries. The Voice Activated Smart Home application was developed to demonstrate online grocery shopping and home control using voice comments and tested by measuring its effectiveness in performing tasks as well as its efficiency in recognizing user speech input.

  15. Two-component network model in voice identification technologies

    Directory of Open Access Journals (Sweden)

    Edita K. Kuular

    2018-03-01

    Full Text Available Among the most important parameters of biometric systems with voice modalities that determine their effectiveness, along with reliability and noise immunity, a speed of identification and verification of a person has been accentuated. This parameter is especially sensitive while processing large-scale voice databases in real time regime. Many research studies in this area are aimed at developing new and improving existing algorithms for presentation and processing voice records to ensure high performance of voice biometric systems. Here, it seems promising to apply a modern approach, which is based on complex network platform for solving complex massive problems with a large number of elements and taking into account their interrelationships. Thus, there are known some works which while solving problems of analysis and recognition of faces from photographs, transform images into complex networks for their subsequent processing by standard techniques. One of the first applications of complex networks to sound series (musical and speech analysis are description of frequency characteristics by constructing network models - converting the series into networks. On the network ontology platform a previously proposed technique of audio information representation aimed on its automatic analysis and speaker recognition has been developed. This implies converting information into the form of associative semantic (cognitive network structure with amplitude and frequency components both. Two speaker exemplars have been recorded and transformed into pertinent networks with consequent comparison of their topological metrics. The set of topological metrics for each of network models (amplitude and frequency one is a vector, and together  those combine a matrix, as a digital "network" voiceprint. The proposed network approach, with its sensitivity to personal conditions-physiological, psychological, emotional, might be useful not only for person identification

  16. Machine Learning for Text-Independent Speaker Verification : How to Teach a Machine to RecognizeHuman Voices

    OpenAIRE

    Imoscopi, Stefano

    2016-01-01

    The aim of speaker recognition and veri cation is to identify people's identity from the characteristics of their voices (voice biometrics). Traditionally this technology has been employed mostly for security or authentication purposes, identi cation of employees/customers and criminal investigations. During the last decade the increasing popularity of hands-free and voice-controlled systems and the massive growth of media content generated on the internet has increased the need for technique...

  17. Understanding the 'Anorexic Voice' in Anorexia Nervosa.

    Science.gov (United States)

    Pugh, Matthew; Waller, Glenn

    2017-05-01

    In common with individuals experiencing a number of disorders, people with anorexia nervosa report experiencing an internal 'voice'. The anorexic voice comments on the individual's eating, weight and shape and instructs the individual to restrict or compensate. However, the core characteristics of the anorexic voice are not known. This study aimed to develop a parsimonious model of the voice characteristics that are related to key features of eating disorder pathology and to determine whether patients with anorexia nervosa fall into groups with different voice experiences. The participants were 49 women with full diagnoses of anorexia nervosa. Each completed validated measures of the power and nature of their voice experience and of their responses to the voice. Different voice characteristics were associated with current body mass index, duration of disorder and eating cognitions. Two subgroups emerged, with 'weaker' and 'stronger' voice experiences. Those with stronger voices were characterized by having more negative eating attitudes, more severe compensatory behaviours, a longer duration of illness and a greater likelihood of having the binge-purge subtype of anorexia nervosa. The findings indicate that the anorexic voice is an important element of the psychopathology of anorexia nervosa. Addressing the anorexic voice might be helpful in enhancing outcomes of treatments for anorexia nervosa, but that conclusion might apply only to patients with more severe eating psychopathology. Copyright © 2016 John Wiley & Sons, Ltd. Experiences of an internal 'anorexic voice' are common in anorexia nervosa. Clinicians should consider the role of the voice when formulating eating pathology in anorexia nervosa, including how individuals perceive and relate to that voice. Addressing the voice may be beneficial, particularly in more severe and enduring forms of anorexia nervosa. When working with the voice, clinicians should aim to address both the content of the voice and how

  18. Anti-voice adaptation suggests prototype-based coding of voice identity

    Directory of Open Access Journals (Sweden)

    Marianne eLatinus

    2011-07-01

    Full Text Available We used perceptual aftereffects induced by adaptation with anti-voice stimuli to investigate voice identity representations. Participants learned a set of voices then were tested on a voice identification task with vowel stimuli morphed between identities, after different conditions of adaptation. In Experiment 1, participants chose the identity opposite to the adapting anti-voice significantly more often than the other two identities (e.g., after being adapted to anti-A, they identified the average voice as A. In Experiment 2, participants showed a bias for identities opposite to the adaptor specifically for anti-voice, but not for non anti-voice adaptors. These results are strikingly similar to adaptation aftereffects observed for facial identity. They are compatible with a representation of individual voice identities in a multidimensional perceptual voice space referenced on a voice prototype.

  19. Optical voice encryption based on digital holography.

    Science.gov (United States)

    Rajput, Sudheesh K; Matoba, Osamu

    2017-11-15

    We propose an optical voice encryption scheme based on digital holography (DH). An off-axis DH is employed to acquire voice information by obtaining phase retardation occurring in the object wave due to sound wave propagation. The acquired hologram, including voice information, is encrypted using optical image encryption. The DH reconstruction and decryption with all the correct parameters can retrieve an original voice. The scheme has the capability to record the human voice in holograms and encrypt it directly. These aspects make the scheme suitable for other security applications and help to use the voice as a potential security tool. We present experimental and some part of simulation results.

  20. SOFTWARE OPEN SOURCE, SOFTWARE GRATIS?

    Directory of Open Access Journals (Sweden)

    Nur Aini Rakhmawati

    2006-01-01

    Full Text Available Normal 0 false false false IN X-NONE X-NONE MicrosoftInternetExplorer4 Berlakunya Undang – undang Hak Atas Kekayaan Intelektual (HAKI, memunculkan suatu alternatif baru untuk menggunakan software open source. Penggunaan software open source menyebar seiring dengan isu global pada Information Communication Technology (ICT saat ini. Beberapa organisasi dan perusahaan mulai menjadikan software open source sebagai pertimbangan. Banyak konsep mengenai software open source ini. Mulai dari software yang gratis sampai software tidak berlisensi. Tidak sepenuhnya isu software open source benar, untuk itu perlu dikenalkan konsep software open source mulai dari sejarah, lisensi dan bagaimana cara memilih lisensi, serta pertimbangan dalam memilih software open source yang ada. Kata kunci :Lisensi, Open Source, HAKI

  1. Indonesian Automatic Speech Recognition For Command Speech Controller Multimedia Player

    Directory of Open Access Journals (Sweden)

    Vivien Arief Wardhany

    2014-12-01

    Full Text Available The purpose of multimedia devices development is controlling through voice. Nowdays voice that can be recognized only in English. To overcome the issue, then recognition using Indonesian language model and accousticc model and dictionary. Automatic Speech Recognizier is build using engine CMU Sphinx with modified english language to Indonesian Language database and XBMC used as the multimedia player. The experiment is using 10 volunteers testing items based on 7 commands. The volunteers is classifiedd by the genders, 5 Male & 5 female. 10 samples is taken in each command, continue with each volunteer perform 10 testing command. Each volunteer also have to try all 7 command that already provided. Based on percentage clarification table, the word “Kanan” had the most recognize with percentage 83% while “pilih” is the lowest one. The word which had the most wrong clarification is “kembali” with percentagee 67%, while the word “kanan” is the lowest one. From the result of Recognition Rate by male there are several command such as “Kembali”, “Utama”, “Atas “ and “Bawah” has the low Recognition Rate. Especially for “kembali” cannot be recognized as the command in the female voices but in male voice that command has 4% of RR this is because the command doesn’t have similar word in english near to “kembali” so the system unrecognize the command. Also for the command “Pilih” using the female voice has 80% of RR but for the male voice has only 4% of RR. This problem is mostly because of the different voice characteristic between adult male and female which male has lower voice frequencies (from 85 to 180 Hz than woman (165 to 255 Hz.The result of the experiment showed that each man had different number of recognition rate caused by the difference tone, pronunciation, and speed of speech. For further work needs to be done in order to improving the accouracy of the Indonesian Automatic Speech Recognition system

  2. Towards Real-Time Speech Emotion Recognition for Affective E-Learning

    Science.gov (United States)

    Bahreini, Kiavash; Nadolski, Rob; Westera, Wim

    2016-01-01

    This paper presents the voice emotion recognition part of the FILTWAM framework for real-time emotion recognition in affective e-learning settings. FILTWAM (Framework for Improving Learning Through Webcams And Microphones) intends to offer timely and appropriate online feedback based upon learner's vocal intonations and facial expressions in order…

  3. Software Epistemology

    Science.gov (United States)

    2016-03-01

    in-vitro decision to incubate a startup, Lexumo [7], which is developing a commercial Software as a Service ( SaaS ) vulnerability assessment...LTS Label Transition System MUSE Mining and Understanding Software Enclaves RTEMS Real-Time Executive for Multi-processor Systems SaaS Software ...as a Service SSA Static Single Assignment SWE Software Epistemology UD/DU Def-Use/Use-Def Chains (Dataflow Graph)

  4. Mechanics of human voice production and control.

    Science.gov (United States)

    Zhang, Zhaoyan

    2016-10-01

    As the primary means of communication, voice plays an important role in daily life. Voice also conveys personal information such as social status, personal traits, and the emotional state of the speaker. Mechanically, voice production involves complex fluid-structure interaction within the glottis and its control by laryngeal muscle activation. An important goal of voice research is to establish a causal theory linking voice physiology and biomechanics to how speakers use and control voice to communicate meaning and personal information. Establishing such a causal theory has important implications for clinical voice management, voice training, and many speech technology applications. This paper provides a review of voice physiology and biomechanics, the physics of vocal fold vibration and sound production, and laryngeal muscular control of the fundamental frequency of voice, vocal intensity, and voice quality. Current efforts to develop mechanical and computational models of voice production are also critically reviewed. Finally, issues and future challenges in developing a causal theory of voice production and perception are discussed.

  5. Similar representations of emotions across faces and voices.

    Science.gov (United States)

    Kuhn, Lisa Katharina; Wydell, Taeko; Lavan, Nadine; McGettigan, Carolyn; Garrido, Lúcia

    2017-09-01

    [Correction Notice: An Erratum for this article was reported in Vol 17(6) of Emotion (see record 2017-18585-001). In the article, the copyright attribution was incorrectly listed and the Creative Commons CC-BY license disclaimer was incorrectly omitted from the author note. The correct copyright is "© 2017 The Author(s)" and the omitted disclaimer is below. All versions of this article have been corrected. "This article has been published under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Copyright for this article is retained by the author(s). Author(s) grant(s) the American Psychological Association the exclusive right to publish the article and identify itself as the original publisher."] Emotions are a vital component of social communication, carried across a range of modalities and via different perceptual signals such as specific muscle contractions in the face and in the upper respiratory system. Previous studies have found that emotion recognition impairments after brain damage depend on the modality of presentation: recognition from faces may be impaired whereas recognition from voices remains preserved, and vice versa. On the other hand, there is also evidence for shared neural activation during emotion processing in both modalities. In a behavioral study, we investigated whether there are shared representations in the recognition of emotions from faces and voices. We used a within-subjects design in which participants rated the intensity of facial expressions and nonverbal vocalizations for each of the 6 basic emotion labels. For each participant and each modality, we then computed a representation matrix with the intensity ratings of each emotion. These matrices allowed us to examine the patterns of confusions between emotions and to characterize the representations

  6. Software reliability

    CERN Document Server

    Bendell, A

    1986-01-01

    Software Reliability reviews some fundamental issues of software reliability as well as the techniques, models, and metrics used to predict the reliability of software. Topics covered include fault avoidance, fault removal, and fault tolerance, along with statistical methods for the objective assessment of predictive accuracy. Development cost models and life-cycle cost models are also discussed. This book is divided into eight sections and begins with a chapter on adaptive modeling used to predict software reliability, followed by a discussion on failure rate in software reliability growth mo

  7. Quick Statistics about Voice, Speech, and Language

    Science.gov (United States)

    ... here Home » Health Info » Statistics and Epidemiology Quick Statistics About Voice, Speech, Language Voice, Speech, Language, and ... no 205. Hyattsville, MD: National Center for Health Statistics. 2015. Hoffman HJ, Li C-M, Losonczy K, ...

  8. English Voicing in Dimensional Theory*

    Science.gov (United States)

    Iverson, Gregory K.; Ahn, Sang-Cheol

    2007-01-01

    Assuming a framework of privative features, this paper interprets two apparently disparate phenomena in English phonology as structurally related: the lexically specific voicing of fricatives in plural nouns like wives or thieves and the prosodically governed “flapping” of medial /t/ (and /d/) in North American varieties, which we claim is itself not a rule per se, but rather a consequence of the laryngeal weakening of fortis /t/ in interaction with speech-rate determined segmental abbreviation. Taking as our point of departure the Dimensional Theory of laryngeal representation developed by Avery & Idsardi (2001), along with their assumption that English marks voiceless obstruents but not voiced ones (Iverson & Salmons 1995), we find that an unexpected connection between fricative voicing and coronal flapping emerges from the interplay of familiar phonemic and phonetic factors in the phonological system. PMID:18496590

  9. Voices Falling Through the Air

    Directory of Open Access Journals (Sweden)

    Paul Elliman

    2012-11-01

    Full Text Available Where am I? Or as the young boy in Jules Verne’s Journey to the Centre of the Earth calls back to his distant-voiced companions: ‘Lost… in the most intense darkness.’ ‘Then I understood it,’ says the boy, Axel, ‘To make them hear me, all I had to do was to speak with my mouth close to the wall, which would serve to conduct my voice, as the wire conducts the electric fluid’ (Verne 1864. By timing their calls, the group of explorers work out that Axel is separated from them by a distance of four miles, held in a cavernous vertical gallery of smooth rock. Feeling his way down towards the others, the boy ends up falling, along with his voice, through the space. Losing consciousness he seems to give himself up to the space...

  10. Computer software.

    Science.gov (United States)

    Rosenthal, L E

    1986-10-01

    Software is the component in a computer system that permits the hardware to perform the various functions that a computer system is capable of doing. The history of software and its development can be traced to the early nineteenth century. All computer systems are designed to utilize the "stored program concept" as first developed by Charles Babbage in the 1850s. The concept was lost until the mid-1940s, when modern computers made their appearance. Today, because of the complex and myriad tasks that a computer system can perform, there has been a differentiation of types of software. There is software designed to perform specific business applications. There is software that controls the overall operation of a computer system. And there is software that is designed to carry out specialized tasks. Regardless of types, software is the most critical component of any computer system. Without it, all one has is a collection of circuits, transistors, and silicone chips.

  11. Speaker Recognition

    DEFF Research Database (Denmark)

    Mølgaard, Lasse Lohilahti; Jørgensen, Kasper Winther

    2005-01-01

    Speaker recognition is basically divided into speaker identification and speaker verification. Verification is the task of automatically determining if a person really is the person he or she claims to be. This technology can be used as a biometric feature for verifying the identity of a person...

  12. Speaker's voice as a memory cue.

    Science.gov (United States)

    Campeanu, Sandra; Craik, Fergus I M; Alain, Claude

    2015-02-01

    Speaker's voice occupies a central role as the cornerstone of auditory social interaction. Here, we review the evidence suggesting that speaker's voice constitutes an integral context cue in auditory memory. Investigation into the nature of voice representation as a memory cue is essential to understanding auditory memory and the neural correlates which underlie it. Evidence from behavioral and electrophysiological studies suggest that while specific voice reinstatement (i.e., same speaker) often appears to facilitate word memory even without attention to voice at study, the presence of a partial benefit of similar voices between study and test is less clear. In terms of explicit memory experiments utilizing unfamiliar voices, encoding methods appear to play a pivotal role. Voice congruency effects have been found when voice is specifically attended at study (i.e., when relatively shallow, perceptual encoding takes place). These behavioral findings coincide with neural indices of memory performance such as the parietal old/new recollection effect and the late right frontal effect. The former distinguishes between correctly identified old words and correctly identified new words, and reflects voice congruency only when voice is attended at study. Characterization of the latter likely depends upon voice memory, rather than word memory. There is also evidence to suggest that voice effects can be found in implicit memory paradigms. However, the presence of voice effects appears to depend greatly on the task employed. Using a word identification task, perceptual similarity between study and test conditions is, like for explicit memory tests, crucial. In addition, the type of noise employed appears to have a differential effect. While voice effects have been observed when white noise is used at both study and test, using multi-talker babble does not confer the same results. In terms of neuroimaging research modulations, characterization of an implicit memory effect

  13. Permanent Quadriplegia Following Replacement of Voice Prosthesis.

    Science.gov (United States)

    Ozturk, Kayhan; Erdur, Omer; Kibar, Ertugrul

    2016-11-01

    The authors presented a patient with quadriplegia caused by cervical spine abscess following voice prosthesis replacement. The authors present the first reported permanent quadriplegia patient caused by voice prosthesis replacement. The authors wanted to emphasize that life-threatening complications may be faced during the replacement of voice prosthesis. Care should be taken during the replacement of voice prosthesis and if some problems have been faced during the procedure patients must be followed closely.

  14. I like my voice better: self-enhancement bias in perceptions of voice attractiveness.

    Science.gov (United States)

    Hughes, Susan M; Harrison, Marissa A

    2013-01-01

    Previous research shows that the human voice can communicate a wealth of nonsemantic information; preferences for voices can predict health, fertility, and genetic quality of the speaker, and people often use voice attractiveness, in particular, to make these assessments of others. But it is not known what we think of the attractiveness of our own voices as others hear them. In this study eighty men and women rated the attractiveness of an array of voice recordings of different individuals and were not told that their own recorded voices were included in the presentation. Results showed that participants rated their own voices as sounding more attractive than others had rated their voices, and participants also rated their own voices as sounding more attractive than they had rated the voices of others. These findings suggest that people may engage in vocal implicit egotism, a form of self-enhancement.

  15. Analyzing the mediated voice - a datasession

    DEFF Research Database (Denmark)

    Lawaetz, Anna

    Broadcasted voices are technologically manipulated. In order to achieve a certain autencity or sound of “reality” paradoxically the voices are filtered and trained in order to reach the listeners. This “mis-en-scene” is important knowledge when it comes to the development of a consistent method o...... of analysis of the mediated voice...

  16. Voices Not Heard: Voice-Use Profiles of Elementary Music Teachers, the Effects of Voice Amplification on Vocal Load, and Perceptions of Issues Surrounding Voice Use

    Science.gov (United States)

    Morrow, Sharon L.

    2009-01-01

    Teachers represent the largest group of occupational voice users and have voice-related problems at a rate of over twice that found in the general population. Among teachers, music teachers are roughly four times more likely than classroom teachers to develop voice-related problems. Although it has been established that music teachers use their…

  17. Multi-thread Parallel Speech Recognition for Mobile Applications

    Directory of Open Access Journals (Sweden)

    LOJKA Martin

    2014-05-01

    Full Text Available In this paper, the server based solution of the multi-thread large vocabulary automatic speech recognition engine is described along with the Android OS and HTML5 practical application examples. The basic idea was to bring speech recognition available for full variety of applications for computers and especially for mobile devices. The speech recognition engine should be independent of commercial products and services (where the dictionary could not be modified. Using of third-party services could be also a security and privacy problem in specific applications, when the unsecured audio data could not be sent to uncontrolled environments (voice data transferred to servers around the globe. Using our experience with speech recognition applications, we have been able to construct a multi-thread speech recognition serverbased solution designed for simple applications interface (API to speech recognition engine modified to specific needs of particular application.

  18. Interventions for preventing voice disorders in adults.

    Science.gov (United States)

    Ruotsalainen, J H; Sellman, J; Lehto, L; Jauhiainen, M; Verbeek, J H

    2007-10-17

    Poor voice quality due to a voice disorder can lead to a reduced quality of life. In occupations where voice use is substantial it can lead to periods of absence from work. To evaluate the effectiveness of interventions to prevent voice disorders in adults. We searched MEDLINE (PubMed, 1950 to 2006), EMBASE (1974 to 2006), CENTRAL (The Cochrane Library, Issue 2 2006), CINAHL (1983 to 2006), PsychINFO (1967 to 2006), Science Citation Index (1986 to 2006) and the Occupational Health databases OSH-ROM (to 2006). The date of the last search was 05/04/06. Randomised controlled clinical trials (RCTs) of interventions evaluating the effectiveness of treatments to prevent voice disorders in adults. For work-directed interventions interrupted time series and prospective cohort studies were also eligible. Two authors independently extracted data and assessed trial quality. Meta-analysis was performed where appropriate. We identified two randomised controlled trials including a total of 53 participants in intervention groups and 43 controls. One study was conducted with teachers and the other with student teachers. Both trials were poor quality. Interventions were grouped into 1) direct voice training, 2) indirect voice training and 3) direct and indirect voice training combined.1) Direct voice training: One study did not find a significant decrease of the Voice Handicap Index for direct voice training compared to no intervention.2) Indirect voice training: One study did not find a significant decrease of the Voice Handicap Index for indirect voice training when compared to no intervention.3) Direct and indirect voice training combined: One study did not find a decrease of the Voice Handicap Index for direct and indirect voice training combined when compared to no intervention. The same study did however find an improvement in maximum phonation time (Mean Difference -3.18 sec; 95 % CI -4.43 to -1.93) for direct and indirect voice training combined when compared to no

  19. Perceptual-Auditory and Acoustical Analysis of the Voices of Transgender Women.

    Science.gov (United States)

    Schwarz, Karine; Fontanari, Anna Martha Vaitses; Costa, Angelo Brandelli; Soll, Bianca Machado Borba; da Silva, Dhiordan Cardoso; de Sá Villas-Bôas, Anna Paula; Cielo, Carla Aparecida; Bastilha, Gabriele Rodrigues; Ribeiro, Vanessa Veis; Dorfman, Maria Elza Kazumi Yamaguti; Lobato, Maria Inês Rodrigues

    2017-09-28

    Voice is an important gender marker in the transition process as a transgender individual accepts a new gender identity. The objectives of this study were to describe and relate aspects of a perceptual-auditory analysis and the fundamental frequency (F0) of male-to-female (MtF) transsexual individuals. A case-control study was carried out with individuals aged 19-52 years who attended the Gender Identity Program of the Hospital de Clínicas of Porto Alegre. Vocal recordings from the MtF transgender and cisgender individuals (vowel /a:/ and six phrases of Consensus Auditory Perceptual Evaluation Voice [CAPE-V]) were edited and randomly coded before storage in a Dropbox folder. The voices (vowel /a:/) were analyzed by consensus on the same day by two judge speech therapists who had more than 10 years of experience in the voice area using the GRBASI perceptual-auditory vocal evaluation scale. Acoustic analysis of the voices was performed using the advanced Multi-Dimensional Voice Program software. The resonance focus and the degrees of masculinity and femininity for each voice recording were determined by listening to the CAPE-V phrases, for the same judges. There were significant differences between the groups regarding a greater frequency of subjects with F0 between 80 and 150 Hz (P = 0.003), and a greater frequency of hypernasal resonant focus (P < 0.001) in the MtF cases and greater frequency of subjects with absence of roughness (P = 0.031) in the control group. The MtF group of individuals showed altered vertical resonant focus, more masculine voices, and lower fundamental frequencies. The control group showed a significant absence of roughness. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  20. Technical Evaluation Report 37: Assistive Software for Disabled Learners

    Directory of Open Access Journals (Sweden)

    Jon Baggaley

    2004-11-01

    Full Text Available Previous reports in this series (#32 and 36 have discussed online software features of value to disabled learners in distance education. The current report evaluates four specific assistive software products with useful features for visually and hearing impaired learners: ATutor, ACollab, Natural Voice, and Just Vanilla. The evaluative criteria discussed include the purpose, uses, costs, and features of each software product, all considered primarily from the accessibility perspective.

  1. Application of Pattern Recognition Techniques to the Classification of Full-Term and Preterm Infant Cry.

    Science.gov (United States)

    Orlandi, Silvia; Reyes Garcia, Carlos Alberto; Bandini, Andrea; Donzelli, Gianpaolo; Manfredi, Claudia

    2016-11-01

    Scientific and clinical advances in perinatology and neonatology have enhanced the chances of survival of preterm and very low weight neonates. Infant cry analysis is a suitable noninvasive complementary tool to assess the neurologic state of infants particularly important in the case of preterm neonates. This article aims at exploiting differences between full-term and preterm infant cry with robust automatic acoustical analysis and data mining techniques. Twenty-two acoustical parameters are estimated in more than 3000 cry units from cry recordings of 28 full-term and 10 preterm newborns. Feature extraction is performed through the BioVoice dedicated software tool, developed at the Biomedical Engineering Lab, University of Firenze, Italy. Classification and pattern recognition is based on genetic algorithms for the selection of the best attributes. Training is performed comparing four classifiers: Logistic Curve, Multilayer Perceptron, Support Vector Machine, and Random Forest and three different testing options: full training set, 10-fold cross-validation, and 66% split. Results show that the best feature set is made up by 10 parameters capable to assess differences between preterm and full-term newborns with about 87% of accuracy. Best results are obtained with the Random Forest method (receiver operating characteristic area, 0.94). These 10 cry features might convey important additional information to assist the clinical specialist in the diagnosis and follow-up of possible delays or disorders in the neurologic development due to premature birth in this extremely vulnerable population of patients. The proposed approach is a first step toward an automatic infant cry recognition system for fast and proper identification of risk in preterm babies. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  2. Can blind persons accurately assess body size from the voice?

    Science.gov (United States)

    Pisanski, Katarzyna; Oleszkiewicz, Anna; Sorokowska, Agnieszka

    2016-04-01

    Vocal tract resonances provide reliable information about a speaker's body size that human listeners use for biosocial judgements as well as speech recognition. Although humans can accurately assess men's relative body size from the voice alone, how this ability is acquired remains unknown. In this study, we test the prediction that accurate voice-based size estimation is possible without prior audiovisual experience linking low frequencies to large bodies. Ninety-one healthy congenitally or early blind, late blind and sighted adults (aged 20-65) participated in the study. On the basis of vowel sounds alone, participants assessed the relative body sizes of male pairs of varying heights. Accuracy of voice-based body size assessments significantly exceeded chance and did not differ among participants who were sighted, or congenitally blind or who had lost their sight later in life. Accuracy increased significantly with relative differences in physical height between men, suggesting that both blind and sighted participants used reliable vocal cues to size (i.e. vocal tract resonances). Our findings demonstrate that prior visual experience is not necessary for accurate body size estimation. This capacity, integral to both nonverbal communication and speech perception, may be present at birth or may generalize from broader cross-modal correspondences. © 2016 The Author(s).

  3. Construction site Voice Operated Information System (VOIS) test

    Science.gov (United States)

    Lawrence, Debbie J.; Hettchen, William

    1991-01-01

    The Voice Activated Information System (VAIS), developed by USACERL, allows inspectors to verbally log on-site inspection reports on a hand held tape recorder. The tape is later processed by the VAIS, which enters the information into the system's database and produces a written report. The Voice Operated Information System (VOIS), developed by USACERL and Automated Sciences Group, through a ESACERL cooperative research and development agreement (CRDA), is an improved voice recognition system based on the concepts and function of the VAIS. To determine the applicability of the VOIS to Corps of Engineers construction projects, Technology Transfer Test Bad (T3B) funds were provided to the Corps of Engineers National Security Agency (NSA) Area Office (Fort Meade) to procure and implement the VOIS, and to train personnel in its use. This report summarizes the NSA application of the VOIS to quality assurance inspection of radio frequency shielding and to progress payment logs, and concludes that the VOIS is an easily implemented system that can offer improvements when applied to repetitive inspection procedures. Use of VOIS can save time during inspection, improve documentation storage, and provide flexible retrieval of stored information.

  4. Speech Processing and Recognition (SPaRe)

    Science.gov (United States)

    2011-01-01

    computer based Free-To-Air ( FTA ) satellite cards . Developed scripts to select channels and stream live video from FTA satellite cards . • Edited 107...eliminate corrupted video. • Investigated use of older ATI Radeon 8500 DV video cards to capture analog Cable Television (CATV) signals. • Setup...a free/open source software communications platform for the creation of voice and chat driven products. FTA Free-to-air ( FTA ) describes television

  5. Application Of t-Cherry Junction Trees in Pattern Recognition

    Directory of Open Access Journals (Sweden)

    Edith Kovacs

    2010-06-01

    Full Text Available Pattern recognition aims to classify data (patterns based ei-
    ther on a priori knowledge or on statistical information extracted from the data. In this paper we will concentrate on statistical pattern recognition using a new probabilistic approach which makes possible to select the so called 'informative' features. We develop a pattern recognition algorithm which is based on the conditional independence structure underlying the statistical data. Our method was succesfully applied on a real problem of recognizing Parkinson's disease on the basis of voice disorders.

  6. Work-related voice disorder

    Directory of Open Access Journals (Sweden)

    Paulo Eduardo Przysiezny

    2015-04-01

    Full Text Available INTRODUCTION: Dysphonia is the main symptom of the disorders of oral communication. However, voice disorders also present with other symptoms such as difficulty in maintaining the voice (asthenia, vocal fatigue, variation in habitual vocal fundamental frequency, hoarseness, lack of vocal volume and projection, loss of vocal efficiency, and weakness when speaking. There are several proposals for the etiologic classification of dysphonia: functional, organofunctional, organic, and work-related voice disorder (WRVD.OBJECTIVE: To conduct a literature review on WRVD and on the current Brazilian labor legislation.METHODS: This was a review article with bibliographical research conducted on the PubMed and Bireme databases, using the terms "work-related voice disorder", "occupational dysphonia", "dysphonia and labor legislation", and a review of labor and social security relevant laws.CONCLUSION: WRVD is a situation that frequently is listed as a reason for work absenteeism, functional rehabilitation, or for prolonged absence from work. Currently, forensic physicians have no comparative parameters to help with the analysis of vocal disorders. In certain situations WRVD may cause, work disability. This disorder may be labor-related, or be an adjuvant factor to work-related diseases.

  7. Playful Interaction with Voice Sensing Modular Robots

    DEFF Research Database (Denmark)

    Heesche, Bjarke; MacDonald, Ewen; Fogh, Rune

    2013-01-01

    This paper describes a voice sensor, suitable for modular robotic systems, which estimates the energy and fundamental frequency, F0, of the user’s voice. Through a number of example applications and tests with children, we observe how the voice sensor facilitates playful interaction between child...... children and two different robot configurations. In future work, we will investigate if such a system can motivate children to improve voice control and explore how to extend the sensor to detect emotions in the user’s voice....

  8. VOICE QUALITY BEFORE AND AFTER THYROIDECTOMY

    Directory of Open Access Journals (Sweden)

    Dora CVELBAR

    2016-04-01

    Full Text Available Introduction: Voice disorders are a well-known complication which is often associated with thyroid gland diseases and because voice is still the basic mean of communication it is very important to maintain its quality healthy. Objectives: The aim of this study referred to questions whether there is a statistically significant difference between results of voice self-assessment, perceptual voice assessment and acoustic voice analysis before and after thyroidectomy and whether there are statistically significant correlations between variables of voice self-assessment, perceptual assessment and acoustic analysis before and after thyroidectomy. Methods: This scientific research included 12 participants aged between 41 and 76. Voice self-assessment was conducted with the help of Croatian version of Voice Handicap Index (VHI. Recorded reading samples were used for perceptual assessment and later evaluated by two clinical speech and language therapists. Recorded samples of phonation were used for acoustic analysis which was conducted with the help of acoustic program Praat. All of the data was processed through descriptive statistics and nonparametric statistical methods. Results: Results showed that there are statistically significant differences between results of voice self-assessments and results of acoustic analysis before and after thyroidectomy. Statistically significant correlations were found between variables of perceptual assessment and acoustic analysis. Conclusion: Obtained results indicate the importance of multidimensional, preoperative and postoperative assessment. This kind of assessment allows the clinician to describe all of the voice features and provides appropriate recommendation for further rehabilitation to the patient in order to optimize voice outcomes.

  9. Application of computer voice input/output

    International Nuclear Information System (INIS)

    Ford, W.; Shirk, D.G.

    1981-01-01

    The advent of microprocessors and other large-scale integration (LSI) circuits is making voice input and output for computers and instruments practical; specialized LSI chips for speech processing are appearing on the market. Voice can be used to input data or to issue instrument commands; this allows the operator to engage in other tasks, move about, and to use standard data entry systems. Voice synthesizers can generate audible, easily understood instructions. Using voice characteristics, a control system can verify speaker identity for security purposes. Two simple voice-controlled systems have been designed at Los Alamos for nuclear safeguards applicaations. Each can easily be expanded as time allows. The first system is for instrument control that accepts voice commands and issues audible operator prompts. The second system is for access control. The speaker's voice is used to verify his identity and to actuate external devices

  10. The development of the Spanish verb ir into auxiliary of voice

    DEFF Research Database (Denmark)

    Vinther, Thora

    2005-01-01

    spanish, syntax, grammaticalisation, past participle, passive voice, middle voice, language development......spanish, syntax, grammaticalisation, past participle, passive voice, middle voice, language development...

  11. Musical training software for children with cochlear implants.

    Science.gov (United States)

    Di Nardo, W; Schinaia, L; Anzivino, R; De Corso, E; Ciacciarelli, A; Paludetti, G

    2015-10-01

    Although the voice in a free field has an excellent recruitment by a cochlear implant (CI), the situation is different for music because it is a much more complex process, where perceiving the pitch discrimination becomes important to appreciate it. The aim of this study is to determine the music perception abilities among children with Cis and to verify the benefit of a training period for specific musical frequency discrimination. Our main goals were to prepare a computer tool for pitch discrimination training and to assess musical improvements. Ten children, aged between 5 and 12 years, with optimal phoneme recognition in quiet and with no disabilities associated with deafness, were selected to join the training. Each patient received, before training period, two types of exams: a pitch discrimination test, consisting of discovering if two notes were different or not; and a music test consisting of two identification tasks (melodic and full version) of one music-item among 5 popular childhood songs. After assessment, a music training software was designed and utilised individually at home for a period of six months. The results following complete training showed significantly higher performance in the task of frequency discrimination. After a proper musical training identification, frequency discrimination performance was significantly higher (p musical enhancement and to achieve improvements in frequency discrimination, following pitch discrimination training.

  12. Software Innovation

    DEFF Research Database (Denmark)

    Rose, Jeremy

      Innovation is the forgotten key to modern systems development - the element that defines the enterprising engineer, the thriving software firm and the cutting edge software application.  Traditional forms of technical education pay little attention to creativity - often encouraging overly...

  13. Software engineering

    CERN Document Server

    Sommerville, Ian

    2016-01-01

    For courses in computer science and software engineering The Fundamental Practice of Software Engineering Software Engineering introduces readers to the overwhelmingly important subject of software programming and development. In the past few years, computer systems have come to dominate not just our technological growth, but the foundations of our world's major industries. This text seeks to lay out the fundamental concepts of this huge and continually growing subject area in a clear and comprehensive manner. The Tenth Edition contains new information that highlights various technological updates of recent years, providing readers with highly relevant and current information. Sommerville's experience in system dependability and systems engineering guides the text through a traditional plan-based approach that incorporates some novel agile methods. The text strives to teach the innovators of tomorrow how to create software that will make our world a better, safer, and more advanced place to live.

  14. Foetal response to music and voice.

    Science.gov (United States)

    Al-Qahtani, Noura H

    2005-10-01

    To examine whether prenatal exposure to music and voice alters foetal behaviour and whether foetal response to music differs from human voice. A prospective observational study was conducted in 20 normal term pregnant mothers. Ten foetuses were exposed to music and voice for 15 s at different sound pressure levels to find out the optimal setting for the auditory stimulation. Music, voice and sham were played to another 10 foetuses via a headphone on the maternal abdomen. The sound pressure level was 105 db and 94 db for music and voice, respectively. Computerised assessment of foetal heart rate and activity were recorded. 90 actocardiograms were obtained for the whole group. One way anova followed by posthoc (Student-Newman-Keuls method) analysis was used to find if there is significant difference in foetal response to music and voice versus sham. Foetuses responded with heart rate acceleration and motor response to both music and voice. This was statistically significant compared to sham. There was no significant difference between the foetal heart rate acceleration to music and voice. Prenatal exposure to music and voice alters the foetal behaviour. No difference was detected in foetal response to music and voice.

  15. Software requirements

    CERN Document Server

    Wiegers, Karl E

    2003-01-01

    Without formal, verifiable software requirements-and an effective system for managing them-the programs that developers think they've agreed to build often will not be the same products their customers are expecting. In SOFTWARE REQUIREMENTS, Second Edition, requirements engineering authority Karl Wiegers amplifies the best practices presented in his original award-winning text?now a mainstay for anyone participating in the software development process. In this book, you'll discover effective techniques for managing the requirements engineering process all the way through the development cy

  16. Automated road marking recognition system

    Science.gov (United States)

    Ziyatdinov, R. R.; Shigabiev, R. R.; Talipov, D. N.

    2017-09-01

    Development of the automated road marking recognition systems in existing and future vehicles control systems is an urgent task. One way to implement such systems is the use of neural networks. To test the possibility of using neural network software has been developed with the use of a single-layer perceptron. The resulting system based on neural network has successfully coped with the task both when driving in the daytime and at night.

  17. Multimodal approaches for emotion recognition: a survey

    Science.gov (United States)

    Sebe, Nicu; Cohen, Ira; Gevers, Theo; Huang, Thomas S.

    2005-01-01

    Recent technological advances have enabled human users to interact with computers in ways previously unimaginable. Beyond the confines of the keyboard and mouse, new modalities for human-computer interaction such as voice, gesture, and force-feedback are emerging. Despite important advances, one necessary ingredient for natural interaction is still missing-emotions. Emotions play an important role in human-to-human communication and interaction, allowing people to express themselves beyond the verbal domain. The ability to understand human emotions is desirable for the computer in several applications. This paper explores new ways of human-computer interaction that enable the computer to be more aware of the user's emotional and attentional expressions. We present the basic research in the field and the recent advances into the emotion recognition from facial, voice, and physiological signals, where the different modalities are treated independently. We then describe the challenging problem of multimodal emotion recognition and we advocate the use of probabilistic graphical models when fusing the different modalities. We also discuss the difficult issues of obtaining reliable affective data, obtaining ground truth for emotion recognition, and the use of unlabeled data.

  18. Voice disorders in mucosal leishmaniasis.

    Directory of Open Access Journals (Sweden)

    Ana Cristina Nunes Ruas

    Full Text Available INTRODUCTION: Leishmaniasis is considered as one of the six most important infectious diseases because of its high detection coefficient and ability to produce deformities. In most cases, mucosal leishmaniasis (ML occurs as a consequence of cutaneous leishmaniasis. If left untreated, mucosal lesions can leave sequelae, interfering in the swallowing, breathing, voice and speech processes and requiring rehabilitation. OBJECTIVE: To describe the anatomical characteristics and voice quality of ML patients. MATERIALS AND METHODS: A descriptive transversal study was conducted in a cohort of ML patients treated at the Laboratory for Leishmaniasis Surveillance of the Evandro Chagas National Institute of Infectious Diseases-Fiocruz, between 2010 and 2013. The patients were submitted to otorhinolaryngologic clinical examination by endoscopy of the upper airways and digestive tract and to speech-language assessment through directed anamnesis, auditory perception, phonation times and vocal acoustic analysis. The variables of interest were epidemiologic (sex and age and clinic (lesion location, associated symptoms and voice quality. RESULTS: 26 patients under ML treatment and monitored by speech therapists were studied. 21 (81% were male and five (19% female, with ages ranging from 15 to 78 years (54.5+15.0 years. The lesions were distributed in the following structures 88.5% nasal, 38.5% oral, 34.6% pharyngeal and 19.2% laryngeal, with some patients presenting lesions in more than one anatomic site. The main complaint was nasal obstruction (73.1%, followed by dysphonia (38.5%, odynophagia (30.8% and dysphagia (26.9%. 23 patients (84.6% presented voice quality perturbations. Dysphonia was significantly associated to lesions in the larynx, pharynx and oral cavity. CONCLUSION: We observed that vocal quality perturbations are frequent in patients with mucosal leishmaniasis, even without laryngeal lesions; they are probably associated to disorders of some

  19. Software Reviews.

    Science.gov (United States)

    Dwyer, Donna; And Others

    1989-01-01

    Reviewed are seven software packages for Apple and IBM computers. Included are: "Toxicology"; "Science Corner: Space Probe"; "Alcohol and Pregnancy"; "Science Tool Kit Plus"; Computer Investigations: Plant Growth"; "Climatrolls"; and "Animal Watch: Whales." (CW)

  20. Software Reviews.

    Science.gov (United States)

    Davis, Shelly J., Ed.; Knaupp, Jon, Ed.

    1984-01-01

    Reviewed is computer software on: (1) classification of living things, a tutorial program for grades 5-10; and (2) polynomial practice using tiles, a drill-and-practice program for algebra students. (MNS)

  1. Software Reviews.

    Science.gov (United States)

    Miller, Anne, Ed.; Radziemski, Cathy, Ed.

    1988-01-01

    Three pieces of computer software are described and reviewed: HyperCard, to build and use varied applications; Iggy's Gnees, for problem solving with shapes in grades kindergarten-two; and Algebra Shop, for practicing skills and problem solving. (MNS)

  2. Integrating cues of social interest and voice pitch in men's preferences for women's voices

    OpenAIRE

    Jones, Benedict C; Feinberg, David R; DeBruine, Lisa M; Little, Anthony C; Vukovic, Jovana

    2008-01-01

    Most previous studies of vocal attractiveness have focused on preferences for physical characteristics of voices such as pitch. Here we examine the content of vocalizations in interaction with such physical traits, finding that vocal cues of social interest modulate the strength of men's preferences for raised pitch in women's voices. Men showed stronger preferences for raised pitch when judging the voices of women who appeared interested in the listener than when judging the voices of women ...

  3. Evaluation of voice acoustic parameters related to the vocal-loading test in professionally active teachers with dysphonia.

    Science.gov (United States)

    Niebudek-Bogusz, Ewa; Kotyło, Piotr; Sliwińska-Kowalska, Mariola

    2007-01-01

    Teachers are at risk of developing voice disorders. A clinical battery of vocal function tests should include non-invasive and accurate measurements. The quantitative methods (e.g., voice acoustic analysis) make it possible to objectively evaluate voice efficiency and outcomes of dysphonia treatment. To identify possible signs of vocal fatigue, acoustic waveform perturbations during sustained phonation were measured before and after the vocal-loading test in 51 professionally active female teachers with functional voice disorders, using IRIS software. All the participants were also subjected to laryngological/phoniatric examination involving videostroboscopy combined with self-estimation by voice handicap index (VHI)-based scale. The phoniatric examination revealed glottal insufficiency with bowed vocal folds in 35.2%, soft vocal nodules in 31.4%, and hyperfunctional dysphonia with a tendency towards vestibular phonation in 19.6% of the patients. In the VHI scale, 66% of the female teachers estimated their own voice problems as moderate disability. An acoustic analysis performed after the vocal-loading test showed an increased rate of abnormal frequency perturbation parameters (pitch perturbation quotient (Jitter), relative average perturbation (RAP), and pitch period perturbation quotient (PPQ)) compared to the pre-test outcomes. The same was true of pitch-intensity contour of vowel /a:/, an indication of voice instability during sustained phonation. The recorded impairments of voice acoustic parameters related to vocal loading provide further evidence of dysphonia. The voice acoustic analysis performed before and after the vocal-loading test can significantly contribute to objective voice examinations useful in diagnosis of dysphonia among teachers.

  4. Use of Splines in Handwritten Character Recognition

    OpenAIRE

    Sunil Kumar; Gopinath S,; Satish Kumar; Rajesh Chhikara

    2010-01-01

    Handwritten Character Recognition is software used to identify the handwritten characters and receive and interpret intelligible andwritten input from sources such as manuscript documents. The recent past several years has seen the development of many systems which are able to simulate the human brain actions. Among the many, the neural networks and the artificial intelligence are the most two important paradigms used. In this paper we propose a new algorithm for recognition of handwritten t...

  5. Activity Recognition for Personal Time Management

    Science.gov (United States)

    Prekopcsák, Zoltán; Soha, Sugárka; Henk, Tamás; Gáspár-Papanek, Csaba

    We describe an accelerometer based activity recognition system for mobile phones with a special focus on personal time management. We compare several data mining algorithms for the automatic recognition task in the case of single user and multiuser scenario, and improve accuracy with heuristics and advanced data mining methods. The results show that daily activities can be recognized with high accuracy and the integration with the RescueTime software can give good insights for personal time management.

  6. Singing Voice Analysis, Synthesis, and Modeling

    Science.gov (United States)

    Kim, Youngmoo E.

    The singing voice is the oldest musical instrument, but its versatility and emotional power are unmatched. Through the combination of music, lyrics, and expression, the voice is able to affect us in ways that no other instrument can. The fact that vocal music is prevalent in almost all cultures is indicative of its innate appeal to the human aesthetic. Singing also permeates most genres of music, attesting to the wide range of sounds the human voice is capable of producing. As listeners we are naturally drawn to the sound of the human voice, and, when present, it immediately becomes the focus of our attention.

  7. Human-machine interface software package

    International Nuclear Information System (INIS)

    Liu, D.K.; Zhang, C.Z.

    1992-01-01

    The Man-Machine Interface software Package (MMISP) is designed to configure the console software of PLS 60 Mev LINAC control system. The control system of PLS 60 Mev LINAC is a distributed control system which includes the main computer (Intel 310) four local station, and two sets of industrial level console computer. The MMISP provides the operator with the display page editor, various I/O configuration such as digital signals In/Out, analog signal In/Out, waveform TV graphic display, and interactive with operator through graphic picture display, voice explanation, and touch panel. This paper describes its function and application. (author)

  8. "Voice Forum" The Human Voice as Primary Instrument in Music Therapy

    DEFF Research Database (Denmark)

    Pedersen, Inge Nygaard; Storm, Sanne

    2009-01-01

    Aspects will be drawn on the human voice as tool for embodying our psychological and physiological state, and attempting integration of feelings. Presentations and dialogues on different methods and techniques in "Therapy related body-and voice work.", as well as the human voice as a tool for non...

  9. V2S: Voice to Sign Language Translation System for Malaysian Deaf People

    Science.gov (United States)

    Mean Foong, Oi; Low, Tang Jung; La, Wai Wan

    The process of learning and understand the sign language may be cumbersome to some, and therefore, this paper proposes a solution to this problem by providing a voice (English Language) to sign language translation system using Speech and Image processing technique. Speech processing which includes Speech Recognition is the study of recognizing the words being spoken, regardless of whom the speaker is. This project uses template-based recognition as the main approach in which the V2S system first needs to be trained with speech pattern based on some generic spectral parameter set. These spectral parameter set will then be stored as template in a database. The system will perform the recognition process through matching the parameter set of the input speech with the stored templates to finally display the sign language in video format. Empirical results show that the system has 80.3% recognition rate.

  10. The voice conveys emotion in ten globalized cultures and one remote village in Bhutan.

    Science.gov (United States)

    Cordaro, Daniel T; Keltner, Dacher; Tshering, Sumjay; Wangchuk, Dorji; Flynn, Lisa M

    2016-02-01

    With data from 10 different globalized cultures and 1 remote, isolated village in Bhutan, we examined universals and cultural variations in the recognition of 16 nonverbal emotional vocalizations. College students in 10 nations (Study 1) and villagers in remote Bhutan (Study 2) were asked to match emotional vocalizations to 1-sentence stories of the same valence. Guided by previous conceptualizations of recognition accuracy, across both studies, 7 of the 16 vocal burst stimuli were found to have strong or very strong recognition in all 11 cultures, 6 vocal bursts were found to have moderate recognition, and 4 were not universally recognized. All vocal burst stimuli varied significantly in terms of the degree to which they were recognized across the 11 cultures. Our discussion focuses on the implications of these results for current debates concerning the emotion conveyed in the voice. (c) 2016 APA, all rights reserved).

  11. Software reengineering

    Science.gov (United States)

    Fridge, Ernest M., III

    1991-01-01

    Today's software systems generally use obsolete technology, are not integrated properly with other software systems, and are difficult and costly to maintain. The discipline of reverse engineering is becoming prominent as organizations try to move their systems up to more modern and maintainable technology in a cost effective manner. JSC created a significant set of tools to develop and maintain FORTRAN and C code during development of the Space Shuttle. This tool set forms the basis for an integrated environment to re-engineer existing code into modern software engineering structures which are then easier and less costly to maintain and which allow a fairly straightforward translation into other target languages. The environment will support these structures and practices even in areas where the language definition and compilers do not enforce good software engineering. The knowledge and data captured using the reverse engineering tools is passed to standard forward engineering tools to redesign or perform major upgrades to software systems in a much more cost effective manner than using older technologies. A beta vision of the environment was released in Mar. 1991. The commercial potential for such re-engineering tools is very great. CASE TRENDS magazine reported it to be the primary concern of over four hundred of the top MIS executives.

  12. Software Authentication

    International Nuclear Information System (INIS)

    Wolford, J.K.; Geelhood, B.D.; Hamilton, V.A.; Ingraham, J.; MacArthur, D.W.; Mitchell, D.J.; Mullens, J.A.; Vanier, P. E.; White, G.K.; Whiteson, R.

    2001-01-01

    The effort to define guidance for authentication of software for arms control and nuclear material transparency measurements draws on a variety of disciplines and has involved synthesizing established criteria and practices with newer methods. Challenges include the need to protect classified information that the software manipulates as well as deal with the rapid pace of innovation in the technology of nuclear material monitoring. The resulting guidance will shape the design of future systems and inform the process of authentication of instruments now being developed. This paper explores the technical issues underlying the guidance and presents its major tenets

  13. Software engineering

    CERN Document Server

    Thorin, Marc

    1985-01-01

    Software Engineering describes the conceptual bases as well as the main methods and rules on computer programming. This book presents software engineering as a coherent and logically built synthesis and makes it possible to properly carry out an application of small or medium difficulty that can later be developed and adapted to more complex cases. This text is comprised of six chapters and begins by introducing the reader to the fundamental notions of entities, actions, and programming. The next two chapters elaborate on the concepts of information and consistency domains and show that a proc

  14. Robotic Software Integration Using MARIE

    Directory of Open Access Journals (Sweden)

    Carle Côté

    2006-03-01

    Full Text Available This paper presents MARIE, a middleware framework oriented towards developing and integrating new and existing software for robotic systems. By using a generic communication framework, MARIE aims to create a flexible distributed component system that allows robotics developers to share software programs and algorithms, and design prototypes rapidly based on their own integration needs. The use of MARIE is illustrated with the design of a socially interactive autonomous mobile robot platform capable of map building, localization, navigation, tasks scheduling, sound source localization, tracking and separation, speech recognition and generation, visual tracking, message reading and graphical interaction using a touch screen interface.

  15. Clinical voice analysis of Carnatic singers.

    Science.gov (United States)

    Arunachalam, Ravikumar; Boominathan, Prakash; Mahalingam, Shenbagavalli

    2014-01-01

    Carnatic singing is a classical South Indian style of music that involves rigorous training to produce an "open throated" loud, predominantly low-pitched singing, embedded with vocal nuances in higher pitches. Voice problems in singers are not uncommon. The objective was to report the nature of voice problems and apply a routine protocol to assess the voice. Forty-five trained performing singers (females: 36 and males: 9) who reported to a tertiary care hospital with voice problems underwent voice assessment. The study analyzed their problems and the clinical findings. Voice change, difficulty in singing higher pitches, and voice fatigue were major complaints. Most of the singers suffered laryngopharyngeal reflux that coexisted with muscle tension dysphonia and chronic laryngitis. Speaking voices were rated predominantly as "moderate deviation" on GRBAS (Grade, Rough, Breathy, Asthenia, and Strain). Maximum phonation time ranged from 4 to 29 seconds (females: 10.2, standard deviation [SD]: 5.28 and males: 15.7, SD: 5.79). Singing frequency range was reduced (females: 21.3 Semitones and males: 23.99 Semitones). Dysphonia severity index (DSI) scores ranged from -3.5 to 4.91 (females: 0.075 and males: 0.64). Singing frequency range and DSI did not show significant difference between sex and across clinical diagnosis. Self-perception using voice disorder outcome profile revealed overall severity score of 5.1 (SD: 2.7). Findings are discussed from a clinical intervention perspective. Study highlighted the nature of voice problems (hyperfunctional) and required modifications in assessment protocol for Carnatic singers. Need for regular assessments and vocal hygiene education to maintain good vocal health are emphasized as outcomes. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  16. Associations between the Transsexual Voice Questionnaire (TVQMtF ) and self-report of voice femininity and acoustic voice measures.

    Science.gov (United States)

    Dacakis, Georgia; Oates, Jennifer; Douglas, Jacinta

    2017-11-01

    The Transsexual Voice Questionnaire (TVQ MtF ) was designed to capture the voice-related perceptions of individuals whose gender identity as female is the opposite of their birth-assigned gender (MtF women). Evaluation of the psychometric properties of the TVQ MtF is ongoing. To investigate associations between TVQ MtF scores and (1) self-perceptions of voice femininity and (2) acoustic parameters of voice pitch and voice quality in order to evaluate further the validity of the TVQ MtF . A strong correlation between TVQ MtF scores and self-ratings of voice femininity was predicted, but no association between TVQ MtF scores and acoustic measures of voice pitch and quality was proposed. Participants were 148 MtF women (mean age 48.14 years) recruited from the La Trobe Communication Clinic and the clinics of three doctors specializing in transgender health. All participants completed the TVQ MtF and 34 of these participants also provided a voice sample for acoustic analysis. Pearson product-moment correlation analysis was conducted to examine the associations between TVQ MtF scores and (1) self-perceptions of voice femininity and (2) acoustic measures of F0, jitter (%), shimmer (dB) and harmonic-to-noise ratio (HNR). Strong negative correlations between the participants' perceptions of their voice femininity and the TVQ MtF scores demonstrated that for this group of MtF women a low self-rating of voice femininity was associated with more frequent negative voice-related experiences. This association was strongest with the vocal-functioning component of the TVQ MtF . These strong correlations and high levels of shared variance between the TVQ MtF and a measure of a related construct provides evidence for the convergent validity of the TVQ MtF . The absence of significant correlations between the TVQ MtF and the acoustic data is consistent with the equivocal findings of earlier research. This finding indicates that these two measures assess different aspects of the voice

  17. Sound induced activity in voice sensitive cortex predicts voice memory ability

    Directory of Open Access Journals (Sweden)

    Rebecca eWatson

    2012-04-01

    Full Text Available The ‘temporal voice areas’ (TVAs (Belin et al., 2000 of the human brain show greater neuronal activity in response to human voices than to other categories of nonvocal sounds. However, a direct link between TVA activity and voice perceptionbehaviour has not yet been established. Here we show that a functional magnetic resonance imaging (fMRI measure of activity in the TVAs predicts individual performance at a separately administered voice memory test. This relation holds whengeneral sound memory ability is taken into account. These findings provide the first evidence that the TVAs are specifically involved in voice cognition.

  18. Pattern Recognition

    Directory of Open Access Journals (Sweden)

    Aleš Procházka

    2018-05-01

    Full Text Available Multimodal signal analysis based on sophisticated sensors, efficient communicationsystems and fast parallel processing methods has a rapidly increasing range of multidisciplinaryapplications. The present paper is devoted to pattern recognition, machine learning, and the analysisof sleep stages in the detection of sleep disorders using polysomnography (PSG data, includingelectroencephalography (EEG, breathing (Flow, and electro-oculogram (EOG signals. The proposedmethod is based on the classification of selected features by a neural network system with sigmoidaland softmax transfer functions using Bayesian methods for the evaluation of the probabilities of theseparate classes. The application is devoted to the analysis of the sleep stages of 184 individualswith different diagnoses, using EEG and further PSG signals. Data analysis points to an averageincrease of the length of the Wake stage by 2.7% per 10 years and a decrease of the length of theRapid Eye Movement (REM stages by 0.8% per 10 years. The mean classification accuracy for givensets of records and single EEG and multimodal features is 88.7% ( standard deviation, STD: 2.1 and89.6% (STD:1.9, respectively. The proposed methods enable the use of adaptive learning processesfor the detection and classification of health disorders based on prior specialist experience andman–machine interaction.

  19. The processing of auditory and visual recognition of self-stimuli.

    Science.gov (United States)

    Hughes, Susan M; Nicholson, Shevon E

    2010-12-01

    This study examined self-recognition processing in both the auditory and visual modalities by determining how comparable hearing a recording of one's own voice was to seeing photograph of one's own face. We also investigated whether the simultaneous presentation of auditory and visual self-stimuli would either facilitate or inhibit self-identification. Ninety-one participants completed reaction-time tasks of self-recognition when presented with their own faces, own voices, and combinations of the two. Reaction time and errors made when responding with both the right and left hand were recorded to determine if there were lateralization effects on these tasks. Our findings showed that visual self-recognition for facial photographs appears to be superior to auditory self-recognition for voice recordings. Furthermore, a combined presentation of one's own face and voice appeared to inhibit rather than facilitate self-recognition and there was a left-hand advantage for reaction time on the combined-presentation tasks. Copyright © 2010 Elsevier Inc. All rights reserved.

  20. Voices from Around the Globe

    Directory of Open Access Journals (Sweden)

    Birgit Schreiber

    2017-07-01

    Full Text Available JSAA has been seeking to provide an opportunity for Student Affairs professionals and higher education scholars from around the globe to share their research and experiences of student services and student affairs programmes from their respective regional and institutional contexts. This has been given a specific platform with the guest-edited issue “Voices from Around the Globe” which is the result of a collaboration with the International Association of Student Affairs and Services (IASAS, and particularly with the guest editors, Kathleen Callahan and Chinedu Mba.

  1. Voice Disorders: Etiology and Diagnosis.

    Science.gov (United States)

    Martins, Regina Helena Garcia; do Amaral, Henrique Abrantes; Tavares, Elaine Lara Mendes; Martins, Maira Garcia; Gonçalves, Tatiana Maria; Dias, Norimar Hernandes

    2016-11-01

    Voice disorders affect adults and children and have different causes in different age groups. The aim of the study is to present the etiology and diagnosis dysphonia in a large population of patients with this voice disorder.for dysphonia of a large population of dysphonic patients. We evaluated 2019 patients with dysphonia who attended the Voice Disease ambulatories of a university hospital. Parameters assessed were age, gender, profession, associated symptoms, smoking, and videolaryngoscopy diagnoses. Of the 2019 patients with dysphonia who were included in this study, 786 were male (38.93%) and 1233 were female (61.07). The age groups were as follows: 1-6 years (n = 100); 7-12 years (n = 187); 13-18 years (n = 92); 19-39 years (n = 494); 41-60 years (n = 811); and >60 years (n = 335). Symptoms associated with dysphonia were vocal overuse (n = 677), gastroesophageal symptoms (n = 535), and nasosinusal symptoms (n = 497). The predominant professions of the patients were domestic workers, students, and teachers. Smoking was reported by 13.6% patients. With regard to the etiology of dysphonia, in children (1-18 years old), nodules (n = 225; 59.3%), cysts (n = 39; 10.3%), and acute laryngitis (n = 26; 6.8%) prevailed. In adults (19-60 years old), functional dysphonia (n = 268; 20.5%), acid laryngitis (n = 164; 12.5%), and vocal polyps (n = 156; 12%) predominated. In patients older than 60 years, presbyphonia (n = 89; 26.5%), functional dysphonia (n = 59; 17.6%), and Reinke's edema (n = 48; 14%) predominated. In this population of 2019 patients with dysphonia, adults and women were predominant. Dysphonia had different etiologies in the age groups studied. Nodules and cysts were predominant in children, functional dysphonia and reflux in adults, and presbyphonia and Reinke's edema in the elderly. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  2. From Out of Our Voices

    Directory of Open Access Journals (Sweden)

    Evangelia Papanikolaou

    2010-01-01

    Full Text Available Note from the interviewer: Diane Austin's new book “The Theory and Practice of Vocal Psychotherapy: Songs of the Self” (2008 which was published recently, has been an excellent opportunity to learn more about the use of voice in therapy, its clinical applications and its enormous possibilities that offers within a psychotherapeutic setting. This interview focuses on introducing some of these aspects based on Austin’s work, and on exploring her background, motivations and considerations towards this pioneer music-therapeutic approach. The interview has been edited by Diane Austin and Evangelia Papanikolaou and took place via a series of emails, dated from September to December 2009.

  3. Muscular tension and body posture in relation to voice handicap and voice quality in teachers with persistent voice complaints.

    Science.gov (United States)

    Kooijman, P G C; de Jong, F I C R S; Oudes, M J; Huinck, W; van Acht, H; Graamans, K

    2005-01-01

    The aim of this study was to investigate the relationship between extrinsic laryngeal muscular hypertonicity and deviant body posture on the one hand and voice handicap and voice quality on the other hand in teachers with persistent voice complaints and a history of voice-related absenteeism. The study group consisted of 25 female teachers. A voice therapist assessed extrinsic laryngeal muscular tension and a physical therapist assessed body posture. The assessed parameters were clustered in categories. The parameters in the different categories represent the same function. Further a tension/posture index was created, which is the summation of the different parameters. The different parameters and the index were related to the Voice Handicap Index (VHI) and the Dysphonia Severity Index (DSI). The scores of the VHI and the individual parameters differ significantly except for the posterior weight bearing and tension of the sternocleidomastoid muscle. There was also a significant difference between the individual parameters and the DSI, except for tension of the cricothyroid muscle and posterior weight bearing. The score of the tension/posture index correlates significantly with both the VHI and the DSI. In a linear regression analysis, the combination of hypertonicity of the sternocleidomastoid, the geniohyoid muscles and posterior weight bearing is the most important predictor for a high voice handicap. The combination of hypertonicity of the geniohyoid muscle, posterior weight bearing, high position of the hyoid bone, hypertonicity of the cricothyroid muscle and anteroposition of the head is the most important predictor for a low DSI score. The results of this study show the higher the score of the index, the higher the score of the voice handicap and the worse the voice quality is. Moreover, the results are indicative for the importance of assessment of muscular tension and body posture in the diagnosis of voice disorders.

  4. The Role of Occupational Voice Demand and Patient-Rated Impairment in Predicting Voice Therapy Adherence.

    Science.gov (United States)

    Ebersole, Barbara; Soni, Resha S; Moran, Kathleen; Lango, Miriam; Devarajan, Karthik; Jamal, Nausheen

    2018-05-01

    Examine the relationship among the severity of patient-perceived voice impairment, perceptual dysphonia severity, occupational voice demand, and voice therapy adherence. Identify clinical predictors of increased risk for therapy nonadherence. A retrospective cohort study of patients presenting with a chief complaint of persistent dysphonia at an interdisciplinary voice center was done. The Voice Handicap Index-10 (VHI-10) and the Voice-Related Quality of Life (V-RQOL) survey scores, clinician rating of dysphonia severity using the Grade score from the Grade, Roughness Breathiness, Asthenia, and Strain scale, occupational voice demand, and patient demographics were tested for associations with therapy adherence, defined as completion of the treatment plan. Classification and Regression Tree (CART) analysis was performed to establish thresholds for nonadherence risk. Of 166 patients evaluated, 111 were recommended for voice therapy. The therapy nonadherence rate was 56%. Occupational voice demand category, VHI-10, and V-RQOL scores were the only factors significantly correlated with therapy adherence (P demand are significantly more likely to be nonadherent with therapy than those with high occupational voice demand (P 40 is a significant cutoff point for predicting therapy nonadherence (P demand and patient perception of impairment are significantly and independently correlated with therapy adherence. A VHI-10 score of ≤9 or a V-RQOL score of >40 is a significant cutoff point for predicting nonadherence risk. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  5. Integrating cues of social interest and voice pitch in men's preferences for women's voices.

    Science.gov (United States)

    Jones, Benedict C; Feinberg, David R; Debruine, Lisa M; Little, Anthony C; Vukovic, Jovana

    2008-04-23

    Most previous studies of vocal attractiveness have focused on preferences for physical characteristics of voices such as pitch. Here we examine the content of vocalizations in interaction with such physical traits, finding that vocal cues of social interest modulate the strength of men's preferences for raised pitch in women's voices. Men showed stronger preferences for raised pitch when judging the voices of women who appeared interested in the listener than when judging the voices of women who appeared relatively disinterested in the listener. These findings show that voice preferences are not determined solely by physical properties of voices and that men integrate information about voice pitch and the degree of social interest expressed by women when forming voice preferences. Women's preferences for raised pitch in women's voices were not modulated by cues of social interest, suggesting that the integration of cues of social interest and voice pitch when men judge the attractiveness of women's voices may reflect adaptations that promote efficient allocation of men's mating effort.

  6. Reviews, Software.

    Science.gov (United States)

    Science Teacher, 1988

    1988-01-01

    Reviews two computer software packages for use in physical science, physics, and chemistry classes. Includes "Physics of Model Rocketry" for Apple II, and "Black Box" for Apple II and IBM compatible computers. "Black Box" is designed to help students understand the concept of indirect evidence. (CW)

  7. Software Reviews.

    Science.gov (United States)

    Kinnaman, Daniel E.; And Others

    1988-01-01

    Reviews four educational software packages for Apple, IBM, and Tandy computers. Includes "How the West was One + Three x Four,""Mavis Beacon Teaches Typing,""Math and Me," and "Write On." Reviews list hardware requirements, emphasis, levels, publisher, purchase agreements, and price. Discusses the strengths…

  8. Software Review.

    Science.gov (United States)

    McGrath, Diane, Ed.

    1989-01-01

    Reviewed is a computer software package entitled "Audubon Wildlife Adventures: Grizzly Bears" for Apple II and IBM microcomputers. Included are availability, hardware requirements, cost, and a description of the program. The murder-mystery flavor of the program is stressed in this program that focuses on illegal hunting and game…

  9. Software Reviews.

    Science.gov (United States)

    Teles, Elizabeth, Ed.; And Others

    1990-01-01

    Reviewed are two computer software packages for Macintosh microcomputers including "Phase Portraits," an exploratory graphics tool for studying first-order planar systems; and "MacMath," a set of programs for exploring differential equations, linear algebra, and other mathematical topics. Features, ease of use, cost, availability, and hardware…

  10. MIAWARE Software

    DEFF Research Database (Denmark)

    Wilkowski, Bartlomiej; Pereira, Oscar N. M.; Dias, Paulo

    2008-01-01

    is automatically generated. Furthermore, MIAWARE software is accompanied with an intelligent search engine for medical reports, based on the relations between parts of the lungs. A logical structure of the lungs is introduced to the search algorithm through the specially developed ontology. As a result...

  11. Perception of Paralinguistic Traits in Synthesized Voices

    DEFF Research Database (Denmark)

    Baird, Alice Emily; Hasse Jørgensen, Stina; Parada-Cabaleiro, Emilia

    2017-01-01

    Along with the rise of artificial intelligence and the internet-of-things, synthesized voices are now common in daily–life, providing us with guidance, assistance, and even companionship. From formant to concatenative synthesis, the synthesized voice continues to be defined by the same traits we...

  12. Student Voices in School-Based Assessment

    Science.gov (United States)

    Tong, Siu Yin Annie; Adamson, Bob

    2015-01-01

    The value of student voices in dialogues about learning improvement is acknowledged in the literature. This paper examines how the views of students regarding School-based Assessment (SBA), a significant shift in examination policy and practice in secondary schools in Hong Kong, have largely been ignored. The study captures student voices through…

  13. Analog voicing detector responds to pitch

    Science.gov (United States)

    Abel, R. S.; Watkins, H. E.

    1967-01-01

    Modified electronic voice encoder /Vocoder/ includes an independent analog mode of operation in addition to the conventional digital mode. The Vocoder is a bandwidth compression equipment that permits voice transmission over channels, having only a fraction of the bandwidth required for conventional telephone-quality speech transmission.

  14. The Voice of the Technical Writer.

    Science.gov (United States)

    Euler, James S.

    The author's voice is implicit in all writing, even technical writing. It is the expression of the writer's attitude toward audience, subject matter, and self. Effective use of voice is made possible by recognizing the three roles of the technical writer: transmitter, translator, and author. As a transmitter, the writer must consciously apply an…

  15. Student Voice and the Common Core

    Science.gov (United States)

    Yonezawa, Susan

    2015-01-01

    Common Core proponents and detractors debate its merits, but students have voiced their opinion for years. Using a decade's worth of data gathered through design-research on youth voice, this article discusses what high school students have long described as more ideal learning environments for themselves--and how remarkably similar the Common…

  16. Employee voice and engagement : Connections and consequences

    NARCIS (Netherlands)

    Rees, C.; Alfes, K.; Gatenby, M.

    2013-01-01

    This paper considers the relationship between employee voice and employee engagement. Employee perceptions of voice behaviour aimed at improving the functioning of the work group are found to have both a direct impact and an indirect impact on levels of employee engagement. Analysis of data from two

  17. Speaking with the voice of authority

    CERN Multimedia

    2002-01-01

    GPB Consulting has developed a scientific approach to voice coaching. A digital recording of the voice is sent to a lab in Switzerland and analyzed by a computer programme designed by a doctor of psychology and linguistics and a scientist at CERN (1 page).

  18. Managing dysphonia in occupational voice users.

    Science.gov (United States)

    Behlau, Mara; Zambon, Fabiana; Madazio, Glaucya

    2014-06-01

    Recent advances with regard to occupational voice disorders are highlighted with emphasis on issues warranting consideration when assessing, training, and treating professional voice users. Findings include the many particularities between the various categories of professional voice users, the concept that the environment plays a major role in occupational voice disorders, and that biopsychosocial influences should be analyzed on an individual basis. Assessment via self-evaluation protocols to quantify the impact of these disorders is mandatory as a component of an evaluation and to document treatment outcomes. Discomfort or odynophonia has evolved as a critical symptom in this population. Clinical trials are limited and the complexity of the environment may be a limitation in experiment design. This review reinforced the need for large population studies of professional voice users; new data highlighted important factors specific to each group of voice users. Interventions directed at student teachers are necessities to not only improving the quality of future professionals, but also to avoid the frustration and limitations associated with chronic voice problems. The causative relationship between the work environment and voice disorders has not yet been established. Randomized controlled trials are lacking and must be a focus to enhance treatment paradigms for this population.

  19. Does CPAP treatment affect the voice?

    Science.gov (United States)

    Saylam, Güleser; Şahin, Mustafa; Demiral, Dilek; Bayır, Ömer; Yüceege, Melike Bağnu; Çadallı Tatar, Emel; Korkmaz, Mehmet Hakan

    2016-12-20

    The aim of this study was to investigate alterations in voice parameters among patients using continuous positive airway pressure (CPAP) for the treatment of obstructive sleep apnea syndrome. Patients with an indication for CPAP treatment without any voice problems and with normal laryngeal findings were included and voice parameters were evaluated before and 1 and 6 months after CPAP. Videolaryngostroboscopic findings, a self-rated scale (Voice Handicap Index-10, VHI-10), perceptual voice quality assessment (GRBAS: grade, roughness, breathiness, asthenia, strain), and acoustic parameters were compared. Data from 70 subjects (48 men and 22 women) with a mean age of 44.2 ± 6.0 years were evaluated. When compared with the pre-CPAP treatment period, there was a significant increase in the VHI-10 score after 1 month of treatment and in VHI- 10 and total GRBAS scores, jitter percent (P = 0.01), shimmer percent, noise-to-harmonic ratio, and voice turbulence index after 6 months of treatment. Vague negative effects on voice parameters after the first month of CPAP treatment became more evident after 6 months. We demonstrated nonsevere alterations in the voice quality of patients under CPAP treatment. Given that CPAP is a long-term treatment it is important to keep these alterations in mind.

  20. Occupational risk factors and voice disorders.

    Science.gov (United States)

    Vilkman, E

    1996-01-01

    From the point of view of occupational health, the field of voice disorders is very poorly developed as compared, for instance, to the prevention and diagnostics of occupational hearing disorders. In fact, voice disorders have not even been recognized in the field of occupational medicine. Hence, it is obviously very rare in most countries that the voice disorder of a professional voice user, e.g. a teacher, a singer or an actor, is accepted as an occupational disease by insurance companies. However, occupational voice problems do not lack significance from the point of view of the patient. We also know from questionnaires and clinical studies that voice complaints are very common. Another example of job-related health problems, which has proved more successful in terms of its occupational health status, is the repetition strain injury of the elbow, i.e. the "tennis elbow". Its textbook definition could be used as such to describe an occupational voice disorder ("dysphonia professional is"). In the present paper the effects of such risk factors as vocal loading itself, background noise and room acoustics and low relative humidity of the air are discussed. Due to individual factors underlying the development of professional voice disorders, recommendations rather than regulations are called for. There are many simple and even relatively low-cost methods available for the prevention of vocal problems as well as for supporting rehabilitation.

  1. Why Is My Voice Changing? (For Teens)

    Science.gov (United States)

    ... enter puberty earlier or later than others. How Deep Will My Voice Get? How deep a guy's voice gets depends on his genes: ... of Use Notice of Nondiscrimination Visit the Nemours Web site. Note: All information on TeensHealth® is for ...

  2. Stage Voice Training in the London Schools.

    Science.gov (United States)

    Rubin, Lucille S.

    This report is the result of a six-week study in which the voice training offerings at four schools of drama in London were examined using interviews of teachers and directors, observation of voice classes, and attendance at studio presentations and public performances. The report covers such topics as: textbooks and references being used; courses…

  3. Predictors of Choral Directors' Voice Handicap

    Science.gov (United States)

    Schwartz, Sandra

    2013-01-01

    Vocal demands of teaching are considerable and these challenges are greater for choral directors who depend on the voice as a musical and instructive instrument. The purpose of this study was to (1) examine choral directors' vocal condition using a modified Voice Handicap Index (VHI), and (2) determine the extent to which the major variables…

  4. The written voice: implicit memory effects of voice characteristics following silent reading and auditory presentation.

    Science.gov (United States)

    Abramson, Marianne

    2007-12-01

    After being familiarized with two voices, either implicit (auditory lexical decision) or explicit memory (auditory recognition) for words from silently read sentences was assessed among 32 men and 32 women volunteers. In the silently read sentences, the sex of speaker was implied in the initial words, e.g., "He said, ..." or "She said...". Tone in question versus statement was also manipulated by appropriate punctuation. Auditory lexical decision priming was found for sex- and tone-consistent items following silent reading, but only up to 5 min. after silent reading. In a second study, similar lexical decision priming was found following listening to the sentences, although these effects remained reliable after a 2-day delay. The effect sizes for lexical decision priming showed that tone-consistency and sex-consistency were strong following both silent reading and listening 5 min. after studying. These results suggest that readers create episodic traces of text from auditory images of silently read sentences as they do during listening.

  5. Voice disorders in teachers. A review.

    Science.gov (United States)

    Martins, Regina Helena Garcia; Pereira, Eny Regina Bóia Neves; Hidalgo, Caio Bosque; Tavares, Elaine Lara Mendes

    2014-11-01

    Voice disorders are very prevalent among teachers and consequences are serious. Although the literature is extensive, there are differences in the concepts and methodology related to voice problems; most studies are restricted to analyzing the responses of teachers to questionnaires and only a few studies include vocal assessments and videolaryngoscopic examinations to obtain a definitive diagnosis. To review demographic studies related to vocal disorders in teachers to analyze the diverse methodologies, the prevalence rates pointed out by the authors, the main risk factors, the most prevalent laryngeal lesions, and the repercussions of dysphonias on professional activities. The available literature (from 1997 to 2013) was narratively reviewed based on Medline, PubMed, Lilacs, SciELO, and Cochrane library databases. Excluded were articles that specifically analyzed treatment modalities and those that did not make their abstracts available in those databases. The keywords included were teacher, dysphonia, voice disorders, professional voice. Copyright © 2014 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  6. Voice pedagogy-what do we need?

    Science.gov (United States)

    Gill, Brian P; Herbst, Christian T

    2016-12-01

    The final keynote panel of the 10th Pan-European Voice Conference (PEVOC) was concerned with the topic 'Voice pedagogy-what do we need?' In this communication the panel discussion is summarized, and the authors provide a deepening discussion on one of the key questions, addressing the roles and tasks of people working with voice students. In particular, a distinction is made between (1) voice building (derived from the German term 'Stimmbildung'), primarily comprising the functional and physiological aspects of singing; (2) coaching, mostly concerned with performance skills; and (3) singing voice rehabilitation. Both public and private educators are encouraged to apply this distinction to their curricula, in order to arrive at more efficient singing teaching and to reduce the risk of vocal injury to the singers concerned.

  7. Voice Quality Estimation in Wireless Networks

    Directory of Open Access Journals (Sweden)

    Petr Zach

    2015-01-01

    Full Text Available This article deals with the impact of Wireless (Wi-Fi networks on the perceived quality of voice services. The Quality of Service (QoS metrics must be monitored in the computer network during the voice data transmission to ensure proper voice service quality the end-user has paid for, especially in the wireless networks. In addition to the QoS, research area called Quality of Experience (QoE provides metrics and methods for quality evaluation from the end-user’s perspective. This article focuses on a QoE estimation of Voice over IP (VoIP calls in the wireless networks using network simulator. Results contribute to voice quality estimation based on characteristics of the wireless network and location of a wireless client.

  8. Social power and recognition of emotional prosody: High power is associated with lower recognition accuracy than low power.

    Science.gov (United States)

    Uskul, Ayse K; Paulmann, Silke; Weick, Mario

    2016-02-01

    Listeners have to pay close attention to a speaker's tone of voice (prosody) during daily conversations. This is particularly important when trying to infer the emotional state of the speaker. Although a growing body of research has explored how emotions are processed from speech in general, little is known about how psychosocial factors such as social power can shape the perception of vocal emotional attributes. Thus, the present studies explored how social power affects emotional prosody recognition. In a correlational study (Study 1) and an experimental study (Study 2), we show that high power is associated with lower accuracy in emotional prosody recognition than low power. These results, for the first time, suggest that individuals experiencing high or low power perceive emotional tone of voice differently. (c) 2016 APA, all rights reserved).

  9. Acute effects of radioiodine therapy on the voice and larynx of basedow-Graves patients

    Energy Technology Data Exchange (ETDEWEB)

    Isolan-Cury, Roberta Werlang; Cury, Adriano Namo [Sao Paulo Santa Casa de Misericordia, SP (Brazil). Medical Science School (FCMSCSP); Monte, Osmar [Sao Paulo Santa Casa de Misericordia, SP (Brazil). Physiology Department; Silva, Marta Assumpcao de Andrada e [Sao Paulo Santa Casa de Misericordia, SP (Brazil). Medical Science School (FCMSCSP). Speech Therapy School; Duprat, Andre [Sao Paulo Santa Casa de Misericordia, SP (Brazil). Medical Science School (FCMSCSP). Otorhinolaryngology Department; Marone, Marilia [Nuclimagem - Irmanity of the Sao Paulo Santa Casa de Misericordia, SP (Brazil). Nuclear Medicine Unit; Almeida, Renata de; Iglesias, Alexandre [Sao Paulo Santa Casa de Misericordia, SP (Brazil). Medical Science School (FCMSCSP). Otorhinolaryngology Department. Endocrinology and Metabology Unit

    2008-07-01

    Graves's disease is the most common cause of hyperthyroidism. There are three current therapeutic options: anti-thyroid medication, surgery, and radioactive iodine (I 131). There are few data in the literature regarding the effects of radioiodine therapy on the larynx and voice. The aim of this study was: to assess the effect of radioiodine therapy on the voice of Basedow-Graves patients. Material and method: A prospective study was done. Following the diagnosis of Grave's disease, patients underwent investigation of their voice, measurement of maximum phonatory time (/a/) and the s/z ratio, fundamental frequency analysis (Praat software), laryngoscopy and (perceptive-auditory) analysis in three different conditions: pre-treatment, 4 days, and 20 days post-radioiodine therapy. Conditions are based on the inflammatory pattern of thyroid tissue (Jones et al. 1999). Results: No statistically significant differences were found in voice characteristics in these three conditions. Conclusion: Radioiodine therapy does not affect voice quality. (author)

  10. Identifying hidden voice and video streams

    Science.gov (United States)

    Fan, Jieyan; Wu, Dapeng; Nucci, Antonio; Keralapura, Ram; Gao, Lixin

    2009-04-01

    Given the rising popularity of voice and video services over the Internet, accurately identifying voice and video traffic that traverse their networks has become a critical task for Internet service providers (ISPs). As the number of proprietary applications that deliver voice and video services to end users increases over time, the search for the one methodology that can accurately detect such services while being application independent still remains open. This problem becomes even more complicated when voice and video service providers like Skype, Microsoft, and Google bundle their voice and video services with other services like file transfer and chat. For example, a bundled Skype session can contain both voice stream and file transfer stream in the same layer-3/layer-4 flow. In this context, traditional techniques to identify voice and video streams do not work. In this paper, we propose a novel self-learning classifier, called VVS-I , that detects the presence of voice and video streams in flows with minimum manual intervention. Our classifier works in two phases: training phase and detection phase. In the training phase, VVS-I first extracts the relevant features, and subsequently constructs a fingerprint of a flow using the power spectral density (PSD) analysis. In the detection phase, it compares the fingerprint of a flow to the existing fingerprints learned during the training phase, and subsequently classifies the flow. Our classifier is not only capable of detecting voice and video streams that are hidden in different flows, but is also capable of detecting different applications (like Skype, MSN, etc.) that generate these voice/video streams. We show that our classifier can achieve close to 100% detection rate while keeping the false positive rate to less that 1%.

  11. Software Tools for Software Maintenance

    Science.gov (United States)

    1988-10-01

    COMMUNICATIONS, AND COMPUTER SCIENCES I ,(AIRMICS) FO~SOFTWARE TOOLS (.o FOR SOF1 ’ARE MAINTENANCE (ASQBG-1-89-001) October, 1988 DTIC ELECTE -ifB...SUNWW~. B..c Program An~Iysw HA.c C-Tractr C Cobol Stncturing Facility VS Cobol 11 F-Scan Foctma Futbol Cobol Fortran Sltiuc Code Anaiyaer Fortran IS

  12. A voice-actuated wind tunnel model leak checking system

    Science.gov (United States)

    Larson, William E.

    1989-01-01

    A computer program has been developed that improves the efficiency of wind tunnel model leak checking. The program uses a voice recognition unit to relay a technician's commands to the computer. The computer, after receiving a command, can respond to the technician via a voice response unit. Information about the model pressure orifice being checked is displayed on a gas-plasma terminal. On command, the program records up to 30 seconds of pressure data. After the recording is complete, the raw data and a straight line fit of the data are plotted on the terminal. This allows the technician to make a decision on the integrity of the orifice being checked. All results of the leak check program are stored in a database file that can be listed on the line printer for record keeping purposes or displayed on the terminal to help the technician find unchecked orifices. This program allows one technician to check a model for leaks instead of the two or three previously required.

  13. Current trends in small vocabulary speech recognition for equipment control

    Science.gov (United States)

    Doukas, Nikolaos; Bardis, Nikolaos G.

    2017-09-01

    Speech recognition systems allow human - machine communication to acquire an intuitive nature that approaches the simplicity of inter - human communication. Small vocabulary speech recognition is a subset of the overall speech recognition problem, where only a small number of words need to be recognized. Speaker independent small vocabulary recognition can find significant applications in field equipment used by military personnel. Such equipment may typically be controlled by a small number of commands that need to be given quickly and accurately, under conditions where delicate manual operations are difficult to achieve. This type of application could hence significantly benefit by the use of robust voice operated control components, as they would facilitate the interaction with their users and render it much more reliable in times of crisis. This paper presents current challenges involved in attaining efficient and robust small vocabulary speech recognition. These challenges concern feature selection, classification techniques, speaker diversity and noise effects. A state machine approach is presented that facilitates the voice guidance of different equipment in a variety of situations.

  14. The Effects of Size and Type of Vocal Fold Polyp on Some Acoustic Voice Parameters

    Directory of Open Access Journals (Sweden)

    Elaheh Akbari

    2018-03-01

    Full Text Available Background: Vocal abuse and misuse would result in vocal fold polyp. Certain features define the extent of vocal folds polyp effects on voice acoustic parameters. The present study aimed to define the effects of polyp size on acoustic voice parameters, and compare these parameters in hemorrhagic and non-hemorrhagic polyps. Methods: In the present retrospective study, 28 individuals with hemorrhagic or non-hemorrhagic polyps of the true vocal folds were recruited to investigate acoustic voice parameters of vowel/ æ/ computed by the Praat software. The data were analyzed using the SPSS software, version 17.0. According to the type and size of polyps, mean acoustic differences and correlations were analyzed by the statistical t test and Pearson correlation test, respectively; with significance level below 0.05. Results: The results indicated that jitter and the harmonics-to-noise ratio had a significant positive and negative correlation with the polyp size (P=0.01, respectively. In addition, both mentioned parameters were significantly different between the two types of the investigated polyps. Conclusion: Both the type and size of polyps have effects on acoustic voice characteristics. In the present study, a novel method to measure polyp size was introduced. Further confirmation of this method as a tool to compare polyp sizes requires additional investigations.

  15. EPIQR software

    Energy Technology Data Exchange (ETDEWEB)

    Flourentzos, F. [Federal Institute of Technology, Lausanne (Switzerland); Droutsa, K. [National Observatory of Athens, Athens (Greece); Wittchen, K.B. [Danish Building Research Institute, Hoersholm (Denmark)

    1999-11-01

    The support of the EPIQR method is a multimedia computer program. Several modules help the users of the method to treat the data collected during a diagnosis survey, to set up refurbishment scenario and calculate their cost or energy performance, and finally to visualize the results in a comprehensive way and to prepare quality reports. This article presents the structure and the main features of the software. (au)

  16. Software preservation

    Directory of Open Access Journals (Sweden)

    Tadej Vodopivec

    2011-01-01

    Full Text Available Comtrade Ltd. covers a wide range of activities related to information and communication technologies; its deliverables include web applications, locally installed programs,system software, drivers, embedded software (used e.g. in medical devices, auto parts,communication switchboards. Also the extensive knowledge and practical experience about digital long-term preservation technologies have been acquired. This wide spectrum of activities puts us in the position to discuss the often overlooked aspect of the digital preservation - preservation of software programs. There are many resources dedicated to digital preservation of digital data, documents and multimedia records,but not so many about how to preserve the functionalities and features of computer programs. Exactly these functionalities - dynamic response to inputs - render the computer programs rich compared to documents or linear multimedia. The article opens the questions on the beginning of the way to the permanent digital preservation. The purpose is to find a way in the right direction, where all relevant aspects will be covered in proper balance. The following questions are asked: why at all to preserve computer programs permanently, who should do this and for whom, when we should think about permanent program preservation, what should be persevered (such as source code, screenshots, documentation, and social context of the program - e.g. media response to it ..., where and how? To illustrate the theoretic concepts given the idea of virtual national museum of electronic banking is also presented.

  17. Is it me? Self-recognition bias across sensory modalities and its relationship to autistic traits.

    Science.gov (United States)

    Chakraborty, Anya; Chakrabarti, Bhismadev

    2015-01-01

    Atypical self-processing is an emerging theme in autism research, suggested by lower self-reference effect in memory, and atypical neural responses to visual self-representations. Most research on physical self-processing in autism uses visual stimuli. However, the self is a multimodal construct, and therefore, it is essential to test self-recognition in other sensory modalities as well. Self-recognition in the auditory modality remains relatively unexplored and has not been tested in relation to autism and related traits. This study investigates self-recognition in auditory and visual domain in the general population and tests if it is associated with autistic traits. Thirty-nine neurotypical adults participated in a two-part study. In the first session, individual participant's voice was recorded and face was photographed and morphed respectively with voices and faces from unfamiliar identities. In the second session, participants performed a 'self-identification' task, classifying each morph as 'self' voice (or face) or an 'other' voice (or face). All participants also completed the Autism Spectrum Quotient (AQ). For each sensory modality, slope of the self-recognition curve was used as individual self-recognition metric. These two self-recognition metrics were tested for association between each other, and with autistic traits. Fifty percent 'self' response was reached for a higher percentage of self in the auditory domain compared to the visual domain (t = 3.142; P self-recognition bias across sensory modalities (τ = -0.165, P = 0.204). Higher recognition bias for self-voice was observed in individuals higher in autistic traits (τ AQ = 0.301, P = 0.008). No such correlation was observed between recognition bias for self-face and autistic traits (τ AQ = -0.020, P = 0.438). Our data shows that recognition bias for physical self-representation is not related across sensory modalities. Further, individuals with higher autistic traits were better able

  18. Your Cheatin' Voice Will Tell on You: Detection of Past Infidelity from Voice.

    Science.gov (United States)

    Hughes, Susan M; Harrison, Marissa A

    2017-01-01

    Evidence suggests that many physical, behavioral, and trait qualities can be detected solely from the sound of a person's voice, irrespective of the semantic information conveyed through speech. This study examined whether raters could accurately assess the likelihood that a person has cheated on committed, romantic partners simply by hearing the speaker's voice. Independent raters heard voice samples of individuals who self-reported that they either cheated or had never cheated on their romantic partners. To control for aspects that may clue a listener to the speaker's mate value, we used voice samples that did not differ between these groups for voice attractiveness, age, voice pitch, and other acoustic measures. We found that participants indeed rated the voices of those who had a history of cheating as more likely to cheat. Male speakers were given higher ratings for cheating, while female raters were more likely to ascribe the likelihood to cheat to speakers. Additionally, we manipulated the pitch of the voice samples, and for both sexes, the lower pitched versions were consistently rated to be from those who were more likely to have cheated. Regardless of the pitch manipulation, speakers were able to assess actual history of infidelity; the one exception was that men's accuracy decreased when judging women whose voices were lowered. These findings expand upon the idea that the human voice may be of value as a cheater detection tool and very thin slices of vocal information are all that is needed to make certain assessments about others.

  19. A pneumatic Bionic Voice prosthesis-Pre-clinical trials of controlling the voice onset and offset.

    Directory of Open Access Journals (Sweden)

    Farzaneh Ahmadi

    Full Text Available Despite emergent progress in many fields of bionics, a functional Bionic Voice prosthesis for laryngectomy patients (larynx amputees has not yet been achieved, leading to a lifetime of vocal disability for these patients. This study introduces a novel framework of Pneumatic Bionic Voice Prostheses as an electronic adaptation of the Pneumatic Artificial Larynx (PAL device. The PAL is a non-invasive mechanical voice source, driven exclusively by respiration with an exceptionally high voice quality, comparable to the existing gold standard of Tracheoesophageal (TE voice prosthesis. Following PAL design closely as the reference, Pneumatic Bionic Voice Prostheses seem to have a strong potential to substitute the existing gold standard by generating a similar voice quality while remaining non-invasive and non-surgical. This paper designs the first Pneumatic Bionic Voice prosthesis and evaluates its onset and offset control against the PAL device through pre-clinical trials on one laryngectomy patient. The evaluation on a database of more than five hours of continuous/isolated speech recordings shows a close match between the onset/offset control of the Pneumatic Bionic Voice and the PAL with an accuracy of 98.45 ±0.54%. When implemented in real-time, the Pneumatic Bionic Voice prosthesis controller has an average onset/offset delay of 10 milliseconds compared to the PAL. Hence it addresses a major disadvantage of previous electronic voice prostheses, including myoelectric Bionic Voice, in meeting the short time-frames of controlling the onset/offset of the voice in continuous speech.

  20. A pneumatic Bionic Voice prosthesis-Pre-clinical trials of controlling the voice onset and offset.

    Science.gov (United States)

    Ahmadi, Farzaneh; Noorian, Farzad; Novakovic, Daniel; van Schaik, André

    2018-01-01

    Despite emergent progress in many fields of bionics, a functional Bionic Voice prosthesis for laryngectomy patients (larynx amputees) has not yet been achieved, leading to a lifetime of vocal disability for these patients. This study introduces a novel framework of Pneumatic Bionic Voice Prostheses as an electronic adaptation of the Pneumatic Artificial Larynx (PAL) device. The PAL is a non-invasive mechanical voice source, driven exclusively by respiration with an exceptionally high voice quality, comparable to the existing gold standard of Tracheoesophageal (TE) voice prosthesis. Following PAL design closely as the reference, Pneumatic Bionic Voice Prostheses seem to have a strong potential to substitute the existing gold standard by generating a similar voice quality while remaining non-invasive and non-surgical. This paper designs the first Pneumatic Bionic Voice prosthesis and evaluates its onset and offset control against the PAL device through pre-clinical trials on one laryngectomy patient. The evaluation on a database of more than five hours of continuous/isolated speech recordings shows a close match between the onset/offset control of the Pneumatic Bionic Voice and the PAL with an accuracy of 98.45 ±0.54%. When implemented in real-time, the Pneumatic Bionic Voice prosthesis controller has an average onset/offset delay of 10 milliseconds compared to the PAL. Hence it addresses a major disadvantage of previous electronic voice prostheses, including myoelectric Bionic Voice, in meeting the short time-frames of controlling the onset/offset of the voice in continuous speech.

  1. A pneumatic Bionic Voice prosthesis—Pre-clinical trials of controlling the voice onset and offset

    Science.gov (United States)

    Noorian, Farzad; Novakovic, Daniel; van Schaik, André

    2018-01-01

    Despite emergent progress in many fields of bionics, a functional Bionic Voice prosthesis for laryngectomy patients (larynx amputees) has not yet been achieved, leading to a lifetime of vocal disability for these patients. This study introduces a novel framework of Pneumatic Bionic Voice Prostheses as an electronic adaptation of the Pneumatic Artificial Larynx (PAL) device. The PAL is a non-invasive mechanical voice source, driven exclusively by respiration with an exceptionally high voice quality, comparable to the existing gold standard of Tracheoesophageal (TE) voice prosthesis. Following PAL design closely as the reference, Pneumatic Bionic Voice Prostheses seem to have a strong potential to substitute the existing gold standard by generating a similar voice quality while remaining non-invasive and non-surgical. This paper designs the first Pneumatic Bionic Voice prosthesis and evaluates its onset and offset control against the PAL device through pre-clinical trials on one laryngectomy patient. The evaluation on a database of more than five hours of continuous/isolated speech recordings shows a close match between the onset/offset control of the Pneumatic Bionic Voice and the PAL with an accuracy of 98.45 ±0.54%. When implemented in real-time, the Pneumatic Bionic Voice prosthesis controller has an average onset/offset delay of 10 milliseconds compared to the PAL. Hence it addresses a major disadvantage of previous electronic voice prostheses, including myoelectric Bionic Voice, in meeting the short time-frames of controlling the onset/offset of the voice in continuous speech. PMID:29466455

  2. Error-dependent modulation of speech-induced auditory suppression for pitch-shifted voice feedback

    Directory of Open Access Journals (Sweden)

    Larson Charles R

    2011-06-01

    Full Text Available Abstract Background The motor-driven predictions about expected sensory feedback (efference copies have been proposed to play an important role in recognition of sensory consequences of self-produced motor actions. In the auditory system, this effect was suggested to result in suppression of sensory neural responses to self-produced voices that are predicted by the efference copies during vocal production in comparison with passive listening to the playback of the identical self-vocalizations. In the present study, event-related potentials (ERPs were recorded in response to upward pitch shift stimuli (PSS with five different magnitudes (0, +50, +100, +200 and +400 cents at voice onset during active vocal production and passive listening to the playback. Results Results indicated that the suppression of the N1 component during vocal production was largest for unaltered voice feedback (PSS: 0 cents, became smaller as the magnitude of PSS increased to 200 cents, and was almost completely eliminated in response to 400 cents stimuli. Conclusions Findings of the present study suggest that the brain utilizes the motor predictions (efference copies to determine the source of incoming stimuli and maximally suppresses the auditory responses to unaltered feedback of self-vocalizations. The reduction of suppression for 50, 100 and 200 cents and its elimination for 400 cents pitch-shifted voice auditory feedback support the idea that motor-driven suppression of voice feedback leads to distinctly different sensory neural processing of self vs. non-self vocalizations. This characteristic may enable the audio-vocal system to more effectively detect and correct for unexpected errors in the feedback of self-produced voice pitch compared with externally-generated sounds.

  3. Error-dependent modulation of speech-induced auditory suppression for pitch-shifted voice feedback.

    Science.gov (United States)

    Behroozmand, Roozbeh; Larson, Charles R

    2011-06-06

    The motor-driven predictions about expected sensory feedback (efference copies) have been proposed to play an important role in recognition of sensory consequences of self-produced motor actions. In the auditory system, this effect was suggested to result in suppression of sensory neural responses to self-produced voices that are predicted by the efference copies during vocal production in comparison with passive listening to the playback of the identical self-vocalizations. In the present study, event-related potentials (ERPs) were recorded in response to upward pitch shift stimuli (PSS) with five different magnitudes (0, +50, +100, +200 and +400 cents) at voice onset during active vocal production and passive listening to the playback. Results indicated that the suppression of the N1 component during vocal production was largest for unaltered voice feedback (PSS: 0 cents), became smaller as the magnitude of PSS increased to 200 cents, and was almost completely eliminated in response to 400 cents stimuli. Findings of the present study suggest that the brain utilizes the motor predictions (efference copies) to determine the source of incoming stimuli and maximally suppresses the auditory responses to unaltered feedback of self-vocalizations. The reduction of suppression for 50, 100 and 200 cents and its elimination for 400 cents pitch-shifted voice auditory feedback support the idea that motor-driven suppression of voice feedback leads to distinctly different sensory neural processing of self vs. non-self vocalizations. This characteristic may enable the audio-vocal system to more effectively detect and correct for unexpected errors in the feedback of self-produced voice pitch compared with externally-generated sounds.

  4. Features Speech Signature Image Recognition on Mobile Devices

    Directory of Open Access Journals (Sweden)

    Alexander Mikhailovich Alyushin

    2015-12-01

    Full Text Available The algorithms fordynamic spectrograms images recognition, processing and soundspeech signature (SS weredeveloped. The software for mobile phones, thatcan recognize speech signatureswas prepared. The investigation of the SS recognition speed on its boundarytypes was conducted. Recommendations on the boundary types choice in the optimal ratio of recognitionspeed and required space were given.

  5. Mindfulness of voices, self-compassion, and secure attachment in relation to the experience of hearing voices.

    Science.gov (United States)

    Dudley, James; Eames, Catrin; Mulligan, John; Fisher, Naomi

    2018-03-01

    Developing compassion towards oneself has been linked to improvement in many areas of psychological well-being, including psychosis. Furthermore, developing a non-judgemental, accepting way of relating to voices is associated with lower levels of distress for people who hear voices. These factors have also been associated with secure attachment. This study explores associations between the constructs of mindfulness of voices, self-compassion, and distress from hearing voices and how secure attachment style related to each of these variables. Cross-sectional online. One hundred and twenty-eight people (73% female; M age  = 37.5; 87.5% Caucasian) who currently hear voices completed the Self-Compassion Scale, Southampton Mindfulness of Voices Questionnaire, Relationships Questionnaire, and Hamilton Programme for Schizophrenia Voices Questionnaire. Results showed that mindfulness of voices mediated the relationship between self-compassion and severity of voices, and self-compassion mediated the relationship between mindfulness of voices and severity of voices. Self-compassion and mindfulness of voices were significantly positively correlated with each other and negatively correlated with distress and severity of voices. Mindful relation to voices and self-compassion are associated with reduced distress and severity of voices, which supports the proposed potential benefits of mindful relating to voices and self-compassion as therapeutic skills for people experiencing distress by voice hearing. Greater self-compassion and mindfulness of voices were significantly associated with less distress from voices. These findings support theory underlining compassionate mind training. Mindfulness of voices mediated the relationship between self-compassion and distress from voices, indicating a synergistic relationship between the constructs. Although the current findings do not give a direction of causation, consideration is given to the potential impact of mindful and

  6. Establishing software quality assurance

    International Nuclear Information System (INIS)

    Malsbury, J.

    1983-01-01

    This paper is concerned with four questions about establishing software QA: What is software QA. Why have software QA. What is the role of software QA. What is necessary to ensure the success of software QA

  7. Communication Skills Training Exploiting Multimodal Emotion Recognition

    Science.gov (United States)

    Bahreini, Kiavash; Nadolski, Rob; Westera, Wim

    2017-01-01

    The teaching of communication skills is a labour-intensive task because of the detailed feedback that should be given to learners during their prolonged practice. This study investigates to what extent our FILTWAM facial and vocal emotion recognition software can be used for improving a serious game (the Communication Advisor) that delivers a…

  8. Post-editing through Speech Recognition

    DEFF Research Database (Denmark)

    Mesa-Lao, Bartolomé

    (i.e. typing, handwriting and speaking) to improve the efficiency and accuracy of the translation process. However, further studies need to be conducted to build up new knowledge about the way in which state-of-the-art speech recognition software can be applied to the post-editing process...

  9. Image Quality Enhancement Using the Direction and Thickness of Vein Lines for Finger-Vein Recognition

    OpenAIRE

    Park, Young Ho; Park, Kang Ryoung

    2012-01-01

    On the basis of the increased emphasis placed on the protection of privacy, biometric recognition systems using physical or behavioural characteristics such as fingerprints, facial characteristics, iris and finger‐vein patterns or the voice have been introduced in applications including door access control, personal certification, Internet banking and ATM machines. Among these, finger‐vein recognition is advantageous in that it involves the use of inexpensive and small devices that are diffic...

  10. Authentication: From Passwords to Biometrics: An implementation of a speaker recognition system on Android

    OpenAIRE

    Heimark, Erlend

    2012-01-01

    We implement a biometric authentication system on the Android platform, which is based on text-dependent speaker recognition. The Android version used in the application is Android 4.0. The application makes use of the Modular Audio Recognition Framework, from which many of the algorithms are adapted in the processes of preprocessing and feature extraction. In addition, we employ the Dynamic Time Warping (DTW) algorithm for the comparison of different voice features. A training procedure is i...

  11. Psychological effects of dysphonia in voice professionals.

    Science.gov (United States)

    Salturk, Ziya; Kumral, Tolgar Lutfi; Aydoğdu, Imran; Arslanoğlu, Ahmet; Berkiten, Güler; Yildirim, Güven; Uyar, Yavuz

    2015-08-01

    To evaluate the psychological effects of dysphonia in voice professionals compared to non-voice professionals and in both genders. Cross-sectional analysis. Forty-eight 48 voice professionals and 52 non-voice professionals with dysphonia were included in this study. All participants underwent a complete ear, nose, and throat examination and an evaluation for pathologies that might affect vocal quality. Participants were asked to complete the Turkish versions of the Voice Handicap Index-30 (VHI-30), Perceived Stress Scale (PSS), and the Hospital Anxiety and Depression Scale (HADS). HADS scores were evaluated as HADS-A (anxiety) and HADS-D (depression). Dysphonia status was evaluated by grade, roughness, breathiness, asthenia, and strain (GRBAS) scale perceptually. The results were compared statistically. Significant differences between the two groups were evident when the VHI-30 and PSS data were compared (P = .00001 and P = .00001, respectively). However, neither HADS score (HADS-A and HADS-D) differed between groups. An analysis of the scores in terms of sex revealed that females had significantly higher PSS scores (P = .006). The GRBAS scale revealed no difference between groups (P = .819, .931, .803, .655, and .803, respectively). No between-sex differences in the VHI-30 or HADS scores were evident We found that voice professionals and females experienced more stress and were more dissatisfied with their voices. 4. © 2015 The American Laryngological, Rhinological and Otological Society, Inc.

  12. Reliability in perceptual analysis of voice quality.

    Science.gov (United States)

    Bele, Irene Velsvik

    2005-12-01

    This study focuses on speaking voice quality in male teachers (n = 35) and male actors (n = 36), who represent untrained and trained voice users, because we wanted to investigate normal and supranormal voices. In this study, both substantial and methodologic aspects were considered. It includes a method for perceptual voice evaluation, and a basic issue was rater reliability. A listening group of 10 listeners, 7 experienced speech-language therapists, and 3 speech-language therapist students evaluated the voices by 15 vocal characteristics using VA scales. Two sets of voice signals were investigated: text reading (2 loudness levels) and sustained vowel (3 levels). The results indicated a high interrater reliability for most perceptual characteristics. Connected speech was evaluated more reliably, especially at the normal level, but both types of voice signals were evaluated reliably, although the reliability for connected speech was somewhat higher than for vowels. Experienced listeners tended to be more consistent in their ratings than did the student raters. Some vocal characteristics achieved acceptable reliability even with a smaller panel of listeners. The perceptual characteristics grouped in 4 factors reflected perceptual dimensions.

  13. Muted 'voice': The writing of two groups of postgraduate ...

    African Journals Online (AJOL)

    The purpose of this article is to demonstrate and account for the weak emergence of 'voice' in the writing of students embarking upon their postgraduate studies in Geosciences. The two elements of 'voice' that are emphasised are 'voice' as style of expression and 'voice' as the ability to write distinctly, yet building upon ...

  14. Performance of Phonatory Deviation Diagrams in Synthesized Voice Analysis.

    Science.gov (United States)

    Lopes, Leonardo Wanderley; da Silva, Karoline Evangelista; da Silva Evangelista, Deyverson; Almeida, Anna Alice; Silva, Priscila Oliveira Costa; Lucero, Jorge; Behlau, Mara

    2018-05-02

    To analyze the performance of a phonatory deviation diagram (PDD) in discriminating the presence and severity of voice deviation and the predominant voice quality of synthesized voices. A speech-language pathologist performed the auditory-perceptual analysis of the synthesized voice (n = 871). The PDD distribution of voice signals was analyzed according to area, quadrant, shape, and density. Differences in signal distribution regarding the PDD area and quadrant were detected when differentiating the signals with and without voice deviation and with different predominant voice quality. Differences in signal distribution were found in all PDD parameters as a function of the severity of voice disorder. The PDD area and quadrant can differentiate normal voices from deviant synthesized voices. There are differences in signal distribution in PDD area and quadrant as a function of the severity of voice disorder and the predominant voice quality. However, the PDD area and quadrant do not differentiate the signals as a function of severity of voice disorder and differentiated only the breathy and rough voices from the normal and strained voices. PDD density is able to differentiate only signals with moderate and severe deviation. PDD shape shows differences between signals with different severities of voice deviation. © 2018 S. Karger AG, Basel.

  15. Compact Acoustic Models for Embedded Speech Recognition

    Directory of Open Access Journals (Sweden)

    Lévy Christophe

    2009-01-01

    Full Text Available Speech recognition applications are known to require a significant amount of resources. However, embedded speech recognition only authorizes few KB of memory, few MIPS, and small amount of training data. In order to fit the resource constraints of embedded applications, an approach based on a semicontinuous HMM system using state-independent acoustic modelling is proposed. A transformation is computed and applied to the global model in order to obtain each HMM state-dependent probability density functions, authorizing to store only the transformation parameters. This approach is evaluated on two tasks: digit and voice-command recognition. A fast adaptation technique of acoustic models is also proposed. In order to significantly reduce computational costs, the adaptation is performed only on the global model (using related speaker recognition adaptation techniques with no need for state-dependent data. The whole approach results in a relative gain of more than 20% compared to a basic HMM-based system fitting the constraints.

  16. Voicing children's critique and utopias

    DEFF Research Database (Denmark)

    Husted, Mia; Lind, Unni

    and restrictions, Call for aesthetics an sensuality, Longings for home and parents, Longings for better social relations Making children's voice visible allows preschool teachers to reflect children's knowledge and life word in pedagogical practice. Keywords: empowerment and participation, action research...... children to raise and render visible their own critique and wishes related to their everyday life in daycare. Research on how and why to engage children as participants in research and in institutional developments addresses overall interests in democratization and humanization that can be traced back...... to strategies for Nordic welfare developments and the Conventions on Children's Rights. The theoretical and methodological framework follow the lines of how to form and learn democracy of Lewin (1948) and Dewey (1916). The study is carried out as action research involving 50 children at age three to five...

  17. His Master’s Voice?

    DEFF Research Database (Denmark)

    Sörbom, Adrienne; Garsten, Christina

    This paper departs from an interest in the involvement of business leaders in the sphere of politics, in the broad sense. Many global business leaders today do much more than engage narrowly in their own corporation and its search for profit. At a general level, we are seeing a proliferation...... as political. What is the role of business in the World Economic Forum, and how do business corporations advance their interests through the WEF? The results show that corporations find a strategically positioned amplifier for their non-market interests in the WEF. The WEF functions to enhance and gain...... leverage for their ideas and priorities in a highly selective and resourceful environment. In the long run, both the market priorities and the political interests of business may be served by engagement in the WEF. However, the WEF cannot only be conceived as the extended voice of corporations. The WEF...

  18. Giving the Customer a Voice

    DEFF Research Database (Denmark)

    Van der Hoven, Christopher; Michea, Adela; Varnes, Claus

    , for example there are studies that have strongly criticized focus groups, interviews and surveys (e.g. Ulwick, 2002; Goffin et al, 2010; Sandberg, 2002). In particular, a point is made that, “…traditional market research and development approaches proved to be particularly ill-suited to breakthrough products...... the voice of the customer (VoC) through market research is well documented (Davis, 1993; Mullins and Sutherland, 1998; Cooper et al., 2002; Flint, 2002; Davilla et al., 2006; Cooper and Edgett, 2008; Cooper and Dreher, 2010; Goffin and Mitchell, 2010). However, not all research methods are well received......” (Deszca et al, 2010, p613). Therefore, in situations where traditional techniques - interviews and focus groups - are ineffective, the question is which market research techniques are appropriate, particularly for developing breakthrough products? To investigate this, an attempt was made to access...

  19. Dangertalk: Voices of abortion providers.

    Science.gov (United States)

    Martin, Lisa A; Hassinger, Jane A; Debbink, Michelle; Harris, Lisa H

    2017-07-01

    Researchers have described the difficulties of doing abortion work, including the psychosocial costs to individual providers. Some have discussed the self-censorship in which providers engage in to protect themselves and the pro-choice movement. However, few have examined the costs of this self-censorship to public discourse and social movements in the US. Using qualitative data collected during abortion providers' discussions of their work, we explore the tensions between their narratives and pro-choice discourse, and examine the types of stories that are routinely silenced - narratives we name "dangertalk". Using these data, we theorize about the ways in which giving voice to these tensions might transform current abortion discourse by disrupting false dichotomies and better reflecting the complex realities of abortion. We present a conceptual model for dangertalk in abortion discourse, connecting it to functions of dangertalk in social movements more broadly. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. Mediatization: a concept, multiple voices

    Directory of Open Access Journals (Sweden)

    Pedro Gilberto GOMES

    2016-12-01

    Full Text Available Mediatization has become increasingly a key concept, fundamental, essential to describe the present and the history of media and communicative change taking place. Thus, it became part of a whole, one can not see them as a separate sphere. In this perspective, the media coverage is used as a concept to describe the process of expansion of the different technical means and consider the interrelationships between the communicative change, means and sociocultural change. However, although many researchers use the concept of mediatization, each gives you the meaning that best suits your needs. Thus, the concept of media coverage is treated with multiple voices. This paper discusses this problem and present a preliminary pre-position on the matter.

  1. Software Prototyping

    Science.gov (United States)

    Del Fiol, Guilherme; Hanseler, Haley; Crouch, Barbara Insley; Cummins, Mollie R.

    2016-01-01

    Summary Background Health information exchange (HIE) between Poison Control Centers (PCCs) and Emergency Departments (EDs) could improve care of poisoned patients. However, PCC information systems are not designed to facilitate HIE with EDs; therefore, we are developing specialized software to support HIE within the normal workflow of the PCC using user-centered design and rapid prototyping. Objective To describe the design of an HIE dashboard and the refinement of user requirements through rapid prototyping. Methods Using previously elicited user requirements, we designed low-fidelity sketches of designs on paper with iterative refinement. Next, we designed an interactive high-fidelity prototype and conducted scenario-based usability tests with end users. Users were asked to think aloud while accomplishing tasks related to a case vignette. After testing, the users provided feedback and evaluated the prototype using the System Usability Scale (SUS). Results Survey results from three users provided useful feedback that was then incorporated into the design. After achieving a stable design, we used the prototype itself as the specification for development of the actual software. Benefits of prototyping included having 1) subject-matter experts heavily involved with the design; 2) flexibility to make rapid changes, 3) the ability to minimize software development efforts early in the design stage; 4) rapid finalization of requirements; 5) early visualization of designs; 6) and a powerful vehicle for communication of the design to the programmers. Challenges included 1) time and effort to develop the prototypes and case scenarios; 2) no simulation of system performance; 3) not having all proposed functionality available in the final product; and 4) missing needed data elements in the PCC information system. PMID:27081404

  2. Disability: a voice in Australian bioethics?

    Science.gov (United States)

    Newell, Christopher

    2003-06-01

    The rise of research and advocacy over the years to establish a disability voice in Australia with regard to bioethical issues is explored. This includes an analysis of some of the political processes and engagement in mainstream bioethical debate. An understanding of the politics of rejected knowledge is vital in understanding the muted disability voices in Australian bioethics and public policy. It is also suggested that the voices of those who are marginalised or oppressed in society, such as people with disability, have particular contribution to make in fostering critical bioethics.

  3. Unfamiliar voice identification: Effect of post-event information on accuracy and voice ratings

    Directory of Open Access Journals (Sweden)

    Harriet Mary Jessica Smith

    2014-04-01

    Full Text Available This study addressed the effect of misleading post-event information (PEI on voice ratings, identification accuracy, and confidence, as well as the link between verbal recall and accuracy. Participants listened to a dialogue between male and female targets, then read misleading information about voice pitch. Participants engaged in verbal recall, rated voices on a feature checklist, and made a lineup decision. Accuracy rates were low, especially on target-absent lineups. Confidence and accuracy were unrelated, but the number of facts recalled about the voice predicted later lineup accuracy. There was a main effect of misinformation on ratings of target voice pitch, but there was no effect on identification accuracy or confidence ratings. As voice lineup evidence from earwitnesses is used in courts, the findings have potential applied relevance.

  4. Cross-cultural emotional prosody recognition: evidence from Chinese and British listeners.

    Science.gov (United States)

    Paulmann, Silke; Uskul, Ayse K

    2014-01-01

    This cross-cultural study of emotional tone of voice recognition tests the in-group advantage hypothesis (Elfenbein & Ambady, 2002) employing a quasi-balanced design. Individuals of Chinese and British background were asked to recognise pseudosentences produced by Chinese and British native speakers, displaying one of seven emotions (anger, disgust, fear, happy, neutral tone of voice, sad, and surprise). Findings reveal that emotional displays were recognised at rates higher than predicted by chance; however, members of each cultural group were more accurate in recognising the displays communicated by a member of their own cultural group than a member of the other cultural group. Moreover, the evaluation of error matrices indicates that both culture groups relied on similar mechanism when recognising emotional displays from the voice. Overall, the study reveals evidence for both universal and culture-specific principles in vocal emotion recognition.

  5. Multistage Data Selection-based Unsupervised Speaker Adaptation for Personalized Speech Emotion Recognition

    NARCIS (Netherlands)

    Kim, Jaebok; Park, Jeong-Sik

    This paper proposes an efficient speech emotion recognition (SER) approach that utilizes personal voice data accumulated on personal devices. A representative weakness of conventional SER systems is the user-dependent performance induced by the speaker independent (SI) acoustic model framework. But,

  6. Music Information Retrieval from a Singing Voice Using Lyrics and Melody Information

    Directory of Open Access Journals (Sweden)

    Shozo Makino

    2007-01-01

    Full Text Available Recently, several music information retrieval (MIR systems which retrieve musical pieces by the user's singing voice have been developed. All of these systems use only melody information for retrieval, although lyrics information is also useful for retrieval. In this paper, we propose a new MIR system that uses both lyrics and melody information. First, we propose a new lyrics recognition method. A finite state automaton (FSA is used as recognition grammar, and about 86% retrieval accuracy was obtained. We also develop an algorithm for verifying a hypothesis output by a lyrics recognizer. Melody information is extracted from an input song using several pieces of information of the hypothesis, and a total score is calculated from the recognition score and the verification score. From the experimental results, 95.0% retrieval accuracy was obtained with a query consisting of five words.

  7. Music Information Retrieval from a Singing Voice Using Lyrics and Melody Information

    Directory of Open Access Journals (Sweden)

    Suzuki Motoyuki

    2007-01-01

    Full Text Available Recently, several music information retrieval (MIR systems which retrieve musical pieces by the user's singing voice have been developed. All of these systems use only melody information for retrieval, although lyrics information is also useful for retrieval. In this paper, we propose a new MIR system that uses both lyrics and melody information. First, we propose a new lyrics recognition method. A finite state automaton (FSA is used as recognition grammar, and about retrieval accuracy was obtained. We also develop an algorithm for verifying a hypothesis output by a lyrics recognizer. Melody information is extracted from an input song using several pieces of information of the hypothesis, and a total score is calculated from the recognition score and the verification score. From the experimental results, 95.0 retrieval accuracy was obtained with a query consisting of five words.

  8. Connections between voice ergonomic risk factors in classrooms and teachers' voice production.

    Science.gov (United States)

    Rantala, Leena M; Hakala, Suvi; Holmqvist, Sofia; Sala, Eeva

    2012-01-01

    The aim of the study was to investigate if voice ergonomic risk factors in classrooms correlated with acoustic parameters of teachers' voice production. The voice ergonomic risk factors in the fields of working culture, working postures and indoor air quality were assessed in 40 classrooms using the Voice Ergonomic Assessment in Work Environment - Handbook and Checklist. Teachers (32 females, 8 males) from the above-mentioned classrooms recorded text readings before and after a working day. Fundamental frequency, sound pressure level (SPL) and the slope of the spectrum (alpha ratio) were analyzed. The higher the number of the risk factors in the classrooms, the higher SPL the teachers used and the more strained the males' voices (increased alpha ratio) were. The SPL was already higher before the working day in the teachers with higher risk than in those with lower risk. In the working environment with many voice ergonomic risk factors, speakers increase voice loudness and use more strained voice quality (males). A practical implication of the results is that voice ergonomic assessments are needed in schools. Copyright © 2013 S. Karger AG, Basel.

  9. [Applicability of Voice Handicap Index to the evaluation of voice therapy effectiveness in teachers].

    Science.gov (United States)

    Niebudek-Bogusz, Ewa; Kuzańska, Anna; Błoch, Piotr; Domańska, Maja; Woźnicka, Ewelina; Politański, Piotr; Sliwińska-Kowalska, Mariola

    2007-01-01

    The aim of this study was to assess the applicability of Voice Handicap Index (VHI) to the evaluation of effectiveness of functional voice disorders treatment in teachers. The subjects were 45 female teachers with functional dysphonia who evaluated their voice problems according to the subjective VHI scale before and after phoniatric management. Group I (29 patients) were subjected to vocal training, whereas group II (16 patients) received only voice hygiene instructions. The results demonstrated that differences in the mean VHI score before and after phoniatric treatment were significantly higher in group 1 than in group II (p teacher's dysphonia.

  10. Influence of classroom acoustics on the voice levels of teachers with and without voice problems: a field study

    DEFF Research Database (Denmark)

    Pelegrin Garcia, David; Lyberg-Åhlander, Viveka; Rydell, Roland

    2010-01-01

    of the classroom. The results thus suggest that teachers with voice problems are more aware of classroom acoustic conditions than their healthy colleagues and make use of the more supportive rooms to lower their voice levels. This behavior may result from an adaptation process of the teachers with voice problems...... of the voice problems was made with a questionnaire and a laryngological examination. During teaching, the sound pressure level at the teacher’s position was monitored. The teacher’s voice level and the activity noise level were separated using mixed Gaussians. In addition, objective acoustic parameters...... of Reverberation Time and Voice Support were measured in the 30 empty classrooms of the study. An empirical model shows that the measured voice levels depended on the activity noise levels and the voice support. Teachers with and without voice problems were differently affected by the voice support...

  11. Global Software Engineering: A Software Process Approach

    Science.gov (United States)

    Richardson, Ita; Casey, Valentine; Burton, John; McCaffery, Fergal

    Our research has shown that many companies are struggling with the successful implementation of global software engineering, due to temporal, cultural and geographical distance, which causes a range of factors to come into play. For example, cultural, project managementproject management and communication difficulties continually cause problems for software engineers and project managers. While the implementation of efficient software processes can be used to improve the quality of the software product, published software process models do not cater explicitly for the recent growth in global software engineering. Our thesis is that global software engineering factors should be included in software process models to ensure their continued usefulness in global organisations. Based on extensive global software engineering research, we have developed a software process, Global Teaming, which includes specific practices and sub-practices. The purpose is to ensure that requirements for successful global software engineering are stipulated so that organisations can ensure successful implementation of global software engineering.

  12. Static human face recognition using artificial neural networks

    International Nuclear Information System (INIS)

    Qamar, R.; Shah, S.H.; Javed-ur-Rehman

    2003-01-01

    This paper presents a novel method of human face recognition using digital computers. A digital PC camera is used to take the BMP images of the human faces. An artificial neural network using Back Propagation Algorithm is developed as a recognition engine. The BMP images of the faces serve as the input patterns for this engine. A software 'Face Recognition' has been developed to recognize the human faces for which it is trained. Once the neural network is trained for patterns of the faces, the software is able to detect and recognize them with success rate of about 97%. (author)

  13. Software system safety

    Science.gov (United States)

    Uber, James G.

    1988-01-01

    Software itself is not hazardous, but since software and hardware share common interfaces there is an opportunity for software to create hazards. Further, these software systems are complex, and proven methods for the design, analysis, and measurement of software safety are not yet available. Some past software failures, future NASA software trends, software engineering methods, and tools and techniques for various software safety analyses are reviewed. Recommendations to NASA are made based on this review.

  14. Former Auctioneer Finds Voice After Aphasia

    Science.gov (United States)

    ... Aphasia Follow us Former Auctioneer Finds Voice After Aphasia Speech impairment changed his life One unremarkable September ... 10 Tips for Communicating with Someone who has Aphasia Talk to them in a quiet, calm, relaxed ...

  15. A model to explain human voice production

    Science.gov (United States)

    Vilas Bôas, C. S. N.; Gobara, S. T.

    2018-05-01

    This article presents a device constructed with low-cost material to demonstrate and explain voice production. It also provides a contextualized, interdisciplinary approach to introduce the study of sound waves.

  16. Software engineers and nuclear engineers: teaming up to do testing

    International Nuclear Information System (INIS)

    Kelly, D.; Cote, N.; Shepard, T.

    2007-01-01

    The software engineering community has traditionally paid little attention to the specific needs of engineers and scientists who develop their own software. Recently there has been increased recognition that specific software engineering techniques need to be found for this group of developers. In this case study, a software engineering group teamed with a nuclear engineering group to develop a software testing strategy. This work examines the types of testing that proved to be useful and examines what each discipline brings to the table to improve the quality of the software product. (author)

  17. Cognitive processing load during listening is reduced more by decreasing voice similarity than by increasing spatial separation between target and masker speech

    NARCIS (Netherlands)

    Zekveld, A.A.; Rudner, M.; Kramer, S.E.; Lyzenga, J.; Ronnberg, J.

    2014-01-01

    We investigated changes in speech recognition and cognitive processing load due to the masking release attributable to decreasing similarity between target and masker speech. This was achieved by using masker voices with either the same (female) gender as the target speech or different gender (male)

  18. Control of automated system with voice commands

    OpenAIRE

    Švara, Denis

    2012-01-01

    In smart houses contemporary achievements in the fields of automation, communications, security and artificial intelligence, increase comfort and improve the quality of user's lifes. For the purpose of this thesis we developed a system for managing a smart house with voice commands via smart phone. We focused at voice commands most. We want move from communication with fingers - touches, to a more natural, human relationship - speech. We developed the entire chain of communication, by which t...

  19. Voice disorders in Nigerian primary school teachers.

    Science.gov (United States)

    Akinbode, R; Lam, K B H; Ayres, J G; Sadhra, S

    2014-07-01

    The prolonged use or abuse of voice may lead to vocal fatigue and vocal fold tissue damage. School teachers routinely use their voices intensively at work and are therefore at a higher risk of dysphonia. To determine the prevalence of voice disorders among primary school teachers in Lagos, Nigeria, and to explore associated risk factors. Teaching and non-teaching staff from 19 public and private primary schools completed a self-administered questionnaire to obtain information on personal lifestyles, work experience and environment, and voice disorder symptoms. Dysphonia was defined as the presence of at least one of the following: hoarseness, repetitive throat clearing, tired voice or straining to speak. A total of 341 teaching and 155 non-teaching staff participated. The prevalence of dysphonia in teachers was 42% compared with 18% in non-teaching staff. A significantly higher proportion of the teachers reported that voice symptoms had affected their ability to communicate effectively. School type (public/private) did not predict the presence of dysphonia. Statistically significant associations were found for regular caffeinated drink intake (odds ratio [OR] = 3.07; 95% confidence interval [CI]: 1.51-6.62), frequent upper respiratory tract infection (OR = 3.60; 95% CI: 1.39-9.33) and raised voice while teaching (OR = 10.1; 95% CI: 5.07-20.2). Nigerian primary school teachers were at risk for dysphonia. Important environment and personal factors were upper respiratory infection, the need to frequently raise the voice when teaching and regular intake of caffeinated drinks. Dysphonia was not associated with age or years of teaching. © The Author 2014. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  20. [THE APPLICATION OF SHORT-TERM EFFICIENCY ANALYSIS IN DIAGNOSING OCCUPATIONAL VOICE DISORDERS].

    Science.gov (United States)

    Niebudek-Bogusz, Ewa; Just, Marcin; Tyc, Michał; Wiktorowicz, Justyna; Morawska, Joanna; Śliwińska-Kowalska, Mariola

    2015-01-01

    An objective determination of the range of vocal efficiency is rather difficult. The aim of the study was to assess the possibility of application of short-term acoustic efficiency analysis in diagnosing occupational voice disorders. The study covered 98 people (87 women and 11 men) diagnosed with occupational dysphonia throuigh videostroboscopic examination. The control group comprised 100 people (81 women and 19 men) with normal voices. The short-term acoustic analysis was carried out by means of DiagnoScope software, including classical parameters (Jitter group, Shimmer group and the assessment of noise degree NHR), as well as new short-term efficiency parameters determined in a short time period during sustained phonation of the vowel "a." The results were then compared. Results: The values of all the examined classical parameters were considerably higher in the study group of pathological voices than in the control group of normal voices (p = 0.00). The aerodynamic parameter, maximum phonation time, was significantly shorter by over 0.5 s in the study group than in the control group. The majority of the acoustic efficiency parameters were also considerably worse in the study group of subjects with occupational dysphonia than in the control group (p = 0.00). Moreover, the correlation between the efficiency parameters and most of the classical acoustic parameters in the study group implies that for the voices with occupational pathology the decreased efficiency of the vocal apparatus is reflected in the acoustic voice structure. Effliciency parameters determined during short-term acoustic analysis can be an objective indicator of the decreased phonatory function of the larnx, useful in diagnosing occupational vocal pathology.

  1. The application of short-term efficiency analysis in diagnosing occupational voice disorders

    Directory of Open Access Journals (Sweden)

    Ewa Niebudek-Bogusz

    2015-06-01

    Full Text Available Background: An objective determination of the range of vocal efficiency is rather difficult. The aim of the study was to assess the possibility of application of short-term acoustic efficiency analysis in diagnosing occupational voice disorders. Material and Methods: The study covered 98 people (87 women and 11 men diagnosed with occupational dysphonia through videostroboscopic examination. The control group comprised 100 people (81 women and 19 men with normal voices. The short-term acoustic analysis was carried out by means of DiagnoScope software, including classical parameters (Jitter group, Shimmer group and the assessment of noise degree NHR, as well as new short-term efficiency parameters determined in a short time period during sustained phonation of the vowel “a.” The results were then compared. Results: The values of all the examined classical parameters were considerably higher in the study group of pathological voices than in the control group of normal voices (p = 0.00. The aerodynamic parameter, maximum phonation time, was significantly shorter by over 0.5 s in the study group than in the control group. The majority of the acoustic efficiency parameters were also considerably worse in the study group of subjects with occupational dysphonia than in the control group (p = 0.00. Moreover, the correlation between the efficiency parameters and most of the classical acoustic parameters in the study group implies that for the voices with occupational pathology the decreased efficiency of the vocal apparatus is reflected in the acoustic voice structure. Conclusions: Efficiency parameters determined during short-term acoustic analysis can be an objective indicator of the decreased phonatory function of the larynx, useful in diagnosing occupational vocal pathology. Med Pr 2015;66(2:225–234

  2. The Suitability of Cloud-Based Speech Recognition Engines for Language Learning

    Science.gov (United States)

    Daniels, Paul; Iwago, Koji

    2017-01-01

    As online automatic speech recognition (ASR) engines become more accurate and more widely implemented with call software, it becomes important to evaluate the effectiveness and the accuracy of these recognition engines using authentic speech samples. This study investigates two of the most prominent cloud-based speech recognition engines--Apple's…

  3. Voicing Others’ Voices: Spotlighting the Researcher as Narrator

    Directory of Open Access Journals (Sweden)

    Dan O’SULLIVAN

    2015-12-01

    Full Text Available As qualitative research undertakings are not independent of the researcher, the “indissoluble interrelationship between interpreter and interpretation” (Thomas & James, 2006, p. 782 renders it necessary for researchers to understand that their text is a representation, a version of the truth that is the product of writerly choices, and that it is discursive. Endlessly creative, artistic and political, as there is no single interpretative truth, the interpretative process facilitates the refashioning of representations, the remaking of choices and the probing of discourses. As a consequence of the particularity of any researcher’s account, issues pertaining to researcher identity and authorial stance always remain central to research endeavours (Kamler & Thomson, 2006, p. 68; Denzin & Lincoln 2011, pp. 14-15. Therefore, researchers are encouraged to be reflexive about their analyses and research accounts (Elliott, 2005, p. 152, as reflexivity helps spotlight the role of the researcher as narrator. In turn, spotlighting the researcher as narrator foregrounds a range of complex issues about voice, representation and interpretive authority (Chase, 2005, p. 657; Genishi & Glupczynski, 2006, p. 671; Eisenhart, 2006. In essence, therefore, this paper is reflective of the challenges of “doing” qualitative research in educational settings. Its particular focus-the shaping of beginning primary teachers’ identities, in Ireland, throughout the course of their initial year of occupational experience, post-graduation- endeavours to highlight issues pertaining to the researcher as narrator (O’Sullivan, 2014.

  4. Voicing others’ voices: Spotlighting the researcher as narrator

    Directory of Open Access Journals (Sweden)

    Dan O'Sullivan

    2015-09-01

    Full Text Available As qualitative research undertakings are not independent of the researcher, the “indissoluble interrelationship between interpreter and interpretation” (Thomas & James, 2006, p. 782 renders it necessary for researchers to understand that their text is a representation, a version of the truth that is the product of writerly choices, and that it is discursive. Endlessly creative, artistic and political, as there is no single interpretative truth, the interpretative process facilitates the refashioning of representations, the remaking of choices and the probing of discourses. As a consequence of the particularity of any researcher’s account, issues pertaining to researcher identity and authorial stance always remain central to research endeavours (Kamler & Thomson, 2006, p. 68; Denzin & Lincoln 2011, pp. 14-15. Therefore, researchers are encouraged to be reflexive about their analyses and research accounts (Elliott, 2005, p. 152, as reflexivity helps spotlight the role of the researcher as narrator. In turn, spotlighting the researcher as narrator foregrounds a range of complex issues about voice, representation and interpretive authority (Chase, 2005, p. 657; Genishi & Glupczynski, 2006, p. 671; Eisenhart, 2006. In essence, therefore, this paper is reflective of the challenges of “doing” qualitative research in educational settings. Its particular focus-the shaping of beginning primary teachers’ identities, in Ireland, throughout the course of their initial year of occupational experience, post-graduation- endeavours to highlight issues pertaining to the researcher as narrator (O’Sullivan, 2014.

  5. Voice pitch influences perceptions of sexual infidelity.

    Science.gov (United States)

    O'Connor, Jillian J M; Re, Daniel E; Feinberg, David R

    2011-02-28

    Sexual infidelity can be costly to members of both the extra-pair and the paired couple. Thus, detecting infidelity risk is potentially adaptive if it aids in avoiding cuckoldry or loss of parental and relationship investment. Among men, testosterone is inversely related to voice pitch, relationship and offspring investment, and is positively related to the pursuit of short-term relationships, including extra-pair sex. Among women, estrogen is positively related to voice pitch, attractiveness, and the likelihood of extra-pair involvement. Although prior work has demonstrated a positive relationship between men's testosterone levels and infidelity, this study is the first to investigate attributions of infidelity as a function of sexual dimorphism in male and female voices. We found that men attributed high infidelity risk to feminized women's voices, but not significantly more often than did women. Women attributed high infidelity risk to masculinized men's voices at significantly higher rates than did men. These data suggest that voice pitch is used as an indicator of sexual strategy in addition to underlying mate value. The aforementioned attributions may be adaptive if they prevent cuckoldry and/or loss of parental and relationship investment via avoidance of partners who may be more likely to be unfaithful.

  6. Voice Pitch Influences Perceptions of Sexual Infidelity

    Directory of Open Access Journals (Sweden)

    Jillian J.M. O'Connor

    2011-01-01

    Full Text Available Sexual infidelity can be costly to members of both the extra-pair and the paired couple. Thus, detecting infidelity risk is potentially adaptive if it aids in avoiding cuckoldry or loss of parental and relationship investment. Among men, testosterone is inversely related to voice pitch, relationship and offspring investment, and is positively related to the pursuit of short-term relationships, including extra-pair sex. Among women, estrogen is positively related to voice pitch, attractiveness, and the likelihood of extra-pair involvement. Although prior work has demonstrated a positive relationship between men's testosterone levels and infidelity, this study is the first to investigate attributions of infidelity as a function of sexual dimorphism in male and female voices. We found that men attributed high infidelity risk to feminized women's voices, but not significantly more often than did women. Women attributed high infidelity risk to masculinized men's voices at significantly higher rates than did men. These data suggest that voice pitch is used as an indicator of sexual strategy in addition to underlying mate value. The aforementioned attributions may be adaptive if they prevent cuckoldry and/or loss of parental and relationship investment via avoidance of partners who may be more likely to be unfaithful.

  7. Multivariate sensitivity to voice during auditory categorization.

    Science.gov (United States)

    Lee, Yune Sang; Peelle, Jonathan E; Kraemer, David; Lloyd, Samuel; Granger, Richard

    2015-09-01

    Past neuroimaging studies have documented discrete regions of human temporal cortex that are more strongly activated by conspecific voice sounds than by nonvoice sounds. However, the mechanisms underlying this voice sensitivity remain unclear. In the present functional MRI study, we took a novel approach to examining voice sensitivity, in which we applied a signal detection paradigm to the assessment of multivariate pattern classification among several living and nonliving categories of auditory stimuli. Within this framework, voice sensitivity can be interpreted as a distinct neural representation of brain activity that correctly distinguishes human vocalizations from other auditory object categories. Across a series of auditory categorization tests, we found that bilateral superior and middle temporal cortex consistently exhibited robust sensitivity to human vocal sounds. Although the strongest categorization was in distinguishing human voice from other categories, subsets of these regions were also able to distinguish reliably between nonhuman categories, suggesting a general role in auditory object categorization. Our findings complement the current evidence of cortical sensitivity to human vocal sounds by revealing that the greatest sensitivity during categorization tasks is devoted to distinguishing voice from nonvoice categories within human temporal cortex. Copyright © 2015 the American Physiological Society.

  8. Voice Quality in Mobile Telecommunication System

    Directory of Open Access Journals (Sweden)

    Evaldas Stankevičius

    2013-05-01

    Full Text Available The article deals with methods measuring the quality of voice transmitted over the mobile network as well as related problem, algorithms and options. It presents the created voice quality measurement system and discusses its adequacy as well as efficiency. Besides, the author presents the results of system application under the optimal hardware configuration. Under almost ideal conditions, the system evaluates the voice quality with MOS 3.85 average estimate; while the standardized TEMS Investigation 9.0 has 4.05 average MOS estimate. Next, the article presents the discussion of voice quality predictor implementation and investigates the predictor using nonlinear and linear prediction methods of voice quality dependence on the mobile network settings. Nonlinear prediction using artificial neural network resulted in the correlation coefficient of 0.62. While the linear prediction method using the least mean squares resulted in the correlation coefficient of 0.57. The analytical expression of voice quality features from the three network parameters: BER, C / I, RSSI is given as well.Article in Lithuanian

  9. Vocal parameters and voice-related quality of life in adult women with and without ovarian function.

    Science.gov (United States)

    Ferraz, Pablo Rodrigo Rocha; Bertoldo, Simão Veras; Costa, Luanne Gabrielle Morais; Serra, Emmeliny Cristini Nogueira; Silva, Eduardo Magalhães; Brito, Luciane Maria Oliveira; Chein, Maria Bethânia da Costa

    2013-05-01

    To identify the perceptual and acoustic parameters of voice in adult women with and without ovarian function and its impact on quality of life related to voice. Cross-sectional and analytical study with 106 women divided into, two groups: G1, with ovarian function (n=43) and G2, without physiological ovarian function (n=63). The women were instructed to sustain the vowel "a" and the sounds of /s/ and /z/ in habitual pitch and loudness. They were also asked to classify their voices and answer the voice-related quality of life (V-RQOL) questionnaire. The perceptual analysis of the vocal samples was performed by three speech-language pathologists using the GRBASI (G: grade; R: roughness; B: breathness; A: asthenia; S: strain; I: instability) scale. The acoustic analysis was carried out with the software VoxMetria 2.7h (CTS Informatica). The data were analyzed using descriptive statistics. In the perceptual analysis, both groups showed a mild deviation for the parameters roughness, strain, and instability, but only G2 showed a mild impact for the overall degree of dysphonia. The mean of fundamental frequency was significantly lower for the G2, with a difference of 17.41Hz between the two groups. There was no impact on V-RQOL in any of the V-RQOL domains for this group. With the menopause, there is a change in women's voices, impacting on some voice parameters. However, there is no direct impact on their quality of life related to voice. Copyright © 2013 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  10. Voice Use Among Music Theory Teachers: A Voice Dosimetry and Self-Assessment Study.

    Science.gov (United States)

    Schiller, Isabel S; Morsomme, Dominique; Remacle, Angélique

    2017-07-25

    This study aimed (1) to investigate music theory teachers' professional and extra-professional vocal loading and background noise exposure, (2) to determine the correlation between vocal loading and background noise, and (3) to determine the correlation between vocal loading and self-evaluation data. Using voice dosimetry, 13 music theory teachers were monitored for one workweek. The parameters analyzed were voice sound pressure level (SPL), fundamental frequency (F0), phonation time, vocal loading index (VLI), and noise SPL. Spearman correlation was used to correlate vocal loading parameters (voice SPL, F0, and phonation time) and noise SPL. Each day, the subjects self-assessed their voice using visual analog scales. VLI and self-evaluation data were correlated using Spearman correlation. Vocal loading parameters and noise SPL were significantly higher in the professional than in the extra-professional environment. Voice SPL, phonation time, and female subjects' F0 correlated positively with noise SPL. VLI correlated with self-assessed voice quality, vocal fatigue, and amount of singing and speaking voice produced. Teaching music theory is a profession with high vocal demands. More background noise is associated with increased vocal loading and may indirectly increase the risk for voice disorders. Correlations between VLI and self-assessments suggest that these teachers are well aware of their vocal demands and feel their effect on voice quality and vocal fatigue. Visual analog scales seem to represent a useful tool for subjective vocal loading assessment and associated symptoms in these professional voice users. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  11. Free Software for Disorders of Human Communication

    Directory of Open Access Journals (Sweden)

    William Ricardo Rodríguez Dueñas

    2015-05-01

    Full Text Available Introduction: New technologies are increasingly used by the health sector for its implementation in therapeutic interventions. However, in the case of speech therapists, there are many unknown free software-based tools which could support their daily work. This paper summarizes fourteen free software-based tools that can support interventions in early stimulation, assessment and control of voice and speech, several resources for augmentative and alternative communication and tools that facilitate access to the computer. Materials and methods: The information presented here is the result of a general review of software-based tools designed to treat human communication disorders. Criteria for inclusion and exclusion were established to select tools and these were installed and tested. Results: 22 tools were found and 14 were selected and classified in these categories: Early stimulation and capture attention, acoustic signal processing of voice, speech processing, Augmentative and Alternative Communication and Other; the latter includes tools for access to the computer without the need for advanced computer skills. Discussion: The set of tools discussed in this paper provides free computer-based tools to therapists in order to help their interventions, additionally, promotes the improvement of computer skills so necessary in today’s society of professionals.

  12. Evaluation of Speech Recognition of Cochlear Implant Recipients Using Adaptive, Digital Remote Microphone Technology and a Speech Enhancement Sound Processing Algorithm.

    Science.gov (United States)

    Wolfe, Jace; Morais, Mila; Schafer, Erin; Agrawal, Smita; Koch, Dawn

    2015-05-01

    Cochlear implant recipients often experience difficulty with understanding speech in the presence of noise. Cochlear implant manufacturers have developed sound processing algorithms designed to improve speech recognition in noise, and research has shown these technologies to be effective. Remote microphone technology utilizing adaptive, digital wireless radio transmission has also been shown to provide significant improvement in speech recognition in noise. There are no studies examining the potential improvement in speech recognition in noise when these two technologies are used simultaneously. The goal of this study was to evaluate the potential benefits and limitations associated with the simultaneous use of a sound processing algorithm designed to improve performance in noise (Advanced Bionics ClearVoice) and a remote microphone system that incorporates adaptive, digital wireless radio transmission (Phonak Roger). A two-by-two way repeated measures design was used to examine performance differences obtained without these technologies compared to the use of each technology separately as well as the simultaneous use of both technologies. Eleven Advanced Bionics (AB) cochlear implant recipients, ages 11 to 68 yr. AzBio sentence recognition was measured in quiet and in the presence of classroom noise ranging in level from 50 to 80 dBA in 5-dB steps. Performance was evaluated in four conditions: (1) No ClearVoice and no Roger, (2) ClearVoice enabled without the use of Roger, (3) ClearVoice disabled with Roger enabled, and (4) simultaneous use of ClearVoice and Roger. Speech recognition in quiet was better than speech recognition in noise for all conditions. Use of ClearVoice and Roger each provided significant improvement in speech recognition in noise. The best performance in noise was obtained with the simultaneous use of ClearVoice and Roger. ClearVoice and Roger technology each improves speech recognition in noise, particularly when used at the same time

  13. Use of digital speech recognition in diagnostics radiology

    International Nuclear Information System (INIS)

    Arndt, H.; Stockheim, D.; Mutze, S.; Petersein, J.; Gregor, P.; Hamm, B.

    1999-01-01

    Purpose: Applicability and benefits of digital speech recognition in diagnostic radiology were tested using the speech recognition system SP 6000. Methods: The speech recognition system SP 6000 was integrated into the network of the institute and connected to the existing Radiological Information System (RIS). Three subjects used this system for writing 2305 findings from dictation. After the recognition process the date, length of dictation, time required for checking/correction, kind of examination and error rate were recorded for every dictation. With the same subjects, a correlation was performed with 625 conventionally written finding. Results: After an 1-hour initial training the average error rates were 8.4 to 13.3%. The first adaptation of the speech recognition system (after nine days) decreased the average error rates to 2.4 to 10.7% due to the ability of the program to learn. The 2 nd and 3 rd adaptations resulted only in small changes of the error rate. An individual comparison of the error rate developments in the same kind of investigation showed the relative independence of the error rate on the individual user. Conclusion: The results show that the speech recognition system SP 6000 can be evaluated as an advantageous alternative for quickly recording radiological findings. A comparison between manually writing and dictating the findings verifies the individual differences of the writing speeds and shows the advantage of the application of voice recognition when faced with normal keyboard performance. (orig.) [de

  14. Automatic determination of pathological voice transformation coefficients for TDPDOLA using neural network

    International Nuclear Information System (INIS)

    Belgacem, H.; Cherif, A.

    2011-01-01

    One of the biggest challenges in vocal transformation with TD-PSOLA technique is the selection of modified parameters that will make a successful speech resynthesis. The best selection methods are by using human ratters. This study focuses on automatic determination of the pathological voice transformation coefficients using an Artificial Neural Network this way by comparing the results to the previous manual work. Four characterizied parameters (RATA-PLP, Jitter, Shimmer and RAP) were chosen. The system is developed with supervised training, consists of recognition (neural network) for synthesis (TD-PSOLA). The experimental results show that the parameter sets selected by the proposed system can be successfully used to resynthesize and demonstrating that our system can assist in vocal of pathological voice's transformation.

  15. Using voice input and audio feedback to enhance the reality of a virtual experience

    Energy Technology Data Exchange (ETDEWEB)

    Miner, N.E.

    1994-04-01

    Virtual Reality (VR) is a rapidly emerging technology which allows participants to experience a virtual environment through stimulation of the participant`s senses. Intuitive and natural interactions with the virtual world help to create a realistic experience. Typically, a participant is immersed in a virtual environment through the use of a 3-D viewer. Realistic, computer-generated environment models and accurate tracking of a participant`s view are important factors for adding realism to a virtual experience. Stimulating a participant`s sense of sound and providing a natural form of communication for interacting with the virtual world are equally important. This paper discusses the advantages and importance of incorporating voice recognition and audio feedback capabilities into a virtual world experience. Various approaches and levels of complexity are discussed. Examples of the use of voice and sound are presented through the description of a research application developed in the VR laboratory at Sandia National Laboratories.

  16. Emotional voice processing: investigating the role of genetic variation in the serotonin transporter across development.

    Directory of Open Access Journals (Sweden)

    Tobias Grossmann

    Full Text Available The ability to effectively respond to emotional information carried in the human voice plays a pivotal role for social interactions. We examined how genetic factors, especially the serotonin transporter genetic variation (5-HTTLPR, affect the neurodynamics of emotional voice processing in infants and adults by measuring event-related brain potentials (ERPs. The results revealed that infants distinguish between emotions during an early perceptual processing stage, whereas adults recognize and evaluate the meaning of emotions during later semantic processing stages. While infants do discriminate between emotions, only in adults was genetic variation associated with neurophysiological differences in how positive and negative emotions are processed in the brain. This suggests that genetic association with neurocognitive functions emerges during development, emphasizing the role that variation in serotonin plays in the maturation of brain systems involved in emotion recognition.

  17. Face Detection and Recognition

    National Research Council Canada - National Science Library

    Jain, Anil K

    2004-01-01

    This report describes research efforts towards developing algorithms for a robust face recognition system to overcome many of the limitations found in existing two-dimensional facial recognition systems...

  18. Graphical symbol recognition

    OpenAIRE

    K.C. , Santosh; Wendling , Laurent

    2015-01-01

    International audience; The chapter focuses on one of the key issues in document image processing i.e., graphical symbol recognition. Graphical symbol recognition is a sub-field of a larger research domain: pattern recognition. The chapter covers several approaches (i.e., statistical, structural and syntactic) and specially designed symbol recognition techniques inspired by real-world industrial problems. It, in general, contains research problems, state-of-the-art methods that convey basic s...

  19. Updating signal typing in voice: addition of type 4 signals.

    Science.gov (United States)

    Sprecher, Alicia; Olszewski, Aleksandra; Jiang, Jack J; Zhang, Yu

    2010-06-01

    The addition of a fourth type of voice to Titze's voice classification scheme is proposed. This fourth voice type is characterized by primarily stochastic noise behavior and is therefore unsuitable for both perturbation and correlation dimension analysis. Forty voice samples were classified into the proposed four types using narrowband spectrograms. Acoustic, perceptual, and correlation dimension analyses were completed for all voice samples. Perturbation measures tended to increase with voice type. Based on reliability cutoffs, the type 1 and type 2 voices were considered suitable for perturbation analysis. Measures of unreliability were higher for type 3 and 4 voices. Correlation dimension analyses increased significantly with signal type as indicated by a one-way analysis of variance. Notably, correlation dimension analysis could not quantify the type 4 voices. The proposed fourth voice type represents a subset of voices dominated by noise behavior. Current measures capable of evaluating type 4 voices provide only qualitative data (spectrograms, perceptual analysis, and an infinite correlation dimension). Type 4 voices are highly complex and the development of objective measures capable of analyzing these voices remains a topic of future investigation.

  20. Altered emotional recognition and expression in patients with Parkinson’s disease

    Directory of Open Access Journals (Sweden)

    Jin Y

    2017-11-01

    Full Text Available Yazhou Jin,* Zhiqi Mao,* Zhipei Ling, Xin Xu, Zhiyuan Zhang, Xinguang Yu Department of Neurosurgery, People’s Liberation Army General Hospital, Beijing, People’s Republic of China *These authors contributed equally to this work Background: Parkinson’s disease (PD patients exhibit deficits in emotional recognition and expression abilities, including emotional faces and voices. The aim of this study was to explore emotional processing in pre-deep brain stimulation (pre-DBS PD patients using two sensory modalities (visual and auditory. Methods: Fifteen PD patients who needed DBS surgery and 15 healthy, age- and gender-matched controls were recruited as participants. All participants were assessed by the Karolinska Directed Emotional Faces database 50 Faces Recognition test. Vocal recognition was evaluated by the Montreal Affective Voices database 50 Voices Recognition test. For emotional facial expression, the participants were asked to imitate five basic emotions (neutral, happiness, anger, fear, and sadness. The subjects were required to express nonverbal vocalizations of the five basic emotions. Fifteen Chinese native speakers were recruited as decoders. We recorded the accuracy of the responses, reaction time, and confidence level. Results: For emotional recognition and expression, the PD group scored lower on both facial and vocal emotional processing than did the healthy control group. There were significant differences between the two groups in both reaction time and confidence level. A significant relationship was also found between emotional recognition and emotional expression when considering all participants between the two groups together. Conclusion: The PD group exhibited poorer performance on both the recognition and expression tasks. Facial emotion deficits and vocal emotion abnormalities were associated with each other. In addition, our data allow us to speculate that emotional recognition and expression may share a common

  1. Analysis of failure of voice production by a sound-producing voice prosthesis

    NARCIS (Netherlands)

    van der Torn, M.; van Gogh, C.D.L.; Verdonck-de Leeuw, I M; Festen, J.M.; Mahieu, H.F.

    OBJECTIVE: To analyse the cause of failing voice production by a sound-producing voice prosthesis (SPVP). METHODS: The functioning of a prototype SPVP is described in a female laryngectomee before and after its sound-producing mechanism was impeded by tracheal phlegm. This assessment included:

  2. Sandia software guidelines: Software quality planning

    Energy Technology Data Exchange (ETDEWEB)

    1987-08-01

    This volume is one in a series of Sandia Software Guidelines intended for use in producing quality software within Sandia National Laboratories. In consonance with the IEEE Standard for Software Quality Assurance Plans, this volume identifies procedures to follow in producing a Software Quality Assurance Plan for an organization or a project, and provides an example project SQA plan. 2 figs., 4 tabs.

  3. Avoidable Software Procurements

    Science.gov (United States)

    2012-09-01

    software license, software usage, ELA, Software as a Service , SaaS , Software Asset...PaaS Platform as a Service SaaS Software as a Service SAM Software Asset Management SMS System Management Server SEWP Solutions for Enterprise Wide...delivery of full Cloud Services , we will see the transition of the Cloud Computing service model from Iaas to SaaS , or Software as a Service . Software

  4. Interactive Augmentation of Voice Quality and Reduction of Breath Airflow in the Soprano Voice.

    Science.gov (United States)

    Rothenberg, Martin; Schutte, Harm K

    2016-11-01

    In 1985, at a conference sponsored by the National Institutes of Health, Martin Rothenberg first described a form of nonlinear source-tract acoustic interaction mechanism by which some sopranos, singing in their high range, can use to reduce the total airflow, to allow holding the note longer, and simultaneously enrich the quality of the voice, without straining the voice. (M. Rothenberg, "Source-Tract Acoustic Interaction in the Soprano Voice and Implications for Vocal Efficiency," Fourth International Conference on Vocal Fold Physiology, New Haven, Connecticut, June 3-6, 1985.) In this paper, we describe additional evidence for this type of nonlinear source-tract interaction in some soprano singing and describe an analogous interaction phenomenon in communication engineering. We also present some implications for voice research and pedagogy. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  5. Silent Speech Recognition as an Alternative Communication Device for Persons with Laryngectomy.

    Science.gov (United States)

    Meltzner, Geoffrey S; Heaton, James T; Deng, Yunbin; De Luca, Gianluca; Roy, Serge H; Kline, Joshua C

    2017-12-01

    Each year thousands of individuals require surgical removal of their larynx (voice box) due to trauma or disease, and thereby require an alternative voice source or assistive device to verbally communicate. Although natural voice is lost after laryngectomy, most muscles controlling speech articulation remain intact. Surface electromyographic (sEMG) activity of speech musculature can be recorded from the neck and face, and used for automatic speech recognition to provide speech-to-text or synthesized speech as an alternative means of communication. This is true even when speech is mouthed or spoken in a silent (subvocal) manner, making it an appropriate communication platform after laryngectomy. In this study, 8 individuals at least 6 months after total laryngectomy were recorded using 8 sEMG sensors on their face (4) and neck (4) while reading phrases constructed from a 2,500-word vocabulary. A unique set of phrases were used for training phoneme-based recognition models for each of the 39 commonly used phonemes in English, and the remaining phrases were used for testing word recognition of the models based on phoneme identification from running speech. Word error rates were on average 10.3% for the full 8-sensor set (averaging 9.5% for the top 4 participants), and 13.6% when reducing the sensor set to 4 locations per individual (n=7). This study provides a compelling proof-of-concept for sEMG-based alaryngeal speech recognition, with the strong potential to further improve recognition performance.

  6. Software engineering architecture-driven software development

    CERN Document Server

    Schmidt, Richard F

    2013-01-01

    Software Engineering: Architecture-driven Software Development is the first comprehensive guide to the underlying skills embodied in the IEEE's Software Engineering Body of Knowledge (SWEBOK) standard. Standards expert Richard Schmidt explains the traditional software engineering practices recognized for developing projects for government or corporate systems. Software engineering education often lacks standardization, with many institutions focusing on implementation rather than design as it impacts product architecture. Many graduates join the workforce with incomplete skil

  7. FPGA-Based Implementation of Lithuanian Isolated Word Recognition Algorithm

    Directory of Open Access Journals (Sweden)

    Tomyslav Sledevič

    2013-05-01

    Full Text Available The paper describes the FPGA-based implementation of Lithuanian isolated word recognition algorithm. FPGA is selected for parallel process implementation using VHDL to ensure fast signal processing at low rate clock signal. Cepstrum analysis was applied to features extraction in voice. The dynamic time warping algorithm was used to compare the vectors of cepstrum coefficients. A library of 100 words features was created and stored in the internal FPGA BRAM memory. Experimental testing with speaker dependent records demonstrated the recognition rate of 94%. The recognition rate of 58% was achieved for speaker-independent records. Calculation of cepstrum coefficients lasted for 8.52 ms at 50 MHz clock, while 100 DTWs took 66.56 ms at 25 MHz clock.Article in Lithuanian

  8. A system of automatic speaker recognition on a minicomputer

    International Nuclear Information System (INIS)

    El Chafei, Cherif

    1978-01-01

    This study describes a system of automatic speaker recognition using the pitch of the voice. The pre-treatment consists in the extraction of the speakers' discriminating characteristics taken from the pitch. The programme of recognition gives, firstly, a preselection and then calculates the distance between the speaker's characteristics to be recognized and those of the speakers already recorded. An experience of recognition has been realized. It has been undertaken with 15 speakers and included 566 tests spread over an intermittent period of four months. The discriminating characteristics used offer several interesting qualities. The algorithms concerning the measure of the characteristics on one hand, the speakers' classification on the other hand, are simple. The results obtained in real time with a minicomputer are satisfactory. Furthermore they probably could be improved if we considered other speaker's discriminating characteristics but this was unfortunately not in our possibilities. (author) [fr

  9. Comparison of Forced-Alignment Speech Recognition and Humans for Generating Reference VAD

    DEFF Research Database (Denmark)

    Kraljevski, Ivan; Tan, Zheng-Hua; Paola Bissiri, Maria

    2015-01-01

    This present paper aims to answer the question whether forced-alignment speech recognition can be used as an alternative to humans in generating reference Voice Activity Detection (VAD) transcriptions. An investigation of the level of agreement between automatic/manual VAD transcriptions and the ......This present paper aims to answer the question whether forced-alignment speech recognition can be used as an alternative to humans in generating reference Voice Activity Detection (VAD) transcriptions. An investigation of the level of agreement between automatic/manual VAD transcriptions...... and the reference ones produced by a human expert was carried out. Thereafter, statistical analysis was employed on the automatically produced and the collected manual transcriptions. Experimental results confirmed that forced-alignment speech recognition can provide accurate and consistent VAD labels....

  10. Recognition and Toleration

    DEFF Research Database (Denmark)

    Lægaard, Sune

    2010-01-01

    Recognition and toleration are ways of relating to the diversity characteristic of multicultural societies. The article concerns the possible meanings of toleration and recognition, and the conflict that is often claimed to exist between these two approaches to diversity. Different forms...... or interpretations of recognition and toleration are considered, confusing and problematic uses of the terms are noted, and the compatibility of toleration and recognition is discussed. The article argues that there is a range of legitimate and importantly different conceptions of both toleration and recognition...

  11. The software life cycle

    CERN Document Server

    Ince, Darrel

    1990-01-01

    The Software Life Cycle deals with the software lifecycle, that is, what exactly happens when software is developed. Topics covered include aspects of software engineering, structured techniques of software development, and software project management. The use of mathematics to design and develop computer systems is also discussed. This book is comprised of 20 chapters divided into four sections and begins with an overview of software engineering and software development, paying particular attention to the birth of software engineering and the introduction of formal methods of software develop

  12. Voice and Narrative in L1 Writing

    DEFF Research Database (Denmark)

    Krogh, Ellen; Piekut, Anke

    2015-01-01

    This paper investigates issues of voice and narrative in L1 writing. Three branches of research are initial-ly discussed: research on narratives as resources for identity work, research on writer identity and voice as an essential aspect of identity, and research on Bildung in L1 writing. Subsequ...... training of voice and narratives as a resource for academic writing, and that the Bildung potential of L1 writing may be tied to this issue.......This paper investigates issues of voice and narrative in L1 writing. Three branches of research are initial-ly discussed: research on narratives as resources for identity work, research on writer identity and voice as an essential aspect of identity, and research on Bildung in L1 writing...... in lower secondary L1, she found that her previous writing strategies were not rewarded in upper secondary school. In the second empiri-cal study, two upper-secondary exam papers are investigated, with a focus on their approaches to exam genres and their use of narrative resources to address issues...

  13. 8 CFR 1292.2 - Organizations qualified for recognition; requests for recognition; withdrawal of recognition...

    Science.gov (United States)

    2010-01-01

    ...; requests for recognition; withdrawal of recognition; accreditation of representatives; roster. 1292.2...; requests for recognition; withdrawal of recognition; accreditation of representatives; roster. (a) Qualifications of organizations. A non-profit religious, charitable, social service, or similar organization...

  14. Hemispheric association and dissociation of voice and speech information processing in stroke.

    Science.gov (United States)

    Jones, Anna B; Farrall, Andrew J; Belin, Pascal; Pernet, Cyril R

    2015-10-01

    As we listen to someone speaking, we extract both linguistic and non-linguistic information. Knowing how these two sets of information are processed in the brain is fundamental for the general understanding of social communication, speech recognition and therapy of language impairments. We investigated the pattern of performances in phoneme versus gender categorization in left and right hemisphere stroke patients, and found an anatomo-functional dissociation in the right frontal cortex, establishing a new syndrome in voice discrimination abilities. In addition, phoneme and gender performances were most often associated than dissociated in the left hemisphere patients, suggesting a common neural underpinnings. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. Teachers’ voice use in teaching environment. Aspects on speakers’ comfort

    DEFF Research Database (Denmark)

    Lyberg-Åhlander, Viveka; Rydell, Roland; Löfqvist, Anders

    2015-01-01

    use and prevalence of voice problems in teachers and to explore their ratings of vocally loading aspects of their working environment. Method: A questionnaire-survey in 467 teachers aiming to explore the prevalence of voice problems in teaching staff identified teachers with voice problems and vocally...... in the teaching environment and aspects of the classroom environment were also measured. Results: Teachers with voice problems were more affected by any loading factor in the work-environment and were more perceptive of the room acoustics. Differences between the groups were found during field......-measurements of the voice, while there were no differences in the findings from the clinical examinations of larynx and voice. Conclusion: Teachers suffering from voice problems react stronger to loading factors in the teaching environment. It is in the interplay between the individual and the work environment that voice...

  16. Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

    International Nuclear Information System (INIS)

    Holzrichter, J.F.; Ng, L.C.

    1998-01-01

    The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs

  17. Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

    Science.gov (United States)

    Holzrichter, John F.; Ng, Lawrence C.

    1998-01-01

    The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching.

  18. Optical Pattern Recognition

    Science.gov (United States)

    Yu, Francis T. S.; Jutamulia, Suganda

    2008-10-01

    Contributors; Preface; 1. Pattern recognition with optics Francis T. S. Yu and Don A. Gregory; 2. Hybrid neural networks for nonlinear pattern recognition Taiwei Lu; 3. Wavelets, optics, and pattern recognition Yao Li and Yunglong Sheng; 4. Applications of the fractional Fourier transform to optical pattern recognition David Mendlovic, Zeev Zalesky and Haldum M. Oxaktas; 5. Optical implementation of mathematical morphology Tien-Hsin Chao; 6. Nonlinear optical correlators with improved discrimination capability for object location and recognition Leonid P. Yaroslavsky; 7. Distortion-invariant quadratic filters Gregory Gheen; 8. Composite filter synthesis as applied to pattern recognition Shizhou Yin and Guowen Lu; 9. Iterative procedures in electro-optical pattern recognition Joseph Shamir; 10. Optoelectronic hybrid system for three-dimensional object pattern recognition Guoguang Mu, Mingzhe Lu and Ying Sun; 11. Applications of photrefractive devices in optical pattern recognition Ziangyang Yang; 12. Optical pattern recognition with microlasers Eung-Gi Paek; 13. Optical properties and applications of bacteriorhodopsin Q. Wang Song and Yu-He Zhang; 14. Liquid-crystal spatial light modulators Aris Tanone and Suganda Jutamulia; 15. Representations of fully complex functions on real-time spatial light modulators Robert W. Cohn and Laurence G. Hassbrook; Index.

  19. ACOUSTIC SPEECH RECOGNITION FOR MARATHI LANGUAGE USING SPHINX

    Directory of Open Access Journals (Sweden)

    Aman Ankit

    2016-09-01

    Full Text Available Speech recognition or speech to text processing, is a process of recognizing human speech by the computer and converting into text. In speech recognition, transcripts are created by taking recordings of speech as audio and their text transcriptions. Speech based applications which include Natural Language Processing (NLP techniques are popular and an active area of research. Input to such applications is in natural language and output is obtained in natural language. Speech recognition mostly revolves around three approaches namely Acoustic phonetic approach, Pattern recognition approach and Artificial intelligence approach. Creation of acoustic model requires a large database of speech and training algorithms. The output of an ASR system is recognition and translation of spoken language into text by computers and computerized devices. ASR today finds enormous application in tasks that require human machine interfaces like, voice dialing, and etc. Our key contribution in this paper is to create corpora for Marathi language and explore the use of Sphinx engine for automatic speech recognition

  20. Voice and Video Telephony Services in Smartphone

    Directory of Open Access Journals (Sweden)

    2006-01-01

    Full Text Available Multimedia telephony is a delay-sensitive application. Packet losses, relatively less critical than delay, are allowed up to a certain threshold. They represent the QoS constraints that have to be respected to guarantee the operation of the telephony service and user satisfaction. In this work we introduce a new smartphone architecture characterized by two process levels called application processor (AP and mobile termination (MT, respectively. Here, they communicate through a serial channel. Moreover, we focus our attention on two very important UMTS services: voice and video telephony. Through a simulation study the impact of voice and video telephony is evaluated on the structure considered using the protocols known at this moment to realize voice and video telephony

  1. Effects of Voice on Emotional Arousal

    Directory of Open Access Journals (Sweden)

    Psyche eLoui

    2013-10-01

    Full Text Available Music is a powerful medium capable of eliciting a broad range of emotions. Although the relationship between language and music is well documented, relatively little is known about the effects of lyrics and the voice on the emotional processing of music and on listeners’ preferences. In the present study, we investigated the effects of vocals in music on participants’ perceived valence and arousal in songs. Participants (N = 50 made valence and arousal ratings for familiar songs that were presented with and without the voice. We observed robust effects of vocal content on perceived arousal. Furthermore, we found that the effect of the voice on enhancing arousal ratings is independent of familiarity of the song and differs across genders and age: females were more influenced by vocals than males; furthermore these gender effects were enhanced among older adults. Results highlight the effects of gender and aging in emotion perception and are discussed in terms of the social roles of music.

  2. A Voice Processing Technology for Rural Specific Context

    Science.gov (United States)

    He, Zhiyong; Zhang, Zhengguang; Zhao, Chunshen

    Durian the promotion and applications of rural information, different geographical dialect voice interaction is a very complex issue. Through in-depth analysis of TTS core technologies, this paper presents the methods of intelligent segmentation, word segmentation algorithm and intelligent voice thesaurus construction in the different dialects context. And then COM based development methodology for specific context voice processing system implementation and programming method. The method has a certain reference value for the rural dialect and voice processing applications.

  3. Measurement of Voice Onset Time in Maxillectomy Patients

    OpenAIRE

    Hattori, Mariko; Sumita, Yuka I.; Taniguchi, Hisashi

    2014-01-01

    Objective speech evaluation using acoustic measurement is needed for the proper rehabilitation of maxillectomy patients. For digital evaluation of consonants, measurement of voice onset time is one option. However, voice onset time has not been measured in maxillectomy patients as their consonant sound spectra exhibit unique characteristics that make the measurement of voice onset time challenging. In this study, we established criteria for measuring voice onset time in maxillectomy patients ...

  4. ASERA: A Spectrum Eye Recognition Assistant

    Science.gov (United States)

    Yuan, Hailong; Zhang, Haotong; Zhang, Yanxia; Lei, Yajuan; Dong, Yiqiao; Zhao, Yongheng

    2018-04-01

    ASERA, ASpectrum Eye Recognition Assistant, aids in quasar spectral recognition and redshift measurement and can also be used to recognize various types of spectra of stars, galaxies and AGNs (Active Galactic Nucleus). This interactive software allows users to visualize observed spectra, superimpose template spectra from the Sloan Digital Sky Survey (SDSS), and interactively access related spectral line information. ASERA is an efficient and user-friendly semi-automated toolkit for the accurate classification of spectra observed by LAMOST (the Large Sky Area Multi-object Fiber Spectroscopic Telescope) and is available as a standalone Java application and as a Java applet. The software offers several functions, including wavelength and flux scale settings, zoom in and out, redshift estimation, and spectral line identification.

  5. Miniature EVA Software Defined Radio

    Science.gov (United States)

    Pozhidaev, Aleksey

    2012-01-01

    As NASA embarks upon developing the Next-Generation Extra Vehicular Activity (EVA) Radio for deep space exploration, the demands on EVA battery life will substantially increase. The number of modes and frequency bands required will continue to grow in order to enable efficient and complex multi-mode operations including communications, navigation, and tracking applications. Whether conducting astronaut excursions, communicating to soldiers, or first responders responding to emergency hazards, NASA has developed an innovative, affordable, miniaturized, power-efficient software defined radio that offers unprecedented power-efficient flexibility. This lightweight, programmable, S-band, multi-service, frequency- agile EVA software defined radio (SDR) supports data, telemetry, voice, and both standard and high-definition video. Features include a modular design, an easily scalable architecture, and the EVA SDR allows for both stationary and mobile battery powered handheld operations. Currently, the radio is equipped with an S-band RF section. However, its scalable architecture can accommodate multiple RF sections simultaneously to cover multiple frequency bands. The EVA SDR also supports multiple network protocols. It currently implements a Hybrid Mesh Network based on the 802.11s open standard protocol. The radio targets RF channel data rates up to 20 Mbps and can be equipped with a real-time operating system (RTOS) that can be switched off for power-aware applications. The EVA SDR's modular design permits implementation of the same hardware at all Network Nodes concept. This approach assures the portability of the same software into any radio in the system. It also brings several benefits to the entire system including reducing system maintenance, system complexity, and development cost.

  6. ESTSC - Software Best Practices

    Science.gov (United States)

    DOE Scientific and Technical Software Best Practices December 2010 Table of Contents 1.0 Introduction 2.0 Responsibilities 2.1 OSTI/ESTSC 2.2 SIACs 2.3 Software Submitting Sites/Creators 2.4 Software Sensitivity Review 3.0 Software Announcement and Submission 3.1 STI Software Appropriate for Announcement 3.2

  7. Software Assurance Competency Model

    Science.gov (United States)

    2013-03-01

    COTS) software , and software as a service ( SaaS ). L2: Define and analyze risks in the acquisition of contracted software , COTS software , and SaaS ...2010a]: Application of technologies and processes to achieve a required level of confidence that software systems and services function in the...

  8. The electronic cry: Voice and gender in electroacoustic music

    NARCIS (Netherlands)

    Bosma, H.M.

    2013-01-01

    The voice provides an entrance to discuss gender and related fundamental issues in electroacoustic music that are relevant as well in other musical genres and outside of music per se: the role of the female voice; the use of language versus non-verbal vocal sounds; the relation of voice, embodiment

  9. Original Knowledge, Gender and the Word's Mythology: Voicing the Doctorate

    Science.gov (United States)

    Carter, Susan

    2012-01-01

    Using mythology as a generative matrix, this article investigates the relationship between knowledge, words, embodiment and gender as they play out in academic writing's voice and, in particular, in doctoral voice. The doctoral thesis is defensive, a performance seeking admittance into discipline scholarship. Yet in finding its scholarly voice,…

  10. The Influence of Sleep Disorders on Voice Quality.

    Science.gov (United States)

    Rocha, Bruna Rainho; Behlau, Mara

    2017-09-19

    To verify the influence of sleep quality on the voice. Descriptive and analytical cross-sectional study. Data were collected by an online or printed survey divided in three parts: (1) demographic data and vocal health aspects; (2) self-assessment of sleep and vocal quality, and the influence that sleep has on voice; and (3) sleep and voice self-assessment inventories-the Epworth Sleepiness Scale (ESS), the Pittsburgh Sleep Quality Index (PSQI), and the Voice Handicap Index reduced version (VHI-10). A total of 862 people were included (493 women, 369 men), with a mean age of 32 years old (maximum age of 79 and minimum age of 18 years old). The perception of the influence that sleep has on voice showed a difference (P influence a voice handicap are vocal self-assessment, ESS total score, and self-assessment of the influence that sleep has on voice. The absence of daytime sleepiness is a protective factor (odds ratio [OR] > 1) against perceived voice handicap; the presence of daytime sleepiness is a damaging factor (OR influences voice. Perceived poor sleep quality is related to perceived poor vocal quality. Individuals with a voice handicap observe a greater influence of sleep on voice than those without. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  11. Acoustic Analysis of Voice in Singers: A Systematic Review

    Science.gov (United States)

    Gunjawate, Dhanshree R.; Ravi, Rohit; Bellur, Rajashekhar

    2018-01-01

    Purpose: Singers are vocal athletes having specific demands from their voice and require special consideration during voice evaluation. Presently, there is a lack of standards for acoustic evaluation in them. The aim of the present study was to systematically review the available literature on the acoustic analysis of voice in singers. Method: A…

  12. Voice Disorders in Occupations with Vocal Load in Slovenia.

    Science.gov (United States)

    Boltežar, Lučka; Šereg Bahar, Maja

    2014-12-01

    The aim of this paper is to compare the prevalence of voice disorders and the risk factors for them in different occupations with a vocal load in Slovenia. A meta-analysis of six different Slovenian studies involving teachers, physicians, salespeople, catholic priests, nurses and speech-and-language therapists (SLTs) was performed. In all six studies, similar questions about the prevalence of voice disorders and the causes for them were included. The comparison of the six studies showed that more than 82% of the 2347 included subjects had voice problems at some time during their career. The teachers were the most affected by voice problems. The prevalent cause of voice problems was the vocal load in teachers and salespeople and respiratory-tract infections in all the other occupational groups. When the occupational groups were compared, it was stated that the teachers had more voice problems and showed less care for their voices than the priests. The physicians had more voice problems and showed better consideration of vocal hygiene rules than the SLTs. The majority of all the included subjects did not receive instructions about voice care during education. In order to decrease the prevalence of voice disorders in vocal professionals, a screening program is recommended before the beginning of their studies. Regular courses on voice care and proper vocal technique should be obligatory for all professional voice users during their career. The inclusion of dysphonia in the list of occupational diseases should be considered in Slovenia as it is in some European countries.

  13. The Voice Pump: an Affectively Engaging Interface for Changing Attachments

    DEFF Research Database (Denmark)

    Fritsch, Jonas; Jacobsen, Mogens

    2017-01-01

    In this paper, we present the preliminary results from an ongoing interaction design experiment, the Voice Pump. The Voice Pump is an affectively engaging air-based interface for attuning to the differential qualities of voices in order to change attachments between native Danish speakers and non-native...

  14. Software attribute visualization for high integrity software

    Energy Technology Data Exchange (ETDEWEB)

    Pollock, G.M.

    1998-03-01

    This report documents a prototype tool developed to investigate the use of visualization and virtual reality technologies for improving software surety confidence. The tool is utilized within the execution phase of the software life cycle. It provides a capability to monitor an executing program against prespecified requirements constraints provided in a program written in the requirements specification language SAGE. The resulting Software Attribute Visual Analysis Tool (SAVAnT) also provides a technique to assess the completeness of a software specification.

  15. Characterization of the voice of children with mouth breathing caused by four different etiologies using perceptual and acoustic analyses

    Directory of Open Access Journals (Sweden)

    Rosana Tiepo Arévalo

    2005-09-01

    Full Text Available Objective: To describe vocal characteristics in children aged fiveto twelve years with mouth breathing caused by four etiologies:chronic rhinitis, hypertrophy, hypertrophy + chronic rhinitis andfunctional condition, using perceptual evaluation and acousticanalysis. Methods: Voice recordings of 120 mouth breathers judgedby four speech pathologists using the software Multi-Speech.Results: The perceptual evaluation of the voice revealed highincidence of breathy and hoarse voices, especially in the rhinitisgroup. Most cases were moderate, with low pitch and normalloudness. Hyponasality was found in over 50% of sample, asexpected, but we also found high occurrence of laryngealresonance, especially in the rhinitis group. Mean fundamentalfrequency was 24.81Hz, SD = 15.02; jitter = 2.17; shimmer =0.44, and HNR = 2.11. Values did not show statistically significantdifference among the groups. Conclusion: Perceptual evaluation ofthe voice revealed that most mouth breathers presented hoarseand breathy voice, low pitch, normal loudness and hyponasal andlaryngeal resonance. However, the acoustic analysis did not resultin any significant condition.

  16. Pattern recognition & machine learning

    CERN Document Server

    Anzai, Y

    1992-01-01

    This is the first text to provide a unified and self-contained introduction to visual pattern recognition and machine learning. It is useful as a general introduction to artifical intelligence and knowledge engineering, and no previous knowledge of pattern recognition or machine learning is necessary. Basic for various pattern recognition and machine learning methods. Translated from Japanese, the book also features chapter exercises, keywords, and summaries.

  17. The Study of Application System for Small and Medium CTI Based on Voice Card

    Directory of Open Access Journals (Sweden)

    Zhong Dong

    2016-01-01

    Full Text Available With the rapid development of computer telecommunications integration (CTI technology, the development of application system for small and medium CTI are updated constantly, but the study of application system for small and medium CTI, we are lack of a stability and unified model. In this paper, the author analyzes the unified structure platform of application system for small and medium CTI based on voice card. Meanwhile, the author introduces a suitable software architecture model and general procedural framework for application system for small and medium CTI based on voice card by using the idea of hierarchical design, which shows the versatility of the architecture. It provided an efficient channel for the development of small and medium CTI.

  18. Neurocognition and symptoms identify links between facial recognition and emotion processing in schizophrenia: meta-analytic findings.

    Science.gov (United States)

    Ventura, Joseph; Wood, Rachel C; Jimenez, Amy M; Hellemann, Gerhard S

    2013-12-01

    In schizophrenia patients, one of the most commonly studied deficits of social cognition is emotion processing (EP), which has documented links to facial recognition (FR). But, how are deficits in facial recognition linked to emotion processing deficits? Can neurocognitive and symptom correlates of FR and EP help differentiate the unique contribution of FR to the domain of social cognition? A meta-analysis of 102 studies (combined n=4826) in schizophrenia patients was conducted to determine the magnitude and pattern of relationships between facial recognition, emotion processing, neurocognition, and type of symptom. Meta-analytic results indicated that facial recognition and emotion processing are strongly interrelated (r=.51). In addition, the relationship between FR and EP through voice prosody (r=.58) is as strong as the relationship between FR and EP based on facial stimuli (r=.53). Further, the relationship between emotion recognition, neurocognition, and symptoms is independent of the emotion processing modality - facial stimuli and voice prosody. The association between FR and EP that occurs through voice prosody suggests that FR is a fundamental cognitive process. The observed links between FR and EP might be due to bottom-up associations between neurocognition and EP, and not simply because most emotion recognition tasks use visual facial stimuli. In addition, links with symptoms, especially negative symptoms and disorganization, suggest possible symptom mechanisms that contribute to FR and EP deficits. © 2013 Elsevier B.V. All rights reserved.

  19. Statistical Pattern Recognition

    CERN Document Server

    Webb, Andrew R

    2011-01-01

    Statistical pattern recognition relates to the use of statistical techniques for analysing data measurements in order to extract information and make justified decisions.  It is a very active area of study and research, which has seen many advances in recent years. Applications such as data mining, web searching, multimedia data retrieval, face recognition, and cursive handwriting recognition, all require robust and efficient pattern recognition techniques. This third edition provides an introduction to statistical pattern theory and techniques, with material drawn from a wide range of fields,

  20. The relation of vocal fold lesions and voice quality to voice handicap and psychosomatic well-being

    NARCIS (Netherlands)

    Smits, R.; Marres, H.A.; de Jong, F.

    2012-01-01

    BACKGROUND: Voice disorders have a multifactorial genesis and may be present in various ways. They can cause a significant communication handicap and impaired quality of life. OBJECTIVE: To assess the effect of vocal fold lesions and voice quality on voice handicap and psychosomatic well-being.

  1. Reliability of software

    International Nuclear Information System (INIS)

    Kopetz, H.

    1980-01-01

    Common factors and differences in the reliability of hardware and software; reliability increase by means of methods of software redundancy. Maintenance of software for long term operating behavior. (HP) [de

  2. Voice Activated Cockpit Management Systems: Voice-Flight NexGen, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — Speaking to the cockpit as a method of system management in flight can become an effective interaction method, since voice communication is very efficient. Automated...

  3. Voice Over Internet Protocol Testbed Design for Non-Intrusive, Objective Voice Quality Assessment

    National Research Council Canada - National Science Library

    Manka, David L

    2007-01-01

    Voice over Internet Protocol (VoIP) is an emerging technology with the potential to assist the United States Marine Corps in solving communication challenges stemming from modern operational concepts...

  4. Epidemiology of Voice Disorders in Latvian School Teachers.

    Science.gov (United States)

    Trinite, Baiba

    2017-07-01

    The prevalence of voice disorders in the teacher population in Latvia has not been studied so far and this is the first epidemiological study whose goal is to investigate the prevalence of voice disorders and their risk factors in this professional group. A wide cross-sectional study using stratified sampling methodology was implemented in the general education schools of Latvia. The self-administered voice risk factor questionnaire and the Voice Handicap Index were completed by 522 teachers. Two teachers groups were formed: the voice disorders group which included 235 teachers with actual voice problems or problems during the last 9 months; and the control group which included 174 teachers without voice disorders. Sixty-six percent of teachers gave a positive answer to the following question: Have you ever had problems with your voice? Voice problems are more often found in female than male teachers (68.2% vs 48.8%). Music teachers suffer from voice disorders more often than teachers of other subjects. Eighty-two percent of teachers first faced voice problems in their professional carrier. The odds of voice disorders increase if the following risk factors exist: extra vocal load, shouting, throat clearing, neglecting of personal health, background noise, chronic illnesses of the upper respiratory tract, allergy, job dissatisfaction, and regular stress in the working place. The study findings indicated a high risk of voice disorders among Latvian teachers. The study confirmed data concerning the multifactorial etiology of voice disorders. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  5. One Voice Too Many: Echoes of Irony and Trauma in Oedipus the King

    Directory of Open Access Journals (Sweden)

    Joshua Waggoner

    2017-11-01

    Full Text Available Sophocles’ Oedipus the King has often inspired concurrent interpretations examining the tragic irony of the play and the traumatic neurosis of its protagonist. The Theban king epitomizes a man who knows everything but himself, and Sophocles’ use of irony allows Oedipus to discover the truth in a manner that Freud viewed in The Interpretation of Dreams as “comparable to the work of a psychoanalysis.” Psychoanalytical readings of Oedipus at times depend greatly on his role as a doubled figure, but this article specifically investigates his doubled voice in order to demonstrate the interrelated, chiasmic relationship between Oedipus’ trauma and the trope of irony. It argues, in fact, that irony serves as the language, so to speak, of the traumatic experiences haunting the king and his city, but it also posits that this doubled voice compounds the irony of the play and its hero. In other words, in addition to the Sophoclean irony that dominates the work, the doubling of the king’s voice reveals a modified form of Socratic irony that contributes to the tragedy’s power. Consequently, even after the king’s recognition of the truth ultimately resolves the work’s tragic irony, Oedipus remains divided by a state of simultaneous knowledge and ignorance.

  6. Space Flight Software Development Software for Intelligent System Health Management

    Science.gov (United States)

    Trevino, Luis C.; Crumbley, Tim

    2004-01-01

    The slide presentation examines the Marshall Space Flight Center Flight Software Branch, including software development projects, mission critical space flight software development, software technical insight, advanced software development technologies, and continuous improvement in the software development processes and methods.

  7. Software Engineering Guidebook

    Science.gov (United States)

    Connell, John; Wenneson, Greg

    1993-01-01

    The Software Engineering Guidebook describes SEPG (Software Engineering Process Group) supported processes and techniques for engineering quality software in NASA environments. Three process models are supported: structured, object-oriented, and evolutionary rapid-prototyping. The guidebook covers software life-cycles, engineering, assurance, and configuration management. The guidebook is written for managers and engineers who manage, develop, enhance, and/or maintain software under the Computer Software Services Contract.

  8. Engaging retailers: giving them voice or controlling their voice, a supplier's perspective

    OpenAIRE

    Jackson, Keith; Jackson, Jacqui; Hopkinson, Gillian

    2013-01-01

    This full paper from the Marketing and Retail track of BAM 2013 investigates the relationships between suppliers and retailers in the UK convenience store sector in terms of Hirschman's model whereby members of a group can influence it by either expressing their opinions (voice) or leaving it in protest (exit). Suppliers may create loyalty among retailers by raising exit costs and/or allowing them to express their voices. The investigation was carried out using the recorded turnover of the to...

  9. Voice Onset Time in Parkinson Disease

    Science.gov (United States)

    Fischer, Emily; Goberman, Alexander M.

    2010-01-01

    Research has found that speaking rate has an effect on voice onset time (VOT). Given that Parkinson disease (PD) affects speaking rate, the purpose of this study was to examine VOT with the effect of rate removed (VOT ratio), along with the traditional VOT measure, in individuals with PD. VOT and VOT ratio were examined in 9 individuals with PD…

  10. Classroom Noise and Teachers' Voice Production

    Science.gov (United States)

    Rantala, Leena M.; Hakala, Suvi; Holmqvist, Sofia; Sala, Eeva

    2015-01-01

    Purpose: The aim of this study was to research the associations between noise (ambient and activity noise) and objective metrics of teachers' voices in real working environments (i.e., classrooms). Method: Thirty-two female and 8 male teachers from 14 elementary schools were randomly selected for the study. Ambient noise was measured during breaks…

  11. Student Voice Initiative: Exploring Implementation Strategies

    Science.gov (United States)

    Alexander, Blaine G.

    2017-01-01

    Student voice is the process of allowing students to work collaboratively with adults to produce a learning culture that is conducive for optimum growth in every student. In a traditional setting, the adults make the decisions and the students are passive observers in the learning process. Data has shown that this traditional culture is not…

  12. Developing Student Voices on the Internet.

    Science.gov (United States)

    Dresang, Eliza T.

    1997-01-01

    Books and online discussion groups encourage youth to develop strong narrative voices. Includes an annotated bibliography of books and Internet sites dealing with discovering the self and others; exploring race, culture, archeology, technology, war, poverty, gender and urban problems; creating and critiquing stories; and publishing industry…

  13. Speech masking and cancelling and voice obscuration

    Science.gov (United States)

    Holzrichter, John F.

    2013-09-10

    A non-acoustic sensor is used to measure a user's speech and then broadcasts an obscuring acoustic signal diminishing the user's vocal acoustic output intensity and/or distorting the voice sounds making them unintelligible to persons nearby. The non-acoustic sensor is positioned proximate or contacting a user's neck or head skin tissue for sensing speech production information.

  14. Web life: Voices of the Manhattan Project

    Science.gov (United States)

    2014-02-01

    Voices of the Manhattan Project was launched in October 2012 with the aim of preserving the memories and experiences of scientists and other workers who participated in the US-led effort to build an atomic bomb during the Second World War.

  15. Ventriloquising the Voice: Writing in the University

    Science.gov (United States)

    Fulford, Amanda

    2009-01-01

    In this paper I consider one aspect of how student writing is supported in the university. I focus on the use of the "writing frame", questioning its status as a vehicle for facilitating student voice, and in the process questioning how that notion is itself understood. I illustrate this by using examples from the story of the 1944 Hollywood film…

  16. Changes after voice therapy in objective and subjective voice measurements of pediatric patients with vocal nodules.

    Science.gov (United States)

    Tezcaner, Ciler Zahide; Karatayli Ozgursoy, Selmin; Ozgursoy, Selmin Karatayli; Sati, Isil; Dursun, Gursel

    2009-12-01

    The aim of this study was to analyze the efficiency of the voice therapy in children with vocal nodules by using the acoustic analysis and subjective assessment. Thirty-nine patients with vocal fold nodules, aged between 7 and 14, were included in the study. Each subject had voice therapy led by an experienced voice therapist once a week. All diagnostic and follow-up workouts were performed before the voice therapy and after the third or the sixth month. Transoral and/or transnasal videostroboscopic examination and acoustic analysis were achieved using multi-dimensional voice program (MDVP) and subjective analysis with GRBAS scale. As for the perceptual assessment, the difference was significant for four parameters out of five. A significant improvement was found in the acoustic analysis parameters of jitter, shimmer, and noise-to-harmonic ratio. The voice therapy which was planned according to patients' needs, age, compliance and response to therapy had positive effects on pediatric patients with vocal nodules. Acoustic analysis and GRBAS may be used successfully in the follow-up of pediatric vocal nodule treatment.

  17. The Sound of Voice: Voice-Based Categorization of Speakers' Sexual Orientation within and across Languages.

    Directory of Open Access Journals (Sweden)

    Simone Sulpizio

    Full Text Available Empirical research had initially shown that English listeners are able to identify the speakers' sexual orientation based on voice cues alone. However, the accuracy of this voice-based categorization, as well as its generalizability to other languages (language-dependency and to non-native speakers (language-specificity, has been questioned recently. Consequently, we address these open issues in 5 experiments: First, we tested whether Italian and German listeners are able to correctly identify sexual orientation of same-language male speakers. Then, participants of both nationalities listened to voice samples and rated the sexual orientation of both Italian and German male speakers. We found that listeners were unable to identify the speakers' sexual orientation correctly. However, speakers were consistently categorized as either heterosexual or gay on the basis of how they sounded. Moreover, a similar pattern of results emerged when listeners judged the sexual orientation of speakers of their own and of the foreign language. Overall, this research suggests that voice-based categorization of sexual orientation reflects the listeners' expectations of how gay voices sound rather than being an accurate detector of the speakers' actual sexual identity. Results are discussed with regard to accuracy, acoustic features of voices, language dependency and language specificity.

  18. A methodology of error detection: Improving speech recognition in radiology

    OpenAIRE

    Voll, Kimberly Dawn

    2006-01-01

    Automated speech recognition (ASR) in radiology report dictation demands highly accurate and robust recognition software. Despite vendor claims, current implementations are suboptimal, leading to poor accuracy, and time and money wasted on proofreading. Thus, other methods must be considered for increasing the reliability and performance of ASR before it is a viable alternative to human transcription. One such method is post-ASR error detection, used to recover from the inaccuracy of speech r...

  19. Paradigms in object recognition

    International Nuclear Information System (INIS)

    Mutihac, R.; Mutihac, R.C.

    1999-09-01

    A broad range of approaches has been proposed and applied for the complex and rather difficult task of object recognition that involves the determination of object characteristics and object classification into one of many a priori object types. Our paper revises briefly the three main different paradigms in pattern recognition, namely Bayesian statistics, neural networks, and expert systems. (author)

  20. Infant Visual Recognition Memory

    Science.gov (United States)

    Rose, Susan A.; Feldman, Judith F.; Jankowski, Jeffery J.

    2004-01-01

    Visual recognition memory is a robust form of memory that is evident from early infancy, shows pronounced developmental change, and is influenced by many of the same factors that affect adult memory; it is surprisingly resistant to decay and interference. Infant visual recognition memory shows (a) modest reliability, (b) good discriminant…

  1. Recognition and Toleration

    DEFF Research Database (Denmark)

    Lægaard, Sune

    2010-01-01

    Recognition and toleration are ways of relating to the diversity characteristic of multicultural societies. The article concerns the possible meanings of toleration and recognition, and the conflict that is often claimed to exist between these two approaches to diversity. Different forms or inter...

  2. Self-Reported Acute and Chronic Voice Disorders in Teachers.

    Science.gov (United States)

    Rossi-Barbosa, Luiza Augusta Rosa; Barbosa, Mirna Rossi; Morais, Renata Martins; de Sousa, Kamilla Ferreira; Silveira, Marise Fagundes; Gama, Ana Cristina Côrtes; Caldeira, Antônio Prates

    2016-11-01

    The present study aimed to identify factors associated with self-reported acute and chronic voice disorders among municipal elementary school teachers in the city of Montes Claros, in the State of Minas Gerais, Brazil. The dependent variable, self-reported dysphonia, was determined via a single question, "Have you noticed changes in your voice quality?" and if so, a follow-up question queried the duration of this change, acute or chronic. The independent variables were dichotomized and divided into five categories: sociodemographic and economic data; lifestyle; organizational and environmental data; health-disease processes; and voice. Analyses of associated factors were performed via a hierarchical multiple logistic regression model. The present study included 226 teachers, of whom 38.9% reported no voice disorders, 35.4% reported an acute disorder, and 25.7% reported a chronic disorder. Excessive voice use daily, consuming more than one alcoholic drink per time, and seeking medical treatment because of voice disorders were associated factors for acute and chronic voice disorders. Consuming up to three glasses of water per day was associated with acute voice disorders. Among teachers who reported chronic voice disorders, teaching for over 15 years and the perception of disturbing or unbearable noise outside the school were both associated factors. Identification of organizational, environmental, and predisposing risk factors for voice disorders is critical, and furthermore, a vocal health promotion program may address these issues. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  3. The effect of network degradation on speech recognition

    CSIR Research Space (South Africa)

    Joubert, G

    2005-11-01

    Full Text Available become increasingly popular, VoIP (Voice over Internet Protocol) is predicted to become the standard means of spoken telecommunication. As a consequence, a significant amount of research has been undertaken on the effect of various packet... to measure the effect of network traffic degeneration during a VoIP transmission, on speech-recognition accuracy. Sentences from the TIMIT database [2] were selected as basis for comparison. The open-source toolkit SOX [3] was used to code the samples...

  4. Reproducibility of Automated Voice Range Profiles, a Systematic Literature Review

    DEFF Research Database (Denmark)

    Printz, Trine; Rosenberg, Tine; Godballe, Christian

    2018-01-01

    literature on test-retest accuracy of the automated voice range profile assessment. Study design: Systematic review. Data sources: PubMed, Scopus, Cochrane Library, ComDisDome, Embase, and CINAHL (EBSCO). Methods: We conducted a systematic literature search of six databases from 1983 to 2016. The following......Objective: Reliable voice range profiles are of great importance when measuring effects and side effects from surgery affecting voice capacity. Automated recording systems are increasingly used, but the reproducibility of results is uncertain. Our objective was to identify and review the existing...... keywords were used: phonetogram, voice range profile, and acoustic voice analysis. Inclusion criteria were automated recording procedure, healthy voices, and no intervention between test and retest. Test-retest values concerning fundamental frequency and voice intensity were reviewed. Results: Of 483...

  5. Challenging ocular image recognition

    Science.gov (United States)

    Pauca, V. Paúl; Forkin, Michael; Xu, Xiao; Plemmons, Robert; Ross, Arun A.

    2011-06-01

    Ocular recognition is a new area of biometric investigation targeted at overcoming the limitations of iris recognition performance in the presence of non-ideal data. There are several advantages for increasing the area beyond the iris, yet there are also key issues that must be addressed such as size of the ocular region, factors affecting performance, and appropriate corpora to study these factors in isolation. In this paper, we explore and identify some of these issues with the goal of better defining parameters for ocular recognition. An empirical study is performed where iris recognition methods are contrasted with texture and point operators on existing iris and face datasets. The experimental results show a dramatic recognition performance gain when additional features are considered in the presence of poor quality iris data, offering strong evidence for extending interest beyond the iris. The experiments also highlight the need for the direct collection of additional ocular imagery.

  6. Designing a Voice Controlled Interface For Radio : Guidelines for The First Generation of Voice Controlled Public Radio

    OpenAIRE

    Päärni, Anna

    2017-01-01

    From being a fictional element in sci-fi, voice control has become a reality, with inventions such as Apple's Siri, and interactive voice response (IVR) when calling your doctor's office. The combination of radio’s strength as a hands-free medium, public radio’s mission to reach across all platforms and the rise of voice makes up a relevant intersection; voice controlled public radio in Sweden. This thesis has aimed to investigate how radio listeners wish to interact using voice control to li...

  7. Bihippocampal damage with emotional dysfunction: impaired auditory recognition of fear.

    Science.gov (United States)

    Ghika-Schmid, F; Ghika, J; Vuilleumier, P; Assal, G; Vuadens, P; Scherer, K; Maeder, P; Uske, A; Bogousslavsky, J

    1997-01-01

    A right-handed man developed a sudden transient, amnestic syndrome associated with bilateral hemorrhage of the hippocampi, probably due to Urbach-Wiethe disease. In the 3rd month, despite significant hippocampal structural damage on imaging, only a milder degree of retrograde and anterograde amnesia persisted on detailed neuropsychological examination. On systematic testing of recognition of facial and vocal expression of emotion, we found an impairment of the vocal perception of fear, but not that of other emotions, such as joy, sadness and anger. Such selective impairment of fear perception was not present in the recognition of facial expression of emotion. Thus emotional perception varies according to the different aspects of emotions and the different modality of presentation (faces versus voices). This is consistent with the idea that there may be multiple emotion systems. The study of emotional perception in this unique case of bilateral involvement of hippocampus suggests that this structure may play a critical role in the recognition of fear in vocal expression, possibly dissociated from that of other emotions and from that of fear in facial expression. In regard of recent data suggesting that the amygdala is playing a role in the recognition of fear in the auditory as well as in the visual modality this could suggest that the hippocampus may be part of the auditory pathway of fear recognition.

  8. Recognition in Programmes for Children with Special Needs

    Directory of Open Access Journals (Sweden)

    Marjeta Šmid

    2016-09-01

    Full Text Available The purpose of this article is to examine the factors that affect the inclusion of pupils in programmes for children with special needs from the perspective of the theory of recognition. The concept of recognition, which includes three aspects of social justice (economic, cultural and political, argues that the institutional arrangements that prevent ‘parity of participation’ in the school social life of the children with special needs are affected not only by economic distribution but also by the patterns of cultural values. A review of the literature shows that the arrangements of education of children with special needs are influenced primarily by the patterns of cultural values of capability and inferiority, as well as stereotypical images of children with special needs. Due to the significant emphasis on learning skills for academic knowledge and grades, less attention is dedicated to factors of recognition and representational character, making it impossible to improve some meaningful elements of inclusion. Any participation of pupils in activities, the voices of the children, visibility of the children due to achievements and the problems of arbitrariness in determining boundaries between programmes are some such elements. Moreover, aided by theories, the actions that could contribute to better inclusion are reviewed. An effective approach to changes would be the creation of transformative conditions for the recognition and balancing of redistribution, recognition, and representation.

  9. The medical software quality deployment method.

    Science.gov (United States)

    Hallberg, N; Timpka, T; Eriksson, H

    1999-03-01

    The objective of this study was to develop a Quality Function Deployment (QFD) model for design of information systems in health-care environments. Consecutive blocked-subject case studies were conducted, based on action research methods. Starting with a QFD model for software development, a model for information system design, the Medical Software Quality Deployment (MSQD) model, was developed. The MSQD model was divided into the pre-study phase, in which the customer categories and their power to influence the design are determined; the data collection phase, in which the voice of customers (VoC) is identified by observations and interviews and quantified by Critical. Incident questionnaires; the need specification phase, where the VoC is specified into ranked customer needs; and the design phase where the customer needs are transformed stepwise to technical requirements and design attributes. QFD showed to be useful for integrating the values of different customer categories in software development for health-care settings. In the later design phases, other quality methods should be used for software implementation and testing.

  10. Image processing and analysis software development

    International Nuclear Information System (INIS)

    Shahnaz, R.

    1999-01-01

    The work presented in this project is aimed at developing a software 'IMAGE GALLERY' to investigate various image processing and analysis techniques. The work was divided into two parts namely the image processing techniques and pattern recognition, which further comprised of character and face recognition. Various image enhancement techniques including negative imaging, contrast stretching, compression of dynamic, neon, diffuse, emboss etc. have been studied. Segmentation techniques including point detection, line detection, edge detection have been studied. Also some of the smoothing and sharpening filters have been investigated. All these imaging techniques have been implemented in a window based computer program written in Visual Basic Neural network techniques based on Perception model have been applied for face and character recognition. (author)

  11. Ensuring Software IP Cleanliness

    Directory of Open Access Journals (Sweden)

    Mahshad Koohgoli

    2007-12-01

    Full Text Available At many points in the life of a software enterprise, determination of intellectual property (IP cleanliness becomes critical. The value of an enterprise that develops and sells software may depend on how clean the software is from the IP perspective. This article examines various methods of ensuring software IP cleanliness and discusses some of the benefits and shortcomings of current solutions.

  12. Commercial Literacy Software.

    Science.gov (United States)

    Balajthy, Ernest

    1997-01-01

    Presents the first year's results of a continuing project to monitor the availability of software of relevance for literacy education purposes. Concludes there is an enormous amount of software available for use by teachers of reading and literacy--whereas drill-and-practice software is the largest category of software available, large numbers of…

  13. Ensuring Software IP Cleanliness

    OpenAIRE

    Mahshad Koohgoli; Richard Mayer

    2007-01-01

    At many points in the life of a software enterprise, determination of intellectual property (IP) cleanliness becomes critical. The value of an enterprise that develops and sells software may depend on how clean the software is from the IP perspective. This article examines various methods of ensuring software IP cleanliness and discusses some of the benefits and shortcomings of current solutions.

  14. Statistical Software Engineering

    Science.gov (United States)

    1998-04-13

    multiversion software subject to coincident errors. IEEE Trans. Software Eng. SE-11:1511-1517. Eckhardt, D.E., A.K Caglayan, J.C. Knight, L.D. Lee, D.F...J.C. and N.G. Leveson. 1986. Experimental evaluation of the assumption of independence in multiversion software. IEEE Trans. Software

  15. Agile Software Development

    Science.gov (United States)

    Biju, Soly Mathew

    2008-01-01

    Many software development firms are now adopting the agile software development method. This method involves the customer at every level of software development, thus reducing the impact of change in the requirement at a later stage. In this article, the principles of the agile method for software development are explored and there is a focus on…

  16. Improving Software Developer's Competence

    DEFF Research Database (Denmark)

    Abrahamsson, Pekka; Kautz, Karlheinz; Sieppi, Heikki

    2002-01-01

    Emerging agile software development methods are people oriented development approaches to be used by the software industry. The personal software process (PSP) is an accepted method for improving the capabilities of a single software engineer. Five original hypotheses regarding the impact...

  17. Software - Naval Oceanography Portal

    Science.gov (United States)

    are here: Home › USNO › Earth Orientation › Software USNO Logo USNO Navigation Earth Orientation Products GPS-based Products VLBI-based Products EO Information Center Publications about Products Software Search databases Auxiliary Software Supporting Software Form Folder Earth Orientation Matrix Calculator

  18. Software Engineering Education Directory

    Science.gov (United States)

    1990-04-01

    and Engineering (CMSC 735) Codes: GPEV2 * Textiooks: IEEE Tutoria on Models and Metrics for Software Management and Engameeing by Basi, Victor R...Software Engineering (Comp 227) Codes: GPRY5 Textbooks: IEEE Tutoria on Software Design Techniques by Freeman, Peter and Wasserman, Anthony 1. Software

  19. Great software debates

    CERN Document Server

    Davis, A

    2004-01-01

    The industry’s most outspoken and insightful critic explains how the software industry REALLY works. In Great Software Debates, Al Davis, shares what he has learned about the difference between the theory and the realities of business and encourages you to question and think about software engineering in ways that will help you succeed where others fail. In short, provocative essays, Davis fearlessly reveals the truth about process improvement, productivity, software quality, metrics, agile development, requirements documentation, modeling, software marketing and sales, empiricism, start-up financing, software research, requirements triage, software estimation, and entrepreneurship.

  20. Clinical Features of Psychogenic Voice Disorder and the Efficiency of Voice Therapy and Psychological Evaluation.

    Science.gov (United States)

    Tezcaner, Zahide Çiler; Gökmen, Muhammed Fatih; Yıldırım, Sibel; Dursun, Gürsel

    2017-11-06

    The aim of this study was to define the clinical features of psychogenic voice disorder (PVD) and explore the treatment efficiency of voice therapy and psychological evaluation. Fifty-eight patients who received treatment following the PVD diagnosis and had no organic or other functional voice disorders were assessed retrospectively based on laryngoscopic examinations and subjective and objective assessments. Epidemiological characteristics, accompanying organic and psychological disorders, preferred methods of treatment, and previous treatment outcomes were examined for each patient. A comparison was made based on voice disorders and responses to treatment between patients who received psychotherapy and patients who did not. Participants in this study comprised 58 patients, 10 male and 48 female. Voice therapy was applied in all patients, 54 (93.1%) of whom had improvement in their voice. Although all patients were advised to undergo psychological assessment, only 60.3% (35/58) of them underwent psychological assessment. No statistically significant difference was found between patients who did receive psychological support concerning their treatment responses and patients who did not. Relapse occurred in 14.7% (5/34) of the patients who applied for psychological assessment and in 50% (10/20) of those who did not. There was a statistically significant difference in relapse rates, which was higher among patients who did not receive psychological support (P therapy is an efficient treatment method for PVD. However, in the long-term follow-up, relapse of the disease is observed to be higher among patients who failed to follow up on the recommendation for psychological assessment. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.