Goeritno, Arief; Ginting, Sandy Ferdiansyah; Yatim, Rakhmad
Minimum system berbasis mikrokontroler dan sensor voice recognition (VR) sebagai pengendali aktuator telah digunakan untuk pengoperasian beban listrik fase tunggal. Minimum system adalah suatu sistem yang tersusun melalui 2 (dua) tahapan, yaitu (a) diagram rangkaian dan bentuk fisis board dan (b) pengawatan terintegrasi terhadap minimum system pada sistem mikrokontroler ATmega16. Keberadaan sistem mikrokontroler pada minimum system perlu program tertanam melalui pemrograman berbasis bahasa ...
This thesis investigates the current status of VR technology, its use in support of Joint vision 2010, its use in the Healthcare environment and provides an analysis of the VR Pilot Project at NHRR...
Sferrella, Sheila M
You need a compelling reason to implement voice recognition technology. At my institution, the compelling reason was a turnaround time for Radiology results of more than two days. Only 41 percent of our reports were transcribed and signed within 24 hours. In November 1998, a team from Lehigh Valley Hospital went to RSNA and reviewed every voice system on the market. The evaluation was done with the radiologist workflow in mind, and we came back from the meeting with the vendor selection completed. The next steps included developing a business plan, approval of funds, reference calls to more than 15 sites and contract negotiation, all of which took about six months. The department of Radiology at Lehigh Valley Hospital and Health Network (LVHHN) is a multi-site center that performs over 360,000 procedures annually. The department handles all modalities of radiology: general diagnosis, neuroradiology, ultrasound, CT Scan, MRI, interventional radiology, arthography, myelography, bone densitometry, nuclear medicine, PET imaging, vascular lab and other advanced procedures. The department consists of 200 FTEs and a medical staff of more than 40 radiologists. The budget is in the $10.3 million range. There are three hospital sites and four outpatient imaging center sites where services are provided. At Lehigh Valley Hospital, radiologists are not dedicated to one subspecialty, so implementing a voice system by modality was not an option. Because transcription was so far behind, we needed to eliminate that part of the process. As a result, we decided to deploy the system all at once and with the radiologists as editors. The planning and testing phase took about four months, and the implementation took two weeks. We deployed over 40 workstations and trained close to 50 physicians. The radiologists brought in an extra radiologist from our group for the two weeks of training. That allowed us to train without taking a radiologist out of the department. We trained three to six
Bahreini, Kiavash; Nadolski, Rob; Westera, Wim
This paper introduces the voice emotion recognition part of our framework for improving learning through webcams and microphones (FILTWAM). This framework enables multimodal emotion recognition of learners during game-based learning. The main goal of this study is to validate the use of microphone
Bhan, S.N.; Coblentz, C.L.; Norman, G.R.; Ali, S.H.
To study the effect that voice recognition (VR) has on radiologist reporting efficiency in a clinical setting and to identify variables associated with faster reporting time. Five radiologists were observed during the routine reporting of 402 plain radiograph studies using either VR (n 217) or conventional dictation (CD) (n = 185). Two radiologists were observed reporting 66 computed tomography (CT) studies using either VR (n - 39) or CD (n - 27). The time spent per reporting cycle, defined as the radiologist's time spent on a study from report finalization to the subsequent report finalization, was compared. As well, characteristics about the radiologist and their reporting style were collected and correlated against reporting time. For plain radiographs, radiologists took 134% (P = 0.048) more time to produce reports using VR, but there was significant variability between radiologists. Significant association with faster reporting times using VR included: English as a first language (r-0.24), use of a template (r -0.34), use of a headset microphone (r -0.46), and increased experience with VR (r -0.43). Experience as a staff radiologist and having previous study for comparison did not correlate with reporting time. For CT, there was no significant difference in reporting time identified between VR and CD (P 0.61). Overall, VR slightly decreases the reporting efficiency of radiologists. However, efficiency may be improved if English is a first language, a headset microphone, and macros and templates are use. (author)
Full Text Available Behavioral studies of spoken word memory have shown that context congruency facilitates both word and source recognition, though the level at which context exerts its influence remains equivocal. We measured event-related potentials (ERPs while participants performed both types of recognition task with words spoken in four voices. Two voice parameters (i.e., gender and accent varied between speakers, with the possibility that none, one or two of these parameters was congruent between study and test. Results indicated that reinstating the study voice at test facilitated both word and source recognition, compared to similar or no context congruency at test. Behavioral effects were paralleled by two ERP modulations. First, in the word recognition test, the left parietal old/new effect showed a positive deflection reflective of context congruency between study and test words. Namely, the same speaker condition provided the most positive deflection of all correctly identified old words. In the source recognition test, a right frontal positivity was found for the same speaker condition compared to the different speaker conditions, regardless of response success. Taken together, the results of this study suggest that the benefit of context congruency is reflected behaviorally and in ERP modulations traditionally associated with recognition memory.
Campeanu, Sandra; Craik, Fergus I M; Alain, Claude
Behavioral studies of spoken word memory have shown that context congruency facilitates both word and source recognition, though the level at which context exerts its influence remains equivocal. We measured event-related potentials (ERPs) while participants performed both types of recognition task with words spoken in four voices. Two voice parameters (i.e., gender and accent) varied between speakers, with the possibility that none, one or two of these parameters was congruent between study and test. Results indicated that reinstating the study voice at test facilitated both word and source recognition, compared to similar or no context congruency at test. Behavioral effects were paralleled by two ERP modulations. First, in the word recognition test, the left parietal old/new effect showed a positive deflection reflective of context congruency between study and test words. Namely, the same speaker condition provided the most positive deflection of all correctly identified old words. In the source recognition test, a right frontal positivity was found for the same speaker condition compared to the different speaker conditions, regardless of response success. Taken together, the results of this study suggest that the benefit of context congruency is reflected behaviorally and in ERP modulations traditionally associated with recognition memory.
Rana, D.S.; Hurst, G.; Shepstone, L.; Pilling, J.; Cockburn, J.; Crawford, M.
AIM: To compare the efficiency and accuracy of radiology reports generated by voice recognition (VR) against the traditional tape dictation-transcription (DT) method. MATERIALS AND METHODS: Two hundred and twenty previously reported computed radiography (CR) and cross-sectional imaging (CSI) examinations were separately entered into the Radiology Information System (RIS) using both VR and DT. The times taken and errors found in the reports were compared using univariate analyses based upon the sign-test, and a general linear model constructed to examine the mean differences between the two methods. RESULTS: There were significant reductions (p<0.001) in the mean difference in the reporting times using VR compared with DT for the two reporting methods assessed (CR, +67.4; CSI, +122.1 s). There was a significant increase in the mean difference in the actual radiologist times using VR compared with DT in the CSI reports; -14.3 s, p=0.037 (more experienced user); -13.7 s, p=0.014 (less experienced user). There were significantly more total and major errors when using VR compared with DT for CR reports (-0.25 and -0.26, respectively), and in total errors for CSI (-0.75, p<0.001), but no difference in major errors (-0.16, p=0.168). Although there were significantly more errors with VR in the less experienced group of users (mean difference in total errors -0.90, and major errors -0.40, p<0.001), there was no significant difference in the more experienced (p=0.419 and p=0.814, respectively). CONCLUSIONS: VR is a viable reporting method for experienced users, with a quicker overall report production time (despite an increase in the radiologists' time) and a tendency to more errors for inexperienced users
Liu, Ran R.; Pancaroglu, Raika; Hills, Charlotte S.; Duchaine, Brad; Barton, Jason J. S.
Right or bilateral anterior temporal damage can impair face recognition, but whether this is an associative variant of prosopagnosia or part of a multimodal disorder of person recognition is an unsettled question, with implications for cognitive and neuroanatomic models of person recognition. We assessed voice perception and short-term recognition of recently heard voices in 10 subjects with impaired face recognition acquired after cerebral lesions. All 4 subjects with apperceptive prosopagnosia due to lesions limited to fusiform cortex had intact voice discrimination and recognition. One subject with bilateral fusiform and anterior temporal lesions had a combined apperceptive prosopagnosia and apperceptive phonagnosia, the first such described case. Deficits indicating a multimodal syndrome of person recognition were found only in 2 subjects with bilateral anterior temporal lesions. All 3 subjects with right anterior temporal lesions had normal voice perception and recognition, 2 of whom performed normally on perceptual discrimination of faces. This confirms that such lesions can cause a modality-specific associative prosopagnosia. PMID:25349193
Strahan, Rodney H; Schneider-Kolsky, Michal E
Despite the frequent introduction of voice recognition (VR) into radiology departments, little evidence still exists about its impact on workflow, error rates and costs. We designed a study to compare typographical errors, turnaround times (TAT) from reported to verified and productivity for VR-generated reports versus transcriptionist-generated reports in MRI. Fifty MRI reports generated by VR and 50 finalized MRI reports generated by the transcriptionist, of two radiologists, were sampled retrospectively. Two hundred reports were scrutinised for typographical errors and the average TAT from dictated to final approval. To assess productivity, the average MRI reports per hour for one of the radiologists was calculated using data from extra weekend reporting sessions. Forty-two % and 30% of the finalized VR reports for each of the radiologists investigated contained errors. Only 6% and 8% of the transcriptionist-generated reports contained errors. The average TAT for VR was 0 h, and for the transcriptionist reports TAT was 89 and 38.9 h. Productivity was calculated at 8.6 MRI reports per hour using VR and 13.3 MRI reports using the transcriptionist, representing a 55% increase in productivity. Our results demonstrate that VR is not an effective method of generating reports for MRI. Ideally, we would have the report error rate and productivity of a transcriptionist and the TAT of VR. © 2010 The Authors. Journal of Medical Imaging and Radiation Oncology © 2010 The Royal Australian and New Zealand College of Radiologists.
Strahan, Rodney H.; Schneider-Kolsky, Michal E.
Full text: Purpose: Despite the frequent introduction of voice recognition (VR) into radiology departments, little evidence still exists about its impact on workflow, error rates and costs. We designed a study to compare typographical errors, turnaround times (TAT) from reported to verified and productivity for VR-generated reports versus transcriptionist-generated reports in MRI. Methods: Fifty MRI reports generated by VR and 50 finalised MRI reports generated by the transcriptionist, of two radiologists, were sampled retrospectively. Two hundred reports were scrutinised for typographical errors and the average TAT from dictated to final approval. To assess productivity, the average MRI reports per hour for one of the radiologists was calculated using data from extra weekend reporting sessions. Results: Forty-two % and 30% of the finalised VR reports for each of the radiologists investigated contained errors. Only 6% and 8% of the transcriptionist-generated reports contained errors. The average TAT for VR was 0 h, and for the transcriptionist reports TAT was 89 and 38.9 h. Productivity was calculated at 8.6 MRI reports per hour using VR and 13.3 MRI reports using the transcriptionist, representing a 55% increase in productivity. Conclusion: Our results demonstrate that VR is not an effective method of generating reports for MRI. Ideally, we would have the report error rate and productivity of a transcriptionist and the TAT of VR.
Katharina von Kriegstein
Full Text Available Natural objects provide partially redundant information to the brain through different sensory modalities. For example, voices and faces both give information about the speech content, age, and gender of a person. Thanks to this redundancy, multimodal recognition is fast, robust, and automatic. In unimodal perception, however, only part of the information about an object is available. Here, we addressed whether, even under conditions of unimodal sensory input, crossmodal neural circuits that have been shaped by previous associative learning become activated and underpin a performance benefit. We measured brain activity with functional magnetic resonance imaging before, while, and after participants learned to associate either sensory redundant stimuli, i.e. voices and faces, or arbitrary multimodal combinations, i.e. voices and written names, ring tones, and cell phones or brand names of these cell phones. After learning, participants were better at recognizing unimodal auditory voices that had been paired with faces than those paired with written names, and association of voices with faces resulted in an increased functional coupling between voice and face areas. No such effects were observed for ring tones that had been paired with cell phones or names. These findings demonstrate that brief exposure to ecologically valid and sensory redundant stimulus pairs, such as voices and faces, induces specific multisensory associations. Consistent with predictive coding theories, associative representations become thereafter available for unimodal perception and facilitate object recognition. These data suggest that for natural objects effective predictive signals can be generated across sensory systems and proceed by optimization of functional connectivity between specialized cortical sensory modules.
Higgins, Alan; Bahler, L.; Porter, J.; Blais, P.
This paper describes an automated method of comparing a voice sample of an unknown individual with samples from known speakers in order to establish or verify the individual's identity. The method is based on a statistical pattern matching approach that employs a simple training procedure, requires no human intervention (transcription, work or phonetic marketing, etc.), and makes no assumptions regarding the expected form of the statistical distributions of the observations. The content of the speech material (vocabulary, grammar, etc.) is not assumed to be constrained in any way. An algorithm is described which incorporates frame pruning and channel equalization processes designed to achieve robust performance with reasonable computational resources. An experimental implementation demonstrating the feasibility of the concept is described.
Roswandowitz, Claudia; Kappes, Claudia; Obrig, Hellmuth; von Kriegstein, Katharina
Abstract Recognizing the identity of others by their voice is an important skill for social interactions. To date, it remains controversial which parts of the brain are critical structures for this skill. Based on neuroimaging findings, standard models of person-identity recognition suggest that the right temporal lobe is the hub for voice-identity recognition. Neuropsychological case studies, however, reported selective deficits of voice-identity recognition in patients predominantly with right inferior parietal lobe lesions. Here, our aim was to work towards resolving the discrepancy between neuroimaging studies and neuropsychological case studies to find out which brain structures are critical for voice-identity recognition in humans. We performed a voxel-based lesion-behaviour mapping study in a cohort of patients (n = 58) with unilateral focal brain lesions. The study included a comprehensive behavioural test battery on voice-identity recognition of newly learned (voice-name, voice-face association learning) and familiar voices (famous voice recognition) as well as visual (face-identity recognition) and acoustic control tests (vocal-pitch and vocal-timbre discrimination). The study also comprised clinically established tests (neuropsychological assessment, audiometry) and high-resolution structural brain images. The three key findings were: (i) a strong association between voice-identity recognition performance and right posterior/mid temporal and right inferior parietal lobe lesions; (ii) a selective association between right posterior/mid temporal lobe lesions and voice-identity recognition performance when face-identity recognition performance was factored out; and (iii) an association of right inferior parietal lobe lesions with tasks requiring the association between voices and faces but not voices and names. The results imply that the right posterior/mid temporal lobe is an obligatory structure for voice-identity recognition, while the inferior parietal
Luis Miguel eMazaira-Fernández
Full Text Available Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g. YouTube to broadcast its message. In this new scenario, classical identification methods (such fingerprints or face recognition have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. Through the present paper, a new methodology to characterize speakers will be shown. This methodology is benefiting from the advances achieved during the last years in understanding and modelling voice production. The paper hypothesizes that a gender dependent characterization of speakers combined with the use of a new set of biometric parameters extracted from the components resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract gender-dependent extended biometric parameters are given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions.
Mazaira-Fernandez, Luis Miguel; Álvarez-Marquina, Agustín; Gómez-Vilda, Pedro
Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g., YouTube) to broadcast its message. In this new scenario, classical identification methods (such as fingerprints or face recognition) have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. The present study benefits from the advances achieved during last years in understanding and modeling voice production. The paper hypothesizes that a gender-dependent characterization of speakers combined with the use of a set of features derived from the components, resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract the gender-dependent extended biometric parameters is given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions. PMID:26442245
Roswandowitz, Claudia; Kappes, Claudia; Obrig, Hellmuth; von Kriegstein, Katharina
Recognizing the identity of others by their voice is an important skill for social interactions. To date, it remains controversial which parts of the brain are critical structures for this skill. Based on neuroimaging findings, standard models of person-identity recognition suggest that the right temporal lobe is the hub for voice-identity recognition. Neuropsychological case studies, however, reported selective deficits of voice-identity recognition in patients predominantly with right inferior parietal lobe lesions. Here, our aim was to work towards resolving the discrepancy between neuroimaging studies and neuropsychological case studies to find out which brain structures are critical for voice-identity recognition in humans. We performed a voxel-based lesion-behaviour mapping study in a cohort of patients (n = 58) with unilateral focal brain lesions. The study included a comprehensive behavioural test battery on voice-identity recognition of newly learned (voice-name, voice-face association learning) and familiar voices (famous voice recognition) as well as visual (face-identity recognition) and acoustic control tests (vocal-pitch and vocal-timbre discrimination). The study also comprised clinically established tests (neuropsychological assessment, audiometry) and high-resolution structural brain images. The three key findings were: (i) a strong association between voice-identity recognition performance and right posterior/mid temporal and right inferior parietal lobe lesions; (ii) a selective association between right posterior/mid temporal lobe lesions and voice-identity recognition performance when face-identity recognition performance was factored out; and (iii) an association of right inferior parietal lobe lesions with tasks requiring the association between voices and faces but not voices and names. The results imply that the right posterior/mid temporal lobe is an obligatory structure for voice-identity recognition, while the inferior parietal lobe is
Barsics, Catherine; Brédart, Serge
Autonoetic consciousness is a fundamental property of human memory, enabling us to experience mental time travel, to recollect past events with a feeling of self-involvement, and to project ourselves in the future. Autonoetic consciousness is a characteristic of episodic memory. By contrast, awareness of the past associated with a mere feeling of familiarity or knowing relies on noetic consciousness, depending on semantic memory integrity. Present research was aimed at evaluating whether conscious recollection of episodic memories is more likely to occur following the recognition of a familiar face than following the recognition of a familiar voice. Recall of semantic information (biographical information) was also assessed. Previous studies that investigated the recall of biographical information following person recognition used faces and voices of famous people as stimuli. In this study, the participants were presented with personally familiar people's voices and faces, thus avoiding the presence of identity cues in the spoken extracts and allowing a stricter control of frequency exposure with both types of stimuli (voices and faces). In the present study, the rate of retrieved episodic memories, associated with autonoetic awareness, was significantly higher from familiar faces than familiar voices even though the level of overall recognition was similar for both these stimuli domains. The same pattern was observed regarding semantic information retrieval. These results and their implications for current Interactive Activation and Competition person recognition models are discussed.
Full Text Available The laryngectomies patient has no ability to speak normally because their vocal chords have been removed. The easiest option for the patient to speak again is by using electrolarynx speech. This tool is placed on the lower chin. Vibration of the neck while speaking is used to produce sound. Meanwhile, the technology of "voice recognition" has been growing very rapidly. It is expected that the technology of "voice recognition" can also be used by laryngectomies patients who use electrolarynx.This paper describes a system for electrolarynx speech recognition. Two main parts of the system are feature extraction and pattern recognition. The Pulse Coupled Neural Network – PCNN is used to extract the feature and characteristic of electrolarynx speech. Varying of β (one of PCNN parameter also was conducted. Multi layer perceptron is used to recognize the sound patterns. There are two kinds of recognition conducted in this paper: speech recognition and speaker recognition. The speech recognition recognizes specific speech from every people. Meanwhile, speaker recognition recognizes specific speech from specific person. The system ran well. The "electrolarynx speech recognition" has been tested by recognizing of “A” and "not A" voice. The results showed that the system had 94.4% validation. Meanwhile, the electrolarynx speaker recognition has been tested by recognizing of “saya” voice from some different speakers. The results showed that the system had 92.2% validation. Meanwhile, the best β parameter of PCNN for electrolarynx recognition is 3.
Rosa, Christine; Lassonde, Maryse; Pinard, Claudine; Keenan, Julian Paul; Belin, Pascal
Three experiments investigated functional asymmetries related to self-recognition in the domain of voices. In Experiment 1, participants were asked to identify one of three presented voices (self, familiar or unknown) by responding with either the right or the left-hand. In Experiment 2, participants were presented with auditory morphs between the…
Lenhart, Martha; Yancosek, Kathleen E
The goal of this pilot study is to assess the impact of training on voice recognition software as part of the rehabilitation process that Military patients with amputation, or peripheral nerve loss...
Zeng, Xia; Sang, Xinzhu; Chen, Duo; Wang, Peng; Guo, Nan; Yan, Binbin; Wang, Kuiru
Most current virtual reality (VR) interactions are realized with the hand-held input device which leads to a low degree of presence. There is other solutions using sensors like Leap Motion to recognize the gestures of users in order to interact in a more natural way, but the navigation in these systems is still a problem, because they fail to map the actual walking to virtual walking only with a partial body of the user represented in the synthetic environment. Therefore, we propose a system in which users can walk around in the virtual environment as a humanoid model, selecting menu items and manipulating with the virtual objects using natural hand gestures. With a Kinect depth camera, the system tracks the joints of the user, mapping them to a full virtual body which follows the move of the tracked user. The movements of the feet can be detected to determine whether the user is in walking state, so that the walking of model in the virtual world can be activated and stopped by means of animation control in Unity engine. This method frees the hands of users comparing to traditional navigation way using hand-held device. We use the point cloud data getting from Kinect depth camera to recognize the gestures of users, such as swiping, pressing and manipulating virtual objects. Combining the full body tracking and gestures recognition using Kinect, we achieve our interactive VR system in Unity engine with a high degree of presence.
Stevenage, Sarah V; Neil, Greg J; Hamlin, Iain
The results of two experiments are presented in which participants engaged in a face-recognition or a voice-recognition task. The stimuli were face-voice pairs in which the face and voice were co-presented and were either "matched" (same person), "related" (two highly associated people), or "mismatched" (two unrelated people). Analysis in both experiments confirmed that accuracy and confidence in face recognition was consistently high regardless of the identity of the accompanying voice. However accuracy of voice recognition was increasingly affected as the relationship between voice and accompanying face declined. Moreover, when considering self-reported confidence in voice recognition, confidence remained high for correct responses despite the proportion of these responses declining across conditions. These results converged with existing evidence indicating the vulnerability of voice recognition as a relatively weak signaller of identity, and results are discussed in the context of a person-recognition framework.
Franky Hadinata Marpaung
Full Text Available The purpose of this research is to create a new kind of game by using technology that rarely used in current games. It is developed as an entertainment media and also a social media in which the users can play the games together via multiplayer mode. This research uses Scrum development method since it supports small scaled developer and it supports software increment along the development. Using this game application, the users can play and watch interesting animations by controlling it with their voice, listen the character imitating the users’ voice, play various mini games both in single player or multiplayer mode via Bluetooth connection. The conclusion is that game application of My Name is Dug use voice recognition and inter-devices connection as its main features. It also has various mini games that support both single player and multiplayer.
Full Text Available Self-recognition, being indispensable for successful social communication, has become a major focus in current social neuroscience. The physical aspects of the self are most typically manifested in the face and voice. Compared with the wealth of studies on self-face recognition, self-voice recognition (SVR has not gained much attention. Converging evidence has suggested that the fundamental frequency (F0 and formant structures serve as the key acoustic cues for other-voice recognition (OVR. However, little is known about which, and how, acoustic cues are utilized for SVR as opposed to OVR. To address this question, we independently manipulated the F0 and formant information of recorded voices and investigated their contributions to SVR and OVR. Japanese participants were presented with recorded vocal stimuli and were asked to identify the speaker—either themselves or one of their peers. Six groups of 5 peers of the same sex participated in the study. Under conditions where the formant information was fully preserved and where only the frequencies lower than the third formant (F3 were retained, accuracies of SVR deteriorated significantly with the modulation of the F0, and the results were comparable for OVR. By contrast, under a condition where only the frequencies higher than F3 were retained, the accuracy of SVR was significantly higher than that of OVR throughout the range of F0 modulations, and the F0 scarcely affected the accuracies of SVR and OVR. Our results indicate that while both F0 and formant information are involved in SVR, as well as in OVR, the advantage of SVR is manifested only when major formant information for speech intelligibility is absent. These findings imply the robustness of self-voice representation, possibly by virtue of auditory familiarity and other factors such as its association with motor/articulatory representation.
This paper proposes a multimodal biometric scheme for human authentication based on fusion of voice and face recognition. For voice recognition, three categories of features (statistical coefficients, cepstral coefficients and voice timbre) are used and compared. The voice identification modality is carried out using Gaussian Mixture Model (GMM). For face recognition, three recognition methods (Eigenface, Linear Discriminate Analysis (LDA), and Gabor filter) are used and compared. The combination of voice and face biometrics systems into a single multimodal biometrics system is performed using features fusion and scores fusion. This study shows that the best results are obtained using all the features (cepstral coefficients, statistical coefficients and voice timbre features) for voice recognition, LDA face recognition method and scores fusion for the multimodal biometrics system
Weiner, J. M.
A speech input/output system is presented that can be used to communicate with a task oriented system. Human speech commands and synthesized voice output extend conventional information exchange capabilities between man and machine by utilizing audio input and output channels. The speech input facility is comprised of a hardware feature extractor and a microprocessor implemented isolated word or phrase recognition system. The recognizer offers a medium sized (100 commands), syntactically constrained vocabulary, and exhibits close to real time performance. The major portion of the recognition processing required is accomplished through software, minimizing the complexity of the hardware feature extractor.
Campeanu, Sandra; Craik, Fergus I M; Backer, Kristina C; Alain, Claude
The present study was designed to examine listeners' ability to use voice information incidentally during spoken word recognition. We recorded event-related brain potentials (ERPs) during a continuous recognition paradigm in which participants indicated on each trial whether the spoken word was "new" or "old." Old items were presented at 2, 8 or 16 words following the first presentation. Context congruency was manipulated by having the same word repeated by either the same speaker or a different speaker. The different speaker could share the gender, accent or neither feature with the word presented the first time. Participants' accuracy was greatest when the old word was spoken by the same speaker than by a different speaker. In addition, accuracy decreased with increasing lag. The correct identification of old words was accompanied by an enhanced late positivity over parietal sites, with no difference found between voice congruency conditions. In contrast, an earlier voice reinstatement effect was observed over frontal sites, an index of priming that preceded recollection in this task. Our results provide further evidence that acoustic and semantic information are integrated into a unified trace and that acoustic information facilitates spoken word recollection. Copyright © 2014 Elsevier Ltd. All rights reserved.
Full Text Available Prosopagnosia has been considered for a long period of time as the most important and almost exclusive disorder in the recognition of familiar people. In recent years, however, this conviction has been undermined by the description of patients showing a concomitant defect in the recognition of familiar faces and voices as a consequence of lesions encroaching upon the right anterior temporal lobe (ATL. These new data have obliged researchers to reconsider on one hand the construct of ‘associative prosopagnosia’ and on the other hand current models of people recognition. A systematic review of the patterns of familiar people recognition disorders observed in patients with right and left ATL lesions has shown that in patients with right ATL lesions face familiarity feelings and the retrieval of person-specific semantic information from faces are selectively affected, whereas in patients with left ATL lesions the defect selectively concerns famous people naming. Furthermore, some patients with right ATL lesions and intact face familiarity feelings show a defect in the retrieval of person-specific semantic knowledge greater from face than from name. These data are at variance with current models assuming: (a that familiarity feelings are generated at the level of person identity nodes (PINs where information processed by various sensory modalities converge, and (b that PINs provide a modality-free gateway to a single semantic system, where information about people is stored in an amodal format. They suggest, on the contrary: (a that familiarity feelings are generated at the level of modality-specific recognition units; (b that face and voice recognition units are represented more in the right than in the left ATLs; (c that in the right ATL are mainly stored person-specific information based on a convergence of perceptual information, whereas in the left ATLs are represented verbally-mediated person-specific information.
Pommergaard, Hans-Christian; Huang, Chenxi; Burcharth, Jacob
INTRODUCTION: Dictation of scientific articles has been recognised as an efficient method for producing high-quality, first article drafts. However, standardised transcription service by a secretary may not be available for all researchers and voice recognition software (VRS) may therefore...... with a median score of five (range: 3-9), which was improved with the addition of 5,000 words. CONCLUSION: The out-of-the-box performance of VRS was acceptable and improved after additional words were added. Further studies are needed to investigate the effect of additional software accuracy training....
Pommergaard, Hans-Christian; Huang, Chenxi; Burcharth, Jacob
INTRODUCTION: Dictation of scientific articles has been recognised as an efficient method for producing high-quality, first article drafts. However, standardised transcription service by a secretary may not be available for all researchers and voice recognition software (VRS) may therefore...... be an alternative. The purpose of this study was to evaluate the out-of-the-box accuracy of VRS. METHODS: Eleven young researchers without dictation experience dictated the first draft of their own scientific article after thorough preparation according to a pre-defined schedule. The dictate transcribed by VRS...
Maguinness, Corrina; Roswandowitz, Claudia; von Kriegstein, Katharina
Humans have a remarkable skill for voice-identity recognition: most of us can remember many voices that surround us as 'unique'. In this review, we explore the computational and neural mechanisms which may support our ability to represent and recognise a unique voice-identity. We examine the functional architecture of voice-sensitive regions in the superior temporal gyrus/sulcus, and bring together findings on how these regions may interact with each other, and additional face-sensitive regions, to support voice-identity processing. We also contrast findings from studies on neurotypicals and clinical populations which have examined the processing of familiar and unfamiliar voices. Taken together, the findings suggest that representations of familiar and unfamiliar voices might dissociate in the human brain. Such an observation does not fit well with current models for voice-identity processing, which by-and-large assume a common sequential analysis of the incoming voice signal, regardless of voice familiarity. We provide a revised audio-visual integrative model of voice-identity processing which brings together traditional and prototype models of identity processing. This revised model includes a mechanism of how voice-identity representations are established and provides a novel framework for understanding and examining the potential differences in familiar and unfamiliar voice processing in the human brain. Copyright © 2018 Elsevier Ltd. All rights reserved.
Christ, K. A.
This report is a literature review on the topics of voice recognition and generation. Areas covered are: manual versus vocal data input, vocabulary, stress and workload, noise, protective masks, feedback, and voice warning systems. Results of the studies presented in this report indicate that voice data entry has less of an impact on a pilot's flight performance, during low-level flying and other difficult missions, than manual data entry. However, the stress resulting from such missions may cause the pilot's voice to change, reducing the recognition accuracy of the system. The noise present in helicopter cockpits also causes the recognition accuracy to decrease. Noise-cancelling devices are being developed and improved upon to increase the recognition performance in noisy environments. Future research in the fields of voice recognition and generation should be conducted in the areas of stress and workload, vocabulary, and the types of voice generation best suited for the helicopter cockpit. Also, specific tasks should be studied to determine whether voice recognition and generation can be effectively applied.
Pommergaard, Hans-Christian; Huang, Chenxi; Burcharth, Jacob; Rosenberg, Jacob
Dictation of scientific articles has been recognised as an efficient method for producing high-quality, first article drafts. However, standardised transcription service by a secretary may not be available for all researchers and voice recognition software (VRS) may therefore be an alternative. The purpose of this study was to evaluate the out-of-the-box accuracy of VRS. Eleven young researchers without dictation experience dictated the first draft of their own scientific article after thorough preparation according to a pre-defined schedule. The dictate transcribed by VRS was compared with the same dictate transcribed by an experienced research secretary, and the effect of adding words to the vocabulary of the VRS was investigated. The number of errors per hundred words was used as outcome. Furthermore, three experienced researchers assessed the subjective readability using a Likert scale (0-10). Dragon Nuance Premium version 12.5 was used as VRS. The median number of errors per hundred words was 18 (range: 8.5-24.3), which improved when 15,000 words were added to the vocabulary. Subjective readability assessment showed that the texts were understandable with a median score of five (range: 3-9), which was improved with the addition of 5,000 words. The out-of-the-box performance of VRS was acceptable and improved after additional words were added. Further studies are needed to investigate the effect of additional software accuracy training.
Hoover, Adria E N; Démonet, Jean-François; Steeves, Jennifer K E
Anecdotally, it has been reported that individuals with acquired prosopagnosia compensate for their inability to recognize faces by using other person identity cues such as hair, gait or the voice. Are they therefore superior at the use of non-face cues, specifically voices, to person identity? Here, we empirically measure person and object identity recognition in a patient with acquired prosopagnosia and object agnosia. We quantify person identity (face and voice) and object identity (car and horn) recognition for visual, auditory, and bimodal (visual and auditory) stimuli. The patient is unable to recognize faces or cars, consistent with his prosopagnosia and object agnosia, respectively. He is perfectly able to recognize people's voices and car horns and bimodal stimuli. These data show a reverse shift in the typical weighting of visual over auditory information for audiovisual stimuli in a compromised visual recognition system. Moreover, the patient shows selectively superior voice recognition compared to the controls revealing that two different stimulus domains, persons and objects, may not be equally affected by sensory adaptation effects. This also implies that person and object identity recognition are processed in separate pathways. These data demonstrate that an individual with acquired prosopagnosia and object agnosia can compensate for the visual impairment and become quite skilled at using spared aspects of sensory processing. In the case of acquired prosopagnosia it is advantageous to develop a superior use of voices for person identity recognition in everyday life. Copyright © 2010 Elsevier Ltd. All rights reserved.
Silva, Marco; Vellasco, Marley M B R; Cataldo, Edson
The aging of the voice, known as presbyphonia, is a natural process that can cause great change in vocal quality of the individual. This is a relevant problem to those people who use their voices professionally, and its early identification can help determine a suitable treatment to avoid its progress or even to eliminate the problem. This work focuses on the development of a new model for the identification of aging voices (independently of their chronological age), using as input attributes parameters extracted from the voice and glottal signals. The proposed model, named Quantum binary-real evolving Spiking Neural Network (QbrSNN), is based on spiking neural networks (SNNs), with an unsupervised training algorithm, and a Quantum-Inspired Evolutionary Algorithm that automatically determines the most relevant attributes and the optimal parameters that configure the SNN. The QbrSNN model was evaluated in a database composed of 120 records, containing samples from three groups of speakers. The results obtained indicate that the proposed model provides better accuracy than other approaches, with fewer input attributes. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Winda, A.; E Byan, W. R.; Sofyan; Armansyah; Zariantin, D. L.; Josep, B. G.
Current mechanical key in the motorcycle is prone to bulgary, being stolen or misplaced. Intelligent biometric voice recognition as means to replace this mechanism is proposed as an alternative. The proposed system will decide whether the voice is belong to the user or not and the word utter by the user is ‘On’ or ‘Off’. The decision voice will be sent to Arduino in order to start or stop the engine. The recorded voice is processed in order to get some features which later be used as input to the proposed system. The Mel-Frequency Ceptral Coefficient (MFCC) is adopted as a feature extraction technique. The extracted feature is the used as input to the SVM-based identifier. Experimental results confirm the effectiveness of the proposed intelligent voice recognition and word recognition system. It show that the proposed method produces a good training and testing accuracy, 99.31% and 99.43%, respectively. Moreover, the proposed system shows the performance of false rejection rate (FRR) and false acceptance rate (FAR) accuracy of 0.18% and 17.58%, respectively. In the intelligent word recognition shows that the training and testing accuracy are 100% and 96.3%, respectively.
Poock, G. K.; Martin, B. J.
This was an applied investigation examining the ability of a speech recognition system to recognize speakers' inputs when the speakers were under different stress levels. Subjects were asked to speak to a voice recognition system under three conditions: (1) normal office environment, (2) emotional stress, and (3) perceptual-motor stress. Results indicate a definite relationship between voice recognition system performance and the type of low stress reference patterns used to achieve recognition.
Melson, David L.; Brophy, Robert; Blaine, G. James; Jost, R. Gilbert; Brink, Gary S.
Because of its exciting potential to improve clinical service, as well as reduce costs, a voice recognition system for radiological dictation was recently installed at our institution. This system will be clinically successful if it dramatically reduces radiology report turnaround time without substantially affecting radiologist dictation and editing time. This report summarizes an observer study currently under way in which radiologist reporting times using the traditional transcription system and the voice recognition system are compared. Four radiologists are observed interpreting portable intensive care unit (ICU) chest examinations at a workstation in the chest reading area. Data are recorded with the radiologists using the transcription system and using the voice recognition system. The measurements distinguish between time spent performing clerical tasks and time spent actually dictating the report. Editing time and the number of corrections made are recorded. Additionally, statistics are gathered to assess the voice recognition system's impact on the report cycle time -- the time from report dictation to availability of an edited and finalized report -- and the length of reports.
Meiyanti, R.; Subandi, A.; Fuqara, N.; Budiman, M. A.; Siahaan, A. P. U.
A singer doesn’t just recite the lyrics of a song, but also with the use of particular sound techniques to make it more beautiful. In the singing technique, more female have a diverse sound registers than male. There are so many registers of the human voice, but the voice registers used while singing, among others, Chest Voice, Head Voice, Falsetto, and Vocal fry. Research of speech recognition based on the female’s voice registers in singing technique is built using Borland Delphi 7.0. Speech recognition process performed by the input recorded voice samples and also in real time. Voice input will result in weight energy values based on calculations using Hankel Transformation method and Macdonald Functions. The results showed that the accuracy of the system depends on the accuracy of sound engineering that trained and tested, and obtained an average percentage of the successful introduction of the voice registers record reached 48.75 percent, while the average percentage of the successful introduction of the voice registers in real time to reach 57 percent.
Mahmood, Awais; Alsulaiman, Mansour; Muhammad, Ghulam; Akram, Sheeraz
Local features for any pattern recognition system are based on the information extracted locally. In this paper, a local feature extraction technique was developed. This feature was extracted in the time-frequency plain by taking the moving average on the diagonal directions of the time-frequency plane. This feature captured the time-frequency events producing a unique pattern for each speaker that can be viewed as a voice print of the speaker. Hence, we referred to this technique as voice print-based local feature. The proposed feature was compared to other features including mel-frequency cepstral coefficient (MFCC) for speaker recognition using two different databases. One of the databases used in the comparison is a subset of an LDC database that consisted of two short sentences uttered by 182 speakers. The proposed feature attained 98.35% recognition rate compared to 96.7% for MFCC using the LDC subset.
Ramirez, J.; Gorriz, J. M.; Segura, J. C.
This chapter has shown an overview of the main challenges in robust speech detection and a review of the state of the art and applications. VADs are frequently used in a number of applications including speech coding, speech enhancement and speech recognition. A precise VAD extracts a set of discriminative speech features from the noisy speech and formulates the decision in terms of well defined rule. The chapter has summarized three robust VAD methods that yield high speech/non-speech discri...
Kaur, Jasdeep; Juglan, K. C.; Sharma, Vishal; Upadhyay, R. K.
This paper deals with perception and disorders of speech in view of Punjabi language. Visualizing the importance of voice identification, various parameters of speaker identification has been studied. The speech material was recorded with a tape recorder in their normal and disguised mode of utterances. Out of the recorded speech materials, the utterances free from noise, etc were selected for their auditory and acoustic spectrographic analysis. The comparison of normal and disguised speech of seven subjects is reported. The fundamental frequency (F0) at similar places, Plosive duration at certain phoneme, Amplitude ratio (A1:A2) etc. were compared in normal and disguised speech. It was found that the formant frequency of normal and disguised speech remains almost similar only if it is compared at the position of same vowel quality and quantity. If the vowel is more closed or more open in the disguised utterance the formant frequency will be changed in comparison to normal utterance. The ratio of the amplitude (A1: A2) is found to be speaker dependent. It remains unchanged in the disguised utterance. However, this value may shift in disguised utterance if cross sectioning is not done at the same location.
Aman , Frédéric; Vacher , Michel; Rossato , Solange; Portet , François
International audience; By 2050, about a third of the French population will be over 65. In the context of technologies development aiming at helping aged people to live independently at home, the CIRDO project aims at implementing an ASR system into a social inclusion product designed for elderly people in order to detect distress situations. Speech recognition systems present higher word error rate when speech is uttered by elderly speakers compared to when non-aged voice is considered. Two...
Freeh, M; Dewey, M; Brigham, L
The Department of Radiology at the University of Utah Health Sciences Center has been in the process of transitioning from the traditional film-based department to a digital imaging department for the past 2 years. The department is now transitioning from the traditional method of dictating reports (dictation by radiologist to transcription to review and signing by radiologist) to a voice recognition system. The transition to digital operations will not be complete until we have the ability to directly interface the dictation process with the image review process. Voice recognition technology has advanced to the level where it can and should be an integral part of the new way of working in radiology and is an integral part of an efficient digital imaging department. The transition to voice recognition requires the task of identifying the product and the company that will best meet a department's needs. This report introduces the methods we used to evaluate the vendors and the products available as we made our purchasing decision. We discuss our evaluation method and provide a checklist that can be used by other departments to assist with their evaluation process. The criteria used in the evaluation process fall into the following major categories: user operations, technical infrastructure, medical dictionary, system interfaces, service support, cost, and company strength. Conclusions drawn from our evaluation process will be detailed, with the intention being to shorten the process for others as they embark on a similar venture. As more and more organizations investigate the many products and services that are now being offered to enhance the operations of a radiology department, it becomes increasingly important that solid methods are used to most effectively evaluate the new products. This report should help others complete the task of evaluating a voice recognition system and may be adaptable to other products as well.
Motyer, R E; Liddy, S; Torreggiani, W C; Buckley, O
Voice recognition (VR) dictation of radiology reports has become the mainstay of reporting in many institutions worldwide. Despite benefit, such software is not without limitations, and transcription errors have been widely reported. Evaluate the frequency and nature of non-clinical transcription error using VR dictation software. Retrospective audit of 378 finalised radiology reports. Errors were counted and categorised by significance, error type and sub-type. Data regarding imaging modality, report length and dictation time was collected. 67 (17.72 %) reports contained ≥1 errors, with 7 (1.85 %) containing 'significant' and 9 (2.38 %) containing 'very significant' errors. A total of 90 errors were identified from the 378 reports analysed, with 74 (82.22 %) classified as 'insignificant', 7 (7.78 %) as 'significant', 9 (10 %) as 'very significant'. 68 (75.56 %) errors were 'spelling and grammar', 20 (22.22 %) 'missense' and 2 (2.22 %) 'nonsense'. 'Punctuation' error was most common sub-type, accounting for 27 errors (30 %). Complex imaging modalities had higher error rates per report and sentence. Computed tomography contained 0.040 errors per sentence compared to plain film with 0.030. Longer reports had a higher error rate, with reports >25 sentences containing an average of 1.23 errors per report compared to 0-5 sentences containing 0.09. These findings highlight the limitations of VR dictation software. While most error was deemed insignificant, there were occurrences of error with potential to alter report interpretation and patient management. Longer reports and reports on more complex imaging had higher error rates and this should be taken into account by the reporting radiologist.
Hakanpää, Tua; Waaramaa, Teija; Laukkanen, Anne-Maria
This study examines the recognition of emotion in contemporary commercial music (CCM) and classical styles of singing. This information may be useful in improving the training of interpretation in singing. This is an experimental comparative study. Thirteen singers (11 female, 2 male) with a minimum of 3 years' professional-level singing studies (in CCM or classical technique or both) participated. They sang at three pitches (females: a, e1, a1, males: one octave lower) expressing anger, sadness, joy, tenderness, and a neutral state. Twenty-nine listeners listened to 312 short (0.63- to 4.8-second) voice samples, 135 of which were sung using a classical singing technique and 165 of which were sung in a CCM style. The listeners were asked which emotion they heard. Activity and valence were derived from the chosen emotions. The percentage of correct recognitions out of all the answers in the listening test (N = 9048) was 30.2%. The recognition percentage for the CCM-style singing technique was higher (34.5%) than for the classical-style technique (24.5%). Valence and activation were better perceived than the emotions themselves, and activity was better recognized than valence. A higher pitch was more likely to be perceived as joy or anger, and a lower pitch as sorrow. Both valence and activation were better recognized in the female CCM samples than in the other samples. There are statistically significant differences in the recognition of emotions between classical and CCM styles of singing. Furthermore, in the singing voice, pitch affects the perception of emotions, and valence and activity are more easily recognized than emotions. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Wickline, Virginia B; Bailey, Wendy; Nowicki, Stephen
The authors explored whether there were in-group advantages in emotion recognition of faces and voices by culture or geographic region. Participants were 72 African American students (33 men, 39 women), 102 European American students (30 men, 72 women), 30 African international students (16 men, 14 women), and 30 European international students (15 men, 15 women). The participants determined emotions in African American and European American faces and voices. Results showed an in-group advantage-sometimes by culture, less often by race-in recognizing facial and vocal emotional expressions. African international students were generally less accurate at interpreting American nonverbal stimuli than were European American, African American, and European international peers. Results suggest that, although partly universal, emotional expressions have subtle differences across cultures that persons must learn.
Mathevon, Nicolas; Charrier, Isabelle; Aubin, Thierry
In colonial mammals like fur seals, mutual vocal recognition between mothers and their pup is of primary importance for breeding success. Females alternate feeding sea-trips with suckling periods on land, and when coming back from the ocean, they have to vocally find their offspring among numerous similar-looking pups. Young fur seals emit a 'mother-attraction call' that presents individual characteristics. In this paper, we review the perceptual process of pup's call recognition by Subantarctic Fur Seal Arctocephalus tropicalis mothers. To identify their progeny, females rely on the frequency modulation pattern and spectral features of this call. As the acoustic characteristics of a pup's call change throughout the lactation period due to the growing process, mothers have thus to refine their memorization of their pup's voice. Field experiments show that female Fur Seals are able to retain all the successive versions of their pup's call.
Full Text Available Voice recognition technology is one of biometric technology. Sound is a unique part of the human being which made an individual can be easily distinguished one from another. Voice can also provide information such as gender, emotion, and identity of the speaker. This research will record human voices that pronounce digits between 0 and 9 with and without noise. Features of this sound recording will be extracted using Mel Frequency Cepstral Coefficient (MFCC. Mean, standard deviation, max, min, and the combination of them will be used to construct the feature vectors. This feature vectors then will be classified using Support Vector Machine (SVM. There will be two classification models. The first one is based on the speaker and the other one based on the digits pronounced. The classification model then will be validated by performing 10-fold cross-validation.The best average accuracy from two classification model is 91.83%. This result achieved using Mean + Standard deviation + Min + Max as features.
Zäske, Romi; Awwad Shiekh Hasan, Bashar; Belin, Pascal
Listeners can recognize newly learned voices from previously unheard utterances, suggesting the acquisition of high-level speech-invariant voice representations during learning. Using functional magnetic resonance imaging (fMRI) we investigated the anatomical basis underlying the acquisition of voice representations for unfamiliar speakers independent of speech, and their subsequent recognition among novel voices. Specifically, listeners studied voices of unfamiliar speakers uttering short sentences and subsequently classified studied and novel voices as "old" or "new" in a recognition test. To investigate "pure" voice learning, i.e., independent of sentence meaning, we presented German sentence stimuli to non-German speaking listeners. To disentangle stimulus-invariant and stimulus-dependent learning, during the test phase we contrasted a "same sentence" condition in which listeners heard speakers repeating the sentences from the preceding study phase, with a "different sentence" condition. Voice recognition performance was above chance in both conditions although, as expected, performance was higher for same than for different sentences. During study phases activity in the left inferior frontal gyrus (IFG) was related to subsequent voice recognition performance and same versus different sentence condition, suggesting an involvement of the left IFG in the interactive processing of speaker and speech information during learning. Importantly, at test reduced activation for voices correctly classified as "old" compared to "new" emerged in a network of brain areas including temporal voice areas (TVAs) of the right posterior superior temporal gyrus (pSTG), as well as the right inferior/middle frontal gyrus (IFG/MFG), the right medial frontal gyrus, and the left caudate. This effect of voice novelty did not interact with sentence condition, suggesting a role of temporal voice-selective areas and extra-temporal areas in the explicit recognition of learned voice identity
Lu, Lingxi; Bao, Xiaohan; Chen, Jing; Qu, Tianshu; Wu, Xihong; Li, Liang
Under a noisy "cocktail-party" listening condition with multiple people talking, listeners can use various perceptual/cognitive unmasking cues to improve recognition of the target speech against informational speech-on-speech masking. One potential unmasking cue is the emotion expressed in a speech voice, by means of certain acoustical features. However, it was unclear whether emotionally conditioning a target-speech voice that has none of the typical acoustical features of emotions (i.e., an emotionally neutral voice) can be used by listeners for enhancing target-speech recognition under speech-on-speech masking conditions. In this study we examined the recognition of target speech against a two-talker speech masker both before and after the emotionally neutral target voice was paired with a loud female screaming sound that has a marked negative emotional valence. The results showed that recognition of the target speech (especially the first keyword in a target sentence) was significantly improved by emotionally conditioning the target speaker's voice. Moreover, the emotional unmasking effect was independent of the unmasking effect of the perceived spatial separation between the target speech and the masker. Also, (skin conductance) electrodermal responses became stronger after emotional learning when the target speech and masker were perceptually co-located, suggesting an increase of listening efforts when the target speech was informationally masked. These results indicate that emotionally conditioning the target speaker's voice does not change the acoustical parameters of the target-speech stimuli, but the emotionally conditioned vocal features can be used as cues for unmasking target speech.
Rebecca A. Fairchild
Full Text Available The following teacher research case-study involved an exploration of educational pedagogy by working with a freshman composition student at a college university. All data collected for the study was gathered during the 2013 spring semester. The study was driven by an inquiry based approach where the researcher determined the center of focus that arose from an exploration of the student as a writer through a survey, a classroom observation, multiple one-on-one meetings, and email conversations. The focus area that arose was the student’s limited recognition that writing was done solely for school purposes. Related puzzlements stemming from this focus area included the student’s lack of attachment and lack of voice in her writing. The conclusive data provided insights for how to educate students in future classrooms regarding how vital it is for students to be able to attach themselves to their work.
Issenman, Robert M; Jaffer, Iqbal H
Voice recognition software (VRS), with specialized medical vocabulary, is being promoted to enhance physician efficiency, decrease costs, and improve patient safety. This study reports the experience of a pediatric subspecialist (pediatric gastroenterology) physician with the use of Dragon Naturally Speaking (version 6; ScanSoft Inc, Peabody, MA), incorporated for use with a proprietary electronic medical record, in a large university medical center ambulatory care service. After 2 hours of group orientation and 2 hours of individual VRS instruction, the physician trained the software for 1 month (30 letters) during a hospital slowdown. Set-up, dictation, and correction times for the physician and medical transcriptionist were recorded for these training sessions, as well as for 42 subsequently dictated letters. Figures were extrapolated to the yearly clinic volume for the physician, to estimate costs (physician: 110 dollars per hour; transcriptionist: 11 dollars per hour, US dollars). The use of VRS required an additional 200% of physician dictation and correction time (9 minutes vs 3 minutes), compared with the use of electronic signatures for letters typed by an experienced transcriptionist and imported into the electronic medical record. When the cost of the license agreement and the costs of physician and transcriptionist time were included, the use of the software cost 100% more, for the amount of dictation performed annually by the physician. VRS is an intriguing technology. It holds the possibility of streamlining medical practice. However, the learning curve and accuracy of the tested version of the software limit broad physician acceptance at this time.
Full Text Available In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appropriate for objective and quick evaluation of intelligibility. In this study we investigate the applicability of the method to speech disorders caused by head and neck cancer. Intelligibility was quantified by speech recognition on recordings of a standard text read by 41 German laryngectomized patients with cancer of the larynx or hypopharynx and 49 German patients who had suffered from oral cancer. The speech recognition provides the percentage of correctly recognized words of a sequence, that is, the word recognition rate. Automatic evaluation was compared to perceptual ratings by a panel of experts and to an age-matched control group. Both patient groups showed significantly lower word recognition rates than the control group. Automatic speech recognition yielded word recognition rates which complied with experts' evaluation of intelligibility on a significant level. Automatic speech recognition serves as a good means with low effort to objectify and quantify the most important aspect of pathologic speech—the intelligibility. The system was successfully applied to voice and speech disorders.
Iqbal, Asim; Farooq, Umar; Mahmood, Hassan; Asad, Muhammad Usman; Khan, Akrama; Atiq, Hafiz Muhammad
A self teaching image processing and voice recognition based system is developed to educate visually impaired children, chiefly in their primary education. System comprises of a computer, a vision camera, an ear speaker and a microphone. Camera, attached with the computer system is mounted on the ceiling opposite (on the required angle) to the desk on which the book is placed. Sample images and voices in the form of instructions and commands of English, Urdu alphabets, Numeric Digits, Operators and Shapes are already stored in the database. A blind child first reads the embossed character (object) with the help of fingers than he speaks the answer, name of the character, shape etc into the microphone. With the voice command of a blind child received by the microphone, image is taken by the camera which is processed by MATLAB® program developed with the help of Image Acquisition and Image processing toolbox and generates a response or required set of instructions to child via ear speaker, resulting in self education of a visually impaired child. Speech recognition program is also developed in MATLAB® with the help of Data Acquisition and Signal Processing toolbox which records and process the command of the blind child.
Full Text Available In colonial mammals like fur seals, mutual vocal recognition between mothers and their pup is of primary importance for breeding success. Females alternate feeding sea-trips with suckling periods on land, and when coming back from the ocean, they have to vocally find their offspring among numerous similar-looking pups. Young fur seals emit a 'mother-attraction call' that presents individual characteristics. In this paper, we review the perceptual process of pup's call recognition by Subantarctic Fur Seal Arctocephalus tropicalis mothers. To identify their progeny, females rely on the frequency modulation pattern and spectral features of this call. As the acoustic characteristics of a pup's call change throughout the lactation period due to the growing process, mothers have thus to refine their memorization of their pup's voice. Field experiments show that female Fur Seals are able to retain all the successive versions of their pup's call.Em mamíferos coloniais como as focas, o reconhecimento vocal mútuo entre as mães e seu filhote é de importância primordial para o sucesso reprodutivo. As fêmeas alternam viagens de alimentação no mar com períodos de amamentação em terra e, quando voltam à colônia, elas devem achar vocalmente seu filhote no meio de muitos outros visualmente semelhantes. As jovens focas emitem um ''grito de atração da mãe'' que apresenta características individuais. Examinamos aqui o processo perceptual do reconhecimento do grito do filhote pela mãe numa população sub-antártica da foca Arctocephalus tropicalis. Para identificar seu filhote as fêmeas se baseiam no padrão da freqüência de modulação e outras características espectrais deste grito. Como os parâmetros acústicos do grito de um filhote mudam ao longo do período de amamentação por causa do seu crescimento, as mães precisam de uma memorização refinada da voz de seu filhote. Experiências de campo mostram que as fêmeas desta espécie s
van der Sluis, Frans; van den Broek, Egon; Stam, Liesbeth M.; Abrahamse, E.L.; Luursema, J.M.
This deliverable serves to reinstate a broad view on Virtual Reality (VR), capturing all its constituting disciplines. The core target of this report is to establish a foundation for an educational program where all disciplines subordinate to VR technology will converge. Over the past decade(s) the
Full Text Available The extent to which human speech perception evolved by taking advantage of predispositions and pre-existing features of vertebrate auditory and cognitive systems remains a central question in the evolution of speech. This paper reviews asymmetries in vowel perception, speaker voice recognition, and speaker normalization in non-human animals – topics that have not been thoroughly discussed in relation to the abilities of non-human animals, but are nonetheless important aspects of vocal perception. Throughout this paper we demonstrate that addressing these issues in non-human animals is relevant and worthwhile because many non-human animals must deal with similar issues in their natural environment. That is, they must also discriminate between similar-sounding vocalizations, determine signaler identity from vocalizations, and resolve signaler-dependent variation in vocalizations from conspecifics. Overall, we find that, although plausible, the current evidence is insufficiently strong to conclude that directional asymmetries in vowel perception are specific to humans, or that non-human animals can use voice characteristics to recognize human individuals. However, we do find some indication that non-human animals can normalize speaker differences. Accordingly, we identify avenues for future research that would greatly improve and advance our understanding of these topics.
Tim D. Hunt
Full Text Available The purpose of this work was to test the effectiveness of using readily available speech recognition API services to determine if recordings of bird song had inadvertently recorded human voices. A mobile phone was used to record a human speaking at increasing distances from the phone in an outside setting with bird song occurring in the background. One of the services was trained with sample recordings and each service was compared for their ability to return recognized words. The services from Google and IBM performed similarly and the Microsoft service, that allowed training, performed slightly better. However, all three services failed to perform at a level that would enable recordings with recognizable human speech to be deleted in order to maintain full privacy protection.
Virtual Reality (VR) is a rapidly emerging technology which allows participants to experience a virtual environment through stimulation of the participant`s senses. Intuitive and natural interactions with the virtual world help to create a realistic experience. Typically, a participant is immersed in a virtual environment through the use of a 3-D viewer. Realistic, computer-generated environment models and accurate tracking of a participant`s view are important factors for adding realism to a virtual experience. Stimulating a participant`s sense of sound and providing a natural form of communication for interacting with the virtual world are equally important. This paper discusses the advantages and importance of incorporating voice recognition and audio feedback capabilities into a virtual world experience. Various approaches and levels of complexity are discussed. Examples of the use of voice and sound are presented through the description of a research application developed in the VR laboratory at Sandia National Laboratories.
Principles in Experimental Design. New York: McGraw-Hill, 1962. Woodworth, R.S. and H. Schlosberg, Experimental Psychology, (Revised edition), New...collection iheet APPENDIX II EXPERIMENTAL PROTOCOL AND SUBJECTS’ INSTRICTJONS THIS IS AN EXPERIMENT DESIGNED TO EVALUJATE SOME ," lE RECOGNITION EQUIPMENT. I...37. CDR Paul Chatelier OUSD R&E Room 3D129 Pentagon Washington, D.C. 20301 38. Ralph Cleveland NFMSO Code 9333 Mechanicsburg, PA 17055 39. Clay Coler
Lange, Holley R.
Discussion of voice as the communications device for computer-human interaction focuses on voice recognition systems for use within a library environment. Voice technologies are described, including voice response and voice recognition; examples of voice systems in use in libraries are examined; and further possibilities, including use with…
Cinematic virtual reality is a new and relatively unexplored area in academia. While research in guiding the spectator's attention in this new medium has been conducted for some time, a focus on editing in conjunction with spectator orientation is only currently emerging. In this paper, we consid...... in rhythm perception, and complement it with applications in traditional editing. Through the notion of multimodal listening we provide guidelines that can be used in rhythmic and sonic interaction design in VR....
The rich studies in this collection show that the investigation of voice requires analysis of "recognition" across layered spatial-temporal and sociolinguistic scales. I argue that the concepts of voice, recognition, and scale provide insight into contemporary educational inequality and that their study benefits, in turn, from paying attention to…
Lee, Yong Bum; Cho, Jai Wan; Lee, Nam Ho; Choi, Young Soo; Park, Soon Yong
By searching the present condition of virtual reality technology of which researches were carried out not only abroad but also the country in nuclear power industry, we confirm the possibility of practical usage of VR in it. And as a fundamental research for applications of VR in nuclear power industry, gesture recognition for remote working and VR training system for severe working were performed. 1. A study on gesture recognition for remote working : The hand gesture recognition technology using visual signal and tactile magnetic sensor as a basic study for the introduction of task command and communication were performed. 2. A study on an construction of the virtual environment training system for the task in a severe condition: A construction of virtual reality training system for the tasks in a severe working condition was implemented. This system was intended to enhance the efficiency of actual tasks through advanced practicing the motion procedures those should be performed in a severe working condition where it is difficult to access for personnel. The motion information which is came from the sensors attached on trainers body was used for construction of the virtual environment through the computer graphic procedures. The VR training system has many merits relative to the conservative training method that was performed with mock-up which was made as the same size and shape as real component in nuclear power plant. (author). 27 refs., 21 tabs., 51 figs
Andreas Maier; Tino Haderlein; Florian Stelzle; Elmar Nöth; Emeka Nkenke; Frank Rosanowski; Anne Schützenberger; Maria Schuster
In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appropriate for objective and quick evaluation of intelligibility. In this study we investigate the applicability of the method to speech disorders caused by head and neck cancer. Intelligibility was quantified by speech recognition on recordings of a standard text read by 41 German laryngect...
Full Text Available When we develop voice-activated human-appliance interface systems in smart homes, named entity recognition (NER is an essential tool for extracting execution targets from natural language commands. Previous studies on NER systems generally include supervised machine-learning methods that require a substantial amount of human-annotated training corpus. In the smart home environment, categories of named entities should be defined according to voice-activated devices (e.g., food names for refrigerators and song titles for music players. The previous machine-learning methods make it difficult to change categories of named entities because a large amount of the training corpus should be newly constructed by hand. To address this problem, we present a semi-supervised NER system to minimize the time-consuming and labor-intensive task of constructing the training corpus. Our system uses distant supervision methods with two kinds of auto-labeling processes: auto-labeling based on heuristic rules for single-class named entity corpus generation and auto-labeling based on a pre-trained single-class NER model for multi-class named entity corpus generation. Then, our system improves NER accuracy by using a bagging-based active learning method. In our experiments that included a generic domain that featured 11 named entity classes and a context-specific domain about baseball that featured 21 named entity classes, our system demonstrated good performances in both domains, with F1-measures of 0.777 and 0.958, respectively. Since our system was built from a relatively small human-annotated training corpus, we believe it is a viable alternative to current NER systems in smart home environments.
In this article, I shall examine the cognitive, heuristic and theoretical functions of the concept of recognition. To evaluate both the explanatory power and the limitations of a sociological concept, the theory construction must be analysed and its actual productivity for sociological theory mus...
Jürgens, Rebecca; Grass, Annika; Drolet, Matthis; Fischer, Julia
Both in the performative arts and in emotion research, professional actors are assumed to be capable of delivering emotions comparable to spontaneous emotional expressions. This study examines the effects of acting training on vocal emotion depiction and recognition. We predicted that professional actors express emotions in a more realistic fashion than non-professional actors. However, professional acting training may lead to a particular speech pattern; this might account for vocal expressions by actors that are less comparable to authentic samples than the ones by non-professional actors. We compared 80 emotional speech tokens from radio interviews with 80 re-enactments by professional and inexperienced actors, respectively. We analyzed recognition accuracies for emotion and authenticity ratings and compared the acoustic structure of the speech tokens. Both play-acted conditions yielded similar recognition accuracies and possessed more variable pitch contours than the spontaneous recordings. However, professional actors exhibited signs of different articulation patterns compared to non-trained speakers. Our results indicate that for emotion research, emotional expressions by professional actors are not better suited than those from non-actors.
Zhang, Guoming; Yan, Chen; Ji, Xiaoyu; Zhang, Taimin; Zhang, Tianchen; Xu, Wenyuan
Speech recognition (SR) systems such as Siri or Google Now have become an increasingly popular human-computer interaction method, and have turned various systems into voice controllable systems(VCS). Prior work on attacking VCS shows that the hidden voice commands that are incomprehensible to people can control the systems. Hidden voice commands, though hidden, are nonetheless audible. In this work, we design a completely inaudible attack, DolphinAttack, that modulates voice commands on ultra...
Laukka, Petri; Elfenbein, Hillary Anger; Thingujam, Nutankumar S; Rockstuhl, Thomas; Iraki, Frederick K; Chui, Wanda; Althoff, Jean
This study extends previous work on emotion communication across cultures with a large-scale investigation of the physical expression cues in vocal tone. In doing so, it provides the first direct test of a key proposition of dialect theory, namely that greater accuracy of detecting emotions from one's own cultural group-known as in-group advantage-results from a match between culturally specific schemas in emotional expression style and culturally specific schemas in emotion recognition. Study 1 used stimuli from 100 professional actors from five English-speaking nations vocally conveying 11 emotional states (anger, contempt, fear, happiness, interest, lust, neutral, pride, relief, sadness, and shame) using standard-content sentences. Detailed acoustic analyses showed many similarities across groups, and yet also systematic group differences. This provides evidence for cultural accents in expressive style at the level of acoustic cues. In Study 2, listeners evaluated these expressions in a 5 × 5 design balanced across groups. Cross-cultural accuracy was greater than expected by chance. However, there was also in-group advantage, which varied across emotions. A lens model analysis of fundamental acoustic properties examined patterns in emotional expression and perception within and across groups. Acoustic cues were used relatively similarly across groups both to produce and judge emotions, and yet there were also subtle cultural differences. Speakers appear to have a culturally nuanced schema for enacting vocal tones via acoustic cues, and perceivers have a culturally nuanced schema in judging them. Consistent with dialect theory's prediction, in-group judgments showed a greater match between these schemas used for emotional expression and perception. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Grady, Michael W.; Hicklin, M. B.; Porter, J. E.
A technology assessment of the application of computers and electronics to complex systems is presented. Three existing systems which utilize voice technology (speech recognition and speech generation) are described. Future directions in voice technology are also described.
Gunkel, S.; Prins, M.J.; Stokking, H.M.; Niamut, O.A.
Virtual Reality (VR) and 360-degree video are reshaping the media landscape, creating a fertile business environment. During 2016 new 360-degree cameras and VR headsets entered the consumer market, distribution platforms are being established and new production studios are emerging. VR is evermore
Katoh, Toshisada; Tanaka, Kazuo; Kasai, Yasusuke; Kimura, Katsumi; Nakakoshi, Tetsuhiro
The technology of the V.R. (Virtual Reality) is expected to improve the interface between the human and the computer by reality and easiness. The application of the V.R. technology to the nuclear power plant will bring the wide-spread use of the computer in various fields such as plant planning, design, training, and operation. The combination of the 3D-CAD plant data and the V.R. technologies will be easy approach to realize these applications because the 3D-CAD data for nuclear plant is constructed in the design stage. The prototype system investigates the feasibility of V.R. technologies in the nuclear plant. The stereo-scopic device and the voice processing device has been integrated to 3D-CAD system by 1992. We confirmed that these devices have a good effect on the improvement of the interface between the man and the computer. (orig.)
Sheffert, Sonya M; Olson, Elizabeth
In this research, we investigated the effects of voice and face information on the perceptual learning of talkers and on long-term memory for spoken words. In the first phase, listeners were trained over several days to identify voices from words presented auditorily or audiovisually. The training data showed that visual information about speakers enhanced voice learning, revealing cross-modal connections in talker processing akin to those observed in speech processing. In the second phase, the listeners completed an auditory or audiovisual word recognition memory test in which equal numbers of words were spoken by familiar and unfamiliar talkers. The data showed that words presented by familiar talkers were more likely to be retrieved from episodic memory, regardless of modality. Together, these findings provide new information about the representational code underlying familiar talker recognition and the role of stimulus familiarity in episodic word recognition.
Allen, Catherine; O'Toole, R. B. (Robert Bernard)
An initial report on VR-enhanced seminars with staff and students. Held over 2 days in May 2017, these workshops provided a diverse group of staff and students at the University of Warwick with a valuable opportunity to experience and think about virtual reality. The VR phenomenon is at the top of its hype cycle (again), with significant breakthroughs having been made in technology and in the design of VR content. However, not many people in higher education have experienced what can be achie...
Song, Peng; Xu, Shuhong; Fong, Wee Teck; Chin, Ching Ling; Chua, Gim Guan; Huang, Zhiyong
The development of new technologies has undoubtedly promoted the advances of modern education, among which Virtual Reality (VR) technologies have made the education more visually accessible for students. However, classroom education has been the focus of VR applications whereas not much research has been done in promoting sports education using VR technologies. In this paper, an immersive VR system is designed and implemented to create a more intuitive and visual way of teaching tennis. A scalable system architecture is proposed in addition to the hardware setup layout, which can be used for various immersive interactive applications such as architecture walkthroughs, military training simulations, other sports game simulations, interactive theaters, and telepresent exhibitions. Realistic interaction experience is achieved through accurate and robust hybrid tracking technology, while the virtual human opponent is animated in real time using shader-based skin deformation. Potential future extensions are also discussed to improve the teaching/learning experience.
Grave, Luís; Escaleira, Cristina; Marcos, Adérito
The Virtual Reality (VR) field can provide a wide variety of industrial applications. We can find several examples in the automobile industry, where VR is used for tasks like design, wind tunnel simulators, assemble/disassemble, etc. However, all these applications are designed to be used by VR experts,or well trained personnel. This happens because the VR devices and the VR interaction metaphors are not yet well developed to fulfil the needs of an inexperienced user, like robustness, fail...
To more easily obtain a voiced excitation function for speech characterization, measurements of skin motion, tracheal tube, and vocal fold, motions were made and compared to EM sensor-glottal derived...
Olson, D. M.; Zaman, C. H.; Sutherland, A.
Virtual reality holds great potential for science communication, education, and research. However, interfaces for manipulating data and environments in virtual worlds are limited and idiosyncratic. Furthermore, speech and vision are the primary modalities by which humans collect information about the world, but the linking of visual and natural language domains is a relatively new pursuit in computer vision. Machine learning techniques have been shown to be effective at image and speech classification, as well as at describing images with language (Karpathy 2016), but have not yet been used to describe potential actions. We propose a technique for creating a library of possible context-specific actions associated with 3D objects in immersive virtual worlds based on a novel dataset generated natively in virtual reality containing speech, image, gaze, and acceleration data. We will discuss the design and execution of a user study in virtual reality that enabled the collection and the development of this dataset. We will also discuss the development of a hybrid machine learning algorithm linking vision data with environmental affordances in natural language. Our findings demonstrate that it is possible to develop a model which can generate interpretable verbal descriptions of possible actions associated with recognized 3D objects within immersive VR environments. This suggests promising applications for more intuitive user interfaces through voice interaction within 3D environments. It also demonstrates the potential to apply vast bodies of embodied and semantic knowledge to enrich user interaction within VR environments. This technology would allow for applications such as expert knowledge annotation of 3D environments, complex verbal data querying and object manipulation in virtual spaces, and computer-generated, dynamic 3D object affordances and functionality during simulations.
Karel, Matejka; Lubomir, Sklenka
This paper describes one of the main purposes of the VR-1 training reactor utilisation - i.e. extensive educational programme. The educational programme is intended for the training of university students (all technical universities in Czech Republic) and selected nuclear power plant personnel. At the present, students can go through more than 20 different experimental exercises. An attractive programme including demonstration of reactor operation is prepared also for high school students. Moreover, research and development works and information programmes proceed at the VR-1 reactor as well
Nicola, Stelian; Virag, Ioan; Stoicu-Tivadar, Lăcrămioara
The new virtual reality based medical applications is providing a better understanding of healthcare related subjects for both medical students and physicians. The work presented in this paper underlines gamification as a concept and uses VR as a new modality to study the human skeleton. The team proposes a mobile Android platform application based on Unity 5.4 editor and Google VR SDK. The results confirmed that the approach provides a more intuitive user experience during the learning process, concluding that the gamification of classical medical software provides an increased interactivity level for medical students during the study of the human skeleton.
Habgood, Jacob; Wilson, David; Moore, David; Alapont, Sergio
PlayStation VR has quickly built up a significant user-base of over a million headsets and its own ecosystem of games across a variety of genres. These games form part of a rapidly evolving testing ground for design solutions which can usefully inform HCI design for virtual reality. This paper reviews every PlayStation VR title released in the first three months of its lifecycle in order to identify emerging themes for locomotion. These themes are discussed with respect to the lessons learned...
Full Text Available The ability to recognize an individual from their voice is a widespread ability with a long evolutionary history. Yet, the perceptual representation of familiar voices is ill-defined. In two experiments, we explored the neuropsychological processes involved in the perception of voice identity. We specifically explored the hypothesis that familiar voices (trained-to-familiar (Experiment 1, and famous voices (Experiment 2 are represented as a whole complex pattern, well approximated by the average of multiple utterances produced by a single speaker. In experiment 1, participants learned three voices over several sessions, and performed a three-alternative forced-choice identification task on original voice samples and several “speaker averages,” created by morphing across varying numbers of different vowels (e.g., [a] and [i] produced by the same speaker. In experiment 2, the same participants performed the same task on voice samples produced by familiar speakers. The two experiments showed that for famous voices, but not for trained-to-familiar voices, identification performance increased and response times decreased as a function of the number of utterances in the averages. This study sheds light on the perceptual representation of familiar voices, and demonstrates the power of average in recognizing familiar voices. The speaker average captures the unique characteristics of a speaker, and thus retains the information essential for recognition; it acts as a prototype of the speaker.
Wyatt, R. J.
For more than 15 years, fulldome video technology has transformed planetariums worldwide, using data-driven visualizations to support science storytelling. Fulldome video shares significant technical infrastructure with emerging VR headset technologies, and these personalized VR experiences allow for new audiences and new experiences of an existing library of context—as well as affording new opportunities for fulldome producers to explore. At the California Academy of Sciences, we are translating assets for our planetarium shows into immersive experiences for a variety of HR headsets. We have adapted scenes from our four award-wining features—Fragile Planet (2008), Life: A Cosmic Story (2010), Earthquake: Evidence of a Restless Planet (2012), and Habitat Earth (2015)—to place viewers inside a virtual planetarium viewing the shows. Similarly, we have released two creative-commons mini-shows on various VR outlets. This presentation will also highlight content the Academy will make available from our upcoming 2016 planetarium show about asteroids, comets, and solar system origins, some of which has been formatted for a full four-pi-steradian perspective. The shared immersive environment of digital planetariums offers significant opportunities for education and affective engagement of STEM-hungry audiences—including students, families, and adults. With the advent of VR technologies, we can leverage the experience of fulldome producers and planetarium professionals to create personalized home experiences that allow new ways to experience their content.
Morie, Jacquelyn F.
Immersive Virtual Reality (VR) technology, while popular in the late part of the 20th Century, seemed to disappear from public view as social media took its place and captured the attention of millions. Now that a new generation of entrepreneurs and crowd-sourced funding campaigns have arrived, perhaps virtual reality is poised for a resurgence.
Djajadiningrat, J.P.; Gribnau, M.W.
In this month's final episode of our 'Cubby: Multiscreen Desktop VR' trilogy we explain how you read the InputSprocket driver from part II, how you use it as input for the cameras from part I and how you calibrate the input device so that it leads to the correct head position.
Gribnau, M.W.; Djajadiningrat, J.P.
In this second part of our 'Cubby: Multiscreen Desktop VR' trilogy, we will introduce you to the art of creating a driver to read an Origin Instruments Dynasight input device. With the Dynasight, the position of the head of the user is established so that Cubby can display the correct images on its
This guide, written for vocational rehabilitation (VR) agency policymakers and staff alike, deals with the concept of marketing from a VR perspective. Covered in the individual chapters of the guide are the meaning of the term marketing; a conceptual framework for marketing in a VR agency (product definition, target group definition, differential…
Bekele, E; Bian, D; Peterman, J; Park, S; Sarkar, N
Schizophrenia is a life-long, debilitating psychotic disorder with poor outcome that affects about 1% of the population. Although pharmacotherapy can alleviate some of the acute psychotic symptoms, residual social impairments present a significant barrier that prevents successful rehabilitation. With limited resources and access to social skills training opportunities, innovative technology has emerged as a potentially powerful tool for intervention. In this paper, we present a novel virtual reality (VR)-based system for understanding facial emotion processing impairments that may lead to poor social outcome in schizophrenia. We henceforth call it a VR System for Affect Analysis in Facial Expressions (VR-SAAFE). This system integrates a VR-based task presentation platform that can minutely control facial expressions of an avatar with or without accompanying verbal interaction, with an eye-tracker to quantitatively measure a participants real-time gaze and a set of physiological sensors to infer his/her affective states to allow in-depth understanding of the emotion recognition mechanism of patients with schizophrenia based on quantitative metrics. A usability study with 12 patients with schizophrenia and 12 healthy controls was conducted to examine processing of the emotional faces. Preliminary results indicated that there were significant differences in the way patients with schizophrenia processed and responded towards the emotional faces presented in the VR environment compared with healthy control participants. The preliminary results underscore the utility of such a VR-based system that enables precise and quantitative assessment of social skill deficits in patients with schizophrenia.
Febriansyah; Zainuddin, Zahir; Bachtiar Nappu, M.
The development of voice activated panic button application aims to design faster early notification of hazardous condition in community to the nearest police by using speech as the detector where the current application still applies touch-combination on screen and use coordination of orders from control center then the early notification still takes longer time. The method used in this research was by using voice recognition as the user voice detection and haversine formula for the comparison of closest distance between the user and the police. This research was equipped with auto sms, which sent notification to the victim’s relatives, that was also integrated with Google Maps application (GMaps) as the map to the victim’s location. The results show that voice registration on the application reaches 100%, incident detection using speech recognition while the application is running is 94.67% in average, and the auto sms to the victim relatives reaches 100%.
Adhikary, Prakriti; Biswas, Anirban; Mandal, Dipankar
Composite nanofibers of Eu3+ doped poly(vinylidene fluoride-co-hexafluoropropylene) (P(VDF-HFP))/graphene are prepared by the electrospinning technique for the fabrication of ultrasensitive wearable piezoelectric nanogenerators (WPNGs) where the post-poling technique is not necessary. It is found that the complete conversion of the piezoelectric β-phase and the improvement of the degree of crystallinity is governed by the incorporation of Eu3+ and graphene sheets into P(VDF-HFP) nanofibers. The flexible nanocomposite fibers are associated with a hypersensitive electronic transition that results in an intense red light emission, and WPNGs also have the capability of detecting external pressure as low as ~23 Pa with a higher degree of acoustic sensitivity, ~11 V Pa-1, than has ever been previously reported. This means that ultrasensitive WPNGs can be utilized to recognize human voices, which suggests they could be a potential tool in the biomedical and national security sectors. The capacitor’s ability to charge from abundant environmental vibrations, such as music, wind, body motion, etc, drives WPNGs as a power source for portable electronics. This fact may open up the prospect of using the Eu3+ doped P(VDF-HFP)/graphene composite electrospun nanofibers, with their multifunctional properties such as vibration sensitivity, wearability, red light emission capability and piezoelectric energy harvesting, for various promising applications in portable electronics, health care monitoring, noise detection and security monitoring.
Zäske, Romi; Mühl, Constanze; Schweinberger, Stefan R
Recognition of personally familiar voices benefits from the concurrent presentation of the corresponding speakers' faces. This effect of audiovisual integration is most pronounced for voices combined with dynamic articulating faces. However, it is unclear if learning unfamiliar voices also benefits from audiovisual face-voice integration or, alternatively, is hampered by attentional capture of faces, i.e., "face-overshadowing". In six study-test cycles we compared the recognition of newly-learned voices following unimodal voice learning vs. bimodal face-voice learning with either static (Exp. 1) or dynamic articulating faces (Exp. 2). Voice recognition accuracies significantly increased for bimodal learning across study-test cycles while remaining stable for unimodal learning, as reflected in numerical costs of bimodal relative to unimodal voice learning in the first two study-test cycles and benefits in the last two cycles. This was independent of whether faces were static images (Exp. 1) or dynamic videos (Exp. 2). In both experiments, slower reaction times to voices previously studied with faces compared to voices only may result from visual search for faces during memory retrieval. A general decrease of reaction times across study-test cycles suggests facilitated recognition with more speaker repetitions. Overall, our data suggest two simultaneous and opposing mechanisms during bimodal face-voice learning: while attentional capture of faces may initially impede voice learning, audiovisual integration may facilitate it thereafter.
... prevent voice problems and maintain a healthy voice: Drink water (stay well hydrated): Keeping your body well hydrated by drinking plenty of water each day (6-8 glasses) is essential to maintaining a healthy voice. The ...
Madden, D J; Bastian, J
Considerable evidence has indicated that some acoustical properties of spoken items are preserved in an "echoic" memory for approximately 2 sec. However, some of this evidence has also shown that changing the voice speaking the stimulus items has a disruptive effect on memory which persists longer than that of other acoustical variables. The present experiment examined the effect of voice changes on response bias as well as on accuracy in a recognition memory task. The task involved judging recognition probes as being present in or absent from sets of dichotically presented digits. Recognition of probes spoken in the same voice as that of the dichotic items was more accurate than recognition of different-voice probes at each of three retention intervals of up to 4 sec. Different-voice probes increased the likelihood of "absent" responses, but only up to a 1.4-sec delay. These shifts in response bias may represent a property of echoic memory which should be investigated further.
Troya Moreno, Jorge
Software development for Virtual Reality (VR) has been popularized in 2016, alongside products such as Unity 3D and Oculus, especially in fields such as video games, tourism, media and marketing. But software development for VR is complex because additional requirements must be added to software that are not normally required. Newcomers to the Decoroso Crespo Laboratory, who join new groups to develop VR software using Unity 3D as a development platform, find it difficult to integrate some of...
Van Huyssteen, G
Full Text Available Multilingual emerging markets hold many opportunities for the application of spoken language technologies, such as automatic speech recognition (ASR) or test-to-speech (TTS) technologies in interactive voice response (IVR) systems. However...
Sklenka, L.; Kropik, M.
This paper describes one of the main purposes of the VR-1 training reactor utilization - i.e. extensive educational program. The educational program is intended for the training of university students and selected nuclear power plant personnel. The training courses provide them experience in reactor and neutron physics, dosimetry, nuclear safety and operation of nuclear facilities. At present, the training course participants can go through more than 20 standard experimental exercises; particular exercises for special training can be prepared. Approximately 200 university students become familiar with the reactor (lectures, experiments, experimental and diploma works, etc.) every year. About 12 different faculties from Czech universities use the reactor. International co-operation with European universities in Germany, Hungary, Austria, Slovakia, Holland and UK is frequent. The VR-1 reactor takes also part in Eugene Wigner Course on Reactor Physics Experiments in the framework of European Nuclear Educational Network (ENEN) association. Recently, training courses for Bulgarian research reactor specialists supported by IAEA were carried out. An attractive program including demonstration of reactor operation is prepared also for high school students. Every year, more than 1500 high school students come to visit the reactor, as do many foreigner visitors. (author)
There are three nuclear research reactors in the Czech Republic in operation now: light water reactor LVR-15, maximum reactor power 10 MW t , owner and operator Nuclear Research Institute Rez; light water zero power reactor LR-0, maximum reactor power 5 kW t , owner and operator Nuclear Research Institute Rez and training reactor VR-1 Sparrow, maximum reactor power 5 kW t , owner and operate Faculty of Nuclear Sciences and Physical Engineering, CTU in Prague. The training reactor VR-1 Vrabec 'Sparrow', operated at the Faculty of Nuclear Sciences and Physical Engineering, Czech Technical University in Prague, was started up on December 3, 1990. Particularly it is designed for training the students of Czech universities, preparing the experts for the Czech nuclear programme, as well as for certain research work, and for information programmes in the nuclear programme, as well as for certain research work, and for information programmes in sphere of using the nuclear energy (public relations). (author)
Champel, Mary-Luc; Doré, Renaud; Mollet, Nicolas
For many years, Virtual Reality has been presented as a promising technology that could deliver a truly new experience to users. The media and entertainment industry is now investigating the possibility to offer a video-based VR 360 experience. Nevertheless, there is a substantial risk that VR 360 could have the same fate as 3DTV if it cannot offer more than just being the next fad. The present paper aims at presenting the various quality factors required for a high-quality VR experience. More specifically, this paper will focus on the main three VR quality pillars: visual, audio and immersion.
Byker, Erik Jon; Putman, S. Michael; Handler, Laura; Polly, Drew
Student Voice is a term that honors the participatory roles that students have when they enter learning spaces like classrooms. Student Voice is the recognition of students' choice, creativity, and freedom. Seminal educationists--like Dewey and Montessori--centered the purposes of education in the flourishing and valuing of Student Voice. This…
Matejka, Karel; Sklenka, Lubomir
Full text: The training reactor VR-1 Vrabec ('Sparrow'), operated at the Faculty of Nuclear Sciences and Physical Engineering, Czech Technical University in Prague, was started up on December 3, 1990. Particularly, it is designed and operated for training of students from Czech universities, preparing of experts for the Czech nuclear programme, as well as for certain research and development work, and for information programmes in the sphere of non-military nuclear energy use (public relation). The VR-1 training reactor is a pool-type light-water reactor based on enriched uranium with maximum thermal power 1kWth and short time period up to 5kW th . The moderator of neutrons is light demineralized water (H 2 O) that is also used as a reflector, a biological shielding, and a coolant. Heat is removed from the core with natural convection. The reactor core contains 14 to 18 fuel assemblies IRT-3M, depending on the geometric arrangement and kind of experiments to be performed in the reactor. The core is accommodated in a cylindrical stainless steel vessel - pool, which is filled with water. UR-70 control rods serve the reactor control and safe shutdown. Training of the VR-1 reactor provides students with experience in reactor and neutron physics, dosimetry, nuclear safety, and nuclear installation operation. Students from technical universities and from natural sciences universities come to the reactor for training. Approximately 200 university students are introduced to the reactor (lectures, experiments, experimental and diploma works, etc.) every year. About 12 different faculties from Czech universities use the reactor. International co-operation with European universities in Germany, Hungary, Austria, Slovakia, Holland and UK is frequent. Practical Course on Reactor Physics in Framework of European Nuclear Engineering Network has been newly introduced. Currently, students can try out more than 20 experimental exercises. Further training courses have been included
Coomans, M.K.D.; Timmermans, H.J.P.
We present the design of a Virtual Reality based user interface (VR-UI). It is the interface for the VR-DIS system, a design application for the Building and Construction industry (VRDIS stands for Virtual Reality - Design Information System). The interface is characterised by a mixed representation
This paper explores the development of virtual reality (VR) use in education and the emergence of mobile VR based content creation and sharing as a platform for enabling learner-generated content and learner-generated contexts. The author argues that an ecology of resources that maps the user content creation and sharing affordances of mobile…
Gao, Yayue; Cao, Shuyang; Qu, Tianshu; Wu, Xihong; Li, Haifeng; Zhang, Jinsheng; Li, Liang
In noisy, multipeople talking environments such as a cocktail party, listeners can use various perceptual and/or cognitive cues to improve recognition of target speech against masking, particularly informational masking. Previous studies have shown that temporally prepresented voice cues (voice primes) improve recognition of target speech against speech masking but not noise masking. This study investigated whether static face image primes that have become target-voice associated (i.e., facial images linked through associative learning with voices reciting the target speech) can be used by listeners to unmask speech. The results showed that in 32 normal-hearing younger adults, temporally prepresenting a voice-priming sentence with the same voice reciting the target sentence significantly improved the recognition of target speech that was masked by irrelevant two-talker speech. When a person's face photograph image became associated with the voice reciting the target speech by learning, temporally prepresenting the target-voice-associated face image significantly improved recognition of target speech against speech masking, particularly for the last two keywords in the target sentence. Moreover, speech-recognition performance under the voice-priming condition was significantly correlated to that under the face-priming condition. The results suggest that learned facial information on talker identity plays an important role in identifying the target-talker's voice and facilitating selective attention to the target-speech stream against the masking-speech stream. © 2014 The Institute of Psychology, Chinese Academy of Sciences and Wiley Publishing Asia Pty Ltd.
Bele, Irene Velsvik
This study concerns speaking voice quality in a group of male teachers (n = 35) and male actors (n = 36), as the purpose was to investigate normal and supranormal voices. The goal was the development of a method of valid perceptual evaluation for normal to supranormal and resonant voices. The voices (text reading at two loudness levels) had been evaluated by 10 listeners, for 15 vocal characteristics using VA scales. In this investigation, the results of an exploratory factor analysis of the vocal characteristics used in this method are presented, reflecting four dimensions of major importance for normal and supranormal voices. Special emphasis is placed on the effects on voice quality of a change in the loudness variable, as two loudness levels are studied. Furthermore, the vocal characteristics Sonority and Ringing voice quality are paid special attention, as the essence of the term "resonant voice" was a basic issue throughout a doctoral dissertation where this study was included.
Zane Z Zheng
Full Text Available We describe an illusion in which a stranger's voice, when presented as the auditory concomitant of a participant's own speech, is perceived as a modified version of their own voice. When the congruence between utterance and feedback breaks down, the illusion is also broken. Compared to a baseline condition in which participants heard their own voice as feedback, hearing a stranger's voice induced robust changes in the fundamental frequency (F0 of their production. Moreover, the shift in F0 appears to be feedback dependent, since shift patterns depended reliably on the relationship between the participant's own F0 and the stranger-voice F0. The shift in F0 was evident both when the illusion was present and after it was broken, suggesting that auditory feedback from production may be used separately for self-recognition and for vocal motor control. Our findings indicate that self-recognition of voices, like other body attributes, is malleable and context dependent.
In this Teaching Tips article, the author argues for a dialogic conception of voice, based in the work of Mikhail Bakhtin. He demonstrates a dialogic view of voice in action, using two writing examples about the same topic from his daughter, a fifth-grade student. He then provides five practical tips for teaching a dialogic conception of voice in…
Full Text Available Most judicial opinions, for a variety of reasons, do not speak with the voice of identifiable judges, but an analysis of several of John Marshall’s best known opinions reveals a distinctive voice, with its characteristic language and style of argumentation. The power of this voice helps to account for the influence of his views.
PROF. OLIVER OSUAGWA
Dec 1, 2015 ... this technology and presented how to integrat VR with traditional instructional ... training has forced organizations to adopt new .... skills in a safe, controlled environment ... phone charger battery pack [B] connected to.
Full Text Available In this article, we report users’ perceptions of query input errors and query reformulation strategies in voice search using data collected through a laboratory user study. Our results reveal that: 1 users’ perceived obstacles during a voice search can be related to speech recognition errors and topic complexity; 2 users naturally develop different strategies to deal with various types of words (e.g., acronyms, single-worded queries, non-English words with high error rates in speech recognition; and 3 users can have various emotional reactions when encounter voice input errors and they develop preferred usage occasions for voice search.
A.B. Muhammad Firdaus
Full Text Available Abstract These days automotive has turned into a stand out amongst the most well-known modes of transportation on the grounds that a large number of Malaysians could bear to have an auto. There are numerous decisions of innovations in auto that have in the market. One of the engineering is voice controlled framework. Voice Recognition is the procedure of consequently perceiving a certain statement talked by a specific speaker focused around individual data included in discourse waves. This paper is to make an car controlled by voice of human. An essential pre-processing venture in Voice Recognition systems is to recognize the vicinity of noise. Sensitivity to speech variability lacking recognition precision and helplessness to mimic are among the principle specialized obstacles that keep the far reaching selection of speech-based recognition systems. Voice recognition systems work sensibly well with a quiet conditions however inadequately under loud conditions or in twisted channels. The key focus of the project is to control an electric car starter system.
O. N. Faizulaieva
Full Text Available The reasonability for the usage of computer systems user voice in the authentication process is proved. The scientific task for improving the signal/noise ratio of the user voice signal in the authentication system is considered. The object of study is the process of input and output of the voice signal of authentication system user in computer systems and networks. Methods and means for input and extraction of voice signal against external interference signals are researched. Methods for quality enhancement of user voice signal in voice authentication systems are suggested. As modern computer facilities, including mobile ones, have two-channel audio card, the usage of two microphones is proposed in the voice signal input system of authentication system. Meanwhile, the task of forming a lobe of microphone array in a desired area of voice signal registration (100 Hz to 8 kHz is solved. The usage of directional properties of the proposed microphone array gives the possibility to have the influence of external interference signals two or three times less in the frequency range from 4 to 8 kHz. The possibilities for implementation of space-time processing of the recorded signals using constant and adaptive weighting factors are investigated. The simulation results of the proposed system for input and extraction of signals during digital processing of narrowband signals are presented. The proposed solutions make it possible to improve the value of the signal/noise ratio of the useful signals recorded up to 10, ..., 20 dB under the influence of external interference signals in the frequency range from 4 to 8 kHz. The results may be useful to specialists working in the field of voice recognition and speaker’s discrimination.
Dastolfo-Hromack, Christina; Thomas, Tracey L; Rosen, Clark A; Gartner-Schmidt, Jackie
The objectives of this study were to describe singing voice therapy (SVT), describe referred patient characteristics, and document the outcomes of SVT. Retrospective. Records of patients receiving SVT between June 2008 and June 2013 were reviewed (n = 51). All diagnoses were included. Demographic information, number of SVT sessions, and symptom severity were retrieved from the medical record. Symptom severity was measured via the 10-item Singing Voice Handicap Index (SVHI-10). Treatment outcome was analyzed by diagnosis, history of previous training, and SVHI-10. SVHI-10 scores decreased following SVT (mean change = 11, 40% decrease) (P singing lessons (n = 10) also completed an average of three SVT sessions. Primary muscle tension dysphonia (MTD1) and benign vocal fold lesion (lesion) were the most common diagnoses. Most patients (60%) had previous vocal training. SVHI-10 decrease was not significantly different between MTD and lesion. This is the first outcome-based study of SVT in a disordered population. Diagnosis of MTD or lesion did not influence treatment outcomes. Duration of SVT was short (approximately three sessions). Voice care providers are encouraged to partner with a singing voice therapist to provide optimal care for the singing voice. This study supports the use of SVT as a tool for the treatment of singing voice disorders. 4 Laryngoscope, 126:2546-2551, 2016. © 2016 The American Laryngological, Rhinological and Otological Society, Inc.
Charles V. Smith Iii
Full Text Available Control systems driven by voice recognition software have been implemented before but lacked the context driven approach to generate relevant responses and actions. A partially voice activated control system for mobile robotics is presented that allows an autonomous robot to interact with people and the environment in a meaningful way, while dynamically creating customized tours. Many existing control systems also require substantial training for voice application. The system proposed requires little to no training and is adaptable to chaotic environments. The traversable area is mapped once and from that map a fully customized route is generated to the user
will be based on a reception aesthetic and phenomenological approach, the latter as presented by Don Ihde in his book Listening and Voice. Phenomenologies of Sound , and my analytical sketches will be related to theoretical statements concerning the understanding of voice and media (Cavarero, Dolar, La......Belle, Neumark). Finally, the article will discuss the specific artistic combination and our auditory experience of mediated human voices and sculpturally projected faces in an art museum context under the general conditions of the societal panophonia of disembodied and mediated voices, as promoted by Steven...
Hadden, David R.; Haratz, David
The application of speech recognition technology in the Army command and control area is presented. The problems associated with this program are described as well as as its relevance in terms of the man/machine interactions, voice inflexions, and the amount of training needed to interact with and utilize the automated system.
Full Text Available In parallel with the development of technology, various control methods are also developed. Voice control system is one of these control methods. In this study, an effective modelling upon mathematical models used in the literature is performed, and a voice control system is developed in order to control prosthetic robot arms. The developed control system has been applied on four-jointed RRRR robot arm. Implementation tests were performed on the designed system. As a result of the tests; it has been observed that the technique utilized in our system achieves about 11% more efficient voice recognition than currently used techniques in the literature. With the improved mathematical modelling, it has been shown that voice commands could be effectively used for controlling the prosthetic robot arm. Keywords: Voice recognition model, Voice control, Prosthetic robot arm, Robotic control, Forward kinematic
Full Text Available The OSCE is a reliable evaluation method to estimate the preclinical examination of dental students. The most ideal assessment for OSCE is used the augmented reality simulator to evaluate. This literature review investigated a recently developed in virtual reality (VR and augmented reality (AR starting of the dental history to the progress of the dental skill. As result of the lacking of technology, it needs to depend on other device increasing the success rate and decreasing the risk of the surgery. The development of tracking unit changed the surgical and educational way. Clinical surgery is based on mature education. VR and AR simultaneously affected the skill of the training lesson and navigation system. Widely, the VR and AR not only applied in the dental training lesson and surgery, but also improved all field in our life. Keywords: OSCE, Dental simulator, Augmented reality, Virtual reality, Dentistry
Huang, Ta-Ko; Yang, Chi-Hsun; Hsieh, Yu-Hsin; Wang, Jen-Chyan; Hung, Chun-Cheng
The OSCE is a reliable evaluation method to estimate the preclinical examination of dental students. The most ideal assessment for OSCE is used the augmented reality simulator to evaluate. This literature review investigated a recently developed in virtual reality (VR) and augmented reality (AR) starting of the dental history to the progress of the dental skill. As result of the lacking of technology, it needs to depend on other device increasing the success rate and decreasing the risk of the surgery. The development of tracking unit changed the surgical and educational way. Clinical surgery is based on mature education. VR and AR simultaneously affected the skill of the training lesson and navigation system. Widely, the VR and AR not only applied in the dental training lesson and surgery, but also improved all field in our life. Copyright © 2018. Published by Elsevier Taiwan.
Hsieh, Min-Chai; Lin, Yu-Hsuan
As technology advances, mobile devices have gradually turned into wearable devices. Furthermore, virtual reality (VR), augmented reality (AR), and mixed reality (MR) are being increasingly applied in medical fields such as medical education and training, surgical simulation, neurological rehabilitation, psychotherapy, and telemedicine. Research results demonstrate the ability of VR, AR, and MR to ameliorate the inconveniences that are often associated with traditional medical care, reduce incidents of medical malpractice caused by unskilled operations, and reduce the cost of medical education and training. What is more, the application of these technologies has enhanced the effectiveness of medical education and training, raised the level of diagnosis and treatment, improved the doctor-patient relationship, and boosted the efficiency of medical execution. The present study introduces VR, AR, and MR applications in medical practice and education with the aim of helping health professionals better understand the applications and use these technologies to improve the quality of medical care.
Kreylos, O.; Kellogg, L. H.
Immersive visualization using virtual reality (VR) display technology offers tremendous benefits for the visual analysis of complex three-dimensional data like those commonly obtained from geophysical and geological observations and models. Unlike "traditional" visualization, which has to project 3D data onto a 2D screen for display, VR can side-step this projection and display 3D data directly, in a pseudo-holographic (head-tracked stereoscopic) form, and does therefore not suffer the distortions of relative positions, sizes, distances, and angles that are inherent in 2D projection. As a result, researchers can apply their spatial reasoning skills to virtual data in the same way they can to real objects or environments. The UC Davis W.M. Keck Center for Active Visualization in the Earth Sciences (KeckCAVES, http://keckcaves.org) has been developing VR methods for data analysis since 2005, but the high cost of VR displays has been preventing large-scale deployment and adoption of KeckCAVES technology. The recent emergence of high-quality commodity VR, spearheaded by the Oculus Rift and HTC Vive, has fundamentally changed the field. With KeckCAVES' foundational VR operating system, Vrui, now running natively on the HTC Vive, all KeckCAVES visualization software, including 3D Visualizer, LiDAR Viewer, Crusta, Nanotech Construction Kit, and ProtoShop, are now available to small labs, single researchers, and even home users. LiDAR Viewer and Crusta have been used for rapid response to geologic events including earthquakes and landslides, to visualize the impacts of sealevel rise, to investigate reconstructed paleooceanographic masses, and for exploration of the surface of Mars. The Nanotech Construction Kit is being used to explore the phases of carbon in Earth's deep interior, while ProtoShop can be used to construct and investigate protein structures.
Kosztyła-Hojna, Bożena; Kuryliszyn-Moskal, Anna; Rogowski, Marek; Moskal, Diana; Dakowicz, Agnieszka; Falkowski, Dawid; Kasperuk, Joanna
Hyperfunctional dysphonia is the most frequent type of occupational functional dysphonia. Pharmacotherapy, physiotherapy and psychotherapy are used in the treatment of occupational dysphonia. Vibratory massages of the regions of the larynx relax the external muscles of neck, which have an indirect impact on the tension of the vocal folds. The aim of the study is to assess the impact of vibratory stimulation therapy on voice quality in patients with hyperfunctional occupational dysphonia treated pharmacologically. Forty patients with hyperfunctional occupational dysphonia treated phoniatrically in the Phoniatric Outpatient Clinic were included in the study. Patients were divided into two groups. Group I consisted of 20 patients treated pharmacologically. In group II, including 20 patients, apart from pharmacotherapy the vibratory stimulation therapy by the device of VR type (CyberBioMed LLC) was used. In the analysis of voice quality the evaluation of the vocal folds vibration using videolaryngostroboscopy and acoustic assessment of voice were conducted. The perceptual assessment of voice, the visualization of the vocal folds vibration in stroboscopic examination of the larynx and the acoustic assessment of voice enable the appropriate diagnostics of the clinical type and voice quality in hyperfunctional dysphonia. The tension of superficial and deep muscles of neck has the impact on the phonatory function of the larynx. Pharmacological treatment improves the voice quality in hyperfunctional occupational dysphonia. Pharmacological treatment combines with the relaxation of muscles of neck using the device of VR type significantly improve voice quality in hyperfunctional occupational dysphonia. Copyright © 2012 Polish Otolaryngology Society. Published by Elsevier Urban & Partner Sp. z.o.o. All rights reserved.
Lee, Il S.; Yoon, Sang H.; Shim, Kyu W.; Yu, Yong H.; Suh, Kune Y.
There continues to be an increasing demand of electricity around the globe to fuel the industrial growth and to promote the human welfare. The economic activities have brought about richness in our material and cultural lives, in which process the electric power has been at the heart of the versatile energy sources. In order to timely and competitively respond to rapidly changing energy environment in the twenty-first century there is a growing need to build the advanced nuclear power plants in the unlimited workspace of virtual reality (VR) prior to commissioning. One can then realistically evaluate their construction time and cost per varying methods and options available from the leading-edge technology. In particular a great deal of efforts have yet to be made for time- and cost-dependent plant simulation and dynamically coupled database construction in the VR space. The operator training and personnel education may also benefit from the VR technology. The present work is being proposed in the three-dimensional space and time plus cost coordinates, i. e. four plus dimensional (4 + D) coordinates. The 4 + D VR application will enable the nuclear industry to narrow the technological gap from the other leading industries that have long since been employing the VR engineering. The 4 + D technology will help nurture public understanding of the special discipline of nuclear power plants. The technology will also facilitate public access to the knowledge on the nuclear science and engineering which has so far been monopolized by the academia, national laboratories and the heavy industry. The 4 + D virtual design and construction will open up the new horizon for revitalization of the nuclear industry over the globe in the foreseeable future. Considering the long construction and operation time for the nuclear power plants, the preliminary VR simulation capability for the plants will supply the vital information not only for the actual design and construction of the
Examines two methods of generating synthetic speech in voice response systems, which allow computers to communicate in human terms (speech), using human interface devices (ears): phoneme and reconstructed voice systems. Considerations prior to implementation, current and potential applications, glossary, directory, and introduction to Input Output…
Fusaroli, Riccardo; Weed, Ethan
Anomalous aspects of speech and voice, including pitch, fluency, and voice quality, are reported to characterise many mental disorders. However, it has proven difficult to quantify and explain this oddness of speech by employing traditional statistical methods. In this talk we will show how...
O. N. Faizulaieva
Full Text Available Scientific task for improving the signal-to-noise ratio for user’s voice signal in computer systems and networks during the process of user’s voice authentication is considered. The object of study is the process of input and extraction of the voice signal of authentication system user in computer systems and networks. Methods and means for input and extraction of the voice signal on the background of external interference signals are investigated. Ways for quality improving of the user’s voice signal in systems of voice authentication are investigated experimentally. Firmware means for experimental unit of input and extraction of the user’s voice signal against external interference influence are considered. As modern computer means, including mobile, have two-channel audio card, two microphones are used in the voice signal input. The distance between sonic-wave sensors is 20 mm and it provides forming one direction pattern lobe of microphone array in a desired area of voice signal registration (from 100 Hz to 8 kHz. According to the results of experimental studies, the usage of directional properties of the proposed microphone array and space-time processing of the recorded signals with implementation of constant and adaptive weighting factors has made it possible to reduce considerably the influence of interference signals. The results of firmware experimental studies for input and extraction of the user’s voice signal against external interference influence are shown. The proposed solutions will give the possibility to improve the value of the signal/noise ratio of the useful signals recorded up to 20 dB under the influence of external interference signals in the frequency range from 4 to 8 kHz. The results may be useful to specialists working in the field of voice recognition and speaker discrimination.
Kisilevsky, Barbara S.; Hains, Sylvia M. J.
Background: Term fetuses discriminate their mother's voice from a female stranger's, suggesting recognition/learning of some property of her voice. Identification of the onset and maturation of the response would increase our understanding of the influence of environmental sounds on the development of sensory abilities and identify the period when…
Alphen, P.M. van; McQueen, J.M.
Effects on spoken-word recognition of prevoicing differences in Dutch initial voiced plosives were examined. In 2 cross-modal identity-priming experiments, participants heard prime words and nonwords beginning with voiced plosives with 12, 6, or 0 periods of prevoicing or matched items beginning
Recognition is the main word attached to multicultural perspectives. The multicultural call for recognition, the one calling for the recognition of cultural minorities and identities, the one now voiced by liberal states all over and also in Israel was a more difficult one. It took the author some time to realize that calling for the recognition…
S.PON SANGEETHA; DR.M.KARNAN
The security plays an important role in any type of organization in today’s life. Iris recognition is one of the leading automatic biometric systems in the area of security which is used to identify the individual person. Biometric systems include fingerprints, facial features, voice recognition, hand geometry, handwriting, the eye retina and the most secured one presented in this paper, the iris recognition. Biometric systems has become very famous in security systems because it is not possi...
Gallo, Luigi; Pietro, Giuseppe De
Virtual Reality (VR) technologies make it possible to reproduce faithfully real life events in computer-generated scenarios. This approach has the potential to simplify the way people solve problems, since they can take advantage of their real life experiences while interacting in synthetic worlds.
Most technical programs in Nigeria's tertiary institutions lack the desired laboratories to impact technical skills to the students. This has led to the production of pseudo-illustrates as graduates and this accounts for reasons why many employers are saying Nigerian graduates are not employable. Virtual Reality (VR) can ...
Lee, Il-Suk; Yoon, Sang-Hyuk; Suh, Kune Y.
For the timely and competitive response to rapidly changing energy environment in the twenty-first century, there is a growing need to build the advanced nuclear power plants in the unlimited workspace of virtual reality (VR) prior to commissioning. One can then realistically evaluate their construction time and cost per varying methods and options available from the leading-edge technology. In particular, a great deal of efforts have yet to be made for time- and cost-dependent plant simulation and dynamically coupled database construction in the VR space. The present work is being proposed in the three-dimensional space and time plus cost coordinates, i.e. four plus dimensional (4 + D) coordinates. The 4 + D VR technology TM will help the preliminary VR simulation capability for the plants will supply the vital information not only for the actual design and construction of the engineered structures but also for the on-line design modification. Quite a few companies and research institutions have supplied various information services to the nuclear market. A great deal of the information exists in the form of reports, articles, books, which are just kind of simple texts and graphic images. But if very large and important information transfer methods are developed for the nuclear plants by means of the 4 + D technology database, they will tend to greatly benefit the designers, manufacturers, users and even the public. Moreover, one can understand clearly the total structure of the nuclear plants if the 4 + D VR technology TM database operates together with the transient analysis simulator. This technique should be available for public information about the nuclear industry as well as nuclear plant structure and components. By using the 4 + D VR technology TM one can supply the information to users which couldn't have been expressed by the existing technology. Users can not only spin or observe closely the structural elements by simple mouse control, but also know
This study was undertaken to provide information on the voice of patients following radiotherapy for glottic cancer. Part I presents findings from questionnaires returned by 227 of 235 patients successfully irradiated for glottic cancer from 1960 through 1971. Part II presents preliminary findings on the speaking fundamental frequencies of 22 irradiated patients. Normal to near-normal voice was reported by 83 percent of the 227 patients; however, 80 percent did indicate persisting vocal difficulties such as fatiguing of voice with much usage, inability to sing, reduced loudness, hoarse voice quality and inability to shout. Amount of talking during treatments appeared to affect length of time for voice to recover following treatments in those cases where it took from nine to 26 weeks; also, with increasing years since treatment, patients rated their voices more favorably. Smoking habits following treatments improved significantly with only 27 percent smoking heavily as compared with 65 percent prior to radiation therapy. No correlation was found between smoking (during or after treatments) and vocal ratings or between smoking and length of time for voice to recover. There was no relationship found between reported vocal ratings and stage of the disease
Music teachers are in a class all their own when it comes to voice use. These elite vocal athletes require stamina, strength, and flexibility from their voices day in, day out for hours at a time. Voice rehabilitation clinics and research show that music education ranks high among the professionals most commonly affected by voice problems.…
Sorokin, V. N.; Makarov, I. S.
Efficiency of automatic recognition of male and female voices based on solving the inverse problem for glottis area dynamics and for waveform of the glottal airflow volume velocity pulse is studied. The inverse problem is regularized through the use of analytical models of the voice excitation pulse and of the dynamics of the glottis area, as well as the model of one-dimensional glottal airflow. Parameters of these models and spectral parameters of the volume velocity pulse are considered. The following parameters are found to be most promising: the instant of maximum glottis area, the maximum derivative of the area, the slope of the spectrum of the glottal airflow volume velocity pulse, the amplitude ratios of harmonics of this spectrum, and the pitch. On the plane of the first two main components in the space of these parameters, an almost twofold decrease in the classification error relative to that for the pitch alone is attained. The male voice recognition probability is found to be 94.7%, and the female voice recognition probability is 95.9%.
Ebbesen, Marius; Ahsan, Sabeel
Recent technological development has led virtual reality (VR) head mounted displays (HMD) to become commercially available to the mass market. Consumers have started to adopt the technology quickly, and forecasts for the VR industry are very promising for the upcoming years. However, little research has been conducted on the effects of exposure to immersive VR video through HMDs. Our aim for this thesis has been to investigate the effects of exposure to VR video and uncover the underlying mec...
Yan Ming Cheng
Full Text Available We describe two voice-to-phoneme conversion algorithms for speaker-independent voice-tag creation specifically targeted at applications on embedded platforms. These algorithms (batch mode and sequential are compared in speech recognition experiments where they are first applied in a same-language context in which both acoustic model training and voice-tag creation and application are performed on the same language. Then, their performance is tested in a cross-language setting where the acoustic models are trained on a particular source language while the voice-tags are created and applied on a different target language. In the same-language environment, both algorithms either perform comparably to or significantly better than the baseline where utterances are manually transcribed by a phonetician. In the cross-language context, the voice-tag performances vary depending on the source-target language pair, with the variation reflecting predicted phonological similarity between the source and target languages. Among the most similar languages, performance nears that of the native-trained models and surpasses the native reference baseline.
Herrera Martínez, Marcelo; Aldana Blanco, Andrea Lorena; Guzmán Palacios, Ana María
The present paper describes the development of a software for analysis of acoustic voice parameters (APAVOIX), which can be used for forensic acoustic purposes, based on the speaker recognition and identification. This software enables to observe in a clear manner, the parameters which are sufficient and necessary when performing a comparison between two voice signals, the suspicious and the original one. These parameters are used according to the classic method, generally used by state entit...
Gunkel, S.N.B.; Stokking, H.M.; Prins, M.J.; Stap, N. van der; Haar, F.B. ter; Niamut, O.A.
Virtual Reality (VR) and 360-degree video are set to become part of the future social environment, enriching and enhancing the way we share experiences and collaborate remotely. While Social VR applications are getting more momentum, most services regarding Social VR focus on animated avatars. In
Boutin, Daniel L.; Wilson, Keith B.
The relationship of receiving college and university training within the state vocational rehabilitation (VR) program to pre-VR consumer characteristics was investigated with a multiple direct logistic regression technique. A model containing 11 pre-VR characteristics predict the reception of college and university training for a multidisability…
Pauletto, Sandra; Balentine, Bruce; Pidcock, Chris; Jones, Kevin; Bottaci, Leonardo; Aretoulaki, Maria; Wells, Jez; Mundy, Darren P; Balentine, James
Emotion in audio-voice signals, as synthesized by text-to-speech (TTS) technologies, was investigated to formulate a theory of expression for user interface design. Emotional parameters were specified with markup tags, and the resulting audio was further modulated with post-processing techniques. Software was then developed to link a selected TTS synthesizer with an automatic speech recognition (ASR) engine, producing a chatbot that could speak and listen. Using these two artificial voice subsystems, investigators explored both artistic and psychological implications of artificial speech emotion. Goals of the investigation were interdisciplinary, with interest in musical composition, augmentative and alternative communication (AAC), commercial voice announcement applications, human-computer interaction (HCI), and artificial intelligence (AI). The work-in-progress points towards an emerging interdisciplinary ontology for artificial voices. As one study output, HCI tools are proposed for future collaboration.
Tiwari, Manjul; Tiwari, Maneesha
Voices are important things for humans. They are the medium through which we do a lot of communicating with the outside world: our ideas, of course, and also our emotions and our personality. The voice is the very emblem of the speaker, indelibly woven into the fabric of speech. In this sense, each of our utterances of spoken language carries not only its own message but also, through accent, tone of voice and habitual voice quality it is at the same time an audible declaration of our membership of particular social regional groups, of our individual physical and psychological identity, and of our momentary mood. Voices are also one of the media through which we (successfully, most of the time) recognize other humans who are important to us-members of our family, media personalities, our friends, and enemies. Although evidence from DNA analysis is potentially vastly more eloquent in its power than evidence from voices, DNA cannot talk. It cannot be recorded planning, carrying out or confessing to a crime. It cannot be so apparently directly incriminating. As will quickly become evident, voices are extremely complex things, and some of the inherent limitations of the forensic-phonetic method are in part a consequence of the interaction between their complexity and the real world in which they are used. It is one of the aims of this article to explain how this comes about. This subject have unsolved questions, but there is no direct way to present the information that is necessary to understand how voices can be related, or not, to their owners.
Rantala, Leena M; Hakala, Suvi J; Holmqvist, Sofia; Sala, Eeva
The aim of the study was to investigate the connections between voice ergonomic risk factors found in classrooms and voice-related problems in teachers. Voice ergonomic assessment was performed in 39 classrooms in 14 elementary schools by means of a Voice Ergonomic Assessment in Work Environment--Handbook and Checklist. The voice ergonomic risk factors assessed included working culture, noise, indoor air quality, working posture, stress, and access to a sound amplifier. Teachers from the above-mentioned classrooms reported their voice symptoms, respiratory tract diseases, and completed a Voice Handicap Index (VHI). The more voice ergonomic risk factors found in the classroom the higher were the teachers' total scores on voice symptoms and VHI. Stress was the factor that correlated most strongly with voice symptoms. Poor indoor air quality increased the occurrence of laryngitis. Voice ergonomics were poor in the classrooms studied and voice ergonomic risk factors affected the voice. It is important to convey information on voice ergonomics to education administrators and those responsible for school planning and taking care of school buildings. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Huang, Shuping; Ran, Feng; Ji, Yuan; Chen, Wendong
Silicon-based OLED (Organic Light Emitting Display) microdisplay technology begins to attract people's attention in the emerging VR and AR devices. The high display frame refresh rate is an important solution to alleviate the dizziness in VR applications. Traditional display circuit drivers use the analog method or the digital PWM method that follow the serial scan order from the first pixel to the last pixel by using the shift registers. This paper proposes a novel atomized scan strategy based on the digital fractal scan strategy using the pseudo-random scan order. It can be used to realize the high frame refresh rate with the moderate pixel clock frequency in the high definition OLED microdisplay. The linearity of the gray level is also improved compared with the Z fractal scan strategy.
Quadratic Discrimination Function(QDF)is commonly used in speech emotion recognition,which proceeds on the premise that the input data is normal distribution.In this Paper,we propose a transformation to normalize the emotional features,then derivate a Modified QDF(MQDF) to speech emotion recognition.Features based on prosody and voice quality are extracted and Principal Component Analysis Neural Network (PCANN) is used to reduce dimension of the feature vectors.The results show that voice quality features are effective supplement for recognition.and the method in this paper could improve the recognition ratio effectively.
Full Text Available Presently, lawyers, law enforcement agencies, and judges in courts use speech and other biometric features to recognize suspects. In general, speaker recognition is used for discriminating people based on their voices. The process of determining, if a suspected speaker is the source of trace, is called forensic speaker recognition. In such applications, the voice samples are most probably noisy, the recording sessions might mismatch each other, the sessions might not contain sufficient recording for recognition purposes, and the suspect voices are recorded through mobile channel. The identification of a person through his voice within a forensic quality context is challenging. In this paper, we propose a method for forensic speaker recognition for the Arabic language; the King Saud University Arabic Speech Database is used for obtaining experimental results. The advantage of this database is that each speaker’s voice is recorded in both clean and noisy environments, through a microphone and a mobile channel. This diversity facilitates its usage in forensic experimentations. Mel-Frequency Cepstral Coefficients are used for feature extraction and the Gaussian mixture model-universal background model is used for speaker modeling. Our approach has shown low equal error rates (EER, within noisy environments and with very short test samples.
Ren, Yilong; Duan, Xitong; Wu, Lei; He, Jin; Xu, Wu
With the development of the “VR+” era, the traditional virtual assembly system of power equipment has been unable to satisfy our growing needs. In this paper, based on the analysis of the traditional virtual assembly system of electric power equipment and the application of VR technology in the virtual assembly system of electric power equipment in our country, this paper puts forward the scheme of establishing the virtual assembly system of power equipment: At first, we should obtain the information of power equipment, then we should using OpenGL and multi texture technology to build 3D solid graphics library. After the completion of three-dimensional modeling, we can use the dynamic link library DLL package three-dimensional solid graphics generation program to realize the modularization of power equipment model library and power equipment model library generated hidden algorithm. After the establishment of 3D power equipment model database, we set up the virtual assembly system of 3D power equipment to separate the assembly operation of the power equipment from the space. At the same time, aiming at the deficiency of the traditional gesture recognition algorithm, we propose a gesture recognition algorithm based on improved PSO algorithm for BP neural network data glove. Finally, the virtual assembly system of power equipment can really achieve multi-channel interaction function.
Ta-Ko Huang; Chi-Hsun Yang; Yu-Hsin Hsieh; Jen-Chyan Wang; Chun-Cheng Hung
The OSCE is a reliable evaluation method to estimate the preclinical examination of dental students. The most ideal assessment for OSCE is used the augmented reality simulator to evaluate. This literature review investigated a recently developed in virtual reality (VR) and augmented reality (AR) starting of the dental history to the progress of the dental skill. As result of the lacking of technology, it needs to depend on other device increasing the success rate and decreasing the risk of th...
Richir, Simon; Fuchs, Philippe; Lourdeaux, Domitile; Buche, Cédric; Querrec, Ronan
The convergence of technologies currently observed in the field of VR, AR, robotics and consumer electronic reinforces the trend of new applications appearing every day. But when transferring knowledge acquired from research to businesses, research laboratories are often at a loss because of a lack of knowledge of the design and integration processes in creating an industrial scale product. In fact, the innovation approaches that take a good idea from the laboratory to a successful industrial product are often little known to researchers. The objective of this paper is to present the results of the work of several research teams that have finalized a working method for researchers and manufacturers that allow them to design virtual or augmented reality systems and enable their users to enjoy "a compelling VR experience". That approach, called "the I2I method", present 11 phases from "Establishing technological and competitive intelligence and industrial property" to "Improvements" through the "Definition of the Behavioral Interface, Virtual Environment and Behavioral Software Assistance". As a result of the experience gained by various research teams, this design approach benefits from contributions from current VR and AR research. Our objective is to validate and continuously move such multidisciplinary design team methods forward.
Cuquel, A-C; Dorandeu, F; Ceppa, F; Renard, C; Burnat, P
A product of the arms race during the Cold War, the Russian VX, or VR, is an organophosphorus compound that is a structural isomer of the western VX compound (or A4), with which it shares a very high toxicity. It is much less studied and known than VX because the knowledge of its existence is relatively recent. A very low volatility and high resistance in the environment make it a persistent agent. Poisoning occurs mainly following penetration through skin and mucosa but vapour inhalation is a credible risk in some circumstances. The clinical presentation may be differed by several hours and despite the absence of signs and symptoms, the casualty should not be considered as contamination or intoxication-free. This agent has a long residence time in blood, a characteristics that clearly differentiates it from other compounds such as sarin. The protocols for antidote administration may thus have to be changed accordingly. The fact that VR poisoned individuals will less respond to the current oxime therapy used in France, the 2-PAM and that VR represents a higher threat than VX, being probably possessed by some proliferating states, justify the interest for this toxic product. Copyright © 2014 Elsevier Masson SAS. All rights reserved.
Ainsworth, Richard A.
Virtual reality systems are an excellent environment for stereo panorama displays. The acquisition and display methods described here combine high-resolution photography with surround vision and full stereo view in an immersive environment. This combination provides photographic stereo-panoramas for a variety of VR displays, including the StarCAVE, NexCAVE, and CORNEA. The zero parallax point used in conventional panorama photography is also the center of horizontal and vertical rotation when creating photographs for stereo panoramas. The two photographically created images are displayed on a cylinder or a sphere. The radius from the viewer to the image is set at approximately 20 feet, or at the object of major interest. A full stereo view is presented in all directions. The interocular distance, as seen from the viewer\\'s perspective, displaces the two spherical images horizontally. This presents correct stereo separation in whatever direction the viewer is looking, even up and down. Objects at infinity will move with the viewer, contributing to an immersive experience. Stereo panoramas created with this acquisition and display technique can be applied without modification to a large array of VR devices having different screen arrangements and different VR libraries.
Barsoum, Emad; Kuester, Falko
The pervasive nature of web-based content has lead to the development of applications and user interfaces that port between a broad range of operating systems and databases, while providing intuitive access to static and time-varying information. However, the integration of this vast resource into virtual environments has remained elusive. In this paper we present an implementation of a 3D Web Browser (WebVR) that enables the user to search the internet for arbitrary information and to seamlessly augment this information into virtual environments. WebVR provides access to the standard data input and query mechanisms offered by conventional web browsers, with the difference that it generates active texture-skins of the web contents that can be mapped onto arbitrary surfaces within the environment. Once mapped, the corresponding texture functions as a fully integrated web-browser that will respond to traditional events such as the selection of links or text input. As a result, any surface within the environment can be turned into a web-enabled resource that provides access to user-definable data. In order to leverage from the continuous advancement of browser technology and to support both static as well as streamed content, WebVR uses ActiveX controls to extract the desired texture skin from industry strength browsers, providing a unique mechanism for data fusion and extensibility.
Suh, Kune Y.; Ryu, Joong W.; Kang, Myung G.; Kim, H. Y.; Cho, J. G.; Kim, D. H.; Park, J. W.
Currently, there are many ongoing efforts to shorten the plant refueling and maintenance outage durations, and it is expected to become more active as the time goes on. Improved training and education system are required for the personnel to perform efficient inspection on time. This work is focused on establishing virtual Nuclear Power Plant system which will help train the personnel to understand the system characteristics of the plant by creating navigation enabled 3D plant mockups. Furthermore, this project is aimed at constructing information management system over the whole plant area, by integrating safety related data and combining it with web based GUI technology, to make search and management activities easy. This project spans three years. The forst year was spent in 3D mockup modeling of most part of the plant, and prototyping the web based VR plant digital information system. Plant environment, buildings, reactor structure, steam generator, pressurizer, fuel assemblies, pressurizer safety valve, main steamline safety valve, reactor coolant system, main steamline system, auxiliary and main coolant supply system were modeled into 3D mockups. Control functions such as magnification, rotation, movement, transparency, location detection, cross-cut view, full screen toggle and screen capture were implemented to facilitate manipulation of and navigation through the VR mockups. It is expected that the VR plant will serve as an effective support system for power plant regulation and inspection
Balogh, Tibor; Kara, Peter A.
The evolution of 3D technologies shows a cyclical learning curve with a series of hypes and dead ends, with mistakes and consequences. 3D images contain significantly more information than the corresponding 2D ones. 3D display systems should be built on more pixels, or higher speed components. For true 3D, this factor is in the order of 100x, which is a real technological challenge. If not fulfilled, the capabilities of 3D systems will be compromised: headgears will be needed, or the viewers should be positioned or tracked, single-user devices, lack of parallax, missing cues, etc. The temptation is always there: why to provide all the information, just what the person absorbs that moment (subjective or objective visualization). Virtual Reality (VR) glasses have been around for more than two decades. With the latest technical improvements, VR became the next hype. 3D immersion was added as a new phenomenon; however, VR represents an isolated experience, and still requires headgears and a controlled environment. Augmented Reality (AR) in this sense is different. Will the VR/AR hype with the headgears be a dead end? While VR headsets may sell better than smart glasses or 3D TV glasses, also consider that using the technology may require a set of behavioral changes that the majority of people do not want to make. Displays and technologies that restrict viewers, or cause any discomfort will not be accepted on the long term. The newer wave of 3D is forecasted to 2018-2020, answering the need for unaided, limitation-free 3D experience. Light Field (LF) systems represent the next-generation in 3D. The HoloVizio system, having a capacity in the order of 100x, offers natural, restrictions-free 3D experience on a full field of view, enabling collaborative use for an unlimited number of viewers, even in a wider, immersive space. As a scalable technology, the display range goes from monitor-style units, through automotive 3D HUDs, screen-less solutions, up to cinema systems
Full Text Available In the field of information security, voice is one of the most important parts in biometrics. Especially, with the development of voice communication through the Internet or telephone system, huge voice data resources are accessed. In speaker recognition, voiceprint can be applied as the unique password for the user to prove his/her identity. However, speech with various emotions can cause an unacceptably high error rate and aggravate the performance of speaker recognition system. This paper deals with this problem by introducing a cost-sensitive learning technology to reweight the probability of test affective utterances in the pitch envelop level, which can enhance the robustness in emotion-dependent speaker recognition effectively. Based on that technology, a new architecture of recognition system as well as its components is proposed in this paper. The experiment conducted on the Mandarin Affective Speech Corpus shows that an improvement of 8% identification rate over the traditional speaker recognition is achieved.
Li, Dongdong; Yang, Yingchun; Dai, Weihui
In the field of information security, voice is one of the most important parts in biometrics. Especially, with the development of voice communication through the Internet or telephone system, huge voice data resources are accessed. In speaker recognition, voiceprint can be applied as the unique password for the user to prove his/her identity. However, speech with various emotions can cause an unacceptably high error rate and aggravate the performance of speaker recognition system. This paper deals with this problem by introducing a cost-sensitive learning technology to reweight the probability of test affective utterances in the pitch envelop level, which can enhance the robustness in emotion-dependent speaker recognition effectively. Based on that technology, a new architecture of recognition system as well as its components is proposed in this paper. The experiment conducted on the Mandarin Affective Speech Corpus shows that an improvement of 8% identification rate over the traditional speaker recognition is achieved.
Mueller, Peter B.; Larson, George W.
Eighty-three voice disorder therapists' ratings of statements regarding voice therapy practices indicated that vocal nodules are the most frequent disorder treated; vocal abuse and hard glottal attack elimination, counseling, and relaxation were preferred treatment approaches; and voice therapy is more effective with adults than with children.…
... on. Feature: Taste, Smell, Hearing, Language, Voice, Balance Smartphone App for Voice Disorders Past Issues / Fall 2013 ... developed a mobile monitoring device that relies on smartphone technology to gather a week's worth of talking, ...
... ENTCareers Marketplace Find an ENT Doctor Near You Effects of Medications on Voice Effects of Medications on Voice Patient Health Information News ... replacement therapy post-menopause may have a variable effect. An inadequate level of thyroid replacement medication in ...
... Facts for Families Guide Facts for Families - Vietnamese Hearing Voices and Seeing Things No. 102; Updated October ... delusions (a fixed, false, and often bizarre belief). Hearing voices or seeing things that are not there ...
Full Text Available Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers’ gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.
Vinson E Claire
Full Text Available Abstract Background In molecular applications, virtual reality (VR and immersive virtual environments have generally been used and valued for the visual and interactive experience – to enhance intuition and communicate excitement – rather than as part of the actual research process. In contrast, this work develops a software infrastructure for research use and illustrates such use on a specific case. Methods The Syzygy open-source toolkit for VR software was used to write the KinImmerse program, which translates the molecular capabilities of the kinemage graphics format into software for display and manipulation in the DiVE (Duke immersive Virtual Environment or other VR system. KinImmerse is supported by the flexible display construction and editing features in the KiNG kinemage viewer and it implements new forms of user interaction in the DiVE. Results In addition to molecular visualizations and navigation, KinImmerse provides a set of research tools for manipulation, identification, co-centering of multiple models, free-form 3D annotation, and output of results. The molecular research test case analyzes the local neighborhood around an individual atom within an ensemble of nuclear magnetic resonance (NMR models, enabling immersive visual comparison of the local conformation with the local NMR experimental data, including target curves for residual dipolar couplings (RDCs. Conclusion The promise of KinImmerse for production-level molecular research in the DiVE is shown by the locally co-centered RDC visualization developed there, which gave new insights now being pursued in wider data analysis.
Grolman, Wilko; Eerenstein, Simone E. J.; Tan, Frédérique M. L.; Tange, Rinze A.; Schouwenburg, Paul F.
BACKGROUND: In laryngectomized patients, tracheoesophageal voice generally provides a better voice quality than esophageal voice. Understanding the aerodynamics of voice production in patients with a voice prosthesis is important for optimizing prosthetic designs and successful voice rehabilitation.
Niebudek-Bogusz, Ewa; Kuzańska, Anna; Woźnicka, Ewelina; Sliwińska-Kowalska, Mariola
The aim of this study was to assess the application of Voice Handicap Index (VHI) in the diagnosis of occupational voice disorders in female teachers. The subjective assessment of voice by VHI was performed in fifty subjects with dysphonia diagnosed in laryngovideostroboscopic examination. The control group comprised 30 women whose jobs did not involve vocal effort. The results of the total VHI score and each of its subscales: functional, emotional and physical was significantly worse in the study group than in controls (p teachers estimated their own voice problems as a moderate disability, while 12% of them reported severe voice disability. However, all non-teachers assessed their voice problems as slight, their results ranged at the lowest level of VHI score. This study confirmed that VHI as a tool for self-assessment of voice can be a significant contribution to the diagnosis of occupational dysphonia.
Full Text Available Joassin et al. (Neuroscience Letters, 2004,369,132-137 observed that the recognition of face-voice associations led to an interference effect, i.e. to decreased performances relative to the recognition of faces presented in isolation. In the present experiment, we tested the hypothesis that this interference effect could be due to the fact that voices were more difficult to recognize than faces. For this purpose, we modified some faces by morphing to make them as difficult to recognize as the voices. Twenty one healthy volunteers performed a recogniton task of previously learned face-voice associations in 5 conditions: voices (A, natural faces (V, morphed faces (V30, voice-natural face associations (AV and voice-morphed faces associations (AV30. As expected, AV led to interference, as it was less well and slower performed than V. However, when faces were as difficult to recognize as voices, their simultaneous presentation produced a clear facilitation, AV30 being significantly better and faster performed than A and V30. These results demonstrate that matching or not the perceptual complexity of the unimodal stimuli modulates the potential cross-modal gains of the bimodal situations.
Listen to the voice of a young girl Lonnie, who was diagnosed with Type 1 diabetes at 16. Imagine that she is deeply involved in the social security system. She lives with her mother and two siblings in a working class part of a small town. She is at a special school for problematic youth, and her...
Klitmøller, Anders; Rask, Morten; Jensen, Nevena
Aiming to explore how user driven innovation can inform high level design strategies, an in-depth empirical study was carried out, based on data from 50 observations of private vehicle users. This paper reports the resulting 5 consumer voices: Technology Enthusiast, Environmentalist, Design Lover...
Noraida Abdullah Karim
Full Text Available In May 2007 the Women’s Commission for Refugee Women and Children1 presented its annual Voices of Courage awards to three displaced people who have dedicated their lives to promoting economic opportunities for refugee and displaced women and youth. These are their (edited testimonies.
Ko, Sei Jin
Given that the voice is our main form of communication, we know surprisingly little about how it impacts judgment and behavior. Furthermore, the modern advancement in telecommunication systems, such as cellular phones, has meant that a large proportion of our everyday interactions are conducted
A wide-ranging collection of essays centred on readings of the body in contemporary literary and socio-anthropological discourse, from slavery and rape to female genital mutilation, from clothing, ocular pornography, voice, deformation and transmutation to the imprisoned, dismembered, remembered...
Latinus, Marianne; Belin, Pascal
We are all voice experts. First and foremost, we can produce and understand speech, and this makes us a unique species. But in addition to speech perception, we routinely extract from voices a wealth of socially-relevant information in what constitutes a more primitive, and probably more universal, non-linguistic mode of communication. Consider the following example: you are sitting in a plane, and you can hear a conversation in a foreign language in the row behind you. You do not see the speakers' faces, and you cannot understand the speech content because you do not know the language. Yet, an amazing amount of information is available to you. You can evaluate the physical characteristics of the different protagonists, including their gender, approximate age and size, and associate an identity to the different voices. You can form a good idea of the different speaker's mood and affective state, as well as more subtle cues as the perceived attractiveness or dominance of the protagonists. In brief, you can form a fairly detailed picture of the type of social interaction unfolding, which a brief glance backwards can on the occasion help refine - sometimes surprisingly so. What are the acoustical cues that carry these different types of vocal information? How does our brain process and analyse this information? Here we briefly review an emerging field and the main tools used in voice perception research. Copyright © 2011 Elsevier Ltd. All rights reserved.
This book will give beginners an introduction to building voice-based applications on Android. It will begin by covering the basic concepts and will build up to creating a voice-based personal assistant. By the end of this book, you should be in a position to create your own voice-based applications on Android from scratch in next to no time.Voice Application Development for Android is for all those who are interested in speech technology and for those who, as owners of Android devices, are keen to experiment with developing voice apps for their devices. It will also be useful as a starting po
Gugenheimer, Jan; Wolf, Dennis; Eiríksson, Eyþór Rúnar
We present GyroVR, head worn flywheels designed to render inertia in Virtual Reality (VR. Motions such as flying, diving or floating in outer space generate kinesthetic forces onto our body which impede movement and are currently not represented in VR. We simulate those kinesthetic forces...... by attaching flywheels to the users head, leveraging the gyroscopic effect of resistance when changing the spinning axis of rotation. GyroVR is an ungrounded, wireless and self contained device allowing the user to freely move inside the virtual environment. The generic shape allows to attach it to different...... positions on the users body. We evaluated the impact of GyroVR onto different mounting positions on the head (back and front) in terms of immersion, enjoyment and simulator sickness. Our results show, that attaching GyroVR onto the users head (front of the Head Mounted Display (HMD)) resulted in the highest...
Van Gysel, W D; Vercammen, J; Debruyne, F
If people are asked to discriminate visually the two individuals of a monozygotic twin (MT), they mostly get into trouble. Does this problem also exist when listening to twin voices? Twenty female and 10 male MT voices were randomly assembled with one "strange" voice to get voice trios. The listeners (10 female students in Speech and Language Pathology) were asked to label the twins (voices 1-2, 1-3 or 2-3) in two conditions: two standard sentences read aloud and a 2.5-second midsection of a sustained /a/. The proportion correctly labelled twins was for female voices 82% and 63% and for male voices 74% and 52% for the sentences and the sustained /a/ respectively, both being significantly greater than chance (33%). The acoustic analysis revealed a high intra-twin correlation for the speaking fundamental frequency (SFF) of the sentences and the fundamental frequency (F0) of the sustained /a/. So the voice pitch could have been a useful characteristic in the perceptual identification of the twins. We conclude that there is a greater perceptual resemblance between the voices of identical twins than between voices without genetic relationship. The identification however is not perfect. The voice pitch possibly contributes to the correct twin identifications.
Van Gysel, C.; Velikovich, L.; McGraw, I.; Beaufays, F.
User interactions with mobile devices increasingly depend on voice as a primary input modality. Due to the disadvantages of sending audio across potentially spotty network connections for speech recognition, in recent years there has been growing attention to performing recognition on-device. The
34Speech Recognition by Computer," Scientific American. New York: Scientific American, April 1981: 64-76. 16. Marcus, Mitchell P. A Theo of Syntactic...prob)...) Pcssible words for voice decoder to choose from are: gents dishes issues itches ewes folks foes comunications units eunichs error * farce
Kaimaris, D.; Stylianidis, E.; Karanikolas, N.
Virtual reality (VR) is extensively used in various applications; in industry, in academia, in business, and is becoming more and more affordable for end users from the financial point of view. At the same time, in academia and higher education more and more applications are developed, like in medicine, engineering, etc. and students are inquiring to be well-prepared for their professional life after their educational life cycle. Moreover, VR is providing the benefits having the possibility to improve skills but also to understand space as well. This paper presents the methodology used during a course, namely "Geoinformatics applications" at the School of Spatial Planning and Development (Eng.), Aristotle University of Thessaloniki, to create a virtual School space. The course design focuses on the methods and techniques to be used in order to develop the virtual environment. In addition the project aspires to become more and more effective for the students and provide a real virtual environment with useful information not only for the students but also for any citizen interested in the academic life at the School.
Kropik, M.; Jurickova, M.
The contribution describes the new measuring and protection system of the VR-1 training reactor. The measuring and protection system upgrade is an integral part of the reactor I and C upgrade. The new measuring and protection system of the VR-1 reactor consists of the operational power measuring and the independent power protection systems. Both systems measure the reactor power and power rate, initiate safety action if safety limits are exceeded and send data (power, power rate, status, etc.) to the reactor control system. The operational power measuring system is a full power range system that receives signal from a fission chamber. The signal is evaluated according to the reactor power either in the pulse or current mode. The current mode utilizes the DC current and Campbell techniques. The new independent power protection system operates in the two highest reactor power decades. It receives signals from a boron chamber and evaluates it in the pulse mode. Both systems are computer based. The operational power measuring and independent power protection systems are diverse - different types and location of chambers, completely different hardware, software algorithms for the power and power rate calculations, software development tools and teems for the software manufacturing. (author)
The VR-1 training reactor is a light water reactor of the pool type using enriched uranium as the fuel. The moderator is demineralized light water, which also serves as the neutron reflector, biological shielding, and coolant. Heat evolved during the fission process is removed by natural convection. The reactor is used in the education of students in the field of reactor and neutron physics, dosimetry, nuclear safety, and instrumentation and control systems for nuclear facilities. Although primarily intended for students in various branches of technology (power engineering, nuclear engineering, physical engineering), this specialized facility is also used by students of faculties educating future natural scientists and teachers. Typical tasks trained at the VR-1 reactor include: measurement of delayed neutrons; examination of the effect of various materials on the reactivity of the reactor; measurement of the neutron flux density by various procedures; measurement of reactivity by various procedures; calibration of reactor control rods by various procedures; approaching the critical state; investigation of nuclear reactor dynamics; start-up, control and operation of a nuclear reactor; and investigation of the effect of a simulated nucleate boil on reactivity. In addition to the education of university-level students, training courses are also organized for specialists in the Czech nuclear programme
Matejka, K.; Kolros, A.; Polach, S.; Sklenka, L.
The first 3 years of operation of the VR-1 training reactor are reviewed. This period includes its physical start-up (preparation, implementation, results) and operation development as far as the current operating configuration of the reactor core. The physical start-up was commenced using a reactor core referred to as AZ A1, whose physical parameters had been verified by calculation and whose configuration was based on data tested experimentally on the SR-0 reactor at Vochov. The next operating core, labelled AZ A2, was already prepared during the test operation of the VR-1 reactor. Its configuration was such that both of the main horizontal channels, radial and tangential, could be employed. The configuration that followed, AZ A3, was an intermediate step before testing the graphite side reflector. The current reactor core, labelled AZ A3 G, was obtained by supplementing the previous core with a one-sided graphite side reflector. (Z.S.). 2 tabs., 11 figs., 2 refs
Schulze, Jürgen P.
We present a new approach for how multiple users\\' views can be rendered in a surround virtual environment without using special multi-view hardware. It is based on the idea that different parts of the screen are often viewed by different users, so that they can be rendered from their own view point, or at least from a point closer to their view point than traditionally expected. The vast majority of 3D virtual reality systems are designed for one head-tracked user, and a number of passive viewers. Only the head tracked user gets to see the correct view of the scene, everybody else sees a distorted image. We reduce this problem by algorithmically democratizing the rendering view point among all tracked users. Researchers have proposed solutions for multiple tracked users, but most of them require major changes to the display hardware of the VR system, such as additional projectors or custom VR glasses. Our approach does not require additional hardware, except the ability to track each participating user. We propose three versions of our multi-viewer algorithm. Each of them balances image distortion and frame rate in different ways, making them more or less suitable for certain application scenarios. Our most sophisticated algorithm renders each pixel from its own, optimized camera perspective, which depends on all tracked users\\' head positions and orientations. © 2012 IEEE.
Schulze, Jü rgen P.; Acevedo-Feliz, Daniel; Mangan, John; Prudhomme, Andrew; Nguyen, Phi Khanh; Weber, Philip P.
We present a new approach for how multiple users' views can be rendered in a surround virtual environment without using special multi-view hardware. It is based on the idea that different parts of the screen are often viewed by different users, so that they can be rendered from their own view point, or at least from a point closer to their view point than traditionally expected. The vast majority of 3D virtual reality systems are designed for one head-tracked user, and a number of passive viewers. Only the head tracked user gets to see the correct view of the scene, everybody else sees a distorted image. We reduce this problem by algorithmically democratizing the rendering view point among all tracked users. Researchers have proposed solutions for multiple tracked users, but most of them require major changes to the display hardware of the VR system, such as additional projectors or custom VR glasses. Our approach does not require additional hardware, except the ability to track each participating user. We propose three versions of our multi-viewer algorithm. Each of them balances image distortion and frame rate in different ways, making them more or less suitable for certain application scenarios. Our most sophisticated algorithm renders each pixel from its own, optimized camera perspective, which depends on all tracked users' head positions and orientations. © 2012 IEEE.
Pattern recognition is a scientific discipline that is becoming increasingly important in the age of automation and information handling and retrieval. Patter Recognition, 2e covers the entire spectrum of pattern recognition applications, from image analysis to speech recognition and communications. This book presents cutting-edge material on neural networks, - a set of linked microprocessors that can form associations and uses pattern recognition to ""learn"" -and enhances student motivation by approaching pattern recognition from the designer's point of view. A direct result of more than 10
Knittle, C.D.; Malone, K.T.
This paper reports that transmitting digital voice via packetized mobile communications systems that employ relatively short packet lengths and narrow bandwidths often necessitates very low bit rate coding of the voice data. Sandia National Laboratories is currently developing an efficient voice coding system operating at 800 bits per second (bps). The coding scheme is a modified version of the 2400 bps NSA LPC-10e standard. The most significant modification to the LPC-10e scheme is the vector quantization of the line spectrum frequencies associated with the synthesis filters. An outline of a hardware implementation for the 800 bps coder is presented. The speech quality of the coder is generally good, although speaker recognition is not possible. Further research is being conducted to reduce the memory requirements and complexity of the vector quantizer, and to increase the quality of the reconstructed speech. This work may be of use dealing with nuclear materials
Sierra-Sosa, D.; Bastidas, M.; Ortiz P, D.; Quintero, O.L.
We propose a novel analysis alternative, based on two Fourier Transforms for emotion recognition from speech. Fourier analysis allows for display and synthesizes different signals, in terms of power spectral density distributions. A spectrogram of the voice signal is obtained performing a short time Fourier Transform with Gaussian windows, this spectrogram portraits frequency related features, such as vocal tract resonances and quasi-periodic excitations during voiced sounds. Emotions induce such characteristics in speech, which become apparent in spectrogram time-frequency distributions. Later, the signal time-frequency representation from spectrogram is considered an image, and processed through a 2-dimensional Fourier Transform in order to perform the spatial Fourier analysis from it. Finally features related with emotions in voiced speech are extracted and presented. (paper)
Forensic Speaker Recognition: Law Enforcement and Counter-Terrorism is an anthology of the research findings of 35 speaker recognition experts from around the world. The volume provides a multidimensional view of the complex science involved in determining whether a suspect’s voice truly matches forensic speech samples, collected by law enforcement and counter-terrorism agencies, that are associated with the commission of a terrorist act or other crimes. While addressing such topics as the challenges of forensic case work, handling speech signal degradation, analyzing features of speaker recognition to optimize voice verification system performance, and designing voice applications that meet the practical needs of law enforcement and counter-terrorism agencies, this material all sounds a common theme: how the rigors of forensic utility are demanding new levels of excellence in all aspects of speaker recognition. The contributors are among the most eminent scientists in speech engineering and signal process...
Full Text Available This paper explores the experiences of women who ‘hear voices’ (auditory verbal hallucinations. We begin by examining historical understandings of women hearing voices, showing these have been driven by androcentric theories of how women’s bodies functioned, leading to women being viewed as requiring their voices be interpreted by men. We show the twentieth-century was associated with recognition that the mental violation of women’s minds (represented by some voice-hearing was often a consequence of the physical violation of women’s bodies. We next report the results of a qualitative study into voice-hearing women’s experiences (N=8. This found similarities between women’s relationships with their voices and their relationships with others and the wider social context. Finally, we present results from a quantitative study comparing voice-hearing in women (n=65 and men (n=132 in a psychiatric setting. Women were more likely than men to have certain forms of voice-hearing (voices conversing and to have antecedent events of trauma, physical illness, and relationship problems. Voices identified as female may have more positive affect than male voices. We conclude that women voice-hearers have and continue to face specific challenges necessitating research and activism, and hope this paper will act as a stimulus to such work.
Kooijman, P.G.C.; Jong, F.I.C.R.S. de; Thomas, G.; Huinck, W.J.; Donders, A.R.T.; Graamans, K.; Schutte, H.K.
In order to identify factors that are associated with voice problems and voice-related absenteeism in teachers, 1,878 questionnaires were analysed. The questionnaires inquired about personal data, voice complaints, voice-related absenteeism from work and conditions that may lead to voice complaints
Kooijman, P. G. C.; de Jong, F. I. C. R. S.; Thomas, G.; Huinck, W.; Donders, R.; Graamans, K.; Schutte, H. K.
In order to identify factors that are associated with voice problems and voice-related absenteeism in teachers, 1,878 questionnaires were analysed. The questionnaires inquired about personal data, voice complaints, voice-related absenteeism from work and conditions that may lead to voice complaints
This article talks about voice actors and features Tony Oliver, a professional voice actor. Voice actors help to bring one's favorite cartoon and video game characters to life. They also do voice-overs for radio and television commercials and movie trailers. These actors use the sound of their voice to sell a character's emotions--or an advertised…
Full Text Available of speech technology development, similar approaches are likely to be applicable in both circumstances. However, within these broad approaches there are details which are specific to certain languages (or lan- guage families) that may require solutions... to the modeling of pitch were therefore required. Similarly, it is possible that novel solutions will be required to deal with the click sounds that occur in some Southern Bantu languages, or the voicing Copyright 2010 ISCA 26-30 September 2010, Makuhari...
Teichgraeber, U.K.M.; Ehrenstein, T.; Lemke, M.; Liebig, T.; Stobbe, H.; Hosten, N.; Keske, U.; Felix, R.
Purpose: A study was performed to compare the performance of automatic speech recognition (ASR) with conventional transcription. Materials and Methods: 100 CT reports were generated by using ASR and 100 CT reports were dictated and written by medical transcriptionists. The time for dictation and correction of errors by the radiologist was assessed and the type of mistakes was analysed. The text recognition rate was calculated in both groups and the average time between completion of the imaging study by the technologist and generation of the written report was assessed. A commercially available speech recognition technology (ASKA Software, IBM Via Voice) running of a personal computer was used. Results: The time for the dictation using digital voice recognition was 9.4±2.3 min compared to 4.5±3.6 min with an ordinary Dictaphone. The text recognition rate was 97% with digital voice recognition and 99% with medical transcriptionists. The average time from imaging completion to written report finalisation was reduced from 47.3 hours with medical transcriptionists to 12.7 hours with ASR. The analysis of misspellings demonstrated (ASR vs. medical transcriptionists): 3 vs. 4 for syntax errors, 0 vs. 37 orthographic mistakes, 16 vs. 22 mistakes in substance and 47 vs. erroneously applied terms. Conclusions: The use of digital voice recognition as a replacement for medical transcription is recommendable when an immediate availability of written reports is necessary. (orig.) [de
Noer, Vibeke Røn; Nielsen, Cathrine Sand
. The education lasts for 3,5 years and the landmark of the educational model is the continuously shifts between teaching in classroom and teaching in clinical practice. Clinical teaching takes place at approved clinical placement institutions in hospitals and in the social and health care services outside...... Universities, University Colleges and Higher Professional Colleges. In spite of the changes, the literature still suggests the need for radical transformation within the nursing education – and point to the overall vision of transforming the education in order to strengthen and connect education and health......, Aarhus N, Klim, s. 87-110. Kirketerp, AL. 2012 ”Foretagsomhedspædagogik og SKUB-metoden”, Kognition og pædagogik, årg. 22, nr. 83, s. 66-86 Nielsen, Carsten and others. 2011 "Projektevaluering 1. Studieår, Ekstra Klasse 092E" VIA University College, Sygeplejerskeuddannelsen i Aarhus, Aarhus. Noer, VR...
Takeda, Yasuhiro; Ueshima, Yutaka
In the visualization work of simulation data in every advanced research field, what is used most in the report or the presentation as a research result has still remained in the stages of the still picture or the 2-dimensional animation, in spite of recent abundance of various visualization software. With the recent progress of computational environment, however, more complicated phenomena can be so easily computed that the results are more needed to be comprehensible as well as intelligible. Therefore, it inevitably requires an animation rather than a still picture, or 3-dimensional display (virtual reality) rather than 2-dimensional one. In this report, two visualization tools, 3DAVS and Polarization-Type VR system are described as the data expression method after visualization processing. (author)
Matejka, K.; Kolros, A.; Krops, S.; Polach, S.; Sklenka, L.
An overview is presented of the extent and ways of using the VR-1 training reactor, which is operated by the Faculty of Nuclear Science and Physical Engineering, Czech Technical University in Prague. A list and the characteristics of 16 problems developed for teaching purposes is given, and the 14 faculties and 2 research institutes participating in the teaching activities are listed. The reactor is used in the education and training of nuclear scientists and engineers. The instrumentation, experimental, handling and operating tools, as well as documentation and texts relating to the reactor are described. The following examples of the teaching activities are included: a guided visit to the operating reactor site, reactor dynamics study and delayed neutron measurement, training course, and the basic criticality experiment. Nuclear safety aspects (hypothetical accidents, quality control and system qualification demonstration, safety culture) are stressed during the education. The reactor department is involved in international cooperation projects. (J.B.). 3 refs
Kropik, M.; Matejka, K.; Sklenka, L.; Chab, V.
The contribution describes a new human machine interface that was installed at the VR-1 training reactor. The human machine interface update was completed in the summer 2001. The human machine interface enables to operate the training reactor. The interface was designed with respect to functional, ergonomic and aesthetic requirements. The interface is based on a personal computer equipped with two displays. One display enables alphanumeric communication between a reactor operator and the control and safety system of the nuclear reactor. Messages appear from the control system, the operator can write commands and send them there. The second display is a graphical one. It is possible to represent there the status of the reactor, principle parameters (as power, period), control rods' positions, the course of the reactor power. Furthermore, it is possible to set parameters, to show the active core configuration, to perform reactivity calculations, etc. The software for the new human machine interface was produced in the InTouch developing environment of the WonderWare Company. It is possible to switch the language of the interface between Czech and English because of many foreign students and visitors at the reactor. The former operator's desk was completely removed and superseded with a new one. Besides of the computer and the two displays, there are control buttons, indicators and individual numerical displays of instrumentation there. Utilised components guarantee high quality of the new equipment. Microcomputer based communication units with proper software were developed to connect the contemporary control and safety system with the personal computer of the human machine interface and the individual displays. New human machine interface at the VR-1 training reactor improves the safety and comfort of the reactor utilisation, facilitates experiments and training, and provides better support of foreign visitors.(author)
Sutton, C; McCloy, R; Middlebrook, A; Chater, P; Wilson, M; Stone, R
The key bimanual instrument tasks involved in laparoscopic surgery have been abstracted for use in a virtual reality surgical skills evaluator and trainer. The trainer uses two laparoscopic instruments mounted on a frame with position sensors which provide instrument movement data that is translated into interactive real time graphics on a PC (P133, 16 Mb RAM, graphics acceleration card). An accurately scaled operating volume of 10 cm3 is represented by a 3D cube on the computer screen. "Camera" position and size of target objects can be varied for different skill levels. Targets appear randomly within the operating volume according to the skill task and can be grasped and manipulated with the instruments. Accuracy and errors during the tasks and time to completion are logged. Mist VR has tutorial, training, examination, analysis and configuration modes. Six tasks have been selected and include combinations of instrument approach, target acquisition, target manipulation and placement, transfer between instruments, target contact with optional diathermy, and controlled instrument withdrawal/replacement. Tasks can be configured for varying degrees of difficulty and the configurations saved to a library for reuse. Specific task configurations can be assigned to individual students. In the examination mode the supervisor can select the tasks, repetitions and order and save to a specific file for that trainee. Progress can be assessed and there is the option for playback of the training session or examination. Data analyses permit overall, including task, and right or left hand performances to be quantified. Mist VR represents a significant advance over the subjective assessment of training performances with existing "plastic box" basic trainers.
Nissim, Yonit; Weissblueth, Eyal
The current study sought to explore the experiences of pre-service student teachers in a teaching unit in VR within a special course framework which was intended to enhance student-teacher's 21st century skills and growth processes. In particular, how their experiences working with VR affected their self-efficacy. The research population comprised…
Piccolo, Lidia Del; Finset, Arnstein; Mellblom, Anneli V; Figueiredo-Braga, Margarida; Korsvold, Live; Zhou, Yuefang; Zimmermann, Christa; Humphris, Gerald
To discuss the theoretical and empirical framework of VR-CoDES and potential future direction in research based on the coding system. The paper is based on selective review of papers relevant to the construction and application of VR-CoDES. VR-CoDES system is rooted in patient-centered and biopsychosocial model of healthcare consultations and on a functional approach to emotion theory. According to the VR-CoDES, emotional interaction is studied in terms of sequences consisting of an eliciting event, an emotional expression by the patient and the immediate response by the clinician. The rationale for the emphasis on sequences, on detailed classification of cues and concerns, and on the choices of explicit vs. non-explicit responses and providing vs. reducing room for further disclosure, as basic categories of the clinician responses, is described. Results from research on VR-CoDES may help raise awareness of emotional sequences. Future directions in applying VR-CoDES in research may include studies on predicting patient and clinician behavior within the consultation, qualitative analyses of longer sequences including several VR-CoDES triads, and studies of effects of emotional communication on health outcomes. VR-CoDES may be applied to develop interventions to promote good handling of patients' emotions in healthcare encounters. Copyright © 2017 Elsevier B.V. All rights reserved.
Bekir Busatlic; Nejdet Dogru; Isaac Lera; Enes Sukic
Smart home refers to the application of various technologies to semi-unsupervised home control It refers to systems that control temperature, lighting, door locks, windows and many other appliances. The aim of this study was to design a system that will use existing technology to showcase how it can benefit people with disabilities. This work uses only off-the-shelf products (smart home devices and controllers), speech recognition technology, open-source code libraries. The Voice Activated Sm...
Full Text Available Unlike previous research on voice and silence, this article breaksthe distance between the two and declines to treat them as opposites. Voice and silence are interrelated and intertwined strategic forms ofcommunication which presuppose each other in such a way that the absence of one would minimize completely the other’s presence. Social actors are not voice, or silence. Social actors can have voice or silence, they can do both because they operate at multiple levels and deal with multiple issues at different moments in time.
Menin, Aline; Torchelsen, Rafael; Nedel, Luciana
Using virtual environments (VEs) is a safer and cost-effective alternative to executing dangerous tasks, such as training firefighters and industrial operators. Immersive virtual reality (VR) combined with game aspects have the potential to improve the user experience in the VE by increasing realism, engagement, and motivation. This article investigates the impact of VR technology on 46 immersive gamified simulations with serious purposes and classifies it towards a taxonomy. Our findings suggest that immersive VR improves simulation outcomes, such as increasing learning gain and knowledge retention and improving clinical outcomes for rehabilitation. However, it also has limitations such as motion sickness and restricted access to VR hardware. Our contributions are to provide a better understanding of the benefits and limitations of using VR in immersive simulations with serious purposes, to propose a taxonomy that classifies them, and to discuss whether methods and participants profiles influence results.
.... The ultimate goal of voice biometrics is to enable the use of voice as a password. Voice biometrics are "man-in-the-loop" systems in which system performance is significantly dependent on human performance...
WANG Zhiping; ZHAO Li; ZOU Cairong
A modified Parzen-window method, which keep high resolution in low frequencies and keep smoothness in high frequencies, is proposed to obtain statistical model. Then, a gender classification method utilizing the statistical model is proposed, which have a 98% accuracy of gender classification while long sentence is dealt with. By separation the male voice and female voice, the mean and standard deviation of speech training samples with different emotion are used to create the corresponding emotion models. Then the Bhattacharyya distance between the test sample and statistical models of pitch, are utilized for emotion recognition in speech.The normalization of pitch for the male voice and female voice are also considered, in order to illustrate them into a uniform space. Finally, the speech emotion recognition experiment based on K Nearest Neighbor shows that, the correct rate of 81% is achieved, where it is only 73.85%if the traditional parameters are utilized.
L.C. Cantor Cutiva (Lady Catherine); A. Burdorf (Alex)
textabstractObjectives: To characterize the objective voice parameters among school workers, and to identify associated factors of three objective voice parameters, namely fundamental frequency, sound pressure level and maximum phonation time. Materials and methods: We conducted a cross-sectional
Baroutsis, Aspa; McGregor, Glenda; Mills, Martin
In this paper, we are concerned with the notion of "pedagogic voice" as it relates to the presence of student "voice" in teaching, learning and curriculum matters at an alternative, or second chance, school in Australia. This school draws upon many of the principles of democratic schooling via its utilisation of student voice…
This article is based on examples of contemporary audiovisual art, with a special focus on the Tony Oursler exhibition Face to Face at Aarhus Art Museum ARoS in Denmark in March-July 2012. My investigation involves a combination of qualitative interviews with visitors, observations of the audience´s...... interactions with the exhibition and the artwork in the museum space and short analyses of individual works of art based on reception aesthetics and phenomenology and inspired by newer writings on sound, voice and listening....
Keromytis, Angelos D
Voice over IP (VoIP) and Internet Multimedia Subsystem technologies (IMS) are rapidly being adopted by consumers, enterprises, governments and militaries. These technologies offer higher flexibility and more features than traditional telephony (PSTN) infrastructures, as well as the potential for lower cost through equipment consolidation and, for the consumer market, new business models. However, VoIP systems also represent a higher complexity in terms of architecture, protocols and implementation, with a corresponding increase in the potential for misuse. In this book, the authors examine the
Donatella Mazzoleni; Pietro Vitiello
A good architecture should not only allow functional, formal and technical quality for urban spaces, but also let the voice of the city be perceived, listened, enjoyed. Every city has got its specific sound identity, or “ISO” (R. O. Benenzon), made up of a complex texture of background noises and fluctuation of sound figures emerging and disappearing in a game of continuous fadings. For instance, the ISO of Naples is characterized by a spread need of hearing the sound return of one’s/others v...
Utterback, Ann S.
Discusses T. S. Elliot's essay, "The Three Voices of Poetry" which conceptualizes the position taken by the poet or creator. Suggests that an examination of documentary film, within the three voices concept, expands the critical framework of the film genre. (MH)
Kolros, Antonin; Huml, Ondrej; Kos, Josef
Full text: The VR-1 Sparrow training reactor is the experimental nuclear facility especially employed for education and teaching of students from different technical universities in the Czech Republic and other countries. Since 2005 the uniform all-purpose devices EMK310 have been used for measurement at reactor laboratory with different type of gas filled neutron detectors. The neutron detection system are employed for reactivity measurement, control rod calibration, critical experiment, study of delayed neutrons, study of nuclear reactor dynamics and study of detection systems dead time. The small dimension isotropic detectors are especially used for measurement of thermal neutron flux distribution inside the reactor core. The EMK-310 is a high performance, portable, three-channel fast amplitude analyzer designed for counting applications. It was developed for nuclear applications and made in close co-operation with firm TEMA Ltd. The precise rack eliminates electromagnetic disturbance and contains the control unit and four modules. The modules of high voltage supply and amplifier for gas filled detectors or scintillation probes are used in basic configuration. Software is tailored specifically to the reactor measurement and allows full online control. For applications involving the study of signals that may vary with the time, example study of delayed neutrons or nuclear reactor dynamics, the EMK-310 provides a Multichannel Scaling (MCS) acquisition mode. MCS dwell time can be set from 2 ms. Now, the new generation of digital multichannel analyzers DA310 is introduced. They have similarly attributes as EMK310 but the output information of unipolar signals from detector is more complete. The pipeline A/D converter with field programmable gate array (FPGA) is the hearth of the DA310 device. The resolution is 12 bits (4096 channels); the sample frequency is 80 MHz. The application for the neutron noise analysis is supposed. The correction method for non linearity
Full Text Available This paper presents a method of speech recognition by pattern recognition techniques. Learning consists in determining the unique characteristics of a word (cepstral coefficients by eliminating those characteristics that are different from one word to another. For learning and recognition, the system will build a dictionary of words by determining the characteristics of each word to be used in the recognition. Determining the characteristics of an audio signal consists in the following steps: noise removal, sampling it, applying Hamming window, switching to frequency domain through Fourier transform, calculating the magnitude spectrum, filtering data, determining cepstral coefficients.
Full Text Available A good architecture should not only allow functional, formal and technical quality for urban spaces, but also let the voice of the city be perceived, listened, enjoyed. Every city has got its specific sound identity, or “ISO” (R. O. Benenzon, made up of a complex texture of background noises and fluctuation of sound figures emerging and disappearing in a game of continuous fadings. For instance, the ISO of Naples is characterized by a spread need of hearing the sound return of one’s/others voices, by a hate of silence. Cities may fall ill: illness from noise, within super-crowded neighbourhoods, or illness from silence, in the forced isolation of peripheries. The proposal of an urban music therapy denotes an unpublished and innovative enlarged interdisciplinary research path, where architecture, music, medicine, psychology, communication science may converge, in order to work for rebalancing spaces and relation life of the urban collectivity, through the care of body and sound dimensions.
McGill, M.; Boland, Daniel; Murray-Smith, Roderick; Brewster, Stephen
We identify usability challenges facing consumers adopting\\ud Virtual Reality (VR) head-mounted displays (HMDs) in a survey\\ud of 108 VR HMD users. Users reported significant issues\\ud in interacting with, and being aware of their real-world\\ud context when using a HMD. Building upon existing work on\\ud blending real and virtual environments, we performed three\\ud design studies to address these usability concerns. In a typing\\ud study, we show that augmenting VR with a view of reality\\ud sig...
Botella, Cristina; Villa, Helena; García Palacios, Azucena; Quero, Soledad; Baños, Rosa M; Alcaniz, Mariano
Panic disorder with agoraphobia (PDA) is considered an important public health problem. The efficacy of cognitive-behavioral therapy (CBT) for PDA has been widely demonstrated. The American National Institute of Health recommended Cognitive-Behavioral programs as the treatment of choice for this disorder. This institution also recommended that researchers develop treatments whose mode of delivery increases the availability of these programs. Virtual Reality based treatments can help to achieve this goal. VR has several advantages compared with conventional techniques. One of the essential components to treat these disorders is exposure. In VR the therapist can control the feared situations at will and with a high degree of safety for the patient, as it is easier to grade the feared situations. Another advantage is that VR is more confidential because treatment takes place in the therapist's office. It is also less time consuming as it takes place in the therapist's office. Considering the wide number of situations and activities that agoraphobic patients use to avoid, VR can save time and money significantly. Another advantage in treating PDA using VR is the possibility of doing VR interoceptive. VR could be a more natural setting for interoceptive exposure than the consultation room because we can elicit bodily sensations while the patient is immerse in VR agoraphobic situations. Finally, we think that VR exposure can be a useful intermediate step for those patients who refuse in vivo exposure because the idea of facing the real agoraphobic situations is too aversive for them. In this chapter we offer the work done by our research team at the VEPSY-UPDATED project. We describe the VR program we have developed for the treatment of PDA and we summarize the efficacy and effectiveness data of a study where we compare a cognitive-behavioral program including VR for the exposure component with a standard cognitive-behavioral program including in vivo exposure and with a
This chapter describes a Virtual Reality (VR) based innovative model of evaluation of the performance and potentiality of young mentally/psychically disabled subjects with learning difficulties. Using an immersive PC-based VR system, the study investigated the characteristics of 150 disabled subjects in the EU funded project "Horizon O.D.A.--Catania-1998--2000". The result is the definition of an individual neuropsychological "Integrated Profile", based on VR performance, that allows an objective functional benchmark between different subjects. This model can be used to investigate the possibility of job integration for mentally/psychically disabled subjects.
Soto, R.; Harkins, R.; HPD, Inc., Naperville, IL)
The need for cost effective alternatives to treat large volumes of liquid radwaste has never been more evident. As part of a continuing effort to introduce such alternatives, HPD, Inc., and Chem-Nuclear Systems, Inc., have integrated two proven state-of-the-art technologies to offer a mobile liquid volume reduction system that satisfies nuclear industry requirements, with respect to liquid radwaste handling. This system optimizes proven technology by employing a crystallizer unit to concentrate the waste liquids to 50 weight percent solids, thereby reducing the volume to be solidified by factors of 40, while using only 20 percent of the energy required by conventional evaporative systems. In addition, the system employs a field proven cement solidification process which has been accepted in a Topical Report by the US NRC and which offers the highest waste to container volume ratios for stable waste forms in the industry. This volume reduction-solidification system is able to reduce over 7000 gallons of liquid waste per day to less than 30 cubic feet of 10CFR61 certified stable solidified waste for ultimate disposal or on-site storage. This document describes the GEODE System; its applicability; economics; volume reduction; scope of responsibility and experience. Major benefits include higher VR factors; assurance of continual regulatory compliance; and no capital investment
Fu, Qian-Jie; Chinchilla, Sherol; Galvin, John J
The present study investigated the relative importance of temporal and spectral cues in voice gender discrimination and vowel recognition by normal-hearing subjects listening to an acoustic simulation of cochlear implant speech processing and by cochlear implant users. In the simulation, the number of speech processing channels ranged from 4 to 32, thereby varying the spectral resolution; the cutoff frequencies of the channels' envelope filters ranged from 20 to 320 Hz, thereby manipulating the available temporal cues. For normal-hearing subjects, results showed that both voice gender discrimination and vowel recognition scores improved as the number of spectral channels was increased. When only 4 spectral channels were available, voice gender discrimination significantly improved as the envelope filter cutoff frequency was increased from 20 to 320 Hz. For all spectral conditions, increasing the amount of temporal information had no significant effect on vowel recognition. Both voice gender discrimination and vowel recognition scores were highly variable among implant users. The performance of cochlear implant listeners was similar to that of normal-hearing subjects listening to comparable speech processing (4-8 spectral channels). The results suggest that both spectral and temporal cues contribute to voice gender discrimination and that temporal cues are especially important for cochlear implant users to identify the voice gender when there is reduced spectral resolution.
Partila, Pavol; Voznak, Miroslav; Tovarek, Jaromir
The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.
Full Text Available The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.
Bekele, Esubalew; Crittendon, Julie; Zheng, Zhi; Swanson, Amy; Weitlauf, Amy; Warren, Zachary; Sarkar, Nilanjan
Teenagers with autism spectrum disorder (ASD) and age-matched controls participated in a dynamic facial affect recognition task within a virtual reality (VR) environment. Participants identified the emotion of a facial expression displayed at varied levels of intensity by a computer generated avatar. The system assessed performance (i.e.,…
Styslinger, Mary E.; Whisenant, Alison
In this article, the authors discuss the benefits of using multi-voiced journals as a teaching strategy in reading instruction. Multi-voiced journals, an adaptation of dual-voiced journals, encourage responses to reading in varied, cultured voices of characters. It is similar to reading journals in that they prod students to connect to the lives…
Haderlein, Tino; Döllinger, Michael; Matoušek, Václav; Nöth, Elmar
Automatic voice assessment is often performed using sustained vowels. In contrast, speech analysis of read-out texts can be applied to voice and speech assessment. Automatic speech recognition and prosodic analysis were used to find regression formulae between automatic and perceptual assessment of four voice and four speech criteria. The regression was trained with 21 men and 62 women (average age 49.2 years) and tested with another set of 24 men and 49 women (48.3 years), all suffering from chronic hoarseness. They read the text 'Der Nordwind und die Sonne' ('The North Wind and the Sun'). Five voice and speech therapists evaluated the data on 5-point Likert scales. Ten prosodic and recognition accuracy measures (features) were identified which describe all the examined criteria. Inter-rater correlation within the expert group was between r = 0.63 for the criterion 'match of breath and sense units' and r = 0.87 for the overall voice quality. Human-machine correlation was between r = 0.40 for the match of breath and sense units and r = 0.82 for intelligibility. The perceptual ratings of different criteria were highly correlated with each other. Likewise, the feature sets modeling the criteria were very similar. The automatic method is suitable for assessing chronic hoarseness in general and for subgroups of functional and organic dysphonia. In its current version, it is almost as reliable as a randomly picked rater from a group of voice and speech therapists.
Lightstone, P. C.; Davidson, W. M.
The military detection assessment laboratory houses an experimental field system which assesses different alarm indicators such as fence disturbance sensors, MILES cables, and microwave Racons. A speech synthesis board which could be interfaced, by means of a computer, to an alarm logger making verbal acknowledgement of alarms possible was purchased. Different products and different types of voice synthesis were analyzed before a linear predictive code device produced by Telesensory Speech Systems of Palo Alto, California was chosen. This device is called the Speech 1000 Board and has a dedicated 8085 processor. A multiplexer card was designed and the Sp 1000 interfaced through the card into a TMS 990/100M Texas Instrument microcomputer. It was also necessary to design the software with the capability of recognizing and flagging an alarm on any 1 of 32 possible lines. The experimental field system was then packaged with a dc power supply, LED indicators, speakers, and switches, and deployed in the field performing reliably.
It has been shown that teachers are at high risk of developing occupational dysphonia, and it has been widely accepted that the vocal characteristics of a speaker play an important role in determining the reactions of listeners. The functions of breathing, breathing movement, breathing tonus, voice vibrations and articulation tonus are transmitted to the listener. So we may conclude that listening to the teacher's voice at school influences children's behavior and the perception of spoken language. This paper presents the concept of Schlaffhorst-Andersen including exercises to help teachers improve their voice, breathing, movement and their posture. Copyright 2008 S. Karger AG, Basel.
Haddad, Darren M.; Ratley, Roy J.
Voice Stress Analysis (VSA) systems are marketed as computer-based systems capable of measuring stress in a person's voice as an indicator of deception. They are advertised as being less expensive, easier to use, less invasive in use, and less constrained in their operation then polygraph technology. The National Institute of Justice have asked the Air Force Research Laboratory for assistance in evaluating voice stress analysis technology. Law enforcement officials have also been asking questions about this technology. If VSA technology proves to be effective, its value for military and law enforcement application is tremendous.
Full Text Available The purpose of this case repots are to evaluate the role of ST elevation in aVR lead and to make analysis between both cases. There are some atypical electrocardiogram (ECG presentations which need prompt management in patient with ischemic clinical manifestation such as ST elevation in aVR lead. In this case study, we report a 68-year old woman with chief symptoms of shortness of breath and chest discomfort. She was diagnosed with cardiogenic shock, with Killip class IV, and TIMI score of 8. The second case is a 57-year-old man with typical chest pain at rest which could not be relieved with nitrate treatment. He was diagnosed with ST elevation in inferior and aVR lead, and occlusion in left circumflex artery (LCX. Both patients underwent primary percutaneous coronary intervention (PPCI. Subsequently, both cases presented remarkable clinical improvements and improved ST elevation myocardial infarction (STEMI in aVR lead.
Kühl, Jørgen Tobias; Berg, Ronan M G
BACKGROUND: Lead aVR is a neglected, however, potentially useful tool in electrocardiography. Our aim was to evaluate its value in clinical practice, by reviewing existing literature regarding its utility for identifying the culprit lesion in acute myocardial infarction (AMI). METHODS: Based...... on a systematic search strategy, 16 studies were assessed with the intent to pool data; diagnostic test rates were calculated as key results. RESULTS: Five studies investigated if ST-segment elevation (STE) in aVR is valuable for the diagnosis of left main stem stenosis (LMS) in non-ST-segment AMI (NSTEMI......). The studies were too heterogeneous to pool, but the individual studies all showed that STE in aVR has a high negative predictive value (NPV) for LMS. Six studies evaluated if STE in aVR is valuable for distinguishing proximal from distal lesions in the left anterior descending artery (LAD) in anterior ST...
Wang Yong; Yu Xiao
This paper studies a new method of design verification through the VR plant, in order to perform verification and validation the design of plant conform to the requirements of accident emergency. The VR dynamic plant is established by 3D design model and digital maps that composed of GIS system and indoor maps, and driven by the analyze data of design analyzer. The VR plant could present the operation conditions and accident conditions of power plant. This paper simulates the execution of accident procedures, the development of accidents, the evacuation planning of people and so on, based on VR dynamic plant, and ensure that the plant design will not cause bad effect. Besides design verification, simulated result also can be used for optimization of the accident emergency plan, the training of accident plan and emergency accident treatment. (author)
Garzón García, Marina; Muñoz López, Juana; Y Mendoza Lara, Elvira
The purpose of this study is to analyze the vocal behavior of flamenco singers, as compared with classical music singers, to establish a differential vocal profile of voice habits and behaviors in flamenco music. Bibliographic review was conducted, and the Singer's Vocal Habits Questionnaire, an experimental tool designed by the authors to gather data regarding hygiene behavior, drinking and smoking habits, type of practice, voice care, and symptomatology perceived in both the singing and the speaking voice, was administered. We interviewed 94 singers, divided into two groups: the flamenco experimental group (FEG, n = 48) and the classical control group (CCG, n = 46). Frequency analysis, a Likert scale, and discriminant and exploratory factor analysis were used to obtain a differential profile for each group. The FEG scored higher than the CCG in speaking voice symptomatology. The FEG scored significantly higher than the CCG in use of "inadequate vocal technique" when singing. Regarding voice habits, the FEG scored higher in "lack of practice and warm-up" and "environmental habits." A total of 92.6% of the subjects classified themselves correctly in each group. The Singer's Vocal Habits Questionnaire has proven effective in differentiating flamenco and classical singers. Flamenco singers are exposed to numerous vocal risk factors that make them more prone to vocal fatigue, mucosa dehydration, phonotrauma, and muscle stiffness than classical singers. Further research is needed in voice training in flamenco music, as a means to strengthen the voice and enable it to meet the requirements of this musical genre. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Neill, Johanne; Harbinson, Mark; Shannon, Heather J.; Morton, Amanda; Muir, Alison R.; Adgey, Jennifer A.
To evaluate, in patients with chest pain, the diagnostic value of ST elevation (STE) in lead aVR during stress testing prior to 99m Tc-sestamibi scanning correlating ischaemic territory with angiographic findings. Consecutive patients attending for 99m Tc-sestamibi myocardial perfusion imaging (MPI) completed a treadmill protocol. Peak exercise ECGs were coded. STE ≥0.05 mV in lead aVR was considered significant. Gated perfusion images and findings at angiography were assessed. STE in lead aVR occurred in 25% (138/557) of the patients. More patients with STE in aVR had a reversible defect on imaging compared with those who had no STE in aVR (41%, 56/138 vs 27%, 114/419, p=0.003). Defects indicating a left anterior descending artery (LAD) culprit lesion were more common in the STE in aVR group (20%, 27/138 vs 9%, 39/419, p=0.001). There was a trend towards coronary artery stenosis (>70%) in a double vessel distribution involving the LAD in those patients who had STE in aVR compared with those who did not (22%, 8/37 vs 5%, 4/77, p=0.06). Logistic regression analysis demonstrated that STE in aVR (OR 1.36, p=0.233) is not an independent predictor of inducible abnormality when adjusted for STD >0.1 mV (OR 1.69, p=0.026). However, using anterior wall defect as an end-point, STE in aVR (OR 2.77, p=0.008) was a predictor even after adjustment for STD (OR 1.43, p=0.281). STE in lead aVR during exercise does not diagnose more inducible abnormalities than STD alone. However, unlike STD, which is not predictive of a territory of ischaemia, STE in aVR may indicate an anterior wall defect. (orig.)
Leonov, A. S.; Sorokin, V. N.
The inverse problem of voice source pulse recovery from a segment of a speech signal is under consideration. A special mathematical model is used for the solution that relates these quantities. A variational method of solving inverse problem of voice source recovery for a new parametric class of sources, that is for piecewise-linear sources (PWL-sources), is proposed. Also, a technique for a posteriori numerical error estimation for obtained solutions is presented. A computer study of the adequacy of adopted speech production model with PWL-sources is performed in solving the inverse problems for various types of voice signals, as well as corresponding study of a posteriori error estimates. Numerical experiments for speech signals show satisfactory properties of proposed a posteriori error estimates, which represent the upper bounds of possible errors in solving the inverse problem. The estimate of the most probable error in determining the source-pulse shapes is about 7-8% for the investigated speech material. It is noted that a posteriori error estimates can be used as a criterion of the quality for obtained voice source pulses in application to speaker recognition.
Good public speaking skills are essential in many professions as well as everyday life, but speech anxiety is a common problem. While it is established that public speaking training in virtual reality (VR) is effective, comprehensive studies on the underlying factors that contribute to this success are rare. The “quality evaluation of user-system interaction in virtual reality” framework for evaluation of VR applications is presented that includes system features, user factors, and moderating...
Riva, Giuseppe; Gaggioli, Andrea; Villani, Daniela; Preziosa, Alessandra; Morganti, Francesca; Corsi, Riccardo; Faletti, Gianluca; Vezzadini, Luca
In the past decade, the use of virtual reality for clinical and research applications has become more widespread. However, the diffusion of this approach is still limited by three main issues: poor usability, lack of technical expertise among clinical professionals, and high costs. To address these challenges, we introduce NeuroVR (http://www.neurovr.org--http://www.neurotiv.org), a cost-free virtual reality platform based on open-source software, that allows non-expert users to adapt the content of a pre-designed virtual environment to meet the specific needs of the clinical or experimental setting. Using the NeuroVR Editor, the user can choose the appropriate psychological stimuli/stressors from a database of objects (both 2D and 3D) and videos, and easily place them into the virtual environment. The edited scene can then be visualized in the NeuroVR Player using either immersive or non-immersive displays. Currently, the NeuroVR library includes different virtual scenes (apartment, office, square, supermarket, park, classroom, etc.), covering two of the most studied clinical applications of VR: specific phobias and eating disorders. The NeuroVR Editor is based on Blender (http://www.blender.org), the open source, cross-platform suite of tools for 3D creation, and is available as a completely free resource. An interesting feature of the NeuroVR Editor is the possibility to add new objects to the database. This feature allows the therapist to enhance the patient's feeling of familiarity and intimacy with the virtual scene, i.e., by using photos or movies of objects/people that are part of the patient's daily life, thereby improving the efficacy of the exposure. The NeuroVR platform runs on standard personal computers with Microsoft Windows; the only requirement for the hardware is related to the graphics card, which must support OpenGL.
Agrawal, Sonia; Arze, Cesar; Adkins, Ricky S.; Crabtree, Jonathan; Riley, David; Vangala, Mahesh; Galens, Kevin; Fraser, Claire M.; Tettelin, Herv?; White, Owen; Angiuoli, Samuel V.; Mahurkar, Anup; Fricke, W. Florian
Background The benefit of increasing genomic sequence data to the scientific community depends on easy-to-use, scalable bioinformatics support. CloVR-Comparative combines commonly used bioinformatics tools into an intuitive, automated, and cloud-enabled analysis pipeline for comparative microbial genomics. Results CloVR-Comparative runs on annotated complete or draft genome sequences that are uploaded by the user or selected via a taxonomic tree-based user interface and downloaded from NCBI. ...
Kropik, M.; Matejka, K.; Chab, V.
The contribution describes the upgrade of the VR-1 training reactor I and C (Instrumentation and Control). The reactor was put into operation in the 1990, and its I and C seems to be obsolete now. The new I and C utilises the same digital technology as the old one. The upgrade has been done gradually during holidays in order not to disturb the reactor utilisation during teaching and training. The first stage consisted in the human-machine interface and the control room upgrade in 2001. A new operator's desk, displays, indicators and buttons were installed. Completely new software and communication interface to the present I and C were developed. During the second stage in 2002, new control rod drivers and safety circuits were installed. The rod motors were replaced and necessary mechanical changes on the control rod mechanism, induced by the utilisation of the new motor, were done. The new safety circuits utilise high quality relays with forced contacts to guarantee high reliability of their operation. The third stage, the control system upgrade is being carried out now. The new control system is based on an industrial PC mounted in a 19 inch crate. The operating system of the PC is the Microsoft Windows XP with the real time support RTX of the VentureCom Company. A large amount of work has been devoted to the software requirements to specify all dependencies, modes and permitted actions, safety measures, etc. The Department took an active part in the setting of software requirements and later in verification and validation of the software and the whole control system. Finally, a new protection system consisting of power measuring and power protection channels will be installed in 2004 or 2005. (author)
Full Text Available Network virtualization technology is regarded as one of gradual schemes to network architecture evolution. With the development of network functions virtualization, operators make lots of effort to achieve router virtualization by using general servers. In order to ensure high performance, virtual router platform usually adopts a cluster of general servers, which can be also regarded as a special cloud computing environment. However, due to frequent creation and deletion of router instances, it may generate lots of resource fragmentation to prevent platform from establishing new router instances. In order to solve “resource fragmentation problem,” we firstly propose VR-Cluster, which introduces two extra function planes including switching plane and resource management plane. Switching plane is mainly used to support seamless migration of router instances without packet loss; resource management plane can dynamically move router instances from one server to another server by using VR-mapping algorithms. Besides, three VR-mapping algorithms including first-fit mapping algorithm, best-fit mapping algorithm, and worst-fit mapping algorithm are proposed based on VR-Cluster. At last, we establish VR-Cluster protosystem by using general X86 servers, evaluate its migration time, and further analyze advantages and disadvantages of our proposed VR-mapping algorithms to solve resource fragmentation problem.
Agrawal, Sonia; Arze, Cesar; Adkins, Ricky S; Crabtree, Jonathan; Riley, David; Vangala, Mahesh; Galens, Kevin; Fraser, Claire M; Tettelin, Hervé; White, Owen; Angiuoli, Samuel V; Mahurkar, Anup; Fricke, W Florian
The benefit of increasing genomic sequence data to the scientific community depends on easy-to-use, scalable bioinformatics support. CloVR-Comparative combines commonly used bioinformatics tools into an intuitive, automated, and cloud-enabled analysis pipeline for comparative microbial genomics. CloVR-Comparative runs on annotated complete or draft genome sequences that are uploaded by the user or selected via a taxonomic tree-based user interface and downloaded from NCBI. CloVR-Comparative runs reference-free multiple whole-genome alignments to determine unique, shared and core coding sequences (CDSs) and single nucleotide polymorphisms (SNPs). Output includes short summary reports and detailed text-based results files, graphical visualizations (phylogenetic trees, circular figures), and a database file linked to the Sybil comparative genome browser. Data up- and download, pipeline configuration and monitoring, and access to Sybil are managed through CloVR-Comparative web interface. CloVR-Comparative and Sybil are distributed as part of the CloVR virtual appliance, which runs on local computers or the Amazon EC2 cloud. Representative datasets (e.g. 40 draft and complete Escherichia coli genomes) are processed in genomics projects, while eliminating the need for on-site computational resources and expertise.
Gaffary, Yoren; Le Gouis, Benoit; Marchal, Maud; Argelaguet, Ferran; Arnaldi, Bruno; Lecuyer, Anatole
Does it feel the same when you touch an object in Augmented Reality (AR) or in Virtual Reality (VR)? In this paper we study and compare the haptic perception of stiffness of a virtual object in two situations: (1) a purely virtual environment versus (2) a real and augmented environment. We have designed an experimental setup based on a Microsoft HoloLens and a haptic force-feedback device, enabling to press a virtual piston, and compare its stiffness successively in either Augmented Reality (the virtual piston is surrounded by several real objects all located inside a cardboard box) or in Virtual Reality (the same virtual piston is displayed in a fully virtual scene composed of the same other objects). We have conducted a psychophysical experiment with 12 participants. Our results show a surprising bias in perception between the two conditions. The virtual piston is on average perceived stiffer in the VR condition compared to the AR condition. For instance, when the piston had the same stiffness in AR and VR, participants would select the VR piston as the stiffer one in 60% of cases. This suggests a psychological effect as if objects in AR would feel "softer" than in pure VR. Taken together, our results open new perspectives on perception in AR versus VR, and pave the way to future studies aiming at characterizing potential perceptual biases.
van de Bovenkamp, Hester; Vollaard, Hans; Trappenburg, Margo; Grit, Kor
In many Western countries, options for citizens to influence public services are increased to improve the quality of services and democratize decision making. Possibilities to influence are often cast into Albert Hirschman's taxonomy of exit (choice), voice, and loyalty. In this article we identify delegation as an important addition to this framework. Delegation gives individuals the chance to practice exit/choice or voice without all the hard work that is usually involved in these options. Empirical research shows that not many people use their individual options of exit and voice, which could lead to inequality between users and nonusers. We identify delegation as a possible solution to this problem, using Dutch health care as a case study to explore this option. Notwithstanding various advantages, we show that voice and choice by delegation also entail problems of inequality and representativeness.
Ojala, Tõnu, 1969-
60. sünnipäeva tähistava Tallinna Tehnikaülikooli Akadeemilise Meeskoori juubelihooaja üritusest - a capella pop-gruppide festivalist Voice Force (kontserdid 12. nov. klubis Parlament ja 3. dets. Vene Kultuurikeskuses)
... negative effect on voice. Exercise regularly. Exercise increases stamina and muscle tone. This helps provide good posture ... testing man-made and biological materials and stem cell technologies that may eventually be used to engineer ...
Full Text Available In this paper the Rev. Stuart Fowler outlines a Christian voice in Philosophy and urges the Christian philosopher to investigate his position and his stance with integrity and honesty.
S Cortes, Diana; Laukka, Petri; Lindahl, Christina; Fischer, Håkan
We investigated how memory for faces and voices (presented separately and in combination) varies as a function of sex and emotional expression (anger, disgust, fear, happiness, sadness, and neutral). At encoding, participants judged the expressed emotion of items in forced-choice tasks, followed by incidental Remember/Know recognition tasks. Results from 600 participants showed that accuracy (hits minus false alarms) was consistently higher for neutral compared to emotional items, whereas accuracy for specific emotions varied across the presentation modalities (i.e., faces, voices, and face-voice combinations). For the subjective sense of recollection ("remember" hits), neutral items received the highest hit rates only for faces, whereas for voices and face-voice combinations anger and fear expressions instead received the highest recollection rates. We also observed better accuracy for items by female expressers, and own-sex bias where female participants displayed memory advantage for female faces and face-voice combinations. Results further suggest that own-sex bias can be explained by recollection, rather than familiarity, rates. Overall, results show that memory for faces and voices may be influenced by the expressions that they carry, as well as by the sex of both items and participants. Emotion expressions may also enhance the subjective sense of recollection without enhancing memory accuracy.
Schweinberger, S R; Herholz, A; Sommer, W
The current investigation measured the effects of increasing stimulus duration on listeners' ability to recognize famous voices. In addition, the investigation studied the influence of different types of cues on the naming of voices that could not be named before. Participants were presented with samples of famous and unfamiliar voices and were asked to decide whether or not the samples were spoken by a famous person. The duration of each sample increased in seven steps from 0.25 s up to a maximum of 2 s. Voice recognition improvements with stimulus duration were with a growth function. Gains were most rapid within the first second and less pronounced thereafter. When participants were unable to name a famous voice, they were cued with either a second voice sample, the occupation, or the initials of the celebrity. Initials were most effective in eliciting the name only when semantic information about the speaker had been accessed prior to cue presentation. Paralleling previous research on face naming, this may indicate that voice naming is contingent on previous activation of person-specific semantic information.
Diana S Cortes
Full Text Available We investigated how memory for faces and voices (presented separately and in combination varies as a function of sex and emotional expression (anger, disgust, fear, happiness, sadness, and neutral. At encoding, participants judged the expressed emotion of items in forced-choice tasks, followed by incidental Remember/Know recognition tasks. Results from 600 participants showed that accuracy (hits minus false alarms was consistently higher for neutral compared to emotional items, whereas accuracy for specific emotions varied across the presentation modalities (i.e., faces, voices, and face-voice combinations. For the subjective sense of recollection ("remember" hits, neutral items received the highest hit rates only for faces, whereas for voices and face-voice combinations anger and fear expressions instead received the highest recollection rates. We also observed better accuracy for items by female expressers, and own-sex bias where female participants displayed memory advantage for female faces and face-voice combinations. Results further suggest that own-sex bias can be explained by recollection, rather than familiarity, rates. Overall, results show that memory for faces and voices may be influenced by the expressions that they carry, as well as by the sex of both items and participants. Emotion expressions may also enhance the subjective sense of recollection without enhancing memory accuracy.
Full Text Available Voice-induced cross-taxa emotional recognition is the ability to understand the emotional state of another species based on its voice. In the past, induced affective states, experience-dependent higher cognitive processes or cross-taxa universal acoustic coding and processing mechanisms have been discussed to underlie this ability in humans. The present study sets out to distinguish the influence of familiarity and phylogeny on voice-induced cross-taxa emotional perception in humans. For the first time, two perspectives are taken into account: the self- (i.e. emotional valence induced in the listener versus the others-perspective (i.e. correct recognition of the emotional valence of the recording context. Twenty-eight male participants listened to 192 vocalizations of four different species (human infant, dog, chimpanzee and tree shrew. Stimuli were recorded either in an agonistic (negative emotional valence or affiliative (positive emotional valence context. Participants rated the emotional valence of the stimuli adopting self- and others-perspective by using a 5-point version of the Self-Assessment Manikin (SAM. Familiarity was assessed based on subjective rating, objective labelling of the respective stimuli and interaction time with the respective species. Participants reliably recognized the emotional valence of human voices, whereas the results for animal voices were mixed. The correct classification of animal voices depended on the listener's familiarity with the species and the call type/recording context, whereas there was less influence of induced emotional states and phylogeny. Our results provide first evidence that explicit voice-induced cross-taxa emotional recognition in humans is shaped more by experience-dependent cognitive mechanisms than by induced affective states or cross-taxa universal acoustic coding and processing mechanisms.
National Aeronautics and Space Administration — This proposal responds to the urgent need for improved pilot interfaces in the modern aircraft cockpit. Recent advances in aircraft equipment bring tremendous...
Full Text Available Smart home refers to the application of various technologies to semi-unsupervised home control It refers to systems that control temperature, lighting, door locks, windows and many other appliances. The aim of this study was to design a system that will use existing technology to showcase how it can benefit people with disabilities. This work uses only off-the-shelf products (smart home devices and controllers, speech recognition technology, open-source code libraries. The Voice Activated Smart Home application was developed to demonstrate online grocery shopping and home control using voice comments and tested by measuring its effectiveness in performing tasks as well as its efficiency in recognizing user speech input.
Edita K. Kuular
Full Text Available Among the most important parameters of biometric systems with voice modalities that determine their effectiveness, along with reliability and noise immunity, a speed of identification and verification of a person has been accentuated. This parameter is especially sensitive while processing large-scale voice databases in real time regime. Many research studies in this area are aimed at developing new and improving existing algorithms for presentation and processing voice records to ensure high performance of voice biometric systems. Here, it seems promising to apply a modern approach, which is based on complex network platform for solving complex massive problems with a large number of elements and taking into account their interrelationships. Thus, there are known some works which while solving problems of analysis and recognition of faces from photographs, transform images into complex networks for their subsequent processing by standard techniques. One of the first applications of complex networks to sound series (musical and speech analysis are description of frequency characteristics by constructing network models - converting the series into networks. On the network ontology platform a previously proposed technique of audio information representation aimed on its automatic analysis and speaker recognition has been developed. This implies converting information into the form of associative semantic (cognitive network structure with amplitude and frequency components both. Two speaker exemplars have been recorded and transformed into pertinent networks with consequent comparison of their topological metrics. The set of topological metrics for each of network models (amplitude and frequency one is a vector, and together those combine a matrix, as a digital "network" voiceprint. The proposed network approach, with its sensitivity to personal conditions-physiological, psychological, emotional, might be useful not only for person identification
The aim of speaker recognition and veri cation is to identify people's identity from the characteristics of their voices (voice biometrics). Traditionally this technology has been employed mostly for security or authentication purposes, identi cation of employees/customers and criminal investigations. During the last decade the increasing popularity of hands-free and voice-controlled systems and the massive growth of media content generated on the internet has increased the need for technique...
Pugh, Matthew; Waller, Glenn
In common with individuals experiencing a number of disorders, people with anorexia nervosa report experiencing an internal 'voice'. The anorexic voice comments on the individual's eating, weight and shape and instructs the individual to restrict or compensate. However, the core characteristics of the anorexic voice are not known. This study aimed to develop a parsimonious model of the voice characteristics that are related to key features of eating disorder pathology and to determine whether patients with anorexia nervosa fall into groups with different voice experiences. The participants were 49 women with full diagnoses of anorexia nervosa. Each completed validated measures of the power and nature of their voice experience and of their responses to the voice. Different voice characteristics were associated with current body mass index, duration of disorder and eating cognitions. Two subgroups emerged, with 'weaker' and 'stronger' voice experiences. Those with stronger voices were characterized by having more negative eating attitudes, more severe compensatory behaviours, a longer duration of illness and a greater likelihood of having the binge-purge subtype of anorexia nervosa. The findings indicate that the anorexic voice is an important element of the psychopathology of anorexia nervosa. Addressing the anorexic voice might be helpful in enhancing outcomes of treatments for anorexia nervosa, but that conclusion might apply only to patients with more severe eating psychopathology. Copyright © 2016 John Wiley & Sons, Ltd. Experiences of an internal 'anorexic voice' are common in anorexia nervosa. Clinicians should consider the role of the voice when formulating eating pathology in anorexia nervosa, including how individuals perceive and relate to that voice. Addressing the voice may be beneficial, particularly in more severe and enduring forms of anorexia nervosa. When working with the voice, clinicians should aim to address both the content of the voice and how
Full Text Available We used perceptual aftereffects induced by adaptation with anti-voice stimuli to investigate voice identity representations. Participants learned a set of voices then were tested on a voice identification task with vowel stimuli morphed between identities, after different conditions of adaptation. In Experiment 1, participants chose the identity opposite to the adapting anti-voice significantly more often than the other two identities (e.g., after being adapted to anti-A, they identified the average voice as A. In Experiment 2, participants showed a bias for identities opposite to the adaptor specifically for anti-voice, but not for non anti-voice adaptors. These results are strikingly similar to adaptation aftereffects observed for facial identity. They are compatible with a representation of individual voice identities in a multidimensional perceptual voice space referenced on a voice prototype.
Rajput, Sudheesh K; Matoba, Osamu
We propose an optical voice encryption scheme based on digital holography (DH). An off-axis DH is employed to acquire voice information by obtaining phase retardation occurring in the object wave due to sound wave propagation. The acquired hologram, including voice information, is encrypted using optical image encryption. The DH reconstruction and decryption with all the correct parameters can retrieve an original voice. The scheme has the capability to record the human voice in holograms and encrypt it directly. These aspects make the scheme suitable for other security applications and help to use the voice as a potential security tool. We present experimental and some part of simulation results.
Vivien Arief Wardhany
Full Text Available The purpose of multimedia devices development is controlling through voice. Nowdays voice that can be recognized only in English. To overcome the issue, then recognition using Indonesian language model and accousticc model and dictionary. Automatic Speech Recognizier is build using engine CMU Sphinx with modified english language to Indonesian Language database and XBMC used as the multimedia player. The experiment is using 10 volunteers testing items based on 7 commands. The volunteers is classifiedd by the genders, 5 Male & 5 female. 10 samples is taken in each command, continue with each volunteer perform 10 testing command. Each volunteer also have to try all 7 command that already provided. Based on percentage clarification table, the word “Kanan” had the most recognize with percentage 83% while “pilih” is the lowest one. The word which had the most wrong clarification is “kembali” with percentagee 67%, while the word “kanan” is the lowest one. From the result of Recognition Rate by male there are several command such as “Kembali”, “Utama”, “Atas “ and “Bawah” has the low Recognition Rate. Especially for “kembali” cannot be recognized as the command in the female voices but in male voice that command has 4% of RR this is because the command doesn’t have similar word in english near to “kembali” so the system unrecognize the command. Also for the command “Pilih” using the female voice has 80% of RR but for the male voice has only 4% of RR. This problem is mostly because of the different voice characteristic between adult male and female which male has lower voice frequencies (from 85 to 180 Hz than woman (165 to 255 Hz.The result of the experiment showed that each man had different number of recognition rate caused by the difference tone, pronunciation, and speed of speech. For further work needs to be done in order to improving the accouracy of the Indonesian Automatic Speech Recognition system
Bahreini, Kiavash; Nadolski, Rob; Westera, Wim
This paper presents the voice emotion recognition part of the FILTWAM framework for real-time emotion recognition in affective e-learning settings. FILTWAM (Framework for Improving Learning Through Webcams And Microphones) intends to offer timely and appropriate online feedback based upon learner's vocal intonations and facial expressions in order…
As the primary means of communication, voice plays an important role in daily life. Voice also conveys personal information such as social status, personal traits, and the emotional state of the speaker. Mechanically, voice production involves complex fluid-structure interaction within the glottis and its control by laryngeal muscle activation. An important goal of voice research is to establish a causal theory linking voice physiology and biomechanics to how speakers use and control voice to communicate meaning and personal information. Establishing such a causal theory has important implications for clinical voice management, voice training, and many speech technology applications. This paper provides a review of voice physiology and biomechanics, the physics of vocal fold vibration and sound production, and laryngeal muscular control of the fundamental frequency of voice, vocal intensity, and voice quality. Current efforts to develop mechanical and computational models of voice production are also critically reviewed. Finally, issues and future challenges in developing a causal theory of voice production and perception are discussed.
Harm, D. L.; Taylor, L. C.
Two critical and unresolved human factors issues in VR systems are: 1) potential "cybersickness", a form of motion sickness which is experienced in virtual worlds, and 2) maladaptive sensorimotor performance following exposure to VR systems. Interestingly, these aftereffects are often quite similar to adaptive sensorimotor responses observed in astronauts during and/or following space flight. Most astronauts and cosmonauts experience perceptual and sensorimotor disturbances during and following space flight. All astronauts exhibit decrements in postural control following space flight. It has been suggested that training in virtual reality (VR) may be an effective countermeasure for minimizing perceptual and/or sensorimotor disturbances. People adapt to consistent, sustained alterations of sensory input such as those produced by microgravity, and experimentally-produced stimulus rearrangements (e.g., reversing prisms, magnifying lenses, flight simulators, and VR systems). Adaptation is revealed by aftereffects including perceptual disturbances and sensorimotor control disturbances. The purpose of the current study was to compare disturbances in postural control produced by dome and head-mounted virtual environment displays. Individuals recovered from motion sickness and the detrimental effects of exposure to virtual reality on postural control within one hour. Sickness severity and initial decrements in postural equilibrium decreases over days, which suggests that subjects become dual-adapted over time. These findings provide some direction for developing training schedules for VR users that facilitate adaptation, and address safety concerns about aftereffects.
Yamaguchi, Yasuo; Hanai, Tasuku
The Integrated Support Center for Nuclear nonproliferation and Nuclear Security (ISCN) under the Japan Atomic Energy Agency (JAEA) began the development of Virtual Reality (VR) training system for the purpose of teaching trainees nuclear security. ISCN set up two VR training courses by 2013. One is for teaching a nuclear security system of nuclear plants. The VR training system allows trainees to have virtual experiences visiting a nuclear plant. Through these experiences, trainees are able to learn how physical protection systems work in the plant. The course focuses on learning fundamental knowledge and is suitable for trainees having little experiences in the field of nuclear security. The other is for teaching fundamental skills corresponding to a contingency plan in a Central Alarm Station (CAS) of nuclear power plant. Computers of the VR training system deploy an intrusion scenario in a virtual space. Trainees in a group sit in front of 3-D screens and play a role play game in a virtual CAS. Through the exercise, trainees are able to learn skills necessary to the contingency case of nuclear plants. In my presentation, I will introduce the two training courses, advantages and disadvantages of the VR training system, reactions of trainees and future plans. (author)
Jang, Seong-wook; Ko, Junho; Yoo, Yon-sik; Kim, Yoonsang
Recent medical virtual reality (VR) applications to minimize re-operations are being studied for improvements in surgical efficiency and reduction of operation error. The CT image acquisition method considering three-dimensional (3D) modeling for medical VR applications is important, because the realistic model is required for the actual human organ. However, the research for medical VR applications has focused on 3D modeling techniques and utilized 3D models. In addition, research on a CT image acquisition method considering 3D modeling has never been reported. The conventional CT image acquisition method involves scanning a limited area of the lesion for the diagnosis of doctors once or twice. However, the medical VR application is required to acquire the CT image considering patients' various postures and a wider area than the lesion. A wider area than the lesion is required because of the necessary process of comparing bilateral sides for dyskinesia diagnosis of the shoulder, pelvis, and leg. Moreover, patients' various postures are required due to the different effects on the musculoskeletal system. Therefore, in this paper, we perform a comparative experiment on the acquired CT images considering image area (unilateral/bilateral) and patients' postures (neutral/abducted). CT images are acquired from 10 patients for the experiments, and the acquired CT images are evaluated based on the length per pixel and the morphological deviation. Finally, by comparing the experiment results, we evaluate the CT image acquisition method for medical VR applications.
Lin, Chia-Hui; Peng, Po-Hsin; Ko, Chia-Yun; Markhart, Albert H; Lin, Tsai-Yun
A novel dehydrin gene (VrDhn1) was isolated from an embryo cDNA library of Vigna radiata (L.) Wilczek (mungbean) variety VC1973A. The intronless VrDhn1 gene encodes a protein belonging to the Y(2)K-type dehydrin family. VrDhn1 protein accumulated in embryos and cotyledons during seed maturation and disappeared 2 days after seed imbibition (DAI). The expression of VrDhn1 mRNA and accumulation of VrDhn1 protein were at high levels in mature seeds, but neither mRNA nor protein was detected in mungbean vegetative tissues under normal growth conditions. The VrDhn1 mRNA level was extremely high in mature seeds and decreased to ∼30% at 1 DAI, and was not detectable at ~7 DAI. Tissue dehydration, salinity and exogenous ABA markedly induced VrDhn1 transcripts in plants as measured by quantitative real-time reverse transcription-PCR (qRT-PCR). VrDhn1 protein was not detected using immunoblots in seedlings under stress treatments. In mature seeds or 1 DAI seedlings, VrDhn1 proteins were immunolocalized in the nucleus and cytoplasm. VrDhn1 exhibited low affinity for non-specific interaction with DNA using electrophoretic mobility shift assays (EMSAs), and the exogenous addition of Zn(2+) or Ni(2+) stimulated interaction. The His-tagged VrDhn1 (30.17 kDa) protein showed a molecular mass of 63.1 kDa on gel filtration, suggesting a dimer form. This is the first report showing that a Y(2)K-type VrDhn1 enters the nucleus and interacts with DNA during seed maturation.
Kuhn, Lisa Katharina; Wydell, Taeko; Lavan, Nadine; McGettigan, Carolyn; Garrido, Lúcia
[Correction Notice: An Erratum for this article was reported in Vol 17(6) of Emotion (see record 2017-18585-001). In the article, the copyright attribution was incorrectly listed and the Creative Commons CC-BY license disclaimer was incorrectly omitted from the author note. The correct copyright is "© 2017 The Author(s)" and the omitted disclaimer is below. All versions of this article have been corrected. "This article has been published under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Copyright for this article is retained by the author(s). Author(s) grant(s) the American Psychological Association the exclusive right to publish the article and identify itself as the original publisher."] Emotions are a vital component of social communication, carried across a range of modalities and via different perceptual signals such as specific muscle contractions in the face and in the upper respiratory system. Previous studies have found that emotion recognition impairments after brain damage depend on the modality of presentation: recognition from faces may be impaired whereas recognition from voices remains preserved, and vice versa. On the other hand, there is also evidence for shared neural activation during emotion processing in both modalities. In a behavioral study, we investigated whether there are shared representations in the recognition of emotions from faces and voices. We used a within-subjects design in which participants rated the intensity of facial expressions and nonverbal vocalizations for each of the 6 basic emotion labels. For each participant and each modality, we then computed a representation matrix with the intensity ratings of each emotion. These matrices allowed us to examine the patterns of confusions between emotions and to characterize the representations
... here Home » Health Info » Statistics and Epidemiology Quick Statistics About Voice, Speech, Language Voice, Speech, Language, and ... no 205. Hyattsville, MD: National Center for Health Statistics. 2015. Hoffman HJ, Li C-M, Losonczy K, ...
Iverson, Gregory K.; Ahn, Sang-Cheol
Assuming a framework of privative features, this paper interprets two apparently disparate phenomena in English phonology as structurally related: the lexically specific voicing of fricatives in plural nouns like wives or thieves and the prosodically governed “flapping” of medial /t/ (and /d/) in North American varieties, which we claim is itself not a rule per se, but rather a consequence of the laryngeal weakening of fortis /t/ in interaction with speech-rate determined segmental abbreviation. Taking as our point of departure the Dimensional Theory of laryngeal representation developed by Avery & Idsardi (2001), along with their assumption that English marks voiceless obstruents but not voiced ones (Iverson & Salmons 1995), we find that an unexpected connection between fricative voicing and coronal flapping emerges from the interplay of familiar phonemic and phonetic factors in the phonological system. PMID:18496590
Full Text Available Where am I? Or as the young boy in Jules Verne’s Journey to the Centre of the Earth calls back to his distant-voiced companions: ‘Lost… in the most intense darkness.’ ‘Then I understood it,’ says the boy, Axel, ‘To make them hear me, all I had to do was to speak with my mouth close to the wall, which would serve to conduct my voice, as the wire conducts the electric fluid’ (Verne 1864. By timing their calls, the group of explorers work out that Axel is separated from them by a distance of four miles, held in a cavernous vertical gallery of smooth rock. Feeling his way down towards the others, the boy ends up falling, along with his voice, through the space. Losing consciousness he seems to give himself up to the space...
Mølgaard, Lasse Lohilahti; Jørgensen, Kasper Winther
Speaker recognition is basically divided into speaker identification and speaker verification. Verification is the task of automatically determining if a person really is the person he or she claims to be. This technology can be used as a biometric feature for verifying the identity of a person...
Campeanu, Sandra; Craik, Fergus I M; Alain, Claude
Speaker's voice occupies a central role as the cornerstone of auditory social interaction. Here, we review the evidence suggesting that speaker's voice constitutes an integral context cue in auditory memory. Investigation into the nature of voice representation as a memory cue is essential to understanding auditory memory and the neural correlates which underlie it. Evidence from behavioral and electrophysiological studies suggest that while specific voice reinstatement (i.e., same speaker) often appears to facilitate word memory even without attention to voice at study, the presence of a partial benefit of similar voices between study and test is less clear. In terms of explicit memory experiments utilizing unfamiliar voices, encoding methods appear to play a pivotal role. Voice congruency effects have been found when voice is specifically attended at study (i.e., when relatively shallow, perceptual encoding takes place). These behavioral findings coincide with neural indices of memory performance such as the parietal old/new recollection effect and the late right frontal effect. The former distinguishes between correctly identified old words and correctly identified new words, and reflects voice congruency only when voice is attended at study. Characterization of the latter likely depends upon voice memory, rather than word memory. There is also evidence to suggest that voice effects can be found in implicit memory paradigms. However, the presence of voice effects appears to depend greatly on the task employed. Using a word identification task, perceptual similarity between study and test conditions is, like for explicit memory tests, crucial. In addition, the type of noise employed appears to have a differential effect. While voice effects have been observed when white noise is used at both study and test, using multi-talker babble does not confer the same results. In terms of neuroimaging research modulations, characterization of an implicit memory effect
Ozturk, Kayhan; Erdur, Omer; Kibar, Ertugrul
The authors presented a patient with quadriplegia caused by cervical spine abscess following voice prosthesis replacement. The authors present the first reported permanent quadriplegia patient caused by voice prosthesis replacement. The authors wanted to emphasize that life-threatening complications may be faced during the replacement of voice prosthesis. Care should be taken during the replacement of voice prosthesis and if some problems have been faced during the procedure patients must be followed closely.
Gutiérrez-Maldonado, J; Rus-Calafell, M; Márquez-Rejón, S; Ribas-Sabaté, J
Emotion recognition is known to be impaired in schizophrenia patients. Although cognitive deficits and symptomatology have been associated with this impairment there are other patient characteristics, such as alexithymia, which have not been widely explored. Emotion recognition is normally assessed by means of photographs, although they do not reproduce the dynamism of human expressions. Our group has designed and validated a virtual reality (VR) task to assess and subsequently train schizophrenia patients. The present study uses this VR task to evaluate the impaired recognition of facial affect in patients with schizophrenia and to examine its association with cognitive deficit and the patients' inability to express feelings. Thirty clinically stabilized outpatients with a well-established diagnosis of schizophrenia or schizoaffective disorder were assessed in neuropsychological, symptomatic and affective domains. They then performed the facial emotion recognition task. Statistical analyses revealed no significant differences between the two presentation conditions (photographs and VR) in terms of overall errors made. However, anger and fear were easier to recognize in VR than in photographs. Moreover, strong correlations were found between psychopathology and the errors made.
Hughes, Susan M; Harrison, Marissa A
Previous research shows that the human voice can communicate a wealth of nonsemantic information; preferences for voices can predict health, fertility, and genetic quality of the speaker, and people often use voice attractiveness, in particular, to make these assessments of others. But it is not known what we think of the attractiveness of our own voices as others hear them. In this study eighty men and women rated the attractiveness of an array of voice recordings of different individuals and were not told that their own recorded voices were included in the presentation. Results showed that participants rated their own voices as sounding more attractive than others had rated their voices, and participants also rated their own voices as sounding more attractive than they had rated the voices of others. These findings suggest that people may engage in vocal implicit egotism, a form of self-enhancement.
Sheikh, S.A.; Ahmed, I.; Unar, M.A.
The primary means of generator reactive power control is the generator-excitation Control, using Automatic Voltage Regulator (A VR). The role of A VR is to hold the terminal voltage magnitude of Synchronous generator at a specified level. This paper presents the design of a proportional integral-derivative (PID) controller as an A VR. The PID controller has been tuned by various tuning methods. From all methods, PID parameters are computed through various techniques i.e. Process-reaction curve, Closed-loop system, open-loop system gain margin and phase-margin specifications. From these methods, it has been found that Zhaung- Atherton method and Ho, Hang and Cao method are much superior to the conventional Ziegler-Nichols rules. The performance of the controller has been evaluated through Simulation Studies in MATLAB environment. It has been demonstrated that the PID controller, tuned with the said methods, yields highly satisfactory closed-loop performance. (author)
Advanced management technique and Decision Support System (DSS) are needed to solve the problems of the nuclear reactor decommissioning decision-making. In this study, a kind of new DSS technique for nuclear reactor decommissioning is introduced. It is based on the Virtual Reality (VR) and Geography Information System (GIS), which combine with the scientific management method, operational research, cybernetics and behavior science. The proposed DDSS (Decommissioning Decision Support System) can provide decision-maker the real time 3-D virtual Environment, GIS information and background material of the decommissioning reactor, help to ascertain the decision-making target, modify the decision module and optimize the dismantling plan. The data from three modules (VR Environment Module, VR-DOSE Management Module and Route Layout GIS Module) are used to continuously update and show the statistic at the same time, and the final advice will be given to decision-maker. (authors)
Virtual reality (VR) applications are transforming the way architecture is conceived and produced. By introducing an open and inclusive approach, they encourage a creative dialogue with the users of residential schemes and other buildings and allow competition juries a more thorough understanding...... of architectural concepts. Architects need to heed the dynamics set in motion by these technologies and especially of how laypersons interpret building forms and their simulations in interactive VR environments. The article presents a study which compares aspects of spatial perception in a physical environment...... contextual experience of the viewer, and that spatial ability is an important contributing factor. Results in the two virtual environments tested show consistent differences in how depth and shape are perceived, indicating that VR context is a significant variable in spatial representation. It is asserted...
Broadcasted voices are technologically manipulated. In order to achieve a certain autencity or sound of “reality” paradoxically the voices are filtered and trained in order to reach the listeners. This “mis-en-scene” is important knowledge when it comes to the development of a consistent method o...... of analysis of the mediated voice...
Morrow, Sharon L.
Teachers represent the largest group of occupational voice users and have voice-related problems at a rate of over twice that found in the general population. Among teachers, music teachers are roughly four times more likely than classroom teachers to develop voice-related problems. Although it has been established that music teachers use their…
Bigley, Andrew N; Mabanglo, Mark F; Harvey, Steven P; Raushel, Frank M
The V-type organophosphorus nerve agents are among the most hazardous compounds known. Previous efforts to evolve the bacterial enzyme phosphotriesterase (PTE) for the hydrolytic decontamination of VX resulted in the identification of the variant L7ep-3a, which has a kcat value more than 2 orders of magnitude higher than that of wild-type PTE for the hydrolysis of VX. Because of the relatively small size of the O-ethyl, methylphosphonate center in VX, stereoselectivity is not a major concern. However, the Russian V-agent, VR, contains a larger O-isobutyl, methylphosphonate center, making stereoselectivity a significant issue since the SP-enantiomer is expected to be significantly more toxic than the RP-enantiomer. The three-dimensional structure of the L7ep-3a variant was determined to a resolution of 2.01 Å (PDB id: 4ZST ). The active site of the L7ep-3a mutant has revealed a network of hydrogen bonding interactions between Asp-301, Tyr-257, Gln-254, and the hydroxide that bridges the two metal ions. A series of new analogues that mimic VX and VR has helped to identify critical structural features for the development of new enzyme variants that are further enhanced for the catalytic detoxification of VR and VX. The best of these mutants has been shown to have a reversed stereochemical preference for the hydrolysis of VR-chiral center analogues. This mutant hydrolyzes the two enantiomers of VR 160- and 600-fold faster than wild-type PTE hydrolyzes the SP-enantiomer of VR.
Full Text Available In this paper, the server based solution of the multi-thread large vocabulary automatic speech recognition engine is described along with the Android OS and HTML5 practical application examples. The basic idea was to bring speech recognition available for full variety of applications for computers and especially for mobile devices. The speech recognition engine should be independent of commercial products and services (where the dictionary could not be modified. Using of third-party services could be also a security and privacy problem in specific applications, when the unsecured audio data could not be sent to uncontrolled environments (voice data transferred to servers around the globe. Using our experience with speech recognition applications, we have been able to construct a multi-thread speech recognition serverbased solution designed for simple applications interface (API to speech recognition engine modified to specific needs of particular application.
Ruotsalainen, J H; Sellman, J; Lehto, L; Jauhiainen, M; Verbeek, J H
Poor voice quality due to a voice disorder can lead to a reduced quality of life. In occupations where voice use is substantial it can lead to periods of absence from work. To evaluate the effectiveness of interventions to prevent voice disorders in adults. We searched MEDLINE (PubMed, 1950 to 2006), EMBASE (1974 to 2006), CENTRAL (The Cochrane Library, Issue 2 2006), CINAHL (1983 to 2006), PsychINFO (1967 to 2006), Science Citation Index (1986 to 2006) and the Occupational Health databases OSH-ROM (to 2006). The date of the last search was 05/04/06. Randomised controlled clinical trials (RCTs) of interventions evaluating the effectiveness of treatments to prevent voice disorders in adults. For work-directed interventions interrupted time series and prospective cohort studies were also eligible. Two authors independently extracted data and assessed trial quality. Meta-analysis was performed where appropriate. We identified two randomised controlled trials including a total of 53 participants in intervention groups and 43 controls. One study was conducted with teachers and the other with student teachers. Both trials were poor quality. Interventions were grouped into 1) direct voice training, 2) indirect voice training and 3) direct and indirect voice training combined.1) Direct voice training: One study did not find a significant decrease of the Voice Handicap Index for direct voice training compared to no intervention.2) Indirect voice training: One study did not find a significant decrease of the Voice Handicap Index for indirect voice training when compared to no intervention.3) Direct and indirect voice training combined: One study did not find a decrease of the Voice Handicap Index for direct and indirect voice training combined when compared to no intervention. The same study did however find an improvement in maximum phonation time (Mean Difference -3.18 sec; 95 % CI -4.43 to -1.93) for direct and indirect voice training combined when compared to no
Murphy, Dooley Joel
This preliminary study surveys whether/which avatar body parts are visible in first-wave consumer virtual reality (VR) applications for the HTC Vive (n = 200). A simple coding schema for assessing avatar bodily coherence (ABC) is piloted and evaluated. Results provide a snapshot of ABC in popular...... high-end VR applications in Q3 2016. It is reported (Table 1) that 86.5% of sampled items feature fully invisible avatars, 9% depict hands only, and 4.5% feature a head, torso, or legs, but still with some degree of bodily incoherence. Findings suggest that users may experience a sense of ownership and...
Lady Catherine Cantor Cutiva
Full Text Available Objectives: To characterize the objective voice parameters among school workers, and to identify associated factors of three objective voice parameters, namely fundamental frequency, sound pressure level and maximum phonation time. Materials and methods: We conducted a cross-sectional study among 116 Colombian teachers and 20 Colombian non-teachers. After signing the informed consent form, participants filled out a questionnaire. Then, a voice sample was recorded and evaluated perceptually by a speech therapist and by objective voice analysis with praat software. Short-term environmental measurements of sound level, temperature, humidity, and reverberation time were conducted during visits at the workplaces, such as classrooms and offices. Linear regression analysis was used to determine associations between individual and work-related factors and objective voice parameters. Results: Compared with men, women had higher fundamental frequency (201 Hz for teachers and 209 for non-teachers vs. 120 Hz for teachers and 127 for non-teachers and sound pressure level (82 dB vs. 80 dB, and shorter maximum phonation time (around 14 seconds vs. around 16 seconds. Female teachers younger than 50 years of age evidenced a significant tendency to speak with lower fundamental frequency and shorter mpt compared with female teachers older than 50 years of age. Female teachers had significantly higher fundamental frequency (66 Hz, higher sound pressure level (2 dB and short phonation time (2 seconds than male teachers. Conclusion: Female teachers younger than 50 years of age had significantly lower F0 and shorter mpt compared with those older than 50 years of age. The multivariate analysis showed that gender was a much more important determinant of variations in F0, spl and mpt than age and teaching occupation. Objectively measured temperature also contributed to the changes on spl among school workers.
Pisanski, Katarzyna; Oleszkiewicz, Anna; Sorokowska, Agnieszka
Vocal tract resonances provide reliable information about a speaker's body size that human listeners use for biosocial judgements as well as speech recognition. Although humans can accurately assess men's relative body size from the voice alone, how this ability is acquired remains unknown. In this study, we test the prediction that accurate voice-based size estimation is possible without prior audiovisual experience linking low frequencies to large bodies. Ninety-one healthy congenitally or early blind, late blind and sighted adults (aged 20-65) participated in the study. On the basis of vowel sounds alone, participants assessed the relative body sizes of male pairs of varying heights. Accuracy of voice-based body size assessments significantly exceeded chance and did not differ among participants who were sighted, or congenitally blind or who had lost their sight later in life. Accuracy increased significantly with relative differences in physical height between men, suggesting that both blind and sighted participants used reliable vocal cues to size (i.e. vocal tract resonances). Our findings demonstrate that prior visual experience is not necessary for accurate body size estimation. This capacity, integral to both nonverbal communication and speech perception, may be present at birth or may generalize from broader cross-modal correspondences. © 2016 The Author(s).
Lawrence, Debbie J.; Hettchen, William
The Voice Activated Information System (VAIS), developed by USACERL, allows inspectors to verbally log on-site inspection reports on a hand held tape recorder. The tape is later processed by the VAIS, which enters the information into the system's database and produces a written report. The Voice Operated Information System (VOIS), developed by USACERL and Automated Sciences Group, through a ESACERL cooperative research and development agreement (CRDA), is an improved voice recognition system based on the concepts and function of the VAIS. To determine the applicability of the VOIS to Corps of Engineers construction projects, Technology Transfer Test Bad (T3B) funds were provided to the Corps of Engineers National Security Agency (NSA) Area Office (Fort Meade) to procure and implement the VOIS, and to train personnel in its use. This report summarizes the NSA application of the VOIS to quality assurance inspection of radio frequency shielding and to progress payment logs, and concludes that the VOIS is an easily implemented system that can offer improvements when applied to repetitive inspection procedures. Use of VOIS can save time during inspection, improve documentation storage, and provide flexible retrieval of stored information.
Full Text Available Pattern recognition aims to classify data (patterns based ei-
ther on a priori knowledge or on statistical information extracted from the data. In this paper we will concentrate on statistical pattern recognition using a new probabilistic approach which makes possible to select the so called 'informative' features. We develop a pattern recognition algorithm which is based on the conditional independence structure underlying the statistical data. Our method was succesfully applied on a real problem of recognizing Parkinson's disease on the basis of voice disorders.
Paulo Eduardo Przysiezny
Full Text Available INTRODUCTION: Dysphonia is the main symptom of the disorders of oral communication. However, voice disorders also present with other symptoms such as difficulty in maintaining the voice (asthenia, vocal fatigue, variation in habitual vocal fundamental frequency, hoarseness, lack of vocal volume and projection, loss of vocal efficiency, and weakness when speaking. There are several proposals for the etiologic classification of dysphonia: functional, organofunctional, organic, and work-related voice disorder (WRVD.OBJECTIVE: To conduct a literature review on WRVD and on the current Brazilian labor legislation.METHODS: This was a review article with bibliographical research conducted on the PubMed and Bireme databases, using the terms "work-related voice disorder", "occupational dysphonia", "dysphonia and labor legislation", and a review of labor and social security relevant laws.CONCLUSION: WRVD is a situation that frequently is listed as a reason for work absenteeism, functional rehabilitation, or for prolonged absence from work. Currently, forensic physicians have no comparative parameters to help with the analysis of vocal disorders. In certain situations WRVD may cause, work disability. This disorder may be labor-related, or be an adjuvant factor to work-related diseases.
Heesche, Bjarke; MacDonald, Ewen; Fogh, Rune
This paper describes a voice sensor, suitable for modular robotic systems, which estimates the energy and fundamental frequency, F0, of the user’s voice. Through a number of example applications and tests with children, we observe how the voice sensor facilitates playful interaction between child...... children and two different robot configurations. In future work, we will investigate if such a system can motivate children to improve voice control and explore how to extend the sensor to detect emotions in the user’s voice....
Full Text Available Introduction: Voice disorders are a well-known complication which is often associated with thyroid gland diseases and because voice is still the basic mean of communication it is very important to maintain its quality healthy. Objectives: The aim of this study referred to questions whether there is a statistically significant difference between results of voice self-assessment, perceptual voice assessment and acoustic voice analysis before and after thyroidectomy and whether there are statistically significant correlations between variables of voice self-assessment, perceptual assessment and acoustic analysis before and after thyroidectomy. Methods: This scientific research included 12 participants aged between 41 and 76. Voice self-assessment was conducted with the help of Croatian version of Voice Handicap Index (VHI. Recorded reading samples were used for perceptual assessment and later evaluated by two clinical speech and language therapists. Recorded samples of phonation were used for acoustic analysis which was conducted with the help of acoustic program Praat. All of the data was processed through descriptive statistics and nonparametric statistical methods. Results: Results showed that there are statistically significant differences between results of voice self-assessments and results of acoustic analysis before and after thyroidectomy. Statistically significant correlations were found between variables of perceptual assessment and acoustic analysis. Conclusion: Obtained results indicate the importance of multidimensional, preoperative and postoperative assessment. This kind of assessment allows the clinician to describe all of the voice features and provides appropriate recommendation for further rehabilitation to the patient in order to optimize voice outcomes.
Ford, W.; Shirk, D.G.
The advent of microprocessors and other large-scale integration (LSI) circuits is making voice input and output for computers and instruments practical; specialized LSI chips for speech processing are appearing on the market. Voice can be used to input data or to issue instrument commands; this allows the operator to engage in other tasks, move about, and to use standard data entry systems. Voice synthesizers can generate audible, easily understood instructions. Using voice characteristics, a control system can verify speaker identity for security purposes. Two simple voice-controlled systems have been designed at Los Alamos for nuclear safeguards applicaations. Each can easily be expanded as time allows. The first system is for instrument control that accepts voice commands and issues audible operator prompts. The second system is for access control. The speaker's voice is used to verify his identity and to actuate external devices
At present, the development of Virtual Reality (VR) technology is expanding due to the importance and needs to use the 3D elements and 360 degrees panorama in expressing a clearer picture to consumers in various fields such as education, military, medicine, entertainment and so on. The web based VR kiosk project in Darulaman's Teacher Training…
Beatson, Scott A.; Ben Zakour, Nouri L.; Totsika, Makrina
the evolution and molecular mechanisms that underpin ABU, the genome of the ABU E. coli strain VR50 was sequenced. Analysis of the complete genome indicated that it most resembles E. coli K-12, with the addition of a 94-kb genomic island (GI-VR50-pheV), eight prophages, and multiple plasmids. GI-VR50-pheV has...... a mosaic structure and contains genes encoding a number of UTI-associated virulence factors, namely, Afa (afimbrial adhesin), two autotransporter proteins (Ag43 and Sat), and aerobactin. We demonstrated that the presence of this island in VR50 confers its ability to colonize the murine bladder, as a VR50...... mutant with GI-VR50-pheV deleted was attenuated in a mouse model of UTI in vivo. We established that Afa is the island-encoded factor responsible for this phenotype using two independent deletion (Afa operon and AfaE adhesin) mutants. E. coli VR50afa and VR50afaE displayed significantly decreased ability...
... agency for services, although an EN and a State VR agency may want to enter into an individualized agreement to meet the needs of a single beneficiary. ...' Participation Agreements Between Employment Networks and State Vr Agencies § 411.410 Does each referral from an...
spanish, syntax, grammaticalisation, past participle, passive voice, middle voice, language development......spanish, syntax, grammaticalisation, past participle, passive voice, middle voice, language development...
Full Text Available Forensic speaker recognition is experiencing a remarkable paradigm shift in terms of the evaluation framework and presentation of voice evidence. This paper proposes a new method of forensic automatic speaker recognition using the likelihood ratio framework to quantify the strength of voice evidence. The proposed method uses a reference database to calculate the within- and between-speaker variability. Some acoustic-phonetic features are extracted automatically using the software VoiceSauce. The effectiveness of the approach was tested using two Mandarin databases: A mobile telephone database and a landline database. The experiment's results indicate that these acoustic-phonetic features do have some discriminating potential and are worth trying in discrimination. The automatic acoustic-phonetic features have acceptable discriminative performance and can provide more reliable results in evidence analysis when fused with other kind of voice features.
Heidari Jozam, M.; Allameh, E.; Vries, de B.; Timmermans, H.J.P.; Masoud, M.; Andreev, S.; Balandin, S.; Yevgeni, Koucheryavy
In this paper, we propose a prototype of a user centered design system for Smart Homes which lets users: (1) configure different interactive tasks, and (2) express activity specifications and preferences during the design process. The main objective of this paper is how to create and to implement VR
Gao, Lili; Zhang, Yan; Deng, Jinghua; Yu, Wenbing; Yu, Yunxia
BACKGROUND Alzheimer disease (AD) is a chronic neurodegenerative disease that is one of the most prevalent health problems among seniors. The cause of AD has not yet been elucidated, but many risk factors have been identified that might contribute to the pathogenesis and prognosis of AD. We conducted a meta-analysis of studies involving CHAT, TFAM, and VR22 polymorphisms and AD susceptibility to further understand the pathogenesis of AD. MATERIAL AND METHODS PubMed/Medline, Embase, Web of Science, the Cochrane Library, and Google Scholar were searched for relevant articles. Rs1880676, rs2177369, rs3810950, and rs868750 of CHAT; rs1937 and rs2306604 of TFAM; and rs10997691 and rs7070570 of VR22 are studied in this meta-analysis. RESULTS A total of 51 case-control studies with 16 446 cases and 16 057 controls were enrolled. For CHAT, rs2177369 (G>A) in whites and rs3810950 (G>A) in Asians were found to be associated with AD susceptibility. No association was detected between rs1880676 and rs868750 and AD risk. For TFAM and VR22, no significant association was detected in studied single-nucleotide polymorphisms (SNPs). CONCLUSIONS Rs2177369 and rs3810950 of CHAT are associated with AD susceptibility, but rs1880676 and rs868750 are not. Rs1937 and rs2306604 of TFAM, and rs10997691 and rs7070570 of VR22 are not significantly associated with AD risk.
Cameron, Ian; Crosthwaite, Caroline; Norton, Christine; Balliu, Nicoleta; Tadé, Moses; Hoadley, Andrew; Shallcross, David; Barton, Geoff
This work presents a unique education resource for both process engineering students and the industry workforce. The learning environment is based around spherical imagery of real operating plants coupled with interactive embedded activities and content. This Virtual Reality (VR) learning tool has been developed by applying aspects of relevant…
Herbelin, Bruno; Grillon, Helena; De Heras Ciechomski, Pablo
This article presents a simple and intuitive way to represent the eye-tracking data gathered during immersive virtual reality exposure therapy sessions. Eye-tracking technology is used to observe gaze movements during vir- tual reality sessions and the gaze-map chromatic gradient coding allows to...... is fully compatible with different VR exposure systems and provides clinically meaningful data....
Alverson, Charlotte Y.; Yamamoto, Scott H.
This study utilized hierarchical linear modeling analysis of a 10-year extant dataset from Rehabilitation Services Administration to investigate significant predictors of employment outcomes for vocational rehabilitation (VR) clients with autism. Predictor variables were gender, ethnicity, attained education level, IEP status in high school,…
Schouten, H.J.; Brinkhuis, J.; Burgh, van der S.; Schaart, J.; Groenwold, R.; Broggini, G.A.L.; Gessler, C.
Apple scab, caused by Venturia inaequalis, is a serious disease of apple. Previously, the scab resistance Rvi15 (Vr2) from the accession GMAL 2473 was genetically mapped, and three candidate resistance genes were identified. Here, we report the cloning and functional characterization of these three
Lee, I. S.; Yoon, S. H.; Shim, K. W.; Yu, Y. H.; Suh, K. Y.
There continues to be an increasing demand of electricity around the globe to fuel the industrial growth and to promote the human welfare. The economic activities have brought about richness in our material and cultural lives, in which process the electric power has been at the heart of the versatile energy sources. In order to timely and competitively respond to rapidly changing energy environment in the twenty-first century there is a growing need to build the advanced nuclear power plants in the unlimited K, which were confirmed by FTIR and 51 V Ncommissioning. One can then realistically evaluate their construction time and cost per varying methods and options available from the leading-edge technology. In particular a great deal of efforts have yet to be made for time- and cost-dependent plant simulation and dynamically coupled database construction in the VR space. The operator training and personnel education may also benefit from the VR technology. The present work is being proposed in the three-dimensional space and time plus cost coordinates, i.e. four plus dimensional (4 + D) coordinates. The 4 + D VR application will enable the nuclear industry to narrow the technological gap from the other leading industries that have long since been employing the VR engineering. The 4 + D technology will help nurture public understanding of the special discipline of nuclear power plants. The technology will also facilitate public access to the knowledge on the nuclear science and engineering which has so far been monopolized by the academia, national laboratories and the heavy industry. The 4 + D virtual design and construction will open up the new horizon for revitalization of the nuclear industry over the globe in the foreseeable future. Considering the long construction and operation time for the nuclear power plants, the preliminary VR simulation capability for the plants will supply the vital information not only for the actual design and construction of the
Bell, R. E.; Turrin, M.; Frearson, N.; Boghosian, A.; Ferrini, V. L.; Simpson, F.
The geosciences are rich in imagery, making them compelling material for immersive teaching experiences. We often work in remote locations, places where few others are able to travel. Flat 2 D images from the field have served explorers and scientists well from the lantern slides brought back from Antarctica to the images scientists and educators now use in powerpoint presentations. These images provide a backdrop to introduce the experience for formal classes and informal presentations. Our stories from the field bring the setting alive for the participants. The travelers presented and the audience passively listened. Immersive learning opportunities are much more powerful than lecturing. We have enlisted both VR and drone imagery to bring learners fully into the experience of science. A 360 VR image brings the viewer into the moment of discovery. Both have been shown to create an active learning setting fully under the learner's control; they explore at their own pace and following their own interest. This learning `sticks', becoming part of the participant's own unique experience in the space. We are building VR images of field experiences and VR data immersion experiences that will transport people into new locations, building a field experience that they can not only see but fully explore. Through VR we introduce new experiences that showcase our science, our careers and our collaborations. Users can spin the view up to see the helicopter landing in a remote field location by the ice. Spin to the right and see a colleague collecting a reading from instruments that have been pulled from the LC130 aircraft. Turn the view to the left and see the harsh windswept environment along the edge of an ice shelf. Look down and note that you feet are encased in snow boots to keep them warm and stable on the ice. The viewer is in the field as part of the science team. Learning in the classroom and through social media is now fully 360 and fully immersive.
Al-Qahtani, Noura H
To examine whether prenatal exposure to music and voice alters foetal behaviour and whether foetal response to music differs from human voice. A prospective observational study was conducted in 20 normal term pregnant mothers. Ten foetuses were exposed to music and voice for 15 s at different sound pressure levels to find out the optimal setting for the auditory stimulation. Music, voice and sham were played to another 10 foetuses via a headphone on the maternal abdomen. The sound pressure level was 105 db and 94 db for music and voice, respectively. Computerised assessment of foetal heart rate and activity were recorded. 90 actocardiograms were obtained for the whole group. One way anova followed by posthoc (Student-Newman-Keuls method) analysis was used to find if there is significant difference in foetal response to music and voice versus sham. Foetuses responded with heart rate acceleration and motor response to both music and voice. This was statistically significant compared to sham. There was no significant difference between the foetal heart rate acceleration to music and voice. Prenatal exposure to music and voice alters the foetal behaviour. No difference was detected in foetal response to music and voice.
Voice Recognition (VR) facilitates a human interaction with the machine. VR may be used to replace the manual task of pushing buttons on a wireless telephone keypad. This is particularly useful when the hands of the user are busy with other activities like driving a car. However, the VRS system has several limitations. The VRS requires lot of training and customization in order to be effectively used by individual users as each individual falls into different voice patterns. Besides the voice...
Sebe, Nicu; Cohen, Ira; Gevers, Theo; Huang, Thomas S.
Recent technological advances have enabled human users to interact with computers in ways previously unimaginable. Beyond the confines of the keyboard and mouse, new modalities for human-computer interaction such as voice, gesture, and force-feedback are emerging. Despite important advances, one necessary ingredient for natural interaction is still missing-emotions. Emotions play an important role in human-to-human communication and interaction, allowing people to express themselves beyond the verbal domain. The ability to understand human emotions is desirable for the computer in several applications. This paper explores new ways of human-computer interaction that enable the computer to be more aware of the user's emotional and attentional expressions. We present the basic research in the field and the recent advances into the emotion recognition from facial, voice, and physiological signals, where the different modalities are treated independently. We then describe the challenging problem of multimodal emotion recognition and we advocate the use of probabilistic graphical models when fusing the different modalities. We also discuss the difficult issues of obtaining reliable affective data, obtaining ground truth for emotion recognition, and the use of unlabeled data.
Caranica, Alexandru; Burileanu, Corneliu
The novelty of this work relies on the application of an open source research software toolkit (CMU Sphinx) to train, build and evaluate a speech recognition system, with speaker-independent support, for voice-controlled hardware applications. Moreover, we propose to use the trained acoustic model to successfully decode offline voice commands on embedded hardware, such as an ARMv6 low-cost SoC, Raspberry PI. This type of single-board computer, mainly used for educational and research activities, can serve as a proof-of-concept software and hardware stack for low cost voice automation systems.
Ana Cristina Nunes Ruas
Full Text Available INTRODUCTION: Leishmaniasis is considered as one of the six most important infectious diseases because of its high detection coefficient and ability to produce deformities. In most cases, mucosal leishmaniasis (ML occurs as a consequence of cutaneous leishmaniasis. If left untreated, mucosal lesions can leave sequelae, interfering in the swallowing, breathing, voice and speech processes and requiring rehabilitation. OBJECTIVE: To describe the anatomical characteristics and voice quality of ML patients. MATERIALS AND METHODS: A descriptive transversal study was conducted in a cohort of ML patients treated at the Laboratory for Leishmaniasis Surveillance of the Evandro Chagas National Institute of Infectious Diseases-Fiocruz, between 2010 and 2013. The patients were submitted to otorhinolaryngologic clinical examination by endoscopy of the upper airways and digestive tract and to speech-language assessment through directed anamnesis, auditory perception, phonation times and vocal acoustic analysis. The variables of interest were epidemiologic (sex and age and clinic (lesion location, associated symptoms and voice quality. RESULTS: 26 patients under ML treatment and monitored by speech therapists were studied. 21 (81% were male and five (19% female, with ages ranging from 15 to 78 years (54.5+15.0 years. The lesions were distributed in the following structures 88.5% nasal, 38.5% oral, 34.6% pharyngeal and 19.2% laryngeal, with some patients presenting lesions in more than one anatomic site. The main complaint was nasal obstruction (73.1%, followed by dysphonia (38.5%, odynophagia (30.8% and dysphagia (26.9%. 23 patients (84.6% presented voice quality perturbations. Dysphonia was significantly associated to lesions in the larynx, pharynx and oral cavity. CONCLUSION: We observed that vocal quality perturbations are frequent in patients with mucosal leishmaniasis, even without laryngeal lesions; they are probably associated to disorders of some
Veltri, Daniel; Kamath, Uday; Shehu, Amarda
Bacterial resistance to antibiotics is a growing concern. Antimicrobial peptides (AMPs), natural components of innate immunity, are popular targets for developing new drugs. Machine learning methods are now commonly adopted by wet-laboratory researchers to screen for promising candidates. In this work we utilize deep learning to recognize antimicrobial activity. We propose a neural network model with convolutional and recurrent layers that leverage primary sequence composition. Results show that the proposed model outperforms state-of-the-art classification models on a comprehensive data set. By utilizing the embedding weights, we also present a reduced-alphabet representation and show that reasonable AMP recognition can be maintained using nine amino-acid types. Models and data sets are made freely available through the Antimicrobial Peptide Scanner vr.2 web server at: www.ampscanner.com. email@example.com for general inquiries and firstname.lastname@example.org for web server information. Supplementary data are available at Bioinformatics online.
Niebudek-Bogusz, Ewa; Fiszer, Marta; Sliwińska-Kowalska, Mariola
Laryngovideostroboscopy is the method most frequently used in the assessment of voice disorders. However, the employment of quantitative methods, such as voice acoustic analysis, is essential for evaluating the effectiveness of prophylactic and therapeutic activities as well as for objective medical certification of larynx pathologies. The aim of this study was to examine voice acoustic parameters in female teachers with occupational voice diseases. Acoustic analysis (IRIS software) was performed in 66 female teachers, including 35 teachers with occupational voice diseases and 31 with functional dysphonia. The teachers with occupational voice diseases presented the lower average fundamental frequency (193 Hz) compared to the group with functional dysphonia (209 Hz) and to the normative value (236 Hz), whereas other acoustic parameters did not differ significantly in both groups. Voice acoustic analysis, when applied separately from vocal loading, cannot be used as a testing method to verify the diagnosis of occupational voice disorders.
Jones, Benedict C; Feinberg, David R; DeBruine, Lisa M; Little, Anthony C; Vukovic, Jovana
Most previous studies of vocal attractiveness have focused on preferences for physical characteristics of voices such as pitch. Here we examine the content of vocalizations in interaction with such physical traits, finding that vocal cues of social interest modulate the strength of men's preferences for raised pitch in women's voices. Men showed stronger preferences for raised pitch when judging the voices of women who appeared interested in the listener than when judging the voices of women ...
Chirico, Andrea; Lucidi, Fabio; De Laurentiis, Michele; Milanese, Carla; Napoli, Alessandro; Giordano, Antonio
Virtual reality (VR), a computer-generated virtual environment, has been increasingly used in the entertainment world becoming a very new evolving field, but VR technology has also found a variety of applications in the biomedical field. VR can offer to subjects a safe environment within which to carry on different interventions ranging from the rehabilitation of discharged patients directly at home, to the support of hospitalized patients during different procedures and also of oncological inpatient subjects. VR appears as a promising tool for support and monitoring treatments in cancer patients influencing psychological and physiological functions. The aim of this systematic review is to provide an overview of all the studies that used VR intervention on cancer patients and analyze their main findings. Nineteen studies across nearly a thousand articles were identified that explored effects of VR interventions on cancer patients. Although these studies varied greatly in setting and design, this review identified some overarching themes. Results found that VR improved patients' emotional well-being, and diminished cancer-related psychological symptoms. The studies explored various relevant variables including different types of settings (i.e., during chemotherapy, during pain procedures, during hospitalization). Here, we point to the need of a global and multi-disciplinary approach aimed at analyzing the effects of VR taking advantage of the new technology systems like biosensors as well as electroencephalogram monitoring pre, during, and after intervention. Devoting more attention to bio-physiological variables, standardized procedures, extending duration to longitudinal studies and adjusting for motion sickness related to VR treatment need to become standard of this research field. © 2015 Wiley Periodicals, Inc.
Patole Milind S
Full Text Available Abstract Background Lactobacillus plantarum is considered as a safe and effective probiotic microorganism. Among various sources of isolation, traditionally fermented foods are considered to be rich in Lactobacillus spp., which can be exploited for their probiotic attribute. Antibacterial property of L. plantarum has been demonstrated against various enteric pathogens in both in vitro and in vivo systems. This study was aimed at characterizing L. plantarum isolated from Kutajarista, an ayurvedic fermented biomedicine, and assessing its antagonistic property against a common enteropathogen Aeromonas veronii. Results We report the isolation of L. plantarum (VR1 from Kutajarista, and efficacy of its cell free supernatant (CFS in amelioration of cytotoxicity caused by Aeromonas veronii. On the part of probiotic attributes, VR1 was tolerant to pH 2, 0.3% bile salts and simulated gastric juice. Additionally, VR1 also exhibited adhesive property to human intestinal HT-29 cell line. Furthermore, CFS of VR1 was antibacterial to enteric pathogens like Pseudomonas aeruginosa, Staphylococcus aureus, Escherichia coli, Aeromonas veronii and clinical isolates of P. aeruginosa and E. coli. Detailed study regarding the effect of VR1 CFS on A. veronii cytotoxicity showed a significant decrease in vacuole formation and detrimental cellular changes in Vero cells. On the other hand, A. veronii CFS caused disruption of tight junction proteins ZO-1 and actin in MDCK cell line, which was prevented by pre-incubation with CFS of VR1. Conclusions This is the first study to report isolation of L. plantarum (VR1 from Kutajarista and characterisation for its probiotic attributes. Our study demonstrates the antagonistic property of VR1 to A. veronii and effect of VR1 CFS in reduction of cellular damage caused by A. veronii in both Vero and MDCK cell lines.
Full Text Available Objective: Voice onset time is known to be cue for the distinction between voiced and voiceless stops and it can be used to describe or categorize a range of developmental, neuromotor and linguistic disorders. The aim of this study is determination of standard values of voice onset time for Azerbaijani language (Tabriz dialect. Materials & Methods: In this description-analytical study, 30 Azeris persons whom were selected conveniently by simple selection, uttered 46 monosyllabic words initiating with 6 Azerbaijani stops twice. Using Praat software, the voice onset time values were analyzed by waveform and wideband spectrogram in milliseconds. Vowel effect, sex differences and the effect of place of articulation on VOT, were evaluated and data were analyzed by one-way ANOVA test. Results: There was no significant difference in voice onset time between male and female Azeris speakers (P<0.05. Vowel and place of articulation had significant correlation with voice onset time (P<0.001. Voice onset time values for /b/, /p/, /d/, /t/, /g/, /k/, and [c], [ɟ] allophones were 10.64, 86.88, 13.35, 87.09, 26.25, 100.62, 131.19, 63.18 mili second, respectively. Conclusion: Voice onset time values are the same for Azerbaijani men and women. However, like many other languages, back and high vowels and back place of articulation lengthen VOT. Also, voiceless stops are aspirated in this language and voiced stops have positive VOT values.
Kim, Youngmoo E.
The singing voice is the oldest musical instrument, but its versatility and emotional power are unmatched. Through the combination of music, lyrics, and expression, the voice is able to affect us in ways that no other instrument can. The fact that vocal music is prevalent in almost all cultures is indicative of its innate appeal to the human aesthetic. Singing also permeates most genres of music, attesting to the wide range of sounds the human voice is capable of producing. As listeners we are naturally drawn to the sound of the human voice, and, when present, it immediately becomes the focus of our attention.
Pedersen, Inge Nygaard; Storm, Sanne
Aspects will be drawn on the human voice as tool for embodying our psychological and physiological state, and attempting integration of feelings. Presentations and dialogues on different methods and techniques in "Therapy related body-and voice work.", as well as the human voice as a tool for non...
Matejka, K.; Sklenka, L.
The VR-1 training reactor has been serving students of the Faculty of Nuclear Science and Physical Engineering, Czech Technical University in Prague, for more than 12 years now. The operation history of the reactor is highlighted. The major changes made at the VR-1 reactor are outlined and the main experimentally verified core configurations are shown. Some components of the new equipment installed on the VR-1 reactor are described in detail. The fields of application are shown: the reactor serves not only the training of university students within whole Czech Republic but also the training of specialists, research activities, and information programmes in the nuclear power domain. (P.A.)
Mean Foong, Oi; Low, Tang Jung; La, Wai Wan
The process of learning and understand the sign language may be cumbersome to some, and therefore, this paper proposes a solution to this problem by providing a voice (English Language) to sign language translation system using Speech and Image processing technique. Speech processing which includes Speech Recognition is the study of recognizing the words being spoken, regardless of whom the speaker is. This project uses template-based recognition as the main approach in which the V2S system first needs to be trained with speech pattern based on some generic spectral parameter set. These spectral parameter set will then be stored as template in a database. The system will perform the recognition process through matching the parameter set of the input speech with the stored templates to finally display the sign language in video format. Empirical results show that the system has 80.3% recognition rate.
Cordaro, Daniel T; Keltner, Dacher; Tshering, Sumjay; Wangchuk, Dorji; Flynn, Lisa M
With data from 10 different globalized cultures and 1 remote, isolated village in Bhutan, we examined universals and cultural variations in the recognition of 16 nonverbal emotional vocalizations. College students in 10 nations (Study 1) and villagers in remote Bhutan (Study 2) were asked to match emotional vocalizations to 1-sentence stories of the same valence. Guided by previous conceptualizations of recognition accuracy, across both studies, 7 of the 16 vocal burst stimuli were found to have strong or very strong recognition in all 11 cultures, 6 vocal bursts were found to have moderate recognition, and 4 were not universally recognized. All vocal burst stimuli varied significantly in terms of the degree to which they were recognized across the 11 cultures. Our discussion focuses on the implications of these results for current debates concerning the emotion conveyed in the voice. (c) 2016 APA, all rights reserved).
Riva, G; Bacchetta, M; Baruffi, M; Borgomainerio, E; Defrance, C; Gatti, F; Galimberti, C; Fontaneto, S; Marchi, S; Molinari, E; Nugues, P; Rinaldi, S; Rovetta, A; Ferretti, G S; Tonci, A; Wann, J; Vincelli, F
Virtual reality (VR) is an emerging technology that alters the way individuals interact with computers: a 3D computer-generated environment in which a person can move about and interact as if he actually was inside it. Given to the high computational power required to create virtual environments, these are usually developed on expensive high-end workstations. However, the significant advances in PC hardware that have been made over the last three years, are making PC-based VR a possible solution for clinical assessment and therapy. VREPAR - Virtual Reality Environments for Psychoneurophysiological Assessment and Rehabilitation - are two European Community funded projects (Telematics for health - HC 1053/HC 1055 - http://www.psicologia.net) that are trying to develop a modular PC-based virtual reality system for the medical market. The paper describes the rationale of the developed modules and the preliminary results obtained.
Sreedhara, S.; Huh, Kang Y.; Park, Hoyoung
It has become inevitable to search for alternative fuels due to current worldwide energy crisis. In this paper combustion characteristics of vacuum residue (VR) is investigated numerically against experimental data in typical operating conditions of a furnace. Heat release reaction is modeled as sequential steps of devolatilization, simplified gas phase reaction and char oxidation as for pulverized coal. Thermal and fuel NO are predicted by the conditional moment closure (CMC) method for estimation of elementary reaction rates. It turns out that Sauter mean diameter (SMD) of VR droplets is a crucial parameter for better combustion efficiency and lower NO. Reasonable agreement is achieved for spatial distributions of major species, temperature and NO for all test cases with different fuel and steam flow rates
Chen, Jiekang; Huang, Qitai; Guan, Min
Virtual Reality (VR) products serve for human eyes ultimately, and the optical properties of VR optical systems must be consistent with the characteristic of human eyes. The monocular coaxial VR optical system is simulated in ZEMAX. A diffraction grating is added to the optical surface next to the eye, and the lights emitted from the diffraction grating are deflected, which can forming an asymmetrical field of view(FOV). Then the lateral chromatic aberration caused by the diffraction grating was corrected by the chromatic dispersion of the prism. Finally, the aspheric surface was added to further optimum design. During the optical design of the system, how to balance the dispersion of the diffraction grating and the prism is the main problem. The balance was achieved by adjusting the parameters of the grating and the prism constantly, and then using aspheric surfaces finally. In order to make the asymmetric FOV of the system consistent with the angle of the visual axis, and to ensure the stereo vision area clear, the smaller half FOV of monocular system is required to reach 30°. Eventually, a system with asymmetrical FOV of 30°+40° was designed. In addition, the aberration curve of the system was analyzed by ZEMAX, and the binocular FOV was calculated according to the principle of binocular overlap. The results show that the asymmetry of FOV of VR monocular optical system can fit to human eyes and the imaging quality match for the human visual characteristics. At the same time, the diffraction grating increases binocular FOV, which decreases the requirement for the design FOV of monocular system.
reach satisfactory technical performance like latency and frame rate, while generating the sensory stimuli needed for this type of training —visual...release. Distribution is unlimited. USE OF VR TECHNOLOGY AND PASSIVE HAPTICS FOR MANPADS TRAINING SYSTEM by Faisal Rashid September 2017...HAPTICS FOR MANPADS TRAINING SYSTEM 5. FUNDING NUMBERS 6. AUTHOR(S) Faisal Rashid 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) Naval
Kress, Bernard; Saeedi, Ehsan; Brac-de-la-Perriere, Vincent
This paper reviews the various optical technologies that have been developed to implement HMDs (Head Mounted Displays), both as AR (Augmented Reality) devices, VR (Virtual Reality) devices and more recently as smart glasses, smart eyewear or connected glasses. We review the typical requirements and optical performances of such devices and categorize them into distinct groups, which are suited for different (and constantly evolving) market segments, and analyze such market segmentation.
Farzad Pour Rahimian
Full Text Available Communications for information synchronization during the conceptual design phase require designers to employ more intuitive digital design tools. This paper presents findings of a feasibility study for using VR 3D sketching interface in order to replace current non-intuitive CAD tools. We used a sequential mixed method research methodology including a qualitative case study and a cognitive-based quantitative protocol analysis experiment. Foremost, the case study research was conducted in order to understand how novice designers make intuitive decisions. The case study documented the failure of conventional sketching methods in articulating complicated design ideas and shortcomings of current CAD tools in intuitive ideation. The case study’s findings then became the theoretical foundations for testing the feasibility of using VR 3D sketching interface during design. The latter phase of study evaluated the designers’ spatial cognition and collaboration at six different levels: "physical-actions", "perceptualactions", "functional-actions", "conceptual-actions", "cognitive synchronizations", and "gestures". The results and confirmed hypotheses showed that the utilized tangible 3D sketching interface improved novice designers’ cognitive and collaborative design activities. In summary this paper presents the influences of current external representation tools on designers’ cognition and collaboration as well as providing the necessary theoretical foundations for implementing VR 3D sketching interface. It contributes towards transforming conceptual architectural design phase from analogue to digital by proposing a new VR design interface. The paper proposes this transformation to fill in the existing gap between analogue conceptual architectural design process and remaining digital engineering parts of building design process hence expediting digital design process.
Title: Preparation of VR presentations from CAD models Author: Martin Kudlvasr Department: Dept. of Software and Computer Science Education, Faculty of Mathematics and Physics, Charles University in Prague Supervisor: Prof.Ing. Jiří Žára, CSc. Supervisor's e-mail address: Abstract: Analysis of a specific problem in virtual engineering department in ŠKODA AUTO, a. s. company, concept of the problem solution and implementation of the solution. Virtual engineering department emp...
Arunachalam, Ravikumar; Boominathan, Prakash; Mahalingam, Shenbagavalli
Carnatic singing is a classical South Indian style of music that involves rigorous training to produce an "open throated" loud, predominantly low-pitched singing, embedded with vocal nuances in higher pitches. Voice problems in singers are not uncommon. The objective was to report the nature of voice problems and apply a routine protocol to assess the voice. Forty-five trained performing singers (females: 36 and males: 9) who reported to a tertiary care hospital with voice problems underwent voice assessment. The study analyzed their problems and the clinical findings. Voice change, difficulty in singing higher pitches, and voice fatigue were major complaints. Most of the singers suffered laryngopharyngeal reflux that coexisted with muscle tension dysphonia and chronic laryngitis. Speaking voices were rated predominantly as "moderate deviation" on GRBAS (Grade, Rough, Breathy, Asthenia, and Strain). Maximum phonation time ranged from 4 to 29 seconds (females: 10.2, standard deviation [SD]: 5.28 and males: 15.7, SD: 5.79). Singing frequency range was reduced (females: 21.3 Semitones and males: 23.99 Semitones). Dysphonia severity index (DSI) scores ranged from -3.5 to 4.91 (females: 0.075 and males: 0.64). Singing frequency range and DSI did not show significant difference between sex and across clinical diagnosis. Self-perception using voice disorder outcome profile revealed overall severity score of 5.1 (SD: 2.7). Findings are discussed from a clinical intervention perspective. Study highlighted the nature of voice problems (hyperfunctional) and required modifications in assessment protocol for Carnatic singers. Need for regular assessments and vocal hygiene education to maintain good vocal health are emphasized as outcomes. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Full Text Available Recently we assist to an increasing availability of HBIM models rich in geometric and informative terms. Instead, there is still a lack of researches implementing dedicated libraries, based on parametric intelligence and semantically aware, related to the architectural heritage. Additional challenges became from their portability in non-desktop environment (such as VR. The research article demonstrates the validity of a workflow applied to the architectural heritage, which starting from the semantic modeling reaches the visualization in a virtual reality environment, passing through the necessary phases of export, data migration and management. The three-dimensional modeling of the classical Doric order takes place in the BIM work environment and is configured as a necessary starting point for the implementation of data, parametric intelligences and definition of ontologies that exclusively qualify the model. The study also enables an effective method for data migration from the BIM model to databases integrated into VR technologies for AH. Furthermore, the process intends to propose a methodology, applicable in a return path, suited to the achievement of an appropriate data enrichment of each model and to the possibility of interaction in VR environment with the model.
Stefanik, Milan; Rataj, Jan; Huml, Ondrej; Sklenka, Lubomir
The VR-1 training reactor operated by the Czech Technical University in Prague is utilized mainly for education of students and training of various reactor staff; however, R&D is also carried out at the reactor. The experimental instrumentation of the reactor can be used for the irradiation experiments and neutron activation analysis. In this paper, the neutron activation analysis (NAA) is used for a study of dietary supplements containing the zinc (one of the essential trace elements for the human body). This analysis includes the dietary supplement pills of different brands; each brand is represented by several different batches of pills. All pills were irradiated together with the standard activation etalons in the vertical channel of the VR-1 reactor at the nominal power (80 W). Activated samples were investigated by the nuclear gamma-ray spectrometry technique employing the semiconductor HPGe detector. From resulting saturated activities, the amount of mineral element (Zn) in the pills was determined using the comparative NAA method. The results show clearly that the VR-1 training reactor is utilizable for neutron activation analysis experiments.
Full Text Available Good public speaking skills are essential in many professions as well as everyday life, but speech anxiety is a common problem. While it is established that public speaking training in virtual reality (VR is effective, comprehensive studies on the underlying factors that contribute to this success are rare. The “quality evaluation of user-system interaction in virtual reality” framework for evaluation of VR applications is presented that includes system features, user factors, and moderating variables. Based on this framework, variables that are postulated to influence the quality of a public speaking training application were selected for a first validation study. In a cross-sectional, repeated measures laboratory study [N = 36 undergraduate students; 36% men, 64% women, mean age = 26.42 years (SD = 3.42], the effects of task difficulty (independent variable, ability to concentrate, fear of public speaking, and social presence (covariates on public speaking performance (dependent variable in a virtual training scenario were analyzed, using stereoscopic visualization on a screen. The results indicate that the covariates moderate the effect of task difficulty on speech performance, turning it into a non-significant effect. Further interrelations are explored. The presenter’s reaction to the virtual agents in the audience shows a tendency of overlap of explained variance with task difficulty. This underlines the need for more studies dedicated to the interaction of contributing factors for determining the quality of VR public speaking applications.
Kuka, Daniela; Elias, Oliver; Martins, Ronald; Lindinger, Christopher; Pramböck, Andreas; Jalsovec, Andreas; Maresch, Pascal; Hörtner, Horst; Brandl, Peter
DEEP SPACE is a large-scale platform for interactive, stereoscopic and high resolution content. The spatial and the system design of DEEP SPACE are facing constraints of CAVETM-like systems in respect to multi-user interactive storytelling. To be used as research platform and as public exhibition space for many people, DEEP SPACE is capable to process interactive, stereoscopic applications on two projection walls with a size of 16 by 9 meters and a resolution of four times 1080p (4K) each. The processed applications are ranging from Virtual Reality (VR)-environments to 3D-movies to computationally intensive 2D-productions. In this paper, we are describing DEEP SPACE as an experimental VR platform for multi-user interactive storytelling. We are focusing on the system design relevant for the platform, including the integration of the Apple iPod Touch technology as VR control, and a special case study that is demonstrating the research efforts in the field of multi-user interactive storytelling. The described case study, entitled "Papyrate's Island", provides a prototypical scenario of how physical drawings may impact on digital narratives. In this special case, DEEP SPACE helps us to explore the hypothesis that drawing, a primordial human creative skill, gives us access to entirely new creative possibilities in the domain of interactive storytelling.
Paez-Espino, David; Chen, I-Min A; Palaniappan, Krishna; Ratner, Anna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Huang, Jinghua; Markowitz, Victor M; Nielsen, Torben; Huntemann, Marcel; K Reddy, T B; Pavlopoulos, Georgios A; Sullivan, Matthew B; Campbell, Barbara J; Chen, Feng; McMahon, Katherine; Hallam, Steve J; Denef, Vincent; Cavicchioli, Ricardo; Caffrey, Sean M; Streit, Wolfgang R; Webster, John; Handley, Kim M; Salekdeh, Ghasem H; Tsesmetzis, Nicolas; Setubal, Joao C; Pope, Phillip B; Liu, Wen-Tso; Rivers, Adam R; Ivanova, Natalia N; Kyrpides, Nikos C
Viruses represent the most abundant life forms on the planet. Recent experimental and computational improvements have led to a dramatic increase in the number of viral genome sequences identified primarily from metagenomic samples. As a result of the expanding catalog of metagenomic viral sequences, there exists a need for a comprehensive computational platform integrating all these sequences with associated metadata and analytical tools. Here we present IMG/VR (https://img.jgi.doe.gov/vr/), the largest publicly available database of 3908 isolate reference DNA viruses with 264 413 computationally identified viral contigs from >6000 ecologically diverse metagenomic samples. Approximately half of the viral contigs are grouped into genetically distinct quasi-species clusters. Microbial hosts are predicted for 20 000 viral sequences, revealing nine microbial phyla previously unreported to be infected by viruses. Viral sequences can be queried using a variety of associated metadata, including habitat type and geographic location of the samples, or taxonomic classification according to hallmark viral genes. IMG/VR has a user-friendly interface that allows users to interrogate all integrated data and interact by comparing with external sequences, thus serving as an essential resource in the viral genomics community. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Kuriakose, Selvia; Lahiri, Uttama
Individuals with autism are often characterized by impairments in communication, reciprocal social interaction and explicit expression of their affective states. In conventional techniques, a therapist adjusts the intervention paradigm by monitoring the affective state e.g., anxiety of these individuals for effective floor-time-therapy. Conventional techniques, though powerful, are observation-based and face resource limitations. Technology-assisted systems can provide a quantitative, individualized rehabilitation platform. Presently-available systems are designed primarily to chain learning via aspects of one's performance alone restricting individualization. Specifically, these systems are not sensitive to one's anxiety. Our presented work seeks to bridge this gap by developing a novel VR-based interactive system with Anxiety-Sensitive adaptive technology. Specifically, such a system is capable of objectively identifying and quantifying one's anxiety level from real-time biomarkers, along with performance metrics. In turn it can adaptively respond in an individualized manner to foster improved social communication skills. In our present research, we have used Virtual Reality (VR) to design a proof-of-concept application that exposes participants to social tasks of varying challenges. Results of a preliminary usability study indicate the potential of our VR-based Anxiety-Sensitive system to foster improved task performance, thereby serving as a potent complementary tool in the hands of therapist.
Dacakis, Georgia; Oates, Jennifer; Douglas, Jacinta
The Transsexual Voice Questionnaire (TVQ MtF ) was designed to capture the voice-related perceptions of individuals whose gender identity as female is the opposite of their birth-assigned gender (MtF women). Evaluation of the psychometric properties of the TVQ MtF is ongoing. To investigate associations between TVQ MtF scores and (1) self-perceptions of voice femininity and (2) acoustic parameters of voice pitch and voice quality in order to evaluate further the validity of the TVQ MtF . A strong correlation between TVQ MtF scores and self-ratings of voice femininity was predicted, but no association between TVQ MtF scores and acoustic measures of voice pitch and quality was proposed. Participants were 148 MtF women (mean age 48.14 years) recruited from the La Trobe Communication Clinic and the clinics of three doctors specializing in transgender health. All participants completed the TVQ MtF and 34 of these participants also provided a voice sample for acoustic analysis. Pearson product-moment correlation analysis was conducted to examine the associations between TVQ MtF scores and (1) self-perceptions of voice femininity and (2) acoustic measures of F0, jitter (%), shimmer (dB) and harmonic-to-noise ratio (HNR). Strong negative correlations between the participants' perceptions of their voice femininity and the TVQ MtF scores demonstrated that for this group of MtF women a low self-rating of voice femininity was associated with more frequent negative voice-related experiences. This association was strongest with the vocal-functioning component of the TVQ MtF . These strong correlations and high levels of shared variance between the TVQ MtF and a measure of a related construct provides evidence for the convergent validity of the TVQ MtF . The absence of significant correlations between the TVQ MtF and the acoustic data is consistent with the equivocal findings of earlier research. This finding indicates that these two measures assess different aspects of the voice
Full Text Available The ‘temporal voice areas’ (TVAs (Belin et al., 2000 of the human brain show greater neuronal activity in response to human voices than to other categories of nonvocal sounds. However, a direct link between TVA activity and voice perceptionbehaviour has not yet been established. Here we show that a functional magnetic resonance imaging (fMRI measure of activity in the TVAs predicts individual performance at a separately administered voice memory test. This relation holds whengeneral sound memory ability is taken into account. These findings provide the first evidence that the TVAs are specifically involved in voice cognition.
Full Text Available Multimodal signal analysis based on sophisticated sensors, efficient communicationsystems and fast parallel processing methods has a rapidly increasing range of multidisciplinaryapplications. The present paper is devoted to pattern recognition, machine learning, and the analysisof sleep stages in the detection of sleep disorders using polysomnography (PSG data, includingelectroencephalography (EEG, breathing (Flow, and electro-oculogram (EOG signals. The proposedmethod is based on the classification of selected features by a neural network system with sigmoidaland softmax transfer functions using Bayesian methods for the evaluation of the probabilities of theseparate classes. The application is devoted to the analysis of the sleep stages of 184 individualswith different diagnoses, using EEG and further PSG signals. Data analysis points to an averageincrease of the length of the Wake stage by 2.7% per 10 years and a decrease of the length of theRapid Eye Movement (REM stages by 0.8% per 10 years. The mean classification accuracy for givensets of records and single EEG and multimodal features is 88.7% ( standard deviation, STD: 2.1 and89.6% (STD:1.9, respectively. The proposed methods enable the use of adaptive learning processesfor the detection and classification of health disorders based on prior specialist experience andman–machine interaction.
Hughes, Susan M; Nicholson, Shevon E
This study examined self-recognition processing in both the auditory and visual modalities by determining how comparable hearing a recording of one's own voice was to seeing photograph of one's own face. We also investigated whether the simultaneous presentation of auditory and visual self-stimuli would either facilitate or inhibit self-identification. Ninety-one participants completed reaction-time tasks of self-recognition when presented with their own faces, own voices, and combinations of the two. Reaction time and errors made when responding with both the right and left hand were recorded to determine if there were lateralization effects on these tasks. Our findings showed that visual self-recognition for facial photographs appears to be superior to auditory self-recognition for voice recordings. Furthermore, a combined presentation of one's own face and voice appeared to inhibit rather than facilitate self-recognition and there was a left-hand advantage for reaction time on the combined-presentation tasks. Copyright © 2010 Elsevier Inc. All rights reserved.
Full Text Available JSAA has been seeking to provide an opportunity for Student Affairs professionals and higher education scholars from around the globe to share their research and experiences of student services and student affairs programmes from their respective regional and institutional contexts. This has been given a specific platform with the guest-edited issue “Voices from Around the Globe” which is the result of a collaboration with the International Association of Student Affairs and Services (IASAS, and particularly with the guest editors, Kathleen Callahan and Chinedu Mba.
Martins, Regina Helena Garcia; do Amaral, Henrique Abrantes; Tavares, Elaine Lara Mendes; Martins, Maira Garcia; Gonçalves, Tatiana Maria; Dias, Norimar Hernandes
Voice disorders affect adults and children and have different causes in different age groups. The aim of the study is to present the etiology and diagnosis dysphonia in a large population of patients with this voice disorder.for dysphonia of a large population of dysphonic patients. We evaluated 2019 patients with dysphonia who attended the Voice Disease ambulatories of a university hospital. Parameters assessed were age, gender, profession, associated symptoms, smoking, and videolaryngoscopy diagnoses. Of the 2019 patients with dysphonia who were included in this study, 786 were male (38.93%) and 1233 were female (61.07). The age groups were as follows: 1-6 years (n = 100); 7-12 years (n = 187); 13-18 years (n = 92); 19-39 years (n = 494); 41-60 years (n = 811); and >60 years (n = 335). Symptoms associated with dysphonia were vocal overuse (n = 677), gastroesophageal symptoms (n = 535), and nasosinusal symptoms (n = 497). The predominant professions of the patients were domestic workers, students, and teachers. Smoking was reported by 13.6% patients. With regard to the etiology of dysphonia, in children (1-18 years old), nodules (n = 225; 59.3%), cysts (n = 39; 10.3%), and acute laryngitis (n = 26; 6.8%) prevailed. In adults (19-60 years old), functional dysphonia (n = 268; 20.5%), acid laryngitis (n = 164; 12.5%), and vocal polyps (n = 156; 12%) predominated. In patients older than 60 years, presbyphonia (n = 89; 26.5%), functional dysphonia (n = 59; 17.6%), and Reinke's edema (n = 48; 14%) predominated. In this population of 2019 patients with dysphonia, adults and women were predominant. Dysphonia had different etiologies in the age groups studied. Nodules and cysts were predominant in children, functional dysphonia and reflux in adults, and presbyphonia and Reinke's edema in the elderly. Copyright Â© 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Full Text Available Note from the interviewer: Diane Austin's new book “The Theory and Practice of Vocal Psychotherapy: Songs of the Self” (2008 which was published recently, has been an excellent opportunity to learn more about the use of voice in therapy, its clinical applications and its enormous possibilities that offers within a psychotherapeutic setting. This interview focuses on introducing some of these aspects based on Austin’s work, and on exploring her background, motivations and considerations towards this pioneer music-therapeutic approach. The interview has been edited by Diane Austin and Evangelia Papanikolaou and took place via a series of emails, dated from September to December 2009.
Kooijman, P G C; de Jong, F I C R S; Oudes, M J; Huinck, W; van Acht, H; Graamans, K
The aim of this study was to investigate the relationship between extrinsic laryngeal muscular hypertonicity and deviant body posture on the one hand and voice handicap and voice quality on the other hand in teachers with persistent voice complaints and a history of voice-related absenteeism. The study group consisted of 25 female teachers. A voice therapist assessed extrinsic laryngeal muscular tension and a physical therapist assessed body posture. The assessed parameters were clustered in categories. The parameters in the different categories represent the same function. Further a tension/posture index was created, which is the summation of the different parameters. The different parameters and the index were related to the Voice Handicap Index (VHI) and the Dysphonia Severity Index (DSI). The scores of the VHI and the individual parameters differ significantly except for the posterior weight bearing and tension of the sternocleidomastoid muscle. There was also a significant difference between the individual parameters and the DSI, except for tension of the cricothyroid muscle and posterior weight bearing. The score of the tension/posture index correlates significantly with both the VHI and the DSI. In a linear regression analysis, the combination of hypertonicity of the sternocleidomastoid, the geniohyoid muscles and posterior weight bearing is the most important predictor for a high voice handicap. The combination of hypertonicity of the geniohyoid muscle, posterior weight bearing, high position of the hyoid bone, hypertonicity of the cricothyroid muscle and anteroposition of the head is the most important predictor for a low DSI score. The results of this study show the higher the score of the index, the higher the score of the voice handicap and the worse the voice quality is. Moreover, the results are indicative for the importance of assessment of muscular tension and body posture in the diagnosis of voice disorders.
Ebersole, Barbara; Soni, Resha S; Moran, Kathleen; Lango, Miriam; Devarajan, Karthik; Jamal, Nausheen
Examine the relationship among the severity of patient-perceived voice impairment, perceptual dysphonia severity, occupational voice demand, and voice therapy adherence. Identify clinical predictors of increased risk for therapy nonadherence. A retrospective cohort study of patients presenting with a chief complaint of persistent dysphonia at an interdisciplinary voice center was done. The Voice Handicap Index-10 (VHI-10) and the Voice-Related Quality of Life (V-RQOL) survey scores, clinician rating of dysphonia severity using the Grade score from the Grade, Roughness Breathiness, Asthenia, and Strain scale, occupational voice demand, and patient demographics were tested for associations with therapy adherence, defined as completion of the treatment plan. Classification and Regression Tree (CART) analysis was performed to establish thresholds for nonadherence risk. Of 166 patients evaluated, 111 were recommended for voice therapy. The therapy nonadherence rate was 56%. Occupational voice demand category, VHI-10, and V-RQOL scores were the only factors significantly correlated with therapy adherence (P demand are significantly more likely to be nonadherent with therapy than those with high occupational voice demand (P 40 is a significant cutoff point for predicting therapy nonadherence (P demand and patient perception of impairment are significantly and independently correlated with therapy adherence. A VHI-10 score of ≤9 or a V-RQOL score of >40 is a significant cutoff point for predicting nonadherence risk. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Jones, Benedict C; Feinberg, David R; Debruine, Lisa M; Little, Anthony C; Vukovic, Jovana
Most previous studies of vocal attractiveness have focused on preferences for physical characteristics of voices such as pitch. Here we examine the content of vocalizations in interaction with such physical traits, finding that vocal cues of social interest modulate the strength of men's preferences for raised pitch in women's voices. Men showed stronger preferences for raised pitch when judging the voices of women who appeared interested in the listener than when judging the voices of women who appeared relatively disinterested in the listener. These findings show that voice preferences are not determined solely by physical properties of voices and that men integrate information about voice pitch and the degree of social interest expressed by women when forming voice preferences. Women's preferences for raised pitch in women's voices were not modulated by cues of social interest, suggesting that the integration of cues of social interest and voice pitch when men judge the attractiveness of women's voices may reflect adaptations that promote efficient allocation of men's mating effort.
Gornet, Matthew F; Copay, Anne G; Sorensen, Katrine M; Schranck, Francine W
Health-related quality-of-life outcomes have been collected with the Medical Outcomes Study (MOS) Short Form 36 (SF-36) survey. Boston University School of Public Health has developed algorithms for the conversion of SF-36 to Veterans RAND 12-Item Health Survey (VR-12) Physical Component Summary (PCS) and Mental Component Summary (MCS) scores. The purpose of the present study is to investigate the conversion of the SF-36 to VR-12 PCS and MCS scores. Preoperative and postoperative SF-36 were collected from patients who underwent lumbar or cervical surgery from a single surgeon between August 1998 and January 2013. Short Form 36 PCS and MCS scores were calculated following their original instructions. The SF-36 answers were then converted to VR-12 PCS and MCS scores following the algorithm provided by the Boston University School of Public Health. The mean score, preoperative to postoperative change, and proportions of patients who reach the minimum detectable change were compared between SF-36 and VR-12. A total of 1,968 patients (1,559 lumbar and 409 cervical) had completed preoperative and postoperative SF-36. The values of the SF-36 and VR-12 mean scores were extremely similar, with score differences ranging from 0.77 to 1.82. The preoperative to postoperative improvement was highly significant (p36 and VR-12 scores. The mean change scores were similar, with a difference of up to 0.93 for PCS and up to 0.37 for MCS. Minimum detectable change (MDC) values were almost identical for SF-36 and VR-12, with a difference of 0.12 for PCS and up to 0.41 for MCS. The proportions of patients whose change in score reached MDC were also nearly identical for SF-36 and VR-12. About 90% of the patients above SF-36 MDC were also above VR-12 MDC. The converted VR-12 scores, similar to the SF-36 scores, detect a significant postoperative improvement in PCS and MCS scores. The calculated MDC values and the proportions of patients whose score improvement reach MDC are similar for
Baird, Alice Emily; Hasse Jørgensen, Stina; Parada-Cabaleiro, Emilia
Along with the rise of artificial intelligence and the internet-of-things, synthesized voices are now common in daily–life, providing us with guidance, assistance, and even companionship. From formant to concatenative synthesis, the synthesized voice continues to be defined by the same traits we...
Tong, Siu Yin Annie; Adamson, Bob
The value of student voices in dialogues about learning improvement is acknowledged in the literature. This paper examines how the views of students regarding School-based Assessment (SBA), a significant shift in examination policy and practice in secondary schools in Hong Kong, have largely been ignored. The study captures student voices through…
Abel, R. S.; Watkins, H. E.
Modified electronic voice encoder /Vocoder/ includes an independent analog mode of operation in addition to the conventional digital mode. The Vocoder is a bandwidth compression equipment that permits voice transmission over channels, having only a fraction of the bandwidth required for conventional telephone-quality speech transmission.
Euler, James S.
The author's voice is implicit in all writing, even technical writing. It is the expression of the writer's attitude toward audience, subject matter, and self. Effective use of voice is made possible by recognizing the three roles of the technical writer: transmitter, translator, and author. As a transmitter, the writer must consciously apply an…
Common Core proponents and detractors debate its merits, but students have voiced their opinion for years. Using a decade's worth of data gathered through design-research on youth voice, this article discusses what high school students have long described as more ideal learning environments for themselves--and how remarkably similar the Common…
Rees, C.; Alfes, K.; Gatenby, M.
This paper considers the relationship between employee voice and employee engagement. Employee perceptions of voice behaviour aimed at improving the functioning of the work group are found to have both a direct impact and an indirect impact on levels of employee engagement. Analysis of data from two
GPB Consulting has developed a scientific approach to voice coaching. A digital recording of the voice is sent to a lab in Switzerland and analyzed by a computer programme designed by a doctor of psychology and linguistics and a scientist at CERN (1 page).
Behlau, Mara; Zambon, Fabiana; Madazio, Glaucya
Recent advances with regard to occupational voice disorders are highlighted with emphasis on issues warranting consideration when assessing, training, and treating professional voice users. Findings include the many particularities between the various categories of professional voice users, the concept that the environment plays a major role in occupational voice disorders, and that biopsychosocial influences should be analyzed on an individual basis. Assessment via self-evaluation protocols to quantify the impact of these disorders is mandatory as a component of an evaluation and to document treatment outcomes. Discomfort or odynophonia has evolved as a critical symptom in this population. Clinical trials are limited and the complexity of the environment may be a limitation in experiment design. This review reinforced the need for large population studies of professional voice users; new data highlighted important factors specific to each group of voice users. Interventions directed at student teachers are necessities to not only improving the quality of future professionals, but also to avoid the frustration and limitations associated with chronic voice problems. The causative relationship between the work environment and voice disorders has not yet been established. Randomized controlled trials are lacking and must be a focus to enhance treatment paradigms for this population.
Saylam, Güleser; Şahin, Mustafa; Demiral, Dilek; Bayır, Ömer; Yüceege, Melike Bağnu; Çadallı Tatar, Emel; Korkmaz, Mehmet Hakan
The aim of this study was to investigate alterations in voice parameters among patients using continuous positive airway pressure (CPAP) for the treatment of obstructive sleep apnea syndrome. Patients with an indication for CPAP treatment without any voice problems and with normal laryngeal findings were included and voice parameters were evaluated before and 1 and 6 months after CPAP. Videolaryngostroboscopic findings, a self-rated scale (Voice Handicap Index-10, VHI-10), perceptual voice quality assessment (GRBAS: grade, roughness, breathiness, asthenia, strain), and acoustic parameters were compared. Data from 70 subjects (48 men and 22 women) with a mean age of 44.2 ± 6.0 years were evaluated. When compared with the pre-CPAP treatment period, there was a significant increase in the VHI-10 score after 1 month of treatment and in VHI- 10 and total GRBAS scores, jitter percent (P = 0.01), shimmer percent, noise-to-harmonic ratio, and voice turbulence index after 6 months of treatment. Vague negative effects on voice parameters after the first month of CPAP treatment became more evident after 6 months. We demonstrated nonsevere alterations in the voice quality of patients under CPAP treatment. Given that CPAP is a long-term treatment it is important to keep these alterations in mind.
From the point of view of occupational health, the field of voice disorders is very poorly developed as compared, for instance, to the prevention and diagnostics of occupational hearing disorders. In fact, voice disorders have not even been recognized in the field of occupational medicine. Hence, it is obviously very rare in most countries that the voice disorder of a professional voice user, e.g. a teacher, a singer or an actor, is accepted as an occupational disease by insurance companies. However, occupational voice problems do not lack significance from the point of view of the patient. We also know from questionnaires and clinical studies that voice complaints are very common. Another example of job-related health problems, which has proved more successful in terms of its occupational health status, is the repetition strain injury of the elbow, i.e. the "tennis elbow". Its textbook definition could be used as such to describe an occupational voice disorder ("dysphonia professional is"). In the present paper the effects of such risk factors as vocal loading itself, background noise and room acoustics and low relative humidity of the air are discussed. Due to individual factors underlying the development of professional voice disorders, recommendations rather than regulations are called for. There are many simple and even relatively low-cost methods available for the prevention of vocal problems as well as for supporting rehabilitation.
... enter puberty earlier or later than others. How Deep Will My Voice Get? How deep a guy's voice gets depends on his genes: ... of Use Notice of Nondiscrimination Visit the Nemours Web site. Note: All information on TeensHealth® is for ...
Rubin, Lucille S.
This report is the result of a six-week study in which the voice training offerings at four schools of drama in London were examined using interviews of teachers and directors, observation of voice classes, and attendance at studio presentations and public performances. The report covers such topics as: textbooks and references being used; courses…
Vocal demands of teaching are considerable and these challenges are greater for choral directors who depend on the voice as a musical and instructive instrument. The purpose of this study was to (1) examine choral directors' vocal condition using a modified Voice Handicap Index (VHI), and (2) determine the extent to which the major variables…
After being familiarized with two voices, either implicit (auditory lexical decision) or explicit memory (auditory recognition) for words from silently read sentences was assessed among 32 men and 32 women volunteers. In the silently read sentences, the sex of speaker was implied in the initial words, e.g., "He said, ..." or "She said...". Tone in question versus statement was also manipulated by appropriate punctuation. Auditory lexical decision priming was found for sex- and tone-consistent items following silent reading, but only up to 5 min. after silent reading. In a second study, similar lexical decision priming was found following listening to the sentences, although these effects remained reliable after a 2-day delay. The effect sizes for lexical decision priming showed that tone-consistency and sex-consistency were strong following both silent reading and listening 5 min. after studying. These results suggest that readers create episodic traces of text from auditory images of silently read sentences as they do during listening.
Martins, Regina Helena Garcia; Pereira, Eny Regina Bóia Neves; Hidalgo, Caio Bosque; Tavares, Elaine Lara Mendes
Voice disorders are very prevalent among teachers and consequences are serious. Although the literature is extensive, there are differences in the concepts and methodology related to voice problems; most studies are restricted to analyzing the responses of teachers to questionnaires and only a few studies include vocal assessments and videolaryngoscopic examinations to obtain a definitive diagnosis. To review demographic studies related to vocal disorders in teachers to analyze the diverse methodologies, the prevalence rates pointed out by the authors, the main risk factors, the most prevalent laryngeal lesions, and the repercussions of dysphonias on professional activities. The available literature (from 1997 to 2013) was narratively reviewed based on Medline, PubMed, Lilacs, SciELO, and Cochrane library databases. Excluded were articles that specifically analyzed treatment modalities and those that did not make their abstracts available in those databases. The keywords included were teacher, dysphonia, voice disorders, professional voice. Copyright © 2014 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Gill, Brian P; Herbst, Christian T
The final keynote panel of the 10th Pan-European Voice Conference (PEVOC) was concerned with the topic 'Voice pedagogy-what do we need?' In this communication the panel discussion is summarized, and the authors provide a deepening discussion on one of the key questions, addressing the roles and tasks of people working with voice students. In particular, a distinction is made between (1) voice building (derived from the German term 'Stimmbildung'), primarily comprising the functional and physiological aspects of singing; (2) coaching, mostly concerned with performance skills; and (3) singing voice rehabilitation. Both public and private educators are encouraged to apply this distinction to their curricula, in order to arrive at more efficient singing teaching and to reduce the risk of vocal injury to the singers concerned.
Full Text Available This article deals with the impact of Wireless (Wi-Fi networks on the perceived quality of voice services. The Quality of Service (QoS metrics must be monitored in the computer network during the voice data transmission to ensure proper voice service quality the end-user has paid for, especially in the wireless networks. In addition to the QoS, research area called Quality of Experience (QoE provides metrics and methods for quality evaluation from the end-user’s perspective. This article focuses on a QoE estimation of Voice over IP (VoIP calls in the wireless networks using network simulator. Results contribute to voice quality estimation based on characteristics of the wireless network and location of a wireless client.
Background Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. Results We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. Conclusion The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing. PMID:21878105
Angiuoli, Samuel V; Matalka, Malcolm; Gussman, Aaron; Galens, Kevin; Vangala, Mahesh; Riley, David R; Arze, Cesar; White, James R; White, Owen; Fricke, W Florian
Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing.
Uskul, Ayse K; Paulmann, Silke; Weick, Mario
Listeners have to pay close attention to a speaker's tone of voice (prosody) during daily conversations. This is particularly important when trying to infer the emotional state of the speaker. Although a growing body of research has explored how emotions are processed from speech in general, little is known about how psychosocial factors such as social power can shape the perception of vocal emotional attributes. Thus, the present studies explored how social power affects emotional prosody recognition. In a correlational study (Study 1) and an experimental study (Study 2), we show that high power is associated with lower accuracy in emotional prosody recognition than low power. These results, for the first time, suggest that individuals experiencing high or low power perceive emotional tone of voice differently. (c) 2016 APA, all rights reserved).
Fan, Jieyan; Wu, Dapeng; Nucci, Antonio; Keralapura, Ram; Gao, Lixin
Given the rising popularity of voice and video services over the Internet, accurately identifying voice and video traffic that traverse their networks has become a critical task for Internet service providers (ISPs). As the number of proprietary applications that deliver voice and video services to end users increases over time, the search for the one methodology that can accurately detect such services while being application independent still remains open. This problem becomes even more complicated when voice and video service providers like Skype, Microsoft, and Google bundle their voice and video services with other services like file transfer and chat. For example, a bundled Skype session can contain both voice stream and file transfer stream in the same layer-3/layer-4 flow. In this context, traditional techniques to identify voice and video streams do not work. In this paper, we propose a novel self-learning classifier, called VVS-I , that detects the presence of voice and video streams in flows with minimum manual intervention. Our classifier works in two phases: training phase and detection phase. In the training phase, VVS-I first extracts the relevant features, and subsequently constructs a fingerprint of a flow using the power spectral density (PSD) analysis. In the detection phase, it compares the fingerprint of a flow to the existing fingerprints learned during the training phase, and subsequently classifies the flow. Our classifier is not only capable of detecting voice and video streams that are hidden in different flows, but is also capable of detecting different applications (like Skype, MSN, etc.) that generate these voice/video streams. We show that our classifier can achieve close to 100% detection rate while keeping the false positive rate to less that 1%.
Larson, William E.
A computer program has been developed that improves the efficiency of wind tunnel model leak checking. The program uses a voice recognition unit to relay a technician's commands to the computer. The computer, after receiving a command, can respond to the technician via a voice response unit. Information about the model pressure orifice being checked is displayed on a gas-plasma terminal. On command, the program records up to 30 seconds of pressure data. After the recording is complete, the raw data and a straight line fit of the data are plotted on the terminal. This allows the technician to make a decision on the integrity of the orifice being checked. All results of the leak check program are stored in a database file that can be listed on the line printer for record keeping purposes or displayed on the terminal to help the technician find unchecked orifices. This program allows one technician to check a model for leaks instead of the two or three previously required.
Doukas, Nikolaos; Bardis, Nikolaos G.
Speech recognition systems allow human - machine communication to acquire an intuitive nature that approaches the simplicity of inter - human communication. Small vocabulary speech recognition is a subset of the overall speech recognition problem, where only a small number of words need to be recognized. Speaker independent small vocabulary recognition can find significant applications in field equipment used by military personnel. Such equipment may typically be controlled by a small number of commands that need to be given quickly and accurately, under conditions where delicate manual operations are difficult to achieve. This type of application could hence significantly benefit by the use of robust voice operated control components, as they would facilitate the interaction with their users and render it much more reliable in times of crisis. This paper presents current challenges involved in attaining efficient and robust small vocabulary speech recognition. These challenges concern feature selection, classification techniques, speaker diversity and noise effects. A state machine approach is presented that facilitates the voice guidance of different equipment in a variety of situations.
Chakraborty, Anya; Chakrabarti, Bhismadev
Atypical self-processing is an emerging theme in autism research, suggested by lower self-reference effect in memory, and atypical neural responses to visual self-representations. Most research on physical self-processing in autism uses visual stimuli. However, the self is a multimodal construct, and therefore, it is essential to test self-recognition in other sensory modalities as well. Self-recognition in the auditory modality remains relatively unexplored and has not been tested in relation to autism and related traits. This study investigates self-recognition in auditory and visual domain in the general population and tests if it is associated with autistic traits. Thirty-nine neurotypical adults participated in a two-part study. In the first session, individual participant's voice was recorded and face was photographed and morphed respectively with voices and faces from unfamiliar identities. In the second session, participants performed a 'self-identification' task, classifying each morph as 'self' voice (or face) or an 'other' voice (or face). All participants also completed the Autism Spectrum Quotient (AQ). For each sensory modality, slope of the self-recognition curve was used as individual self-recognition metric. These two self-recognition metrics were tested for association between each other, and with autistic traits. Fifty percent 'self' response was reached for a higher percentage of self in the auditory domain compared to the visual domain (t = 3.142; P self-recognition bias across sensory modalities (τ = -0.165, P = 0.204). Higher recognition bias for self-voice was observed in individuals higher in autistic traits (τ AQ = 0.301, P = 0.008). No such correlation was observed between recognition bias for self-face and autistic traits (τ AQ = -0.020, P = 0.438). Our data shows that recognition bias for physical self-representation is not related across sensory modalities. Further, individuals with higher autistic traits were better able
Hughes, Susan M; Harrison, Marissa A
Evidence suggests that many physical, behavioral, and trait qualities can be detected solely from the sound of a person's voice, irrespective of the semantic information conveyed through speech. This study examined whether raters could accurately assess the likelihood that a person has cheated on committed, romantic partners simply by hearing the speaker's voice. Independent raters heard voice samples of individuals who self-reported that they either cheated or had never cheated on their romantic partners. To control for aspects that may clue a listener to the speaker's mate value, we used voice samples that did not differ between these groups for voice attractiveness, age, voice pitch, and other acoustic measures. We found that participants indeed rated the voices of those who had a history of cheating as more likely to cheat. Male speakers were given higher ratings for cheating, while female raters were more likely to ascribe the likelihood to cheat to speakers. Additionally, we manipulated the pitch of the voice samples, and for both sexes, the lower pitched versions were consistently rated to be from those who were more likely to have cheated. Regardless of the pitch manipulation, speakers were able to assess actual history of infidelity; the one exception was that men's accuracy decreased when judging women whose voices were lowered. These findings expand upon the idea that the human voice may be of value as a cheater detection tool and very thin slices of vocal information are all that is needed to make certain assessments about others.
Full Text Available Despite emergent progress in many fields of bionics, a functional Bionic Voice prosthesis for laryngectomy patients (larynx amputees has not yet been achieved, leading to a lifetime of vocal disability for these patients. This study introduces a novel framework of Pneumatic Bionic Voice Prostheses as an electronic adaptation of the Pneumatic Artificial Larynx (PAL device. The PAL is a non-invasive mechanical voice source, driven exclusively by respiration with an exceptionally high voice quality, comparable to the existing gold standard of Tracheoesophageal (TE voice prosthesis. Following PAL design closely as the reference, Pneumatic Bionic Voice Prostheses seem to have a strong potential to substitute the existing gold standard by generating a similar voice quality while remaining non-invasive and non-surgical. This paper designs the first Pneumatic Bionic Voice prosthesis and evaluates its onset and offset control against the PAL device through pre-clinical trials on one laryngectomy patient. The evaluation on a database of more than five hours of continuous/isolated speech recordings shows a close match between the onset/offset control of the Pneumatic Bionic Voice and the PAL with an accuracy of 98.45 ±0.54%. When implemented in real-time, the Pneumatic Bionic Voice prosthesis controller has an average onset/offset delay of 10 milliseconds compared to the PAL. Hence it addresses a major disadvantage of previous electronic voice prostheses, including myoelectric Bionic Voice, in meeting the short time-frames of controlling the onset/offset of the voice in continuous speech.
Ahmadi, Farzaneh; Noorian, Farzad; Novakovic, Daniel; van Schaik, André
Despite emergent progress in many fields of bionics, a functional Bionic Voice prosthesis for laryngectomy patients (larynx amputees) has not yet been achieved, leading to a lifetime of vocal disability for these patients. This study introduces a novel framework of Pneumatic Bionic Voice Prostheses as an electronic adaptation of the Pneumatic Artificial Larynx (PAL) device. The PAL is a non-invasive mechanical voice source, driven exclusively by respiration with an exceptionally high voice quality, comparable to the existing gold standard of Tracheoesophageal (TE) voice prosthesis. Following PAL design closely as the reference, Pneumatic Bionic Voice Prostheses seem to have a strong potential to substitute the existing gold standard by generating a similar voice quality while remaining non-invasive and non-surgical. This paper designs the first Pneumatic Bionic Voice prosthesis and evaluates its onset and offset control against the PAL device through pre-clinical trials on one laryngectomy patient. The evaluation on a database of more than five hours of continuous/isolated speech recordings shows a close match between the onset/offset control of the Pneumatic Bionic Voice and the PAL with an accuracy of 98.45 ±0.54%. When implemented in real-time, the Pneumatic Bionic Voice prosthesis controller has an average onset/offset delay of 10 milliseconds compared to the PAL. Hence it addresses a major disadvantage of previous electronic voice prostheses, including myoelectric Bionic Voice, in meeting the short time-frames of controlling the onset/offset of the voice in continuous speech.
Noorian, Farzad; Novakovic, Daniel; van Schaik, André
Despite emergent progress in many fields of bionics, a functional Bionic Voice prosthesis for laryngectomy patients (larynx amputees) has not yet been achieved, leading to a lifetime of vocal disability for these patients. This study introduces a novel framework of Pneumatic Bionic Voice Prostheses as an electronic adaptation of the Pneumatic Artificial Larynx (PAL) device. The PAL is a non-invasive mechanical voice source, driven exclusively by respiration with an exceptionally high voice quality, comparable to the existing gold standard of Tracheoesophageal (TE) voice prosthesis. Following PAL design closely as the reference, Pneumatic Bionic Voice Prostheses seem to have a strong potential to substitute the existing gold standard by generating a similar voice quality while remaining non-invasive and non-surgical. This paper designs the first Pneumatic Bionic Voice prosthesis and evaluates its onset and offset control against the PAL device through pre-clinical trials on one laryngectomy patient. The evaluation on a database of more than five hours of continuous/isolated speech recordings shows a close match between the onset/offset control of the Pneumatic Bionic Voice and the PAL with an accuracy of 98.45 ±0.54%. When implemented in real-time, the Pneumatic Bionic Voice prosthesis controller has an average onset/offset delay of 10 milliseconds compared to the PAL. Hence it addresses a major disadvantage of previous electronic voice prostheses, including myoelectric Bionic Voice, in meeting the short time-frames of controlling the onset/offset of the voice in continuous speech. PMID:29466455
Larson Charles R
Full Text Available Abstract Background The motor-driven predictions about expected sensory feedback (efference copies have been proposed to play an important role in recognition of sensory consequences of self-produced motor actions. In the auditory system, this effect was suggested to result in suppression of sensory neural responses to self-produced voices that are predicted by the efference copies during vocal production in comparison with passive listening to the playback of the identical self-vocalizations. In the present study, event-related potentials (ERPs were recorded in response to upward pitch shift stimuli (PSS with five different magnitudes (0, +50, +100, +200 and +400 cents at voice onset during active vocal production and passive listening to the playback. Results Results indicated that the suppression of the N1 component during vocal production was largest for unaltered voice feedback (PSS: 0 cents, became smaller as the magnitude of PSS increased to 200 cents, and was almost completely eliminated in response to 400 cents stimuli. Conclusions Findings of the present study suggest that the brain utilizes the motor predictions (efference copies to determine the source of incoming stimuli and maximally suppresses the auditory responses to unaltered feedback of self-vocalizations. The reduction of suppression for 50, 100 and 200 cents and its elimination for 400 cents pitch-shifted voice auditory feedback support the idea that motor-driven suppression of voice feedback leads to distinctly different sensory neural processing of self vs. non-self vocalizations. This characteristic may enable the audio-vocal system to more effectively detect and correct for unexpected errors in the feedback of self-produced voice pitch compared with externally-generated sounds.
Behroozmand, Roozbeh; Larson, Charles R
The motor-driven predictions about expected sensory feedback (efference copies) have been proposed to play an important role in recognition of sensory consequences of self-produced motor actions. In the auditory system, this effect was suggested to result in suppression of sensory neural responses to self-produced voices that are predicted by the efference copies during vocal production in comparison with passive listening to the playback of the identical self-vocalizations. In the present study, event-related potentials (ERPs) were recorded in response to upward pitch shift stimuli (PSS) with five different magnitudes (0, +50, +100, +200 and +400 cents) at voice onset during active vocal production and passive listening to the playback. Results indicated that the suppression of the N1 component during vocal production was largest for unaltered voice feedback (PSS: 0 cents), became smaller as the magnitude of PSS increased to 200 cents, and was almost completely eliminated in response to 400 cents stimuli. Findings of the present study suggest that the brain utilizes the motor predictions (efference copies) to determine the source of incoming stimuli and maximally suppresses the auditory responses to unaltered feedback of self-vocalizations. The reduction of suppression for 50, 100 and 200 cents and its elimination for 400 cents pitch-shifted voice auditory feedback support the idea that motor-driven suppression of voice feedback leads to distinctly different sensory neural processing of self vs. non-self vocalizations. This characteristic may enable the audio-vocal system to more effectively detect and correct for unexpected errors in the feedback of self-produced voice pitch compared with externally-generated sounds.
Dudley, James; Eames, Catrin; Mulligan, John; Fisher, Naomi
Developing compassion towards oneself has been linked to improvement in many areas of psychological well-being, including psychosis. Furthermore, developing a non-judgemental, accepting way of relating to voices is associated with lower levels of distress for people who hear voices. These factors have also been associated with secure attachment. This study explores associations between the constructs of mindfulness of voices, self-compassion, and distress from hearing voices and how secure attachment style related to each of these variables. Cross-sectional online. One hundred and twenty-eight people (73% female; M age = 37.5; 87.5% Caucasian) who currently hear voices completed the Self-Compassion Scale, Southampton Mindfulness of Voices Questionnaire, Relationships Questionnaire, and Hamilton Programme for Schizophrenia Voices Questionnaire. Results showed that mindfulness of voices mediated the relationship between self-compassion and severity of voices, and self-compassion mediated the relationship between mindfulness of voices and severity of voices. Self-compassion and mindfulness of voices were significantly positively correlated with each other and negatively correlated with distress and severity of voices. Mindful relation to voices and self-compassion are associated with reduced distress and severity of voices, which supports the proposed potential benefits of mindful relating to voices and self-compassion as therapeutic skills for people experiencing distress by voice hearing. Greater self-compassion and mindfulness of voices were significantly associated with less distress from voices. These findings support theory underlining compassionate mind training. Mindfulness of voices mediated the relationship between self-compassion and distress from voices, indicating a synergistic relationship between the constructs. Although the current findings do not give a direction of causation, consideration is given to the potential impact of mindful and
Park, Young Ho; Park, Kang Ryoung
On the basis of the increased emphasis placed on the protection of privacy, biometric recognition systems using physical or behavioural characteristics such as fingerprints, facial characteristics, iris and finger‐vein patterns or the voice have been introduced in applications including door access control, personal certification, Internet banking and ATM machines. Among these, finger‐vein recognition is advantageous in that it involves the use of inexpensive and small devices that are diffic...
We implement a biometric authentication system on the Android platform, which is based on text-dependent speaker recognition. The Android version used in the application is Android 4.0. The application makes use of the Modular Audio Recognition Framework, from which many of the algorithms are adapted in the processes of preprocessing and feature extraction. In addition, we employ the Dynamic Time Warping (DTW) algorithm for the comparison of different voice features. A training procedure is i...
Salturk, Ziya; Kumral, Tolgar Lutfi; Aydoğdu, Imran; Arslanoğlu, Ahmet; Berkiten, Güler; Yildirim, Güven; Uyar, Yavuz
To evaluate the psychological effects of dysphonia in voice professionals compared to non-voice professionals and in both genders. Cross-sectional analysis. Forty-eight 48 voice professionals and 52 non-voice professionals with dysphonia were included in this study. All participants underwent a complete ear, nose, and throat examination and an evaluation for pathologies that might affect vocal quality. Participants were asked to complete the Turkish versions of the Voice Handicap Index-30 (VHI-30), Perceived Stress Scale (PSS), and the Hospital Anxiety and Depression Scale (HADS). HADS scores were evaluated as HADS-A (anxiety) and HADS-D (depression). Dysphonia status was evaluated by grade, roughness, breathiness, asthenia, and strain (GRBAS) scale perceptually. The results were compared statistically. Significant differences between the two groups were evident when the VHI-30 and PSS data were compared (P = .00001 and P = .00001, respectively). However, neither HADS score (HADS-A and HADS-D) differed between groups. An analysis of the scores in terms of sex revealed that females had significantly higher PSS scores (P = .006). The GRBAS scale revealed no difference between groups (P = .819, .931, .803, .655, and .803, respectively). No between-sex differences in the VHI-30 or HADS scores were evident We found that voice professionals and females experienced more stress and were more dissatisfied with their voices. 4. © 2015 The American Laryngological, Rhinological and Otological Society, Inc.
Bele, Irene Velsvik
This study focuses on speaking voice quality in male teachers (n = 35) and male actors (n = 36), who represent untrained and trained voice users, because we wanted to investigate normal and supranormal voices. In this study, both substantial and methodologic aspects were considered. It includes a method for perceptual voice evaluation, and a basic issue was rater reliability. A listening group of 10 listeners, 7 experienced speech-language therapists, and 3 speech-language therapist students evaluated the voices by 15 vocal characteristics using VA scales. Two sets of voice signals were investigated: text reading (2 loudness levels) and sustained vowel (3 levels). The results indicated a high interrater reliability for most perceptual characteristics. Connected speech was evaluated more reliably, especially at the normal level, but both types of voice signals were evaluated reliably, although the reliability for connected speech was somewhat higher than for vowels. Experienced listeners tended to be more consistent in their ratings than did the student raters. Some vocal characteristics achieved acceptable reliability even with a smaller panel of listeners. The perceptual characteristics grouped in 4 factors reflected perceptual dimensions.
The purpose of this article is to demonstrate and account for the weak emergence of 'voice' in the writing of students embarking upon their postgraduate studies in Geosciences. The two elements of 'voice' that are emphasised are 'voice' as style of expression and 'voice' as the ability to write distinctly, yet building upon ...
Lopes, Leonardo Wanderley; da Silva, Karoline Evangelista; da Silva Evangelista, Deyverson; Almeida, Anna Alice; Silva, Priscila Oliveira Costa; Lucero, Jorge; Behlau, Mara
To analyze the performance of a phonatory deviation diagram (PDD) in discriminating the presence and severity of voice deviation and the predominant voice quality of synthesized voices. A speech-language pathologist performed the auditory-perceptual analysis of the synthesized voice (n = 871). The PDD distribution of voice signals was analyzed according to area, quadrant, shape, and density. Differences in signal distribution regarding the PDD area and quadrant were detected when differentiating the signals with and without voice deviation and with different predominant voice quality. Differences in signal distribution were found in all PDD parameters as a function of the severity of voice disorder. The PDD area and quadrant can differentiate normal voices from deviant synthesized voices. There are differences in signal distribution in PDD area and quadrant as a function of the severity of voice disorder and the predominant voice quality. However, the PDD area and quadrant do not differentiate the signals as a function of severity of voice disorder and differentiated only the breathy and rough voices from the normal and strained voices. PDD density is able to differentiate only signals with moderate and severe deviation. PDD shape shows differences between signals with different severities of voice deviation. © 2018 S. Karger AG, Basel.
Full Text Available Speech recognition applications are known to require a significant amount of resources. However, embedded speech recognition only authorizes few KB of memory, few MIPS, and small amount of training data. In order to fit the resource constraints of embedded applications, an approach based on a semicontinuous HMM system using state-independent acoustic modelling is proposed. A transformation is computed and applied to the global model in order to obtain each HMM state-dependent probability density functions, authorizing to store only the transformation parameters. This approach is evaluated on two tasks: digit and voice-command recognition. A fast adaptation technique of acoustic models is also proposed. In order to significantly reduce computational costs, the adaptation is performed only on the global model (using related speaker recognition adaptation techniques with no need for state-dependent data. The whole approach results in a relative gain of more than 20% compared to a basic HMM-based system fitting the constraints.
Frank, Lisa A.
In December, 2011, over 800 people experienced the exhibit, :"der"//pattern for a virtual environment, created for the fully immersive CAVETM at the University of Wisconsin-Madison. This exhibition took my nature-based photographic work and reinterpreted it for virtual reality (VR).Varied responses such as: "It's like a moment of joy," or "I had to see it twice," or "I'm still thinking about it weeks later" were common. Although an implied goal of my 2D artwork is to create a connection that makes viewers more aware of what it means to be a part of the natural world, these six VR environments opened up an unexpected area of inquiry that my 2D work has not. Even as the experience was mediated by machines, there was a softening at the interface between technology and human sensibility. Somehow, for some people, through the unlikely auspices of a computer-driven environment, the project spoke to a human essence that they connected with in a way that went beyond all expectations and felt completely out of my hands. Other interesting behaviors were noted: in some scenarios some spoke of intense anxiety, acrophobia, claustrophobia-even fear of death when the scene took them underground. These environments were believable enough to cause extreme responses and disorientation for some people; were fun, pleasant and wonder-filled for most; and were liberating, poetic and meditative for many others. The exhibition seemed to promote imaginative skills, creativity, emotional insight, and environmental sensitivity. It also revealed the CAVETM to be a powerful tool that can encourage uniquely productive experiences. Quite by accident, I watched as these nature-based environments revealed and articulated an essential relationship between the human spirit and the physical world. The CAVETM is certainly not a natural space, but there is clear potential to explore virtual environments as a path to better and deeper connections between people and nature. We've long associated contact
Husted, Mia; Lind, Unni
and restrictions, Call for aesthetics an sensuality, Longings for home and parents, Longings for better social relations Making children's voice visible allows preschool teachers to reflect children's knowledge and life word in pedagogical practice. Keywords: empowerment and participation, action research...... children to raise and render visible their own critique and wishes related to their everyday life in daycare. Research on how and why to engage children as participants in research and in institutional developments addresses overall interests in democratization and humanization that can be traced back...... to strategies for Nordic welfare developments and the Conventions on Children's Rights. The theoretical and methodological framework follow the lines of how to form and learn democracy of Lewin (1948) and Dewey (1916). The study is carried out as action research involving 50 children at age three to five...
Sörbom, Adrienne; Garsten, Christina
This paper departs from an interest in the involvement of business leaders in the sphere of politics, in the broad sense. Many global business leaders today do much more than engage narrowly in their own corporation and its search for profit. At a general level, we are seeing a proliferation...... as political. What is the role of business in the World Economic Forum, and how do business corporations advance their interests through the WEF? The results show that corporations find a strategically positioned amplifier for their non-market interests in the WEF. The WEF functions to enhance and gain...... leverage for their ideas and priorities in a highly selective and resourceful environment. In the long run, both the market priorities and the political interests of business may be served by engagement in the WEF. However, the WEF cannot only be conceived as the extended voice of corporations. The WEF...
Van der Hoven, Christopher; Michea, Adela; Varnes, Claus
, for example there are studies that have strongly criticized focus groups, interviews and surveys (e.g. Ulwick, 2002; Goffin et al, 2010; Sandberg, 2002). In particular, a point is made that, “…traditional market research and development approaches proved to be particularly ill-suited to breakthrough products...... the voice of the customer (VoC) through market research is well documented (Davis, 1993; Mullins and Sutherland, 1998; Cooper et al., 2002; Flint, 2002; Davilla et al., 2006; Cooper and Edgett, 2008; Cooper and Dreher, 2010; Goffin and Mitchell, 2010). However, not all research methods are well received......” (Deszca et al, 2010, p613). Therefore, in situations where traditional techniques - interviews and focus groups - are ineffective, the question is which market research techniques are appropriate, particularly for developing breakthrough products? To investigate this, an attempt was made to access...
Martin, Lisa A; Hassinger, Jane A; Debbink, Michelle; Harris, Lisa H
Researchers have described the difficulties of doing abortion work, including the psychosocial costs to individual providers. Some have discussed the self-censorship in which providers engage in to protect themselves and the pro-choice movement. However, few have examined the costs of this self-censorship to public discourse and social movements in the US. Using qualitative data collected during abortion providers' discussions of their work, we explore the tensions between their narratives and pro-choice discourse, and examine the types of stories that are routinely silenced - narratives we name "dangertalk". Using these data, we theorize about the ways in which giving voice to these tensions might transform current abortion discourse by disrupting false dichotomies and better reflecting the complex realities of abortion. We present a conceptual model for dangertalk in abortion discourse, connecting it to functions of dangertalk in social movements more broadly. Copyright © 2017 Elsevier Ltd. All rights reserved.
Pedro Gilberto GOMES
Full Text Available Mediatization has become increasingly a key concept, fundamental, essential to describe the present and the history of media and communicative change taking place. Thus, it became part of a whole, one can not see them as a separate sphere. In this perspective, the media coverage is used as a concept to describe the process of expansion of the different technical means and consider the interrelationships between the communicative change, means and sociocultural change. However, although many researchers use the concept of mediatization, each gives you the meaning that best suits your needs. Thus, the concept of media coverage is treated with multiple voices. This paper discusses this problem and present a preliminary pre-position on the matter.
Bertram, Lars; Mullin, Kristina; Parkinson, Michele; Hsiao, Monica; Moscarillo, Thomas J; Wagner, Steven L; Becker, K David; Velicelebi, Gonul; Blacker, Deborah; Tanzi, Rudolph E
Background Recently, conflicting reports have been published on the potential role of genetic variants in the α‐T catenin gene (VR22; CTNNA3) on the risk for Alzheimer's disease. In these papers, evidence for association is mostly observed in multiplex families with Alzheimer's disease, whereas case–control samples of sporadic Alzheimer's disease are predominantly negative. Methods After sequencing VR22 in multiplex families with Alzheimer's disease linked to chromosome 10q21, we identified a novel non‐synonymous (Ser596Asn; rs4548513) single nucleotide polymorphism (SNP). This and four non‐coding SNPs were assessed in two independent samples of families with Alzheimer's disease, one with 1439 subjects from 437 multiplex families with Alzheimer's disease and the other with 489 subjects from 217 discordant sibships. Results A weak association with the Ser596Asn SNP in the multiplex sample, predominantly in families with late‐onset Alzheimer's disease (p = 0.02), was observed. However, this association does not seem to contribute substantially to the chromosome 10 Alzheimer's disease linkage signal that we and others have reported previously. No evidence was found of association with any of the four additional SNPs tested in the multiplex families with Alzheimer's disease. Finally, the Ser596Asn change was not associated with the risk for Alzheimer's disease in the independent discordant sibship sample. Conclusions This is the first study to report evidence of an association between a potentially functional, non‐synonymous SNP in VR22 and the risk for Alzheimer's disease. As the underlying effects are probably small, and are only seen in families with multiple affected members, the population‐wide significance of this finding remains to be determined. PMID:17209133
Park, S. Y.; Yoo, H. J.; Lee, M. S.; Hong, J. H.; Lee, Y. K.
In Hadong fossil power plant simulator project (1998. 1 ∼ 2000. 7), KEPRI applied virtual reality to the simulator. To provide more efficient operator training, KEPRI further developed the virtual reality technology into VR-CATS( Virtual Reality Computer Assistance Training System), a web-based multimedia training system with virtual reality technologies, in KNPEC-2 projects. By visualizing nuclear power plant system with stereoscopic 3-graphics in this project, VR-CATS enable trainee to navigate whole nuclear power plants including high radiation areas and other restricted areas. In addition, instructors can train the local operators to operate the local valves and other equipment in the local area of the plant. It aims at helping trainees understand system locations and system functions more easily. And, by reproducing main control room with stereoscopic 3-D graphics and linking it with P and ID, operating procedures, and plant components, Virtual panels maximize training effects. During the classroom training, the instructor can acess the stand-by host computer of the simulator through a network. This enables the instructor to can operate the simulator with only soft-panel. With the soft-panel, the instructor can activate any malfunction that he wants to instruct, show the trends of major parameters to the trainee and discuss with them. This desktop simulator function helps trainee to understand basic symptoms of the accidents. With CBT, operators can easily understand why some parameters are increasing or decreasing and what they should to mak the system stable. The VR-CATS for Uljin equips with much stronger and higher level virtual environment. First, all components of the virtual plant are linked with P and ID, ISO drawings, and engineering database. In addition, virtual MCR provides much immersive environment with such virtual reality equipment as HMD and data glove. Operators can also do collaboration work in the network through avatar, real
Iwata, Naoki; Fujiwara, Michitaka; Kodera, Yasuhiro; Tanaka, Chie; Ohashi, Norifumi; Nakayama, Goro; Koike, Masahiko; Nakao, Akimasa
Laparoscopic surgery requires fundamental skills peculiar to endoscopic procedures such as eye-hand coordination. Acquisition of such skills prior to performing actual surgery is highly desirable for favorable outcome. Virtual-reality simulators have been developed for both surgical training and assessment of performance. The aim of the current study is to show construct validity of a novel simulator, LapVR (Immersion Medical, San Jose, CA, USA), for Japanese surgeons and surgical residents. Forty-four subjects were divided into the following three groups according to their experience in laparoscopic surgery: 14 residents (RE) with no experience in laparoscopic surgery, 14 junior surgeons (JR) with little experience, and 16 experienced surgeons (EX). All subjects executed "essential task 1" programmed in the LapVR, which consists of six tasks, resulting in automatic measurement of 100 parameters indicating various aspects of laparoscopic skills. Time required for each task tended to be inversely correlated with experience in laparoscopic surgery. For the peg transfer skill, statistically significant differences were observed between EX and RE in three parameters, including total time and average time taken to complete the procedure and path length for the nondominant hand. For the cutting skill, similar differences were observed between EX and RE in total time, number of unsuccessful cutting attempts, and path length for the nondominant hand. According to the programmed comprehensive evaluation, performance in terms of successful completion of the task and actual experience of the participants in laparoscopic surgery correlated significantly for the peg transfer (P=0.007) and cutting skills (P=0.026). The peg transfer and cutting skills could best distinguish between EX and RE. This study is the first to provide evidence that LapVR has construct validity to discriminate between novice and experienced laparoscopic surgeons.
The rise of research and advocacy over the years to establish a disability voice in Australia with regard to bioethical issues is explored. This includes an analysis of some of the political processes and engagement in mainstream bioethical debate. An understanding of the politics of rejected knowledge is vital in understanding the muted disability voices in Australian bioethics and public policy. It is also suggested that the voices of those who are marginalised or oppressed in society, such as people with disability, have particular contribution to make in fostering critical bioethics.
Harriet Mary Jessica Smith
Full Text Available This study addressed the effect of misleading post-event information (PEI on voice ratings, identification accuracy, and confidence, as well as the link between verbal recall and accuracy. Participants listened to a dialogue between male and female targets, then read misleading information about voice pitch. Participants engaged in verbal recall, rated voices on a feature checklist, and made a lineup decision. Accuracy rates were low, especially on target-absent lineups. Confidence and accuracy were unrelated, but the number of facts recalled about the voice predicted later lineup accuracy. There was a main effect of misinformation on ratings of target voice pitch, but there was no effect on identification accuracy or confidence ratings. As voice lineup evidence from earwitnesses is used in courts, the findings have potential applied relevance.
Lotrecchiano, Gaetano R; Kane, Mary; Zocchi, Mark S; Gosa, Jessica; Lazar, Danielle; Pines, Jesse M
Purpose The purpose of this paper is to describe the use of group concept mapping (GCM) as a tool for developing a conceptual model of an episode of acute, unscheduled care from illness or injury to outcomes such as recovery, death and chronic illness. Design/methodology/approach After generating a literature review drafting an initial conceptual model, GCM software (CS Global MAX TM ) is used to organize and identify strengths and directionality between concepts generated through feedback about the model from several stakeholder groups: acute care and non-acute care providers, patients, payers and policymakers. Through online and in-person population-specific focus groups, the GCM approach seeks feedback, assigned relationships and articulated priorities from participants to produce an output map that described overarching concepts and relationships within and across subsamples. Findings A clustered concept map made up of relational data points that produced a taxonomy of feedback was used to update the model for use in soliciting additional feedback from two technical expert panels (TEPs), and finally, a public comment exercise was performed. The results were a stakeholder-informed improved model for an acute care episode, identified factors that influence process and outcomes, and policy recommendations, which were delivered to the Department of Health and Human Services's (DHHS) Assistant Secretary for Preparedness and Response. Practical implications This study provides an example of the value of cross-population multi-stakeholder input to increase voice in shared problem health stakeholder groups. Originality/value This paper provides GCM results and a visual analysis of the relational characteristics both within and across sub-populations involved in the study. It also provides an assessment of observational key factors supporting how different stakeholder voices can be integrated to inform model development and policy recommendations.
Paulmann, Silke; Uskul, Ayse K
This cross-cultural study of emotional tone of voice recognition tests the in-group advantage hypothesis (Elfenbein & Ambady, 2002) employing a quasi-balanced design. Individuals of Chinese and British background were asked to recognise pseudosentences produced by Chinese and British native speakers, displaying one of seven emotions (anger, disgust, fear, happy, neutral tone of voice, sad, and surprise). Findings reveal that emotional displays were recognised at rates higher than predicted by chance; however, members of each cultural group were more accurate in recognising the displays communicated by a member of their own cultural group than a member of the other cultural group. Moreover, the evaluation of error matrices indicates that both culture groups relied on similar mechanism when recognising emotional displays from the voice. Overall, the study reveals evidence for both universal and culture-specific principles in vocal emotion recognition.
The paper presents utilisation of the VR-1 reactor for nuclear education and training at national and international level. VR-1 reactor has been operating by the Czech Technical University since December 1990. The reactor is a pool-type light water reactor based on enriched uranium (19.7% 235 U) with maximum thermal power 1kW and for short time period up to 5kW. The moderator of neutrons is light water, which is also used as a reflector, a biological shielding and a coolant. Heat is removed from the core by natural convection. The pool disposition of the reactor facilitates access to the core, setting and removing of various experimental samples and detectors, easy and safe handling of fuel assemblies. The reactor core can contain from 17 to 21 fuel assemblies IRT-4M, depending on the geometric arrangement and kind of experiments to be performed in the reactor. The reactor is equipped with several experimental devices; e.g. horizontal, radial and tangential channels used to take out a neutron beam, reactivity oscillator for dynamics study and bubble boiling simulator. The reactor has been used very efficiently especially for education and training of university students and NPP's specialists for more than 18 years. The VR-1 reactor is utilised within various national and international activities such as Czech Nuclear Education Network (CENEN), European Nuclear Education Network and also Eastern European Research Reactor Initiative (EERRI). The reactor is well equipped for education and training not only by the experimental facility itself but also by incessant development of training methods and improvement of education experiments. The education experiments can be combined into training courses attended by students according to their study specialization and knowledge level. The training programme is aimed to the reactor and neutron physics, dosimetry, nuclear safety, and control of nuclear installations. Every year, approximately 250 university students undergo
Rissanen, Mikko J; Kume, Naoto; Kuroda, Yoshihiro; Kuroda, Tomohiro; Yoshimura, Koji; Yoshihara, Hiroyuki
Many VR technology based training systems use expert's motion data as the training aid, but would not provide any short-cut to teaching medical skills that do not depend on exact motions. Earlier we presented Annotated Simulation Records (ASRs), which can be used to encapsulate experts' insight on psychomotor skills. Annotations made to behavioural parameters in training simulators enable asynchronous teaching instead of just motion training in a proactive way to the learner. We evaluated ASRs for asynchronous teaching of Digital Rectal Examination (DRE) with 3 urologists and 8 medical students. The ASRs were found more effective than motion-based training with verbal feedback.
Kim, Jaebok; Park, Jeong-Sik
This paper proposes an efficient speech emotion recognition (SER) approach that utilizes personal voice data accumulated on personal devices. A representative weakness of conventional SER systems is the user-dependent performance induced by the speaker independent (SI) acoustic model framework. But,
Nadolski, Rob; Bahreini, Kiavash; Westera, Wim
This paper presentation describes how our FILTWAM software artifacts for face and voice emotion recognition will be used for assessing learners' progress and providing adequate feedback in an online game-based communication skills training. This constitutes an example of in-game assessment for
Bahreini, Kiavash; Nadolski, Rob; Westera, Wim
This paper describes how our FILTWAM software artifacts for face and voice emotion recognition will be used for assessing learners' progress and providing adequate feedback in an online game-based communication skills training. This constitutes an example of in-game assessment for mainly formative
Full Text Available Recently, several music information retrieval (MIR systems which retrieve musical pieces by the user's singing voice have been developed. All of these systems use only melody information for retrieval, although lyrics information is also useful for retrieval. In this paper, we propose a new MIR system that uses both lyrics and melody information. First, we propose a new lyrics recognition method. A finite state automaton (FSA is used as recognition grammar, and about 86% retrieval accuracy was obtained. We also develop an algorithm for verifying a hypothesis output by a lyrics recognizer. Melody information is extracted from an input song using several pieces of information of the hypothesis, and a total score is calculated from the recognition score and the verification score. From the experimental results, 95.0% retrieval accuracy was obtained with a query consisting of five words.
Full Text Available Recently, several music information retrieval (MIR systems which retrieve musical pieces by the user's singing voice have been developed. All of these systems use only melody information for retrieval, although lyrics information is also useful for retrieval. In this paper, we propose a new MIR system that uses both lyrics and melody information. First, we propose a new lyrics recognition method. A finite state automaton (FSA is used as recognition grammar, and about retrieval accuracy was obtained. We also develop an algorithm for verifying a hypothesis output by a lyrics recognizer. Melody information is extracted from an input song using several pieces of information of the hypothesis, and a total score is calculated from the recognition score and the verification score. From the experimental results, 95.0 retrieval accuracy was obtained with a query consisting of five words.
Rantala, Leena M; Hakala, Suvi; Holmqvist, Sofia; Sala, Eeva
The aim of the study was to investigate if voice ergonomic risk factors in classrooms correlated with acoustic parameters of teachers' voice production. The voice ergonomic risk factors in the fields of working culture, working postures and indoor air quality were assessed in 40 classrooms using the Voice Ergonomic Assessment in Work Environment - Handbook and Checklist. Teachers (32 females, 8 males) from the above-mentioned classrooms recorded text readings before and after a working day. Fundamental frequency, sound pressure level (SPL) and the slope of the spectrum (alpha ratio) were analyzed. The higher the number of the risk factors in the classrooms, the higher SPL the teachers used and the more strained the males' voices (increased alpha ratio) were. The SPL was already higher before the working day in the teachers with higher risk than in those with lower risk. In the working environment with many voice ergonomic risk factors, speakers increase voice loudness and use more strained voice quality (males). A practical implication of the results is that voice ergonomic assessments are needed in schools. Copyright © 2013 S. Karger AG, Basel.
Niebudek-Bogusz, Ewa; Kuzańska, Anna; Błoch, Piotr; Domańska, Maja; Woźnicka, Ewelina; Politański, Piotr; Sliwińska-Kowalska, Mariola
The aim of this study was to assess the applicability of Voice Handicap Index (VHI) to the evaluation of effectiveness of functional voice disorders treatment in teachers. The subjects were 45 female teachers with functional dysphonia who evaluated their voice problems according to the subjective VHI scale before and after phoniatric management. Group I (29 patients) were subjected to vocal training, whereas group II (16 patients) received only voice hygiene instructions. The results demonstrated that differences in the mean VHI score before and after phoniatric treatment were significantly higher in group 1 than in group II (p teacher's dysphonia.
Pelegrin Garcia, David; Lyberg-Åhlander, Viveka; Rydell, Roland
of the classroom. The results thus suggest that teachers with voice problems are more aware of classroom acoustic conditions than their healthy colleagues and make use of the more supportive rooms to lower their voice levels. This behavior may result from an adaptation process of the teachers with voice problems...... of the voice problems was made with a questionnaire and a laryngological examination. During teaching, the sound pressure level at the teacher’s position was monitored. The teacher’s voice level and the activity noise level were separated using mixed Gaussians. In addition, objective acoustic parameters...... of Reverberation Time and Voice Support were measured in the 30 empty classrooms of the study. An empirical model shows that the measured voice levels depended on the activity noise levels and the voice support. Teachers with and without voice problems were differently affected by the voice support...
... Aphasia Follow us Former Auctioneer Finds Voice After Aphasia Speech impairment changed his life One unremarkable September ... 10 Tips for Communicating with Someone who has Aphasia Talk to them in a quiet, calm, relaxed ...
Vilas Bôas, C. S. N.; Gobara, S. T.
This article presents a device constructed with low-cost material to demonstrate and explain voice production. It also provides a contextualized, interdisciplinary approach to introduce the study of sound waves.
Vatne, Torun M; Finset, Arnstein; Ørnes, Knut; Ruland, Cornelia M
Adult patients present concerns as defined in the Verona Coding Definitions of Emotional Sequences (VR-CoDES), but we do not know how children express their concerns during medical consultations. This study aimed to evaluate the applicability of VR-CoDES to pediatric oncology consultations. Twenty-eight pediatric consultations were coded with the Verona Coding Definitions of Emotional Sequences (VR-CoDES), and the material was also qualitatively analyzed for descriptive purposes. Five consultations were randomly selected for reliability testing and descriptive statistics were computed. Perfect inter-rater reliability for concerns and moderate reliability for cues were obtained. Cues and/or concerns were present in over half of the consultations. Cues were more frequent than concerns, with the majority of cues being verbal hints to hidden concerns or non-verbal cues. Intensity of expressions, limitations in vocabulary, commonality of statements, and complexity of the setting complicated the use of VR-CoDES. Child-specific cues; use of the imperative, cues about past experiences, and use of onomatopoeia were observed. Children with cancer express concerns during medical consultations. VR-CoDES is a reliable tool for coding concerns in pediatric data sets. For future applications in pediatric settings an appendix should be developed to incorporate the child-specific traits. Copyright (c) 2010 Elsevier Ireland Ltd. All rights reserved.
Santos, Julio A. dos; Mól, Antônio C. de A.; Santo, André C. Do E., E-mail: email@example.com [Instituto de Engenharia Nuclear (IEN/CNEN-RJ), Rio de Janeiro, RJ (Brazil); Centro Universitário Carioca (UniCarioca), Rio de Janeiro, RJ (Brazil)
Radioactive waste is all material resulting from human activity that contains elements that emit radiation that can generate risks to health and the environment. In this sense, they are very toxic also for those who perform the storage of radioactive waste in nuclear facilities. On the other hand, the virtual reality (VR) has been destined to the most diverse purposes, like simulations for educational systems, for military purposes as for diverse training. VR can be considered as the junction of three basic principles: immersion, interaction and involvement. Bases on these principles of VR, this work aimed to develop a simulator of a repository of nuclear tailings, for mobile computing, whose interaction interface will be through the Samsung Gear VR helmet. The simulator of the nuclear waste repository was developed in the unity 3D tool and the elements that make up the scenario in the 3D MAX program. In this work we tried to put virtual reality under scrutiny in conjunction with Gear VR, to help in the sensation of immersion, as well as, the possibility of interaction with joysticks. The purpose was to provide greater insight into the operating environment. (author)
Song, Sub Lee; Lee, Byung Il; Park, Seong Jun; Lee, Dewhey; Park, Younwon
The article of Act on Physical Protection and Radiological Emergency (APPRE) was amended as a nuclear licensee shall formulate a radiological emergency exercise plan as prescribed by the Ordinance of the Prime minister and execute such plan with the approval of the Nuclear Safety and Security Commission (NSSC). Current radiological emergency exercise is basically conducting in the field. The field exercise essentially requires participation of mass population. Due to lack of time, cost, communication and participation, the field exercise necessarily causes several limitations in an aspect of effectiveness. The public participants often misunderstood the situation as real though it is just an exercise so several conflicts are occurring. Furthermore, the exercise program is too ideal to reflect the real accident situation. In this point of view, application of virtual reality (VR) technology is highlighted with its many advantages. VR technology is expected to resolve those existing problems. Our research team is currently developing VR based radiological emergency exercise system. In this paper, the advantages and actual application of VR based training were introduced. With those advantages and improvement of existing disadvantages, our VR based radiological emergency exercise system will be developed. Not only physical interactive features, but also interactive fail-considered real-like scenarios will be adopted in the system. The ultimate goal of the system is safe and perfect evacuation of residents in case of radioactive accident
Song, Sub Lee; Lee, Byung Il; Park, Seong Jun; Lee, Dewhey; Park, Younwon [BEES Inc., Daejeon (Korea, Republic of)
The article of Act on Physical Protection and Radiological Emergency (APPRE) was amended as a nuclear licensee shall formulate a radiological emergency exercise plan as prescribed by the Ordinance of the Prime minister and execute such plan with the approval of the Nuclear Safety and Security Commission (NSSC). Current radiological emergency exercise is basically conducting in the field. The field exercise essentially requires participation of mass population. Due to lack of time, cost, communication and participation, the field exercise necessarily causes several limitations in an aspect of effectiveness. The public participants often misunderstood the situation as real though it is just an exercise so several conflicts are occurring. Furthermore, the exercise program is too ideal to reflect the real accident situation. In this point of view, application of virtual reality (VR) technology is highlighted with its many advantages. VR technology is expected to resolve those existing problems. Our research team is currently developing VR based radiological emergency exercise system. In this paper, the advantages and actual application of VR based training were introduced. With those advantages and improvement of existing disadvantages, our VR based radiological emergency exercise system will be developed. Not only physical interactive features, but also interactive fail-considered real-like scenarios will be adopted in the system. The ultimate goal of the system is safe and perfect evacuation of residents in case of radioactive accident.
Santos, Julio A. dos; Mól, Antônio C. de A.; Santo, André C. Do E.
Radioactive waste is all material resulting from human activity that contains elements that emit radiation that can generate risks to health and the environment. In this sense, they are very toxic also for those who perform the storage of radioactive waste in nuclear facilities. On the other hand, the virtual reality (VR) has been destined to the most diverse purposes, like simulations for educational systems, for military purposes as for diverse training. VR can be considered as the junction of three basic principles: immersion, interaction and involvement. Bases on these principles of VR, this work aimed to develop a simulator of a repository of nuclear tailings, for mobile computing, whose interaction interface will be through the Samsung Gear VR helmet. The simulator of the nuclear waste repository was developed in the unity 3D tool and the elements that make up the scenario in the 3D MAX program. In this work we tried to put virtual reality under scrutiny in conjunction with Gear VR, to help in the sensation of immersion, as well as, the possibility of interaction with joysticks. The purpose was to provide greater insight into the operating environment. (author)
listed decontamination products in the haired guinea pig model following exposure to VR (Russian VX, EA4243). 15. SUBJECT TERMS decontamination...the efficacy of the barrier skin cream SERPACWA and the four listed decontamination products in the haired guinea pig model following exposure to VR...four listed decontamination products and SERPACWA in the haired guinea pig model following exposure to VR (Russian VX, EA4243, Soviet V-gas
The task for this project was to design, develop, test, and deploy a facial recognition system for the Kennedy Space Center Augmented/Virtual Reality Lab. This system will serve as a means of user authentication as part of the NUI of the lab. The overarching goal is to create a seamless user interface that will allow the user to initiate and interact with AR and VR experiences without ever needing to use a mouse or keyboard at any step in the process.
Zekveld, A.A.; Rudner, M.; Kramer, S.E.; Lyzenga, J.; Ronnberg, J.
We investigated changes in speech recognition and cognitive processing load due to the masking release attributable to decreasing similarity between target and masker speech. This was achieved by using masker voices with either the same (female) gender as the target speech or different gender (male)
In smart houses contemporary achievements in the fields of automation, communications, security and artificial intelligence, increase comfort and improve the quality of user's lifes. For the purpose of this thesis we developed a system for managing a smart house with voice commands via smart phone. We focused at voice commands most. We want move from communication with fingers - touches, to a more natural, human relationship - speech. We developed the entire chain of communication, by which t...
Akinbode, R; Lam, K B H; Ayres, J G; Sadhra, S
The prolonged use or abuse of voice may lead to vocal fatigue and vocal fold tissue damage. School teachers routinely use their voices intensively at work and are therefore at a higher risk of dysphonia. To determine the prevalence of voice disorders among primary school teachers in Lagos, Nigeria, and to explore associated risk factors. Teaching and non-teaching staff from 19 public and private primary schools completed a self-administered questionnaire to obtain information on personal lifestyles, work experience and environment, and voice disorder symptoms. Dysphonia was defined as the presence of at least one of the following: hoarseness, repetitive throat clearing, tired voice or straining to speak. A total of 341 teaching and 155 non-teaching staff participated. The prevalence of dysphonia in teachers was 42% compared with 18% in non-teaching staff. A significantly higher proportion of the teachers reported that voice symptoms had affected their ability to communicate effectively. School type (public/private) did not predict the presence of dysphonia. Statistically significant associations were found for regular caffeinated drink intake (odds ratio [OR] = 3.07; 95% confidence interval [CI]: 1.51-6.62), frequent upper respiratory tract infection (OR = 3.60; 95% CI: 1.39-9.33) and raised voice while teaching (OR = 10.1; 95% CI: 5.07-20.2). Nigerian primary school teachers were at risk for dysphonia. Important environment and personal factors were upper respiratory infection, the need to frequently raise the voice when teaching and regular intake of caffeinated drinks. Dysphonia was not associated with age or years of teaching. © The Author 2014. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: firstname.lastname@example.org.
Egger, Jan; Gall, Markus; Wallner, Jürgen; Boechat, Pedro; Hann, Alexander; Li, Xing; Chen, Xiaojun; Schmalstieg, Dieter
Virtual Reality, an immersive technology that replicates an environment via computer-simulated reality, gets a lot of attention in the entertainment industry. However, VR has also great potential in other areas, like the medical domain, Examples are intervention planning, training and simulation. This is especially of use in medical operations, where an aesthetic outcome is important, like for facial surgeries. Alas, importing medical data into Virtual Reality devices is not necessarily trivial, in particular, when a direct connection to a proprietary application is desired. Moreover, most researcher do not build their medical applications from scratch, but rather leverage platforms like MeVisLab, MITK, OsiriX or 3D Slicer. These platforms have in common that they use libraries like ITK and VTK, and provide a convenient graphical interface. However, ITK and VTK do not support Virtual Reality directly. In this study, the usage of a Virtual Reality device for medical data under the MeVisLab platform is presented. The OpenVR library is integrated into the MeVisLab platform, allowing a direct and uncomplicated usage of the head mounted display HTC Vive inside the MeVisLab platform. Medical data coming from other MeVisLab modules can directly be connected per drag-and-drop to the Virtual Reality module, rendering the data inside the HTC Vive for immersive virtual reality inspection.
Roy, S; Klinger, E; Légeron, P; Lauer, F; Chemin, I; Nugues, P
Social phobia is an anxiety disorder that is accessible to two forms of treatment yielding scientifically validated results: drugs and cognitive-behavioral therapies. Graded exposure to feared social situations is fundamental to obtain an improvement of the anxious symptoms. Traditionally, exposure therapies are done either in vivo or by imagining the situations. In vivo exposure is sometimes difficult to control and many patients have some difficulties in using imagination. Virtual reality (VR) seems to bring significant advantages. It allows exposures to numerous and varied situations. This paper reports the definition of a clinical protocol whose purpose is to assess the efficiency of a VR therapy compared to a CBT and to the absence of treatment for social phobic patients. It explains the illness' diagnosis and its usual treatments. It exposes all the architecture of the study, the assessment tools, the content and unfold of the therapy sessions. It finally reports first results of a clinical trial in a between-group design in 10 patients suffering from social phobia. The virtual environments used in the treatment reproduce four situations that social phobics feel the most threatening: performance, intimacy, scrutiny and assertiveness. With the help of the therapist, the patient learns adapted cognitions and behaviors with the aim of reducing her or his anxiety in the corresponding real situations. The novelty of our work is to address a group of situations that the phobic patient is most likely to experience and to treat patients according to a precise protocol.
Full Text Available Virtual Reality, an immersive technology that replicates an environment via computer-simulated reality, gets a lot of attention in the entertainment industry. However, VR has also great potential in other areas, like the medical domain, Examples are intervention planning, training and simulation. This is especially of use in medical operations, where an aesthetic outcome is important, like for facial surgeries. Alas, importing medical data into Virtual Reality devices is not necessarily trivial, in particular, when a direct connection to a proprietary application is desired. Moreover, most researcher do not build their medical applications from scratch, but rather leverage platforms like MeVisLab, MITK, OsiriX or 3D Slicer. These platforms have in common that they use libraries like ITK and VTK, and provide a convenient graphical interface. However, ITK and VTK do not support Virtual Reality directly. In this study, the usage of a Virtual Reality device for medical data under the MeVisLab platform is presented. The OpenVR library is integrated into the MeVisLab platform, allowing a direct and uncomplicated usage of the head mounted display HTC Vive inside the MeVisLab platform. Medical data coming from other MeVisLab modules can directly be connected per drag-and-drop to the Virtual Reality module, rendering the data inside the HTC Vive for immersive virtual reality inspection.
Poeschl, Sandra; Doering, Nicola
Virtual training applications with high levels of immersion or fidelity (for example for social phobia treatment) produce high levels of presence and therefore belong to the most successful Virtual Reality developments. Whereas display and interaction fidelity (as sub-dimensions of immersion) and their influence on presence are well researched, realism of the displayed simulation depends on the specific application and is therefore difficult to measure. We propose to measure simulation realism by using a self-report questionnaire. The German VR Simulation Realism Scale for VR training applications was developed based on a translation of scene realism items from the Witmer-Singer-Presence Questionnaire. Items for realism of virtual humans (for example for social phobia training applications) were supplemented. A sample of N = 151 students rated simulation realism of a Fear of Public Speaking application. Four factors were derived by item- and principle component analysis (Varimax rotation), representing Scene Realism, Audience Behavior, Audience Appearance and Sound Realism. The scale developed can be used as a starting point for future research and measurement of simulation realism for applications including virtual humans.
Stansfield, S.; Shawver, D.; Sobel, A.
This paper presents a prototype virtual reality (VR) system for training medical first responders. The initial application is to battlefield medicine and focuses on the training of medical corpsmen and other front-line personnel who might be called upon to provide emergency triage on the battlefield. The system is built upon Sandia`s multi-user, distributed VR platform and provides an interactive, immersive simulation capability. The user is represented by an Avatar and is able to manipulate his virtual instruments and carry out medical procedures. A dynamic casualty simulation provides realistic cues to the patient`s condition (e.g. changing blood pressure and pulse) and responds to the actions of the trainee (e.g. a change in the color of a patient`s skin may result from a check of the capillary refill rate). The current casualty simulation is of an injury resulting in a tension pneumothorax. This casualty model was developed by the University of Pennsylvania and integrated into the Sandia MediSim system.
Doblack, Benjamin N; Allis, Tim; Dávila, Lilian P
The increasing development of computing (hardware and software) in the last decades has impacted scientific research in many fields including materials science, biology, chemistry and physics among many others. A new computational system for the accurate and fast simulation and 3D/VR visualization of nanostructures is presented here, using the open-source molecular dynamics (MD) computer program LAMMPS. This alternative computational method uses modern graphics processors, NVIDIA CUDA technology and specialized scientific codes to overcome processing speed barriers common to traditional computing methods. In conjunction with a virtual reality system used to model materials, this enhancement allows the addition of accelerated MD simulation capability. The motivation is to provide a novel research environment which simultaneously allows visualization, simulation, modeling and analysis. The research goal is to investigate the structure and properties of inorganic nanostructures (e.g., silica glass nanosprings) under different conditions using this innovative computational system. The work presented outlines a description of the 3D/VR Visualization System and basic components, an overview of important considerations such as the physical environment, details on the setup and use of the novel system, a general procedure for the accelerated MD enhancement, technical information, and relevant remarks. The impact of this work is the creation of a unique computational system combining nanoscale materials simulation, visualization and interactivity in a virtual environment, which is both a research and teaching instrument at UC Merced.
Full Text Available As qualitative research undertakings are not independent of the researcher, the “indissoluble interrelationship between interpreter and interpretation” (Thomas & James, 2006, p. 782 renders it necessary for researchers to understand that their text is a representation, a version of the truth that is the product of writerly choices, and that it is discursive. Endlessly creative, artistic and political, as there is no single interpretative truth, the interpretative process facilitates the refashioning of representations, the remaking of choices and the probing of discourses. As a consequence of the particularity of any researcher’s account, issues pertaining to researcher identity and authorial stance always remain central to research endeavours (Kamler & Thomson, 2006, p. 68; Denzin & Lincoln 2011, pp. 14-15. Therefore, researchers are encouraged to be reflexive about their analyses and research accounts (Elliott, 2005, p. 152, as reflexivity helps spotlight the role of the researcher as narrator. In turn, spotlighting the researcher as narrator foregrounds a range of complex issues about voice, representation and interpretive authority (Chase, 2005, p. 657; Genishi & Glupczynski, 2006, p. 671; Eisenhart, 2006. In essence, therefore, this paper is reflective of the challenges of “doing” qualitative research in educational settings. Its particular focus-the shaping of beginning primary teachers’ identities, in Ireland, throughout the course of their initial year of occupational experience, post-graduation- endeavours to highlight issues pertaining to the researcher as narrator (O’Sullivan, 2014.
Full Text Available As qualitative research undertakings are not independent of the researcher, the “indissoluble interrelationship between interpreter and interpretation” (Thomas & James, 2006, p. 782 renders it necessary for researchers to understand that their text is a representation, a version of the truth that is the product of writerly choices, and that it is discursive. Endlessly creative, artistic and political, as there is no single interpretative truth, the interpretative process facilitates the refashioning of representations, the remaking of choices and the probing of discourses. As a consequence of the particularity of any researcher’s account, issues pertaining to researcher identity and authorial stance always remain central to research endeavours (Kamler & Thomson, 2006, p. 68; Denzin & Lincoln 2011, pp. 14-15. Therefore, researchers are encouraged to be reflexive about their analyses and research accounts (Elliott, 2005, p. 152, as reflexivity helps spotlight the role of the researcher as narrator. In turn, spotlighting the researcher as narrator foregrounds a range of complex issues about voice, representation and interpretive authority (Chase, 2005, p. 657; Genishi & Glupczynski, 2006, p. 671; Eisenhart, 2006. In essence, therefore, this paper is reflective of the challenges of “doing” qualitative research in educational settings. Its particular focus-the shaping of beginning primary teachers’ identities, in Ireland, throughout the course of their initial year of occupational experience, post-graduation- endeavours to highlight issues pertaining to the researcher as narrator (O’Sullivan, 2014.
O'Connor, Jillian J M; Re, Daniel E; Feinberg, David R
Sexual infidelity can be costly to members of both the extra-pair and the paired couple. Thus, detecting infidelity risk is potentially adaptive if it aids in avoiding cuckoldry or loss of parental and relationship investment. Among men, testosterone is inversely related to voice pitch, relationship and offspring investment, and is positively related to the pursuit of short-term relationships, including extra-pair sex. Among women, estrogen is positively related to voice pitch, attractiveness, and the likelihood of extra-pair involvement. Although prior work has demonstrated a positive relationship between men's testosterone levels and infidelity, this study is the first to investigate attributions of infidelity as a function of sexual dimorphism in male and female voices. We found that men attributed high infidelity risk to feminized women's voices, but not significantly more often than did women. Women attributed high infidelity risk to masculinized men's voices at significantly higher rates than did men. These data suggest that voice pitch is used as an indicator of sexual strategy in addition to underlying mate value. The aforementioned attributions may be adaptive if they prevent cuckoldry and/or loss of parental and relationship investment via avoidance of partners who may be more likely to be unfaithful.
Jillian J.M. O'Connor
Full Text Available Sexual infidelity can be costly to members of both the extra-pair and the paired couple. Thus, detecting infidelity risk is potentially adaptive if it aids in avoiding cuckoldry or loss of parental and relationship investment. Among men, testosterone is inversely related to voice pitch, relationship and offspring investment, and is positively related to the pursuit of short-term relationships, including extra-pair sex. Among women, estrogen is positively related to voice pitch, attractiveness, and the likelihood of extra-pair involvement. Although prior work has demonstrated a positive relationship between men's testosterone levels and infidelity, this study is the first to investigate attributions of infidelity as a function of sexual dimorphism in male and female voices. We found that men attributed high infidelity risk to feminized women's voices, but not significantly more often than did women. Women attributed high infidelity risk to masculinized men's voices at significantly higher rates than did men. These data suggest that voice pitch is used as an indicator of sexual strategy in addition to underlying mate value. The aforementioned attributions may be adaptive if they prevent cuckoldry and/or loss of parental and relationship investment via avoidance of partners who may be more likely to be unfaithful.
Lee, Yune Sang; Peelle, Jonathan E; Kraemer, David; Lloyd, Samuel; Granger, Richard
Past neuroimaging studies have documented discrete regions of human temporal cortex that are more strongly activated by conspecific voice sounds than by nonvoice sounds. However, the mechanisms underlying this voice sensitivity remain unclear. In the present functional MRI study, we took a novel approach to examining voice sensitivity, in which we applied a signal detection paradigm to the assessment of multivariate pattern classification among several living and nonliving categories of auditory stimuli. Within this framework, voice sensitivity can be interpreted as a distinct neural representation of brain activity that correctly distinguishes human vocalizations from other auditory object categories. Across a series of auditory categorization tests, we found that bilateral superior and middle temporal cortex consistently exhibited robust sensitivity to human vocal sounds. Although the strongest categorization was in distinguishing human voice from other categories, subsets of these regions were also able to distinguish reliably between nonhuman categories, suggesting a general role in auditory object categorization. Our findings complement the current evidence of cortical sensitivity to human vocal sounds by revealing that the greatest sensitivity during categorization tasks is devoted to distinguishing voice from nonvoice categories within human temporal cortex. Copyright © 2015 the American Physiological Society.
Full Text Available The article deals with methods measuring the quality of voice transmitted over the mobile network as well as related problem, algorithms and options. It presents the created voice quality measurement system and discusses its adequacy as well as efficiency. Besides, the author presents the results of system application under the optimal hardware configuration. Under almost ideal conditions, the system evaluates the voice quality with MOS 3.85 average estimate; while the standardized TEMS Investigation 9.0 has 4.05 average MOS estimate. Next, the article presents the discussion of voice quality predictor implementation and investigates the predictor using nonlinear and linear prediction methods of voice quality dependence on the mobile network settings. Nonlinear prediction using artificial neural network resulted in the correlation coefficient of 0.62. While the linear prediction method using the least mean squares resulted in the correlation coefficient of 0.57. The analytical expression of voice quality features from the three network parameters: BER, C / I, RSSI is given as well.Article in Lithuanian
Schiller, Isabel S; Morsomme, Dominique; Remacle, Angélique
This study aimed (1) to investigate music theory teachers' professional and extra-professional vocal loading and background noise exposure, (2) to determine the correlation between vocal loading and background noise, and (3) to determine the correlation between vocal loading and self-evaluation data. Using voice dosimetry, 13 music theory teachers were monitored for one workweek. The parameters analyzed were voice sound pressure level (SPL), fundamental frequency (F0), phonation time, vocal loading index (VLI), and noise SPL. Spearman correlation was used to correlate vocal loading parameters (voice SPL, F0, and phonation time) and noise SPL. Each day, the subjects self-assessed their voice using visual analog scales. VLI and self-evaluation data were correlated using Spearman correlation. Vocal loading parameters and noise SPL were significantly higher in the professional than in the extra-professional environment. Voice SPL, phonation time, and female subjects' F0 correlated positively with noise SPL. VLI correlated with self-assessed voice quality, vocal fatigue, and amount of singing and speaking voice produced. Teaching music theory is a profession with high vocal demands. More background noise is associated with increased vocal loading and may indirectly increase the risk for voice disorders. Correlations between VLI and self-assessments suggest that these teachers are well aware of their vocal demands and feel their effect on voice quality and vocal fatigue. Visual analog scales seem to represent a useful tool for subjective vocal loading assessment and associated symptoms in these professional voice users. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Wolfe, Jace; Morais, Mila; Schafer, Erin; Agrawal, Smita; Koch, Dawn
Cochlear implant recipients often experience difficulty with understanding speech in the presence of noise. Cochlear implant manufacturers have developed sound processing algorithms designed to improve speech recognition in noise, and research has shown these technologies to be effective. Remote microphone technology utilizing adaptive, digital wireless radio transmission has also been shown to provide significant improvement in speech recognition in noise. There are no studies examining the potential improvement in speech recognition in noise when these two technologies are used simultaneously. The goal of this study was to evaluate the potential benefits and limitations associated with the simultaneous use of a sound processing algorithm designed to improve performance in noise (Advanced Bionics ClearVoice) and a remote microphone system that incorporates adaptive, digital wireless radio transmission (Phonak Roger). A two-by-two way repeated measures design was used to examine performance differences obtained without these technologies compared to the use of each technology separately as well as the simultaneous use of both technologies. Eleven Advanced Bionics (AB) cochlear implant recipients, ages 11 to 68 yr. AzBio sentence recognition was measured in quiet and in the presence of classroom noise ranging in level from 50 to 80 dBA in 5-dB steps. Performance was evaluated in four conditions: (1) No ClearVoice and no Roger, (2) ClearVoice enabled without the use of Roger, (3) ClearVoice disabled with Roger enabled, and (4) simultaneous use of ClearVoice and Roger. Speech recognition in quiet was better than speech recognition in noise for all conditions. Use of ClearVoice and Roger each provided significant improvement in speech recognition in noise. The best performance in noise was obtained with the simultaneous use of ClearVoice and Roger. ClearVoice and Roger technology each improves speech recognition in noise, particularly when used at the same time
Arndt, H.; Stockheim, D.; Mutze, S.; Petersein, J.; Gregor, P.; Hamm, B.
Purpose: Applicability and benefits of digital speech recognition in diagnostic radiology were tested using the speech recognition system SP 6000. Methods: The speech recognition system SP 6000 was integrated into the network of the institute and connected to the existing Radiological Information System (RIS). Three subjects used this system for writing 2305 findings from dictation. After the recognition process the date, length of dictation, time required for checking/correction, kind of examination and error rate were recorded for every dictation. With the same subjects, a correlation was performed with 625 conventionally written finding. Results: After an 1-hour initial training the average error rates were 8.4 to 13.3%. The first adaptation of the speech recognition system (after nine days) decreased the average error rates to 2.4 to 10.7% due to the ability of the program to learn. The 2 nd and 3 rd adaptations resulted only in small changes of the error rate. An individual comparison of the error rate developments in the same kind of investigation showed the relative independence of the error rate on the individual user. Conclusion: The results show that the speech recognition system SP 6000 can be evaluated as an advantageous alternative for quickly recording radiological findings. A comparison between manually writing and dictating the findings verifies the individual differences of the writing speeds and shows the advantage of the application of voice recognition when faced with normal keyboard performance. (orig.) [de
Belgacem, H.; Cherif, A.
One of the biggest challenges in vocal transformation with TD-PSOLA technique is the selection of modified parameters that will make a successful speech resynthesis. The best selection methods are by using human ratters. This study focuses on automatic determination of the pathological voice transformation coefficients using an Artificial Neural Network this way by comparing the results to the previous manual work. Four characterizied parameters (RATA-PLP, Jitter, Shimmer and RAP) were chosen. The system is developed with supervised training, consists of recognition (neural network) for synthesis (TD-PSOLA). The experimental results show that the parameter sets selected by the proposed system can be successfully used to resynthesize and demonstrating that our system can assist in vocal of pathological voice's transformation.
Full Text Available The ability to effectively respond to emotional information carried in the human voice plays a pivotal role for social interactions. We examined how genetic factors, especially the serotonin transporter genetic variation (5-HTTLPR, affect the neurodynamics of emotional voice processing in infants and adults by measuring event-related brain potentials (ERPs. The results revealed that infants distinguish between emotions during an early perceptual processing stage, whereas adults recognize and evaluate the meaning of emotions during later semantic processing stages. While infants do discriminate between emotions, only in adults was genetic variation associated with neurophysiological differences in how positive and negative emotions are processed in the brain. This suggests that genetic association with neurocognitive functions emerges during development, emphasizing the role that variation in serotonin plays in the maturation of brain systems involved in emotion recognition.
Jain, Anil K
This report describes research efforts towards developing algorithms for a robust face recognition system to overcome many of the limitations found in existing two-dimensional facial recognition systems...
K.C. , Santosh; Wendling , Laurent
International audience; The chapter focuses on one of the key issues in document image processing i.e., graphical symbol recognition. Graphical symbol recognition is a sub-field of a larger research domain: pattern recognition. The chapter covers several approaches (i.e., statistical, structural and syntactic) and specially designed symbol recognition techniques inspired by real-world industrial problems. It, in general, contains research problems, state-of-the-art methods that convey basic s...
Sprecher, Alicia; Olszewski, Aleksandra; Jiang, Jack J; Zhang, Yu
The addition of a fourth type of voice to Titze's voice classification scheme is proposed. This fourth voice type is characterized by primarily stochastic noise behavior and is therefore unsuitable for both perturbation and correlation dimension analysis. Forty voice samples were classified into the proposed four types using narrowband spectrograms. Acoustic, perceptual, and correlation dimension analyses were completed for all voice samples. Perturbation measures tended to increase with voice type. Based on reliability cutoffs, the type 1 and type 2 voices were considered suitable for perturbation analysis. Measures of unreliability were higher for type 3 and 4 voices. Correlation dimension analyses increased significantly with signal type as indicated by a one-way analysis of variance. Notably, correlation dimension analysis could not quantify the type 4 voices. The proposed fourth voice type represents a subset of voices dominated by noise behavior. Current measures capable of evaluating type 4 voices provide only qualitative data (spectrograms, perceptual analysis, and an infinite correlation dimension). Type 4 voices are highly complex and the development of objective measures capable of analyzing these voices remains a topic of future investigation.
Full Text Available Yazhou Jin,* Zhiqi Mao,* Zhipei Ling, Xin Xu, Zhiyuan Zhang, Xinguang Yu Department of Neurosurgery, People’s Liberation Army General Hospital, Beijing, People’s Republic of China *These authors contributed equally to this work Background: Parkinson’s disease (PD patients exhibit deficits in emotional recognition and expression abilities, including emotional faces and voices. The aim of this study was to explore emotional processing in pre-deep brain stimulation (pre-DBS PD patients using two sensory modalities (visual and auditory. Methods: Fifteen PD patients who needed DBS surgery and 15 healthy, age- and gender-matched controls were recruited as participants. All participants were assessed by the Karolinska Directed Emotional Faces database 50 Faces Recognition test. Vocal recognition was evaluated by the Montreal Affective Voices database 50 Voices Recognition test. For emotional facial expression, the participants were asked to imitate five basic emotions (neutral, happiness, anger, fear, and sadness. The subjects were required to express nonverbal vocalizations of the five basic emotions. Fifteen Chinese native speakers were recruited as decoders. We recorded the accuracy of the responses, reaction time, and confidence level. Results: For emotional recognition and expression, the PD group scored lower on both facial and vocal emotional processing than did the healthy control group. There were significant differences between the two groups in both reaction time and confidence level. A significant relationship was also found between emotional recognition and emotional expression when considering all participants between the two groups together. Conclusion: The PD group exhibited poorer performance on both the recognition and expression tasks. Facial emotion deficits and vocal emotion abnormalities were associated with each other. In addition, our data allow us to speculate that emotional recognition and expression may share a common
Kuechler, Erich R; Giese, Timothy J; York, Darrin M
To better represent the solvation effects observed along reaction pathways, and of ionic species in general, a charge-dependent variable-radii smooth conductor-like screening model (VR-SCOSMO) is developed. This model is implemented and parameterized with a third order density-functional tight binding quantum model, DFTB3/3OB-OPhyd, a quantum method which was developed for organic and biological compounds, utilizing a specific parameterization for phosphate hydrolysis reactions. Unlike most other applications with the DFTB3/3OB model, an auxiliary set of atomic multipoles is constructed from the underlying DFTB3 density matrix which is used to interact the solute with the solvent response surface. The resulting method is variational, produces smooth energies, and has analytic gradients. As a baseline, a conventional SCOSMO model with fixed radii is also parameterized. The SCOSMO and VR-SCOSMO models shown have comparable accuracy in reproducing neutral-molecule absolute solvation free energies; however, the VR-SCOSMO model is shown to reduce the mean unsigned errors (MUEs) of ionic compounds by half (about 2-3 kcal/mol). The VR-SCOSMO model presents similar accuracy as a charge-dependent Poisson-Boltzmann model introduced by Hou et al. [J. Chem. Theory Comput. 6, 2303 (2010)]. VR-SCOSMO is then used to examine the hydrolysis of trimethylphosphate and seven other phosphoryl transesterification reactions with different leaving groups. Two-dimensional energy landscapes are constructed for these reactions and calculated barriers are compared to those obtained from ab initio polarizable continuum calculations and experiment. Results of the VR-SCOSMO model are in good agreement in both cases, capturing the rate-limiting reaction barrier and the nature of the transition state.
Niebudek-Bogusz, Ewa; Fiszer, Marta; Kotylo, Piotr; Sliwinska-Kowalska, Mariola
It has been shown that teachers are at risk of developing occupational dysphonia, which accounts for over 25% of all occupational diseases diagnosed in Poland. The most frequently used method of diagnosing voice diseases is videostroboscopy. However, to facilitate objective evaluation of voice efficiency as well as medical certification of occupational voice disorders, it is crucial to implement quantitative methods of voice assessment, particularly voice acoustic analysis. The aim of the study was to assess the results of acoustic analysis in 66 female teachers (aged 40-64 years), including 35 subjects with occupational voice pathologies (e.g., vocal nodules) and 31 subjects with functional dysphonia. The acoustic analysis was performed using the IRIS software, before and after a 30-minute vocal loading test. All participants were subjected also to laryngological and videostroboscopic examinations. After the vocal effort, the acoustic parameters displayed statistically significant abnormalities, mostly lowered fundamental frequency (Fo) and incorrect values of shimmer and noise to harmonic ratio. To conclude, quantitative voice acoustic analysis using the IRIS software seems to be an effective complement to voice examinations, which is particularly helpful in diagnosing occupational dysphonia.
van der Torn, M.; van Gogh, C.D.L.; Verdonck-de Leeuw, I M; Festen, J.M.; Mahieu, H.F.
OBJECTIVE: To analyse the cause of failing voice production by a sound-producing voice prosthesis (SPVP). METHODS: The functioning of a prototype SPVP is described in a female laryngectomee before and after its sound-producing mechanism was impeded by tracheal phlegm. This assessment included:
Bruna Pasqualini Genro
Os receptores vanilóides VR1 estão presentes em grandes quantidades no sistema nervoso periférico (SNP) e têm sido amplamente estudados como integradores de estímulos nocivos. A detecção desse sistema vanilóide também no sistema nervoso central (SNC), leva ao questionamento de qual seria o papel fisiológico dos receptores VR1 localizados no encéfalo. No presente estudo, abordamos a função desses receptores no hipocampo, estrutura essencial para a formação de memórias aversivas. Foram estudado...
If VR-based medical training and assessment is to improve patient care and safety (i.e. a genuine health gain), it has to be based on clinically relevant measurement of performance. Metrics on errors are particularly useful for capturing and correcting undesired behaviors before they occur in the operating room. However, translating clinically relevant metrics and errors into meaningful system design is a challenging process. This paper discusses how an existing task and error analysis was translated into the system design of a VR-based training and assessment environment for Ultrasound Guided Regional Anesthesia (UGRA).
Rothenberg, Martin; Schutte, Harm K
In 1985, at a conference sponsored by the National Institutes of Health, Martin Rothenberg first described a form of nonlinear source-tract acoustic interaction mechanism by which some sopranos, singing in their high range, can use to reduce the total airflow, to allow holding the note longer, and simultaneously enrich the quality of the voice, without straining the voice. (M. Rothenberg, "Source-Tract Acoustic Interaction in the Soprano Voice and Implications for Vocal Efficiency," Fourth International Conference on Vocal Fold Physiology, New Haven, Connecticut, June 3-6, 1985.) In this paper, we describe additional evidence for this type of nonlinear source-tract interaction in some soprano singing and describe an analogous interaction phenomenon in communication engineering. We also present some implications for voice research and pedagogy. Copyright Â© 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Meltzner, Geoffrey S; Heaton, James T; Deng, Yunbin; De Luca, Gianluca; Roy, Serge H; Kline, Joshua C
Each year thousands of individuals require surgical removal of their larynx (voice box) due to trauma or disease, and thereby require an alternative voice source or assistive device to verbally communicate. Although natural voice is lost after laryngectomy, most muscles controlling speech articulation remain intact. Surface electromyographic (sEMG) activity of speech musculature can be recorded from the neck and face, and used for automatic speech recognition to provide speech-to-text or synthesized speech as an alternative means of communication. This is true even when speech is mouthed or spoken in a silent (subvocal) manner, making it an appropriate communication platform after laryngectomy. In this study, 8 individuals at least 6 months after total laryngectomy were recorded using 8 sEMG sensors on their face (4) and neck (4) while reading phrases constructed from a 2,500-word vocabulary. A unique set of phrases were used for training phoneme-based recognition models for each of the 39 commonly used phonemes in English, and the remaining phrases were used for testing word recognition of the models based on phoneme identification from running speech. Word error rates were on average 10.3% for the full 8-sensor set (averaging 9.5% for the top 4 participants), and 13.6% when reducing the sensor set to 4 locations per individual (n=7). This study provides a compelling proof-of-concept for sEMG-based alaryngeal speech recognition, with the strong potential to further improve recognition performance.
Full Text Available The paper describes the FPGA-based implementation of Lithuanian isolated word recognition algorithm. FPGA is selected for parallel process implementation using VHDL to ensure fast signal processing at low rate clock signal. Cepstrum analysis was applied to features extraction in voice. The dynamic time warping algorithm was used to compare the vectors of cepstrum coefficients. A library of 100 words features was created and stored in the internal FPGA BRAM memory. Experimental testing with speaker dependent records demonstrated the recognition rate of 94%. The recognition rate of 58% was achieved for speaker-independent records. Calculation of cepstrum coefficients lasted for 8.52 ms at 50 MHz clock, while 100 DTWs took 66.56 ms at 25 MHz clock.Article in Lithuanian
El Chafei, Cherif
This study describes a system of automatic speaker recognition using the pitch of the voice. The pre-treatment consists in the extraction of the speakers' discriminating characteristics taken from the pitch. The programme of recognition gives, firstly, a preselection and then calculates the distance between the speaker's characteristics to be recognized and those of the speakers already recorded. An experience of recognition has been realized. It has been undertaken with 15 speakers and included 566 tests spread over an intermittent period of four months. The discriminating characteristics used offer several interesting qualities. The algorithms concerning the measure of the characteristics on one hand, the speakers' classification on the other hand, are simple. The results obtained in real time with a minicomputer are satisfactory. Furthermore they probably could be improved if we considered other speaker's discriminating characteristics but this was unfortunately not in our possibilities. (author) [fr
Full Text Available History of present illness: An 80-year-old female with a history of Crohn’s disease presented to the emergency department with chest pain. She had two weeks of exertional chest pain that preceded an episode of chest pain immediately prior to arrival associated with diaphoresis. Her pain nearly completely resolved with sublingual nitroglycerin provided by pre-hospital personnel. She was hemodynamically stable with normal vital signs on arrival. An ECG was immediately obtained. Significant findings: The ECG shows ST-segment depressions in precordial leads V3 through V6, and limb leads I, II, and aVL, and 1 mm of ST-segment elevation in aVR. The initial troponin I was elevated at 1.37 ng/mL (upper limit of normal 0.40. Cardiology decided to delay catheterization until the next day when diffuse coronary disease was discovered (including 90% of the left circumflex stenosis, 60% proximal and 75% mid-left anterior descending stenosis, 75% third diagonal branch stenosis, and 90% posterior descending artery stenosis. The following day, the patient went to the operating room for coronary artery bypass grafting (CABG. Discussion: Traditionally, lead aVR has not received attention when interpreting acutely ischemic changes on ECG, leading some to refer to it as “the forgotten lead.”1 Current guidelines acknowledge the significance of multilead ST depression with coexistent ST elevation in aVR, and this pattern has been identified as the strongest predictor of severe left main coronary artery and/or 3-vessel disease (LM/3VD.2-3 When this ECG pattern is recognized in patients with ischemic symptoms, the emergency physician should involve cardiology early. When managing patients with suspected LM/3VD, it is important to withhold dual anti-platelet therapy as CABG is likely to be indicated,1,3 and guidelines recommend discontinuing P2Y12 inhibitors like clopidogrel or ticagrelor at least 24 hours prior to urgent CABG.2
Kraljevski, Ivan; Tan, Zheng-Hua; Paola Bissiri, Maria
This present paper aims to answer the question whether forced-alignment speech recognition can be used as an alternative to humans in generating reference Voice Activity Detection (VAD) transcriptions. An investigation of the level of agreement between automatic/manual VAD transcriptions and the ......This present paper aims to answer the question whether forced-alignment speech recognition can be used as an alternative to humans in generating reference Voice Activity Detection (VAD) transcriptions. An investigation of the level of agreement between automatic/manual VAD transcriptions...... and the reference ones produced by a human expert was carried out. Thereafter, statistical analysis was employed on the automatically produced and the collected manual transcriptions. Experimental results confirmed that forced-alignment speech recognition can provide accurate and consistent VAD labels....
Speech recognition and synthesis technologies have become commercially viable over recent years. Two current market leading products in speech recognition technology are Dragon NaturallySpeaking and IBM ViaVoice. This report describes the development of speech user interfaces incorporating these products with Lotus Notes and Java applications. These interfaces enable data entry using speech recognition and allow warnings and instructions to be issued via speech synthesis. The development of a military vocabulary to improve user interaction is discussed. The report also describes an evaluation in terms of speed of the various speech user interfaces developed using Dragon NaturallySpeaking and IBM ViaVoice with a Lotus Notes Command and Control Support System Log database.
Recognition and toleration are ways of relating to the diversity characteristic of multicultural societies. The article concerns the possible meanings of toleration and recognition, and the conflict that is often claimed to exist between these two approaches to diversity. Different forms...... or interpretations of recognition and toleration are considered, confusing and problematic uses of the terms are noted, and the compatibility of toleration and recognition is discussed. The article argues that there is a range of legitimate and importantly different conceptions of both toleration and recognition...
The results of 12 years' efforts devoted to the construction of the VR-1 ''Vrabec'' training reactor at the Faculty of Nuclear Science and Physical Engineering, Czech Technical University in Prague and to establishing the training reactor department, as well as the contribution of the training reactor facility to the teaching and scientific activities of the Faculty are presented in a comprehensive manner. The thesis is divided into 2 parts: (i) preconditions, reactor construction and commissioning, and constituting the reactor department, and (ii) basic and comprehensive information concerning the current utilization of the reactor for the benefit of students from various university level institutions. The prospects of scientific activities of the department are also outlined. Attention is paid to selected nuclear safety aspects of the reactor during operation and teaching of students, as well as to its innovated digital control system whose implementation is planned. The results achieved are compared with the initial goals and with similar experience abroad. (P.A.)
Wei, Riyu; Wu, Heng
An interactive genetic algorithm (IGA) framework for the design of support schemes to deep excavations is proposed in this paper, in which virtual reality (VR) is used as an aid to the evaluation of design schemes that is performed interactively. The fitness of a scheme individual is evaluated by two steps. Firstly a fitness value is automatically assigned to a scheme individual according to the the estimated construction cost of the individual. And the human evaluation is introduced to modify the fitness value by taking into account other factors, such as the feasibility factor. The design scheme is composed of four basic categories, i. e., cantilever walls, reinforced soil walls, tieback systems and bracing systems, each of which is encoded by a binary string. To assist human evaluation, 3D models of design schemes are created and visualized in a virtual reality environment, providing designers with a reality sense of various schemes
Krogh, Ellen; Piekut, Anke
This paper investigates issues of voice and narrative in L1 writing. Three branches of research are initial-ly discussed: research on narratives as resources for identity work, research on writer identity and voice as an essential aspect of identity, and research on Bildung in L1 writing. Subsequ...... training of voice and narratives as a resource for academic writing, and that the Bildung potential of L1 writing may be tied to this issue.......This paper investigates issues of voice and narrative in L1 writing. Three branches of research are initial-ly discussed: research on narratives as resources for identity work, research on writer identity and voice as an essential aspect of identity, and research on Bildung in L1 writing...... in lower secondary L1, she found that her previous writing strategies were not rewarded in upper secondary school. In the second empiri-cal study, two upper-secondary exam papers are investigated, with a focus on their approaches to exam genres and their use of narrative resources to address issues...
...; requests for recognition; withdrawal of recognition; accreditation of representatives; roster. 1292.2...; requests for recognition; withdrawal of recognition; accreditation of representatives; roster. (a) Qualifications of organizations. A non-profit religious, charitable, social service, or similar organization...
Lungaro, Pietro; Sjoberg, Rickard; Valero, Alfredo Jose Fanghella; Mittal, Ashutosh; Tollmar, Konrad
This paper presents a novel approach to content delivery for video streaming services. It exploits information from connected eye-trackers embedded in the next generation of VR Head Mounted Displays (HMDs). The proposed solution aims to deliver high visual quality, in real time, around the users' fixations points while lowering the quality everywhere else. The goal of the proposed approach is to substantially reduce the overall bandwidth requirements for supporting VR video experiences while delivering high levels of user perceived quality. The prerequisites to achieve these results are: (1) mechanisms that can cope with different degrees of latency in the system and (2) solutions that support fast adaptation of video quality in different parts of a frame, without requiring a large increase in bitrate. A novel codec configuration, capable of supporting near-instantaneous video quality adaptation in specific portions of a video frame, is presented. The proposed method exploits in-built properties of HEVC encoders and while it introduces a moderate amount of error, these errors are indetectable by users. Fast adaptation is the key to enable gaze-aware streaming and its reduction in bandwidth. A testbed implementing gaze-aware streaming, together with a prototype HMD with in-built eye tracker, is presented and was used for testing with real users. The studies quantified the bandwidth savings achievable by the proposed approach and characterize the relationships between Quality of Experience (QoE) and network latency. The results showed that up to 83% less bandwidth is required to deliver high QoE levels to the users, as compared to conventional solutions.
Liang, Yo-Wen; Lee, An-Sheng; Liu, Shuo-Fang
The difficulty of Virtual Reality application in industrial design education and learning is VR engineers cannot comprehend what the important functions or elements are for students. In addition, a general-purpose VR usually confuses the students and provides neither good manipulation means nor useful toolkits. To solve these problems, the…
Moro, Christian; Stromberga, Zane; Stirling, Allan
Consumer-grade virtual reality has recently become available for both desktop and mobile platforms and may redefine the way that students learn. However, the decision regarding which device to utilise within a curriculum is unclear. Desktop-based VR has considerably higher setup costs involved, whereas mobile-based VR cannot produce the quality of…
Full Text Available Objective. The purpose of this pilot study was to determine whether Super Pop VR, a low-cost virtual reality (VR system, was a feasible system for documenting improvement in children with cerebral palsy (CP and whether a home-based VR intervention was effective. Methods. Three children with CP participated in this study and received an 8-week VR intervention (30 minutes × 5 sessions/week using the commercial EyeToy Play VR system. Reaching kinematics measured by Super Pop VR and two fine motor tools (Bruininks-Oseretsky Test of Motor Proficiency second edition, BOT-2, and Pediatric Motor Activity Log, PMAL were tested before, mid, and after intervention. Results. All children successfully completed the evaluations using the Super Pop VR system at home where 85% of the reaches collected were used to compute reaching kinematics, which is compatible with literature using expensive motion analysis systems. Only the child with hemiplegic CP and more impaired arm function improved the reaching kinematics and functional use of the affected hand after intervention. Conclusion. Super Pop VR proved to be a feasible evaluation tool in children with CP.
Matheson, Jennifer L.
Transcribing interview data is a time-consuming task that most qualitative researchers dislike. Transcribing is even more difficult for people with physical limitations because traditional transcribing requires manual dexterity and the ability to sit at a computer for long stretches of time. Researchers have begun to explore using an automated…
Jones, Anna B; Farrall, Andrew J; Belin, Pascal; Pernet, Cyril R
As we listen to someone speaking, we extract both linguistic and non-linguistic information. Knowing how these two sets of information are processed in the brain is fundamental for the general understanding of social communication, speech recognition and therapy of language impairments. We investigated the pattern of performances in phoneme versus gender categorization in left and right hemisphere stroke patients, and found an anatomo-functional dissociation in the right frontal cortex, establishing a new syndrome in voice discrimination abilities. In addition, phoneme and gender performances were most often associated than dissociated in the left hemisphere patients, suggesting a common neural underpinnings. Copyright © 2015 Elsevier Ltd. All rights reserved.
Lyberg-Åhlander, Viveka; Rydell, Roland; Löfqvist, Anders
use and prevalence of voice problems in teachers and to explore their ratings of vocally loading aspects of their working environment. Method: A questionnaire-survey in 467 teachers aiming to explore the prevalence of voice problems in teaching staff identified teachers with voice problems and vocally...... in the teaching environment and aspects of the classroom environment were also measured. Results: Teachers with voice problems were more affected by any loading factor in the work-environment and were more perceptive of the room acoustics. Differences between the groups were found during field......-measurements of the voice, while there were no differences in the findings from the clinical examinations of larynx and voice. Conclusion: Teachers suffering from voice problems react stronger to loading factors in the teaching environment. It is in the interplay between the individual and the work environment that voice...
Holzrichter, J.F.; Ng, L.C.
The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs
Holzrichter, John F.; Ng, Lawrence C.
The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching.
Nezbeda, Ivo; Melnyk, R.; Trokhymchuk, A.
Roč. 309, č. 2 (2011), s. 174-178 ISSN 0378-3812 R&D Projects: GA AV ČR IAA400720710 Institutional research plan: CEZ:AV0Z40720504 Keywords : perturbation theory * SAFT-VR * augmented van der Waals Subject RIV: CF - Physical ; Theoretical Chemistry Impact factor: 2.139, year: 2011
Full Text Available The goal of this article is to present a first list of ethical concerns that may arise from research and personal use of virtual reality (VR and related technology, and to offer concrete recommendations for minimizing those risks. Many of the recommendations call for focused research initiatives. In the first part of the article, we discuss the relevant evidence from psychology that motivates our concerns. In section 1.1, we cover some of the main results suggesting that one’s environment can influence one’s psychological states, as well as recent work on inducing illusions of embodiment. Then, in section 1.2, we go on to discuss recent evidence indicating that immersion in VR can have psychological effects that last after leaving the virtual environment. In the second part of the article we turn to the risks and recommendations. We begin, in section 2.1, with the research ethics of VR, covering six main topics: the limits of experimental environments, informed consent, clinical risks, dual-use, online research, and a general point about the limitations of a code of conduct for research. Then, in section 2.2, we turn to the risks of VR for the general public, covering four main topics: long-term immersion, neglect of the social and physical environment, risky content, and privacy. We offer concrete recommendations for each of these ten topics, summarized in Table 1.
Cobb, S.C.; Richir, S.; D'Cruz, M.; Klinger, E.; Day, A.; David, P.; Gardeux, F.; van den Broek, Egon; van der Voort, Mascha C.; Meijer, F.; Izkara, J.L.; Mavrikios, D.
INTUITION is the European Network of Excellence on virtual reality and virtual environments applications for future workspaces. The purpose of the network is to gather expertise from partner members and determine the future research agenda for the development and use of virtual reality (VR)
Dammann, F.; Bode, A.; Heuschmid, M.; Schwaderer, E.; Schaich, M.; Seemann, M.; Claussen, C.D.; Maassen, M.; Zenner, H.P.
Purpose: To prove the feasibility of a preoperative fitting test for an implantable hearing aid using a VR environment. Methods: A high-resolution spiral CT was performed after mastoidectomy in 10 temporal bone specimens. The bony structures were segmented and merged with the computer-aided design (CAD) data of the hearing aid in a VR environment. For each specimen a three-dimensional fitting test was carried out by three examiners determining the implantability of the hearing aid. The implantation simulation was compared with the real implantation procedure performed by an experienced ENT surgeon. Results: The used VR system enabled real-time 3D-visualisation and manipulation of CT- and CAD-data. All objects could be independently moved in all three dimensions. The VR fitting test corresponded closely with the real implantation. The implantability of the hearing aid was properly predicted by all three examiners. Conclusion: Merging CT and CAD data in a virtual reality environment bears high potential for the presurgical determination of the fit and mountability of medical implants in complex anatomical regions. (orig.) [de
Zimmermann, C.; Piccolo, L. del; Bensing, J.; Bergvik, S.; Haes, H. de; Eide, H.; Fletcher, I.; Goss, C.; Heaven, C.; Humphris, G.; Young-Mi, K.; Langewitz, W.; Meeuwesen, L.; Nuebling, M.; Rimondini, M.; Salmon, P.; Dulmen, S. van; Wissow, L.; Zandbelt, L.; Finset, A.
Objective: To present the Verona Coding Definitions of Emotional Sequences (VR-CoDES CC), a consensus based system for coding patient expressions of emotional distress in medical consultations, defined as Cues or Concerns. Methods: The system was developed by an international group of communication
Pausch, Randy; Aviles, Walter; Durlach, Nathaniel; Robinett, Warren; Zyda, Michael
In 1992, at the request of a consortium of federal agencies, the National Research Council established a committee to "recommand a national research and development agenda in the area of virtual reality" to set U.S. government R&D funding priorities for virtual reality (VR) for the next decade....
Verdaasdonk, E.G.G.; Dankelman, J.; Lange, J.F.; Stassen, L.P.S.
Background- Laparoscopic suturing is one of the most difficult tasks in endoscopic surgery, requiring extensive training. The aim of this study was to determine the transfer validity of knot-tying training on a virtual-reality (VR) simulator to a realistic laparoscopic environment. Methods- Twenty
Suzuki, Naoki; Hattori, Asaki; Tanoue, Kazuo; Ieiri, Satoshi; Konishi, Kozo; Tomikawa, Morimasa; Kenmotsu, Hajime; Hashizume, Makoto
This report presents the development of a VR (virtual real) training system of robotic peroral operation procedure for endoscopic resection of gastric mucosa as the training is essential because the procedure differs from usual one hitherto. For VR operation space, used is reporters' sphere-filled organ model (SFM), which is deformed by and repels to, the outside force as a soft tissue rapidly in the real time. The deformation and repellence are computable. The SFM space is reconstructed to 3D of the inner environment of stomach using MRI data. The endoscope has, at the right and left side of its top, 2 arms of inner needle knife-equipped robotic forceps and is inserted perorally for operation. In VR, the forceps can grab the gastric mucosa, cut it with the knife to complete resection and carry the specimen out of the body. For the procedure training, the time required for hemostasis, bleeding volume, trace of the arms, intensity and direction of the outer force given are recorded, with which trainee's safety and degree of skill are evaluable in VR. Hydration step and clipping to close the wound are to be further added in the procedure. (T.T.)
Hessel, M.; Buzink, S.N.; Schoot, B.C.; Jakimowicz, J.J.
Objective: To secure patient safety, skills needed for laparoscopy are preferably obtained in a non-patient setting. Therefore, we assessed face and construct validity of performance of a salpingectomy in case of ectopic pregnancy on the SimSurgery SEP VR simulator. Materials and Methods: Fifteen
Gao Zhenlong; Wang Qiang; Liu Caixia
Objective: To explore the effects of slice thickness, reconstructive thickness and reconstructive interval on VR image quality in multi-slice CT, in order to select the best slice thickness and reconstructive parameters for the imaging. Methods: Multi-slice CT scan was applied on a rubber dinosaur model with different slice thickness. VR images were reconstructed with different reconstructive thickness and reconstructive interval. Five radiologists were invited to evaluate the quality of the images without knowing anything about the parameters. Results: The slice thickness, reconstructive thickness and reconstructive interval did have effects on VR image quality and the effective degree was different. The effective coefficients were V 1 =1413.033, V 2 =563.733, V 3 =390.533, respectively. The parameters interacted with the others (P<0.05). The smaller of those parameters, the better of the image quality. With a small slice thickness and a reconstructive slice equal to slice thickness, the image quality had no obvious difference when the reconstructive interval was 1/2, 1/3, 1/4 of the slice thickness. Conclusion: A relative small scan slice thickness, a reconstructive slice equal to slice thickness and a reconstructive interval 1/2 of the slice thickness should be selected for the best VR image quality. The image quality depends mostly on the slice thickness. (authors)
Yu, Francis T. S.; Jutamulia, Suganda
Contributors; Preface; 1. Pattern recognition with optics Francis T. S. Yu and Don A. Gregory; 2. Hybrid neural networks for nonlinear pattern recognition Taiwei Lu; 3. Wavelets, optics, and pattern recognition Yao Li and Yunglong Sheng; 4. Applications of the fractional Fourier transform to optical pattern recognition David Mendlovic, Zeev Zalesky and Haldum M. Oxaktas; 5. Optical implementation of mathematical morphology Tien-Hsin Chao; 6. Nonlinear optical correlators with improved discrimination capability for object location and recognition Leonid P. Yaroslavsky; 7. Distortion-invariant quadratic filters Gregory Gheen; 8. Composite filter synthesis as applied to pattern recognition Shizhou Yin and Guowen Lu; 9. Iterative procedures in electro-optical pattern recognition Joseph Shamir; 10. Optoelectronic hybrid system for three-dimensional object pattern recognition Guoguang Mu, Mingzhe Lu and Ying Sun; 11. Applications of photrefractive devices in optical pattern recognition Ziangyang Yang; 12. Optical pattern recognition with microlasers Eung-Gi Paek; 13. Optical properties and applications of bacteriorhodopsin Q. Wang Song and Yu-He Zhang; 14. Liquid-crystal spatial light modulators Aris Tanone and Suganda Jutamulia; 15. Representations of fully complex functions on real-time spatial light modulators Robert W. Cohn and Laurence G. Hassbrook; Index.
Full Text Available Speech recognition or speech to text processing, is a process of recognizing human speech by the computer and converting into text. In speech recognition, transcripts are created by taking recordings of speech as audio and their text transcriptions. Speech based applications which include Natural Language Processing (NLP techniques are popular and an active area of research. Input to such applications is in natural language and output is obtained in natural language. Speech recognition mostly revolves around three approaches namely Acoustic phonetic approach, Pattern recognition approach and Artificial intelligence approach. Creation of acoustic model requires a large database of speech and training algorithms. The output of an ASR system is recognition and translation of spoken language into text by computers and computerized devices. ASR today finds enormous application in tasks that require human machine interfaces like, voice dialing, and etc. Our key contribution in this paper is to create corpora for Marathi language and explore the use of Sphinx engine for automatic speech recognition
Full Text Available Multimedia telephony is a delay-sensitive application. Packet losses, relatively less critical than delay, are allowed up to a certain threshold. They represent the QoS constraints that have to be respected to guarantee the operation of the telephony service and user satisfaction. In this work we introduce a new smartphone architecture characterized by two process levels called application processor (AP and mobile termination (MT, respectively. Here, they communicate through a serial channel. Moreover, we focus our attention on two very important UMTS services: voice and video telephony. Through a simulation study the impact of voice and video telephony is evaluated on the structure considered using the protocols known at this moment to realize voice and video telephony
The authors present a computer-based expert computer system called Mammo-Icon, which automatically assists the radiologist's case analysis by reviewing the trigger phrase output of a commercially available voice transcription system in he domain of mammography. A commercially available PC-based voice dictation system is coupled to an expert system implemented on a microcomputer. Software employs the LISP and C computer languages. Mammo-Icon responds to the trigger phrase output of a voice dictation system with a textual discussion of the potential significance of the findings that have been described and a display of reference images that may help the radiologist to confirm a suspected diagnosis or consider additional diagnoses. This results in automatic availability of potentially useful computer-based expert advice, making such systems much more likely to be used in routine clinical practice
Full Text Available Music is a powerful medium capable of eliciting a broad range of emotions. Although the relationship between language and music is well documented, relatively little is known about the effects of lyrics and the voice on the emotional processing of music and on listeners’ preferences. In the present study, we investigated the effects of vocals in music on participants’ perceived valence and arousal in songs. Participants (N = 50 made valence and arousal ratings for familiar songs that were presented with and without the voice. We observed robust effects of vocal content on perceived arousal. Furthermore, we found that the effect of the voice on enhancing arousal ratings is independent of familiarity of the song and differs across genders and age: females were more influenced by vocals than males; furthermore these gender effects were enhanced among older adults. Results highlight the effects of gender and aging in emotion perception and are discussed in terms of the social roles of music.
Mohammad Reza Beyranvand
Full Text Available Introduction: Among the 12 leads studied in electrocardiography (ECG, lead aVR can be considered as the most forgotten part of it since no attention is paid to it as the mirror image of other leads. Therefore, the present study has been designed with the aim of evaluating the prevalence of ST segment changes in lead aVR and its relationship with the outcome of these patients.Methods: In this retrospective cross sectional study medical profiles of patients who had presented to emergency department with the final diagnosis of myocardial infarction (MI in a 4-year period were evaluated regarding changes of ST segment in lead aVR and its relationship with in-hospital mortality, the number of vessels involved, infarct location and cardiac ejection fraction.Results: 288 patients with the mean age of 59.00 ± 13.14 (18 – 91 were evaluated (79.2% male. 168 (58.3% patients had the mentioned changes (79.2% male. There was no significant relationship between presence of ST changes in lead aVR with infarct location (p = 0.976, number of vessels involved (p = 0.269 and ejection fraction on admission (p = 0.801. However, ST elevation ≥ 1 mv in lead aVR had a significant relationship with mortality (Odds = 7.72, 95% CI: 3.07 – 19.42, p < 0.001. Sensitivity, specificity, positive and negative predictive values and positive and negative likelihood ratios of ST elevation ≥ 1 for prediction of in-hospital mortality were 41.66 (95% CI: 22.79 – 63.05, 91.53 (95% CI: 87.29 – 94.50, 31.25 (95% CI: 16.74 – 50.13, 94.44 (95% CI: 90.65 – 96.81, 0.45 (95% CI: 0.25 – 0.79, and 0.05 (95% CI: 0.03 – 0.09, respectively.Conclusion: Based on the results of the present study, the prevalence of ST segment changes in lead aVR was estimated to be 58.3%. There was no significant relationship between these changes and the number of vessels involved in angiography, infarct location and cardiac ejection fraction. However, presence of ST elevation ≥ 1 in lead aVR
He, Zhiyong; Zhang, Zhengguang; Zhao, Chunshen
Durian the promotion and applications of rural information, different geographical dialect voice interaction is a very complex issue. Through in-depth analysis of TTS core technologies, this paper presents the methods of intelligent segmentation, word segmentation algorithm and intelligent voice thesaurus construction in the different dialects context. And then COM based development methodology for specific context voice processing system implementation and programming method. The method has a certain reference value for the rural dialect and voice processing applications.
Hattori, Mariko; Sumita, Yuka I.; Taniguchi, Hisashi
Objective speech evaluation using acoustic measurement is needed for the proper rehabilitation of maxillectomy patients. For digital evaluation of consonants, measurement of voice onset time is one option. However, voice onset time has not been measured in maxillectomy patients as their consonant sound spectra exhibit unique characteristics that make the measurement of voice onset time challenging. In this study, we established criteria for measuring voice onset time in maxillectomy patients ...
Elizabeth U. Grillo; Jenna N. Brosious; Staci L. Sorrell; Supraja Anand
This study assessed the within-subject variability of voice measures captured using different recording devices (i.e., smartphones and head mounted microphone) and software programs (i.e., Analysis of Dysphonia in Speech and Voice (ADSV), Multi-dimensional Voice Program (MDVP), and Praat). Correlations between the software programs that calculated the voice measures were also analyzed. Results demonstrated no significant within-subject variability across devices and software and that some o...
Large vocabulary speech recognition, its techniques and its software and hardware technology, are being developed, aimed at providing the office user with a tool that could significantly improve both quantity and quality of his work: the dictation machine, which allows memos and documents to be input using voice and a microphone instead of fingers and a keyboard. The IBM Rome Science Center, together with the IBM Research Division, has built a prototype recognizer that accepts sentences in natural language from 20.000-word Italian vocabulary. The unit runs on a personal computer equipped with a special hardware capable of giving all the necessary computing power. The first laboratory experiments yielded very interesting results and pointed out such system characteristics to make its use possible in operational environments. To this purpose, the dictation of medical reports was considered as a suitable application. In cooperation with the 2nd Radiology Department of S. Maria della Misericordia Hospital (Udine, Italy), a system was experimented by radiology department doctors during their everyday work. The doctors were able to directly dictate their reports to the unit. The text appeared immediately on the screen, and eventual errors could be corrected either by voice or by using the keyboard. At the end of report dictation, the doctors could both print and archive the text. The report could also be forwarded to hospital information system, when the latter was available. Our results have been very encouraging: the system proved to be robust, simple to use, and accurate (over 95% average recognition rate). The experiment was precious for suggestion and comments, and its results are useful for system evolution towards improved system management and efficency
The voice provides an entrance to discuss gender and related fundamental issues in electroacoustic music that are relevant as well in other musical genres and outside of music per se: the role of the female voice; the use of language versus non-verbal vocal sounds; the relation of voice, embodiment
Using mythology as a generative matrix, this article investigates the relationship between knowledge, words, embodiment and gender as they play out in academic writing's voice and, in particular, in doctoral voice. The doctoral thesis is defensive, a performance seeking admittance into discipline scholarship. Yet in finding its scholarly voice,…
Rocha, Bruna Rainho; Behlau, Mara
To verify the influence of sleep quality on the voice. Descriptive and analytical cross-sectional study. Data were collected by an online or printed survey divided in three parts: (1) demographic data and vocal health aspects; (2) self-assessment of sleep and vocal quality, and the influence that sleep has on voice; and (3) sleep and voice self-assessment inventories-the Epworth Sleepiness Scale (ESS), the Pittsburgh Sleep Quality Index (PSQI), and the Voice Handicap Index reduced version (VHI-10). A total of 862 people were included (493 women, 369 men), with a mean age of 32 years old (maximum age of 79 and minimum age of 18 years old). The perception of the influence that sleep has on voice showed a difference (P influence a voice handicap are vocal self-assessment, ESS total score, and self-assessment of the influence that sleep has on voice. The absence of daytime sleepiness is a protective factor (odds ratio [OR] > 1) against perceived voice handicap; the presence of daytime sleepiness is a damaging factor (OR influences voice. Perceived poor sleep quality is related to perceived poor vocal quality. Individuals with a voice handicap observe a greater influence of sleep on voice than those without. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Gunjawate, Dhanshree R.; Ravi, Rohit; Bellur, Rajashekhar
Purpose: Singers are vocal athletes having specific demands from their voice and require special consideration during voice evaluation. Presently, there is a lack of standards for acoustic evaluation in them. The aim of the present study was to systematically review the available literature on the acoustic analysis of voice in singers. Method: A…
Boltežar, Lučka; Šereg Bahar, Maja
The aim of this paper is to compare the prevalence of voice disorders and the risk factors for them in different occupations with a vocal load in Slovenia. A meta-analysis of six different Slovenian studies involving teachers, physicians, salespeople, catholic priests, nurses and speech-and-language therapists (SLTs) was performed. In all six studies, similar questions about the prevalence of voice disorders and the causes for them were included. The comparison of the six studies showed that more than 82% of the 2347 included subjects had voice problems at some time during their career. The teachers were the most affected by voice problems. The prevalent cause of voice problems was the vocal load in teachers and salespeople and respiratory-tract infections in all the other occupational groups. When the occupational groups were compared, it was stated that the teachers had more voice problems and showed less care for their voices than the priests. The physicians had more voice problems and showed better consideration of vocal hygiene rules than the SLTs. The majority of all the included subjects did not receive instructions about voice care during education. In order to decrease the prevalence of voice disorders in vocal professionals, a screening program is recommended before the beginning of their studies. Regular courses on voice care and proper vocal technique should be obligatory for all professional voice users during their career. The inclusion of dysphonia in the list of occupational diseases should be considered in Slovenia as it is in some European countries.
Fritsch, Jonas; Jacobsen, Mogens
In this paper, we present the preliminary results from an ongoing interaction design experiment, the Voice Pump. The Voice Pump is an affectively engaging air-based interface for attuning to the differential qualities of voices in order to change attachments between native Danish speakers and non-native...
This is the first text to provide a unified and self-contained introduction to visual pattern recognition and machine learning. It is useful as a general introduction to artifical intelligence and knowledge engineering, and no previous knowledge of pattern recognition or machine learning is necessary. Basic for various pattern recognition and machine learning methods. Translated from Japanese, the book also features chapter exercises, keywords, and summaries.
Ventura, Joseph; Wood, Rachel C; Jimenez, Amy M; Hellemann, Gerhard S
In schizophrenia patients, one of the most commonly studied deficits of social cognition is emotion processing (EP), which has documented links to facial recognition (FR). But, how are deficits in facial recognition linked to emotion processing deficits? Can neurocognitive and symptom correlates of FR and EP help differentiate the unique contribution of FR to the domain of social cognition? A meta-analysis of 102 studies (combined n=4826) in schizophrenia patients was conducted to determine the magnitude and pattern of relationships between facial recognition, emotion processing, neurocognition, and type of symptom. Meta-analytic results indicated that facial recognition and emotion processing are strongly interrelated (r=.51). In addition, the relationship between FR and EP through voice prosody (r=.58) is as strong as the relationship between FR and EP based on facial stimuli (r=.53). Further, the relationship between emotion recognition, neurocognition, and symptoms is independent of the emotion processing modality - facial stimuli and voice prosody. The association between FR and EP that occurs through voice prosody suggests that FR is a fundamental cognitive process. The observed links between FR and EP might be due to bottom-up associations between neurocognition and EP, and not simply because most emotion recognition tasks use visual facial stimuli. In addition, links with symptoms, especially negative symptoms and disorganization, suggest possible symptom mechanisms that contribute to FR and EP deficits. © 2013 Elsevier B.V. All rights reserved.
Webb, Andrew R
Statistical pattern recognition relates to the use of statistical techniques for analysing data measurements in order to extract information and make justified decisions. It is a very active area of study and research, which has seen many advances in recent years. Applications such as data mining, web searching, multimedia data retrieval, face recognition, and cursive handwriting recognition, all require robust and efficient pattern recognition techniques. This third edition provides an introduction to statistical pattern theory and techniques, with material drawn from a wide range of fields,
Smits, R.; Marres, H.A.; de Jong, F.
BACKGROUND: Voice disorders have a multifactorial genesis and may be present in various ways. They can cause a significant communication handicap and impaired quality of life. OBJECTIVE: To assess the effect of vocal fold lesions and voice quality on voice handicap and psychosomatic well-being.
National Aeronautics and Space Administration — Speaking to the cockpit as a method of system management in flight can become an effective interaction method, since voice communication is very efficient. Automated...
Manka, David L
Voice over Internet Protocol (VoIP) is an emerging technology with the potential to assist the United States Marine Corps in solving communication challenges stemming from modern operational concepts...
The prevalence of voice disorders in the teacher population in Latvia has not been studied so far and this is the first epidemiological study whose goal is to investigate the prevalence of voice disorders and their risk factors in this professional group. A wide cross-sectional study using stratified sampling methodology was implemented in the general education schools of Latvia. The self-administered voice risk factor questionnaire and the Voice Handicap Index were completed by 522 teachers. Two teachers groups were formed: the voice disorders group which included 235 teachers with actual voice problems or problems during the last 9 months; and the control group which included 174 teachers without voice disorders. Sixty-six percent of teachers gave a positive answer to the following question: Have you ever had problems with your voice? Voice problems are more often found in female than male teachers (68.2% vs 48.8%). Music teachers suffer from voice disorders more often than teachers of other subjects. Eighty-two percent of teachers first faced voice problems in their professional carrier. The odds of voice disorders increase if the following risk factors exist: extra vocal load, shouting, throat clearing, neglecting of personal health, background noise, chronic illnesses of the upper respiratory tract, allergy, job dissatisfaction, and regular stress in the working place. The study findings indicated a high risk of voice disorders among Latvian teachers. The study confirmed data concerning the multifactorial etiology of voice disorders. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Full Text Available Sophocles’ Oedipus the King has often inspired concurrent interpretations examining the tragic irony of the play and the traumatic neurosis of its protagonist. The Theban king epitomizes a man who knows everything but himself, and Sophocles’ use of irony allows Oedipus to discover the truth in a manner that Freud viewed in The Interpretation of Dreams as “comparable to the work of a psychoanalysis.” Psychoanalytical readings of Oedipus at times depend greatly on his role as a doubled figure, but this article specifically investigates his doubled voice in order to demonstrate the interrelated, chiasmic relationship between Oedipus’ trauma and the trope of irony. It argues, in fact, that irony serves as the language, so to speak, of the traumatic experiences haunting the king and his city, but it also posits that this doubled voice compounds the irony of the play and its hero. In other words, in addition to the Sophoclean irony that dominates the work, the doubling of the king’s voice reveals a modified form of Socratic irony that contributes to the tragedy’s power. Consequently, even after the king’s recognition of the truth ultimately resolves the work’s tragic irony, Oedipus remains divided by a state of simultaneous knowledge and ignorance.
Jackson, Keith; Jackson, Jacqui; Hopkinson, Gillian
This full paper from the Marketing and Retail track of BAM 2013 investigates the relationships between suppliers and retailers in the UK convenience store sector in terms of Hirschman's model whereby members of a group can influence it by either expressing their opinions (voice) or leaving it in protest (exit). Suppliers may create loyalty among retailers by raising exit costs and/or allowing them to express their voices. The investigation was carried out using the recorded turnover of the to...
Fischer, Emily; Goberman, Alexander M.
Research has found that speaking rate has an effect on voice onset time (VOT). Given that Parkinson disease (PD) affects speaking rate, the purpose of this study was to examine VOT with the effect of rate removed (VOT ratio), along with the traditional VOT measure, in individuals with PD. VOT and VOT ratio were examined in 9 individuals with PD…