WorldWideScience

Sample records for voiced sound signal

  1. Sound induced activity in voice sensitive cortex predicts voice memory ability

    Directory of Open Access Journals (Sweden)

    Rebecca eWatson

    2012-04-01

    Full Text Available The ‘temporal voice areas’ (TVAs (Belin et al., 2000 of the human brain show greater neuronal activity in response to human voices than to other categories of nonvocal sounds. However, a direct link between TVA activity and voice perceptionbehaviour has not yet been established. Here we show that a functional magnetic resonance imaging (fMRI measure of activity in the TVAs predicts individual performance at a separately administered voice memory test. This relation holds whengeneral sound memory ability is taken into account. These findings provide the first evidence that the TVAs are specifically involved in voice cognition.

  2. Mapping Phonetic Features for Voice-Driven Sound Synthesis

    Science.gov (United States)

    Janer, Jordi; Maestre, Esteban

    In applications where the human voice controls the synthesis of musical instruments sounds, phonetics convey musical information that might be related to the sound of the imitated musical instrument. Our initial hypothesis is that phonetics are user- and instrument-dependent, but they remain constant for a single subject and instrument. We propose a user-adapted system, where mappings from voice features to synthesis parameters depend on how subjects sing musical articulations, i.e. note to note transitions. The system consists of two components. First, a voice signal segmentation module that automatically determines note-to-note transitions. Second, a classifier that determines the type of musical articulation for each transition based on a set of phonetic features. For validating our hypothesis, we run an experiment where subjects imitated real instrument recordings with their voice. Performance recordings consisted of short phrases of saxophone and violin performed in three grades of musical articulation labeled as: staccato, normal, legato. The results of a supervised training classifier (user-dependent) are compared to a classifier based on heuristic rules (user-independent). Finally, from the previous results we show how to control the articulation in a sample-concatenation synthesizer by selecting the most appropriate samples.

  3. Analysis of failure of voice production by a sound-producing voice prosthesis

    NARCIS (Netherlands)

    van der Torn, M.; van Gogh, C.D.L.; Verdonck-de Leeuw, I M; Festen, J.M.; Mahieu, H.F.

    OBJECTIVE: To analyse the cause of failing voice production by a sound-producing voice prosthesis (SPVP). METHODS: The functioning of a prototype SPVP is described in a female laryngectomee before and after its sound-producing mechanism was impeded by tracheal phlegm. This assessment included:

  4. Developmental Changes in Locating Voice and Sound in Space

    Science.gov (United States)

    Kezuka, Emiko; Amano, Sachiko; Reddy, Vasudevi

    2017-01-01

    We know little about how infants locate voice and sound in a complex multi-modal space. Using a naturalistic laboratory experiment the present study tested 35 infants at 3 ages: 4 months (15 infants), 5 months (12 infants), and 7 months (8 infants). While they were engaged frontally with one experimenter, infants were presented with (a) a second experimenter’s voice and (b) castanet sounds from three different locations (left, right, and behind). There were clear increases with age in the successful localization of sounds from all directions, and a decrease in the number of repetitions required for success. Nonetheless even at 4 months two-thirds of the infants attempted to search for the voice or sound. At all ages localizing sounds from behind was more difficult and was clearly present only at 7 months. Perseverative errors (looking at the last location) were present at all ages and appeared to be task specific (only present in the 7 month-olds for the behind location). Spontaneous attention shifts by the infants between the two experimenters, evident at 7 months, suggest early evidence for infant initiation of triadic attentional engagements. There was no advantage found for voice over castanet sounds in this study. Auditory localization is a complex and contextual process emerging gradually in the first half of the first year. PMID:28979220

  5. The Sound of Voice: Voice-Based Categorization of Speakers' Sexual Orientation within and across Languages.

    Directory of Open Access Journals (Sweden)

    Simone Sulpizio

    Full Text Available Empirical research had initially shown that English listeners are able to identify the speakers' sexual orientation based on voice cues alone. However, the accuracy of this voice-based categorization, as well as its generalizability to other languages (language-dependency and to non-native speakers (language-specificity, has been questioned recently. Consequently, we address these open issues in 5 experiments: First, we tested whether Italian and German listeners are able to correctly identify sexual orientation of same-language male speakers. Then, participants of both nationalities listened to voice samples and rated the sexual orientation of both Italian and German male speakers. We found that listeners were unable to identify the speakers' sexual orientation correctly. However, speakers were consistently categorized as either heterosexual or gay on the basis of how they sounded. Moreover, a similar pattern of results emerged when listeners judged the sexual orientation of speakers of their own and of the foreign language. Overall, this research suggests that voice-based categorization of sexual orientation reflects the listeners' expectations of how gay voices sound rather than being an accurate detector of the speakers' actual sexual identity. Results are discussed with regard to accuracy, acoustic features of voices, language dependency and language specificity.

  6. METHODS FOR QUALITY ENHANCEMENT OF USER VOICE SIGNAL IN VOICE AUTHENTICATION SYSTEMS

    Directory of Open Access Journals (Sweden)

    O. N. Faizulaieva

    2014-03-01

    Full Text Available The reasonability for the usage of computer systems user voice in the authentication process is proved. The scientific task for improving the signal/noise ratio of the user voice signal in the authentication system is considered. The object of study is the process of input and output of the voice signal of authentication system user in computer systems and networks. Methods and means for input and extraction of voice signal against external interference signals are researched. Methods for quality enhancement of user voice signal in voice authentication systems are suggested. As modern computer facilities, including mobile ones, have two-channel audio card, the usage of two microphones is proposed in the voice signal input system of authentication system. Meanwhile, the task of forming a lobe of microphone array in a desired area of voice signal registration (100 Hz to 8 kHz is solved. The usage of directional properties of the proposed microphone array gives the possibility to have the influence of external interference signals two or three times less in the frequency range from 4 to 8 kHz. The possibilities for implementation of space-time processing of the recorded signals using constant and adaptive weighting factors are investigated. The simulation results of the proposed system for input and extraction of signals during digital processing of narrowband signals are presented. The proposed solutions make it possible to improve the value of the signal/noise ratio of the useful signals recorded up to 10, ..., 20 dB under the influence of external interference signals in the frequency range from 4 to 8 kHz. The results may be useful to specialists working in the field of voice recognition and speaker’s discrimination.

  7. Facing Sound - Voicing Art

    DEFF Research Database (Denmark)

    Lønstrup, Ansa

    2013-01-01

    This article is based on examples of contemporary audiovisual art, with a special focus on the Tony Oursler exhibition Face to Face at Aarhus Art Museum ARoS in Denmark in March-July 2012. My investigation involves a combination of qualitative interviews with visitors, observations of the audience´s...... interactions with the exhibition and the artwork in the museum space and short analyses of individual works of art based on reception aesthetics and phenomenology and inspired by newer writings on sound, voice and listening....

  8. EXPERIMENTAL STUDY OF FIRMWARE FOR INPUT AND EXTRACTION OF USER’S VOICE SIGNAL IN VOICE AUTHENTICATION SYSTEMS

    Directory of Open Access Journals (Sweden)

    O. N. Faizulaieva

    2014-09-01

    Full Text Available Scientific task for improving the signal-to-noise ratio for user’s voice signal in computer systems and networks during the process of user’s voice authentication is considered. The object of study is the process of input and extraction of the voice signal of authentication system user in computer systems and networks. Methods and means for input and extraction of the voice signal on the background of external interference signals are investigated. Ways for quality improving of the user’s voice signal in systems of voice authentication are investigated experimentally. Firmware means for experimental unit of input and extraction of the user’s voice signal against external interference influence are considered. As modern computer means, including mobile, have two-channel audio card, two microphones are used in the voice signal input. The distance between sonic-wave sensors is 20 mm and it provides forming one direction pattern lobe of microphone array in a desired area of voice signal registration (from 100 Hz to 8 kHz. According to the results of experimental studies, the usage of directional properties of the proposed microphone array and space-time processing of the recorded signals with implementation of constant and adaptive weighting factors has made it possible to reduce considerably the influence of interference signals. The results of firmware experimental studies for input and extraction of the user’s voice signal against external interference influence are shown. The proposed solutions will give the possibility to improve the value of the signal/noise ratio of the useful signals recorded up to 20 dB under the influence of external interference signals in the frequency range from 4 to 8 kHz. The results may be useful to specialists working in the field of voice recognition and speaker discrimination.

  9. Updating signal typing in voice: addition of type 4 signals.

    Science.gov (United States)

    Sprecher, Alicia; Olszewski, Aleksandra; Jiang, Jack J; Zhang, Yu

    2010-06-01

    The addition of a fourth type of voice to Titze's voice classification scheme is proposed. This fourth voice type is characterized by primarily stochastic noise behavior and is therefore unsuitable for both perturbation and correlation dimension analysis. Forty voice samples were classified into the proposed four types using narrowband spectrograms. Acoustic, perceptual, and correlation dimension analyses were completed for all voice samples. Perturbation measures tended to increase with voice type. Based on reliability cutoffs, the type 1 and type 2 voices were considered suitable for perturbation analysis. Measures of unreliability were higher for type 3 and 4 voices. Correlation dimension analyses increased significantly with signal type as indicated by a one-way analysis of variance. Notably, correlation dimension analysis could not quantify the type 4 voices. The proposed fourth voice type represents a subset of voices dominated by noise behavior. Current measures capable of evaluating type 4 voices provide only qualitative data (spectrograms, perceptual analysis, and an infinite correlation dimension). Type 4 voices are highly complex and the development of objective measures capable of analyzing these voices remains a topic of future investigation.

  10. Multivariate sensitivity to voice during auditory categorization.

    Science.gov (United States)

    Lee, Yune Sang; Peelle, Jonathan E; Kraemer, David; Lloyd, Samuel; Granger, Richard

    2015-09-01

    Past neuroimaging studies have documented discrete regions of human temporal cortex that are more strongly activated by conspecific voice sounds than by nonvoice sounds. However, the mechanisms underlying this voice sensitivity remain unclear. In the present functional MRI study, we took a novel approach to examining voice sensitivity, in which we applied a signal detection paradigm to the assessment of multivariate pattern classification among several living and nonliving categories of auditory stimuli. Within this framework, voice sensitivity can be interpreted as a distinct neural representation of brain activity that correctly distinguishes human vocalizations from other auditory object categories. Across a series of auditory categorization tests, we found that bilateral superior and middle temporal cortex consistently exhibited robust sensitivity to human vocal sounds. Although the strongest categorization was in distinguishing human voice from other categories, subsets of these regions were also able to distinguish reliably between nonhuman categories, suggesting a general role in auditory object categorization. Our findings complement the current evidence of cortical sensitivity to human vocal sounds by revealing that the greatest sensitivity during categorization tasks is devoted to distinguishing voice from nonvoice categories within human temporal cortex. Copyright © 2015 the American Physiological Society.

  11. A SOUND SOURCE LOCALIZATION TECHNIQUE TO SUPPORT SEARCH AND RESCUE IN LOUD NOISE ENVIRONMENTS

    Science.gov (United States)

    Yoshinaga, Hiroshi; Mizutani, Koichi; Wakatsuki, Naoto

    At some sites of earthquakes and other disasters, rescuers search for people buried under rubble by listening for the sounds which they make. Thus developing a technique to localize sound sources amidst loud noise will support such search and rescue operations. In this paper, we discuss an experiment performed to test an array signal processing technique which searches for unperceivable sound in loud noise environments. Two speakers simultaneously played a noise of a generator and a voice decreased by 20 dB (= 1/100 of power) from the generator noise at an outdoor space where cicadas were making noise. The sound signal was received by a horizontally set linear microphone array 1.05 m in length and consisting of 15 microphones. The direction and the distance of the voice were computed and the sound of the voice was extracted and played back as an audible sound by array signal processing.

  12. Investigating the neural correlates of voice versus speech-sound directed information in pre-school children.

    Directory of Open Access Journals (Sweden)

    Nora Maria Raschle

    Full Text Available Studies in sleeping newborns and infants propose that the superior temporal sulcus is involved in speech processing soon after birth. Speech processing also implicitly requires the analysis of the human voice, which conveys both linguistic and extra-linguistic information. However, due to technical and practical challenges when neuroimaging young children, evidence of neural correlates of speech and/or voice processing in toddlers and young children remains scarce. In the current study, we used functional magnetic resonance imaging (fMRI in 20 typically developing preschool children (average age  = 5.8 y; range 5.2-6.8 y to investigate brain activation during judgments about vocal identity versus the initial speech sound of spoken object words. FMRI results reveal common brain regions responsible for voice-specific and speech-sound specific processing of spoken object words including bilateral primary and secondary language areas of the brain. Contrasting voice-specific with speech-sound specific processing predominantly activates the anterior part of the right-hemispheric superior temporal sulcus. Furthermore, the right STS is functionally correlated with left-hemispheric temporal and right-hemispheric prefrontal regions. This finding underlines the importance of the right superior temporal sulcus as a temporal voice area and indicates that this brain region is specialized, and functions similarly to adults by the age of five. We thus extend previous knowledge of voice-specific regions and their functional connections to the young brain which may further our understanding of the neuronal mechanism of speech-specific processing in children with developmental disorders, such as autism or specific language impairments.

  13. Start/End Delays of Voiced and Unvoiced Speech Signals

    Energy Technology Data Exchange (ETDEWEB)

    Herrnstein, A

    1999-09-24

    Recent experiments using low power EM-radar like sensors (e.g, GEMs) have demonstrated a new method for measuring vocal fold activity and the onset times of voiced speech, as vocal fold contact begins to take place. Similarly the end time of a voiced speech segment can be measured. Secondly it appears that in most normal uses of American English speech, unvoiced-speech segments directly precede or directly follow voiced-speech segments. For many applications, it is useful to know typical duration times of these unvoiced speech segments. A corpus, assembled earlier of spoken ''Timit'' words, phrases, and sentences and recorded using simultaneously measured acoustic and EM-sensor glottal signals, from 16 male speakers, was used for this study. By inspecting the onset (or end) of unvoiced speech, using the acoustic signal, and the onset (or end) of voiced speech using the EM sensor signal, the average duration times for unvoiced segments preceding onset of vocalization were found to be 300ms, and for following segments, 500ms. An unvoiced speech period is then defined in time, first by using the onset of the EM-sensed glottal signal, as the onset-time marker for the voiced speech segment and end marker for the unvoiced segment. Then, by subtracting 300ms from the onset time mark of voicing, the unvoiced speech segment start time is found. Similarly, the times for a following unvoiced speech segment can be found. While data of this nature have proven to be useful for work in our laboratory, a great deal of additional work remains to validate such data for use with general populations of users. These procedures have been useful for applying optimal processing algorithms over time segments of unvoiced, voiced, and non-speech acoustic signals. For example, these data appear to be of use in speaker validation, in vocoding, and in denoising algorithms.

  14. Reduction of heart sound interference from lung sound signals using empirical mode decomposition technique.

    Science.gov (United States)

    Mondal, Ashok; Bhattacharya, P S; Saha, Goutam

    2011-01-01

    During the recording time of lung sound (LS) signals from the chest wall of a subject, there is always heart sound (HS) signal interfering with it. This obscures the features of lung sound signals and creates confusion on pathological states, if any, of the lungs. A novel method based on empirical mode decomposition (EMD) technique is proposed in this paper for reducing the undesired heart sound interference from the desired lung sound signals. In this, the mixed signal is split into several components. Some of these components contain larger proportions of interfering signals like heart sound, environmental noise etc. and are filtered out. Experiments have been conducted on simulated and real-time recorded mixed signals of heart sound and lung sound. The proposed method is found to be superior in terms of time domain, frequency domain, and time-frequency domain representations and also in listening test performed by pulmonologist.

  15. Analysis of the Auditory Feedback and Phonation in Normal Voices.

    Science.gov (United States)

    Arbeiter, Mareike; Petermann, Simon; Hoppe, Ulrich; Bohr, Christopher; Doellinger, Michael; Ziethe, Anke

    2018-02-01

    The aim of this study was to investigate the auditory feedback mechanisms and voice quality during phonation in response to a spontaneous pitch change in the auditory feedback. Does the pitch shift reflex (PSR) change voice pitch and voice quality? Quantitative and qualitative voice characteristics were analyzed during the PSR. Twenty-eight healthy subjects underwent transnasal high-speed video endoscopy (HSV) at 8000 fps during sustained phonation [a]. While phonating, the subjects heard their sound pitched up for 700 cents (interval of a fifth), lasting 300 milliseconds in their auditory feedback. The electroencephalography (EEG), acoustic voice signal, electroglottography (EGG), and high-speed-videoendoscopy (HSV) were analyzed to compare feedback mechanisms for the pitched and unpitched condition of the phonation paradigm statistically. Furthermore, quantitative and qualitative voice characteristics were analyzed. The PSR was successfully detected within all signals of the experimental tools (EEG, EGG, acoustic voice signal, HSV). A significant increase of the perturbation measures and an increase of the values of the acoustic parameters during the PSR were observed, especially for the audio signal. The auditory feedback mechanism seems not only to control for voice pitch but also for voice quality aspects.

  16. 33 CFR 67.20-10 - Sound signal.

    Science.gov (United States)

    2010-07-01

    ... 33 Navigation and Navigable Waters 1 2010-07-01 2010-07-01 false Sound signal. 67.20-10 Section 67... AIDS TO NAVIGATION ON ARTIFICIAL ISLANDS AND FIXED STRUCTURES Class âAâ Requirements § 67.20-10 Sound signal. (a) The owner of a Class “A” structure shall: (1) Install a sound signal that has a rated range...

  17. Analysis of acoustic sound signal for ONB measurement

    International Nuclear Information System (INIS)

    Park, S. J.; Kim, H. I.; Han, K. Y.; Chai, H. T.; Park, C.

    2003-01-01

    The onset of nucleate boiling (ONB) was measured in a test fuel bundle composed of several fuel element simulators (FES) by analysing the aquatic sound signals. In order measure ONBs, a hydrophone, a pre-amplifier, and a data acquisition system to acquire/process the aquatic signal was prepared. The acoustic signal generated in the coolant is converted to the current signal through the microphone. When the signal is analyzed in the frequency domain, each sound signal can be identified according to its origin of sound source. As the power is increased to a certain degree, a nucleate boiling is started. The frequent formation and collapse of the void bubbles produce sound signal. By measuring this sound signal one can pinpoint the ONB. Since the signal characteristics is identical for different mass flow rates, this method can be applicable for ascertaining ONB

  18. Double Fourier analysis for Emotion Identification in Voiced Speech

    International Nuclear Information System (INIS)

    Sierra-Sosa, D.; Bastidas, M.; Ortiz P, D.; Quintero, O.L.

    2016-01-01

    We propose a novel analysis alternative, based on two Fourier Transforms for emotion recognition from speech. Fourier analysis allows for display and synthesizes different signals, in terms of power spectral density distributions. A spectrogram of the voice signal is obtained performing a short time Fourier Transform with Gaussian windows, this spectrogram portraits frequency related features, such as vocal tract resonances and quasi-periodic excitations during voiced sounds. Emotions induce such characteristics in speech, which become apparent in spectrogram time-frequency distributions. Later, the signal time-frequency representation from spectrogram is considered an image, and processed through a 2-dimensional Fourier Transform in order to perform the spatial Fourier analysis from it. Finally features related with emotions in voiced speech are extracted and presented. (paper)

  19. Tutorial and Guidelines on Measurement of Sound Pressure Level in Voice and Speech

    Science.gov (United States)

    Švec, Jan G.; Granqvist, Svante

    2018-01-01

    Purpose: Sound pressure level (SPL) measurement of voice and speech is often considered a trivial matter, but the measured levels are often reported incorrectly or incompletely, making them difficult to compare among various studies. This article aims at explaining the fundamental principles behind these measurements and providing guidelines to…

  20. Auditory Sketches: Very Sparse Representations of Sounds Are Still Recognizable.

    Directory of Open Access Journals (Sweden)

    Vincent Isnard

    Full Text Available Sounds in our environment like voices, animal calls or musical instruments are easily recognized by human listeners. Understanding the key features underlying this robust sound recognition is an important question in auditory science. Here, we studied the recognition by human listeners of new classes of sounds: acoustic and auditory sketches, sounds that are severely impoverished but still recognizable. Starting from a time-frequency representation, a sketch is obtained by keeping only sparse elements of the original signal, here, by means of a simple peak-picking algorithm. Two time-frequency representations were compared: a biologically grounded one, the auditory spectrogram, which simulates peripheral auditory filtering, and a simple acoustic spectrogram, based on a Fourier transform. Three degrees of sparsity were also investigated. Listeners were asked to recognize the category to which a sketch sound belongs: singing voices, bird calls, musical instruments, and vehicle engine noises. Results showed that, with the exception of voice sounds, very sparse representations of sounds (10 features, or energy peaks, per second could be recognized above chance. No clear differences could be observed between the acoustic and the auditory sketches. For the voice sounds, however, a completely different pattern of results emerged, with at-chance or even below-chance recognition performances, suggesting that the important features of the voice, whatever they are, were removed by the sketch process. Overall, these perceptual results were well correlated with a model of auditory distances, based on spectro-temporal excitation patterns (STEPs. This study confirms the potential of these new classes of sounds, acoustic and auditory sketches, to study sound recognition.

  1. Estimation of sound pressure levels of voiced speech from skin vibration of the neck

    NARCIS (Netherlands)

    Svec, JG; Titze, IR; Popolo, PS

    How accurately can sound pressure levels (SPLs) of speech be estimated from skin vibration of the neck? Measurements using a small accelerometer were carried out in 27 subjects (10 males and 17 females) who read Rainbow and Marvin Williams passages in soft, comfortable, and loud voice, while skin

  2. Voice Morphing Using 3D Waveform Interpolation Surfaces and Lossless Tube Area Functions

    Directory of Open Access Journals (Sweden)

    Lavner Yizhar

    2005-01-01

    Full Text Available Voice morphing is the process of producing intermediate or hybrid voices between the utterances of two speakers. It can also be defined as the process of gradually transforming the voice of one speaker to that of another. The ability to change the speaker's individual characteristics and to produce high-quality voices can be used in many applications. Examples include multimedia and video entertainment, as well as enrichment of speech databases in text-to-speech systems. In this study we present a new technique which enables production of a given number of intermediate voices or of utterances which gradually change from one voice to another. This technique is based on two components: (1 creation of a 3D prototype waveform interpolation (PWI surface from the LPC residual signal, to produce an intermediate excitation signal; (2 a representation of the vocal tract by a lossless tube area function, and an interpolation of the parameters of the two speakers. The resulting synthesized signal sounds like a natural voice lying between the two original voices.

  3. I like my voice better: self-enhancement bias in perceptions of voice attractiveness.

    Science.gov (United States)

    Hughes, Susan M; Harrison, Marissa A

    2013-01-01

    Previous research shows that the human voice can communicate a wealth of nonsemantic information; preferences for voices can predict health, fertility, and genetic quality of the speaker, and people often use voice attractiveness, in particular, to make these assessments of others. But it is not known what we think of the attractiveness of our own voices as others hear them. In this study eighty men and women rated the attractiveness of an array of voice recordings of different individuals and were not told that their own recorded voices were included in the presentation. Results showed that participants rated their own voices as sounding more attractive than others had rated their voices, and participants also rated their own voices as sounding more attractive than they had rated the voices of others. These findings suggest that people may engage in vocal implicit egotism, a form of self-enhancement.

  4. The Relationship Between Acoustic Signal Typing and Perceptual Evaluation of Tracheoesophageal Voice Quality for Sustained Vowels.

    Science.gov (United States)

    Clapham, Renee P; van As-Brooks, Corina J; van Son, Rob J J H; Hilgers, Frans J M; van den Brekel, Michiel W M

    2015-07-01

    To investigate the relationship between acoustic signal typing and perceptual evaluation of sustained vowels produced by tracheoesophageal (TE) speakers and the use of signal typing in the clinical setting. Two evaluators independently categorized 1.75-second segments of narrow-band spectrograms according to acoustic signal typing and independently evaluated the recording of the same segments on a visual analog scale according to overall perceptual acoustic voice quality. The relationship between acoustic signal typing and overall voice quality (as a continuous scale and as a four-point ordinal scale) was investigated and the proportion of inter-rater agreement as well as the reliability between the two measures is reported. The agreement between signal type (I-IV) and ordinal voice quality (four-point scale) was low but significant, and there was a significant linear relationship between the variables. Signal type correctly predicted less than half of the voice quality data. There was a significant main effect of signal type on continuous voice quality scores with significant differences in median quality scores between signal types I-IV, I-III, and I-II. Signal typing can be used as an adjunct to perceptual and acoustic evaluation of the same stimuli for TE speech as part of a multidimensional evaluation protocol. Signal typing in its current form provides limited predictive information on voice quality, and there is significant overlap between signal types II and III and perceptual categories. Future work should consider whether the current four signal types could be refined. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  5. Aerodynamic and sound intensity measurements in tracheoesophageal voice

    NARCIS (Netherlands)

    Grolman, Wilko; Eerenstein, Simone E. J.; Tan, Frédérique M. L.; Tange, Rinze A.; Schouwenburg, Paul F.

    2007-01-01

    BACKGROUND: In laryngectomized patients, tracheoesophageal voice generally provides a better voice quality than esophageal voice. Understanding the aerodynamics of voice production in patients with a voice prosthesis is important for optimizing prosthetic designs and successful voice rehabilitation.

  6. [Encapsulated voices : Estonian sound recordings from the German prisoner-of-war camps in 1916-1918] / Tõnu Tannberg

    Index Scriptorium Estoniae

    Tannberg, Tõnu, 1961-

    2013-01-01

    Arvustus: Encapsulated voices : Estonian sound recordings from the German prisoner-of-war camps in 1916-1918 (Das Baltikum in Geschichte und Gegenwart, 5). Hrsg. von Jaan Ross. Böhlau Verlag. Köln, Weimar und Wien 2012

  7. The Prevalence of Stuttering, Voice, and Speech-Sound Disorders in Primary School Students in Australia

    Science.gov (United States)

    McKinnon, David H.; McLeod, Sharynne; Reilly, Sheena

    2007-01-01

    Purpose: The aims of this study were threefold: to report teachers' estimates of the prevalence of speech disorders (specifically, stuttering, voice, and speech-sound disorders); to consider correspondence between the prevalence of speech disorders and gender, grade level, and socioeconomic status; and to describe the level of support provided to…

  8. Speech masking and cancelling and voice obscuration

    Science.gov (United States)

    Holzrichter, John F.

    2013-09-10

    A non-acoustic sensor is used to measure a user's speech and then broadcasts an obscuring acoustic signal diminishing the user's vocal acoustic output intensity and/or distorting the voice sounds making them unintelligible to persons nearby. The non-acoustic sensor is positioned proximate or contacting a user's neck or head skin tissue for sensing speech production information.

  9. The shouted voice: A pilot study of laryngeal physiology under extreme aerodynamic pressure.

    Science.gov (United States)

    Lagier, Aude; Legou, Thierry; Galant, Camille; Amy de La Bretèque, Benoit; Meynadier, Yohann; Giovanni, Antoine

    2017-12-01

    The objective was to study the behavior of the larynx during shouted voice production, when the larynx is exposed to extremely high subglottic pressure. The study involved electroglottographic, acoustic, and aerodynamic analyses of shouts produced at maximum effort by three male participants. Under a normal speaking voice, the voice sound pressure level (SPL) is proportional to the subglottic pressure. However, when the subglottic pressure reached high levels, the voice SPL reached a maximum value and then decreased as subglottic pressure increased further. Furthermore, the electroglottographic signal sometimes lost its periodicity during the shout, suggesting irregular vocal fold vibration.

  10. Objective voice parameters in Colombian school workers with healthy voices

    NARCIS (Netherlands)

    L.C. Cantor Cutiva (Lady Catherine); A. Burdorf (Alex)

    2015-01-01

    textabstractObjectives: To characterize the objective voice parameters among school workers, and to identify associated factors of three objective voice parameters, namely fundamental frequency, sound pressure level and maximum phonation time. Materials and methods: We conducted a cross-sectional

  11. Applied Chaos Level Test for Validation of Signal Conditions Underlying Optimal Performance of Voice Classification Methods

    Science.gov (United States)

    Liu, Boquan; Polce, Evan; Sprott, Julien C.; Jiang, Jack J.

    2018-01-01

    Purpose: The purpose of this study is to introduce a chaos level test to evaluate linear and nonlinear voice type classification method performances under varying signal chaos conditions without subjective impression. Study Design: Voice signals were constructed with differing degrees of noise to model signal chaos. Within each noise power, 100…

  12. Connections between voice ergonomic risk factors and voice symptoms, voice handicap, and respiratory tract diseases.

    Science.gov (United States)

    Rantala, Leena M; Hakala, Suvi J; Holmqvist, Sofia; Sala, Eeva

    2012-11-01

    The aim of the study was to investigate the connections between voice ergonomic risk factors found in classrooms and voice-related problems in teachers. Voice ergonomic assessment was performed in 39 classrooms in 14 elementary schools by means of a Voice Ergonomic Assessment in Work Environment--Handbook and Checklist. The voice ergonomic risk factors assessed included working culture, noise, indoor air quality, working posture, stress, and access to a sound amplifier. Teachers from the above-mentioned classrooms reported their voice symptoms, respiratory tract diseases, and completed a Voice Handicap Index (VHI). The more voice ergonomic risk factors found in the classroom the higher were the teachers' total scores on voice symptoms and VHI. Stress was the factor that correlated most strongly with voice symptoms. Poor indoor air quality increased the occurrence of laryngitis. Voice ergonomics were poor in the classrooms studied and voice ergonomic risk factors affected the voice. It is important to convey information on voice ergonomics to education administrators and those responsible for school planning and taking care of school buildings. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  13. Assessments of Voice Use and Voice Quality among College/University Singing Students Ages 18–24 through Ambulatory Monitoring with a Full Accelerometer Signal

    Science.gov (United States)

    Schloneger, Matthew; Hunter, Eric

    2016-01-01

    The multiple social and performance demands placed on college/university singers could put their still developing voices at risk. Previous ambulatory monitoring studies have analyzed the duration, intensity, and frequency (in Hz) of voice use among such students. Nevertheless, no studies to date have incorporated the simultaneous acoustic voice quality measures into the acquisition of these measures to allow for direct comparison during the same voicing period. Such data could provide greater insight into how young singers use their voices, as well as identify potential correlations between vocal dose and acoustic changes in voice quality. The purpose of this study was to assess the voice use and estimated voice quality of college/university singing students (18–24 y/o, N = 19). Ambulatory monitoring was conducted over three full, consecutive weekdays measuring voice from an unprocessed accelerometer signal measured at the neck. From this signal were analyzed traditional vocal dose metrics such as phonation percentage, dose time, cycle dose, and distance dose. Additional acoustic measures included perceived pitch, pitch strength, LTAS slope, alpha ratio, dB SPL 1–3 kHz, and harmonic-to-noise ratio. Major findings from more than 800 hours of recording indicated that among these students (a) higher vocal doses correlated significantly with greater voice intensity, more vocal clarity and less perturbation; and (b) there were significant differences in some acoustic voice quality metrics between non-singing, solo singing and choral singing. PMID:26897545

  14. Objective Voice Parameters in Colombian School Workers with Healthy Voices

    Directory of Open Access Journals (Sweden)

    Lady Catherine Cantor Cutiva

    2015-09-01

    Full Text Available Objectives: To characterize the objective voice parameters among school workers, and to identi­fy associated factors of three objective voice parameters, namely fundamental frequency, sound pressure level and maximum phonation time. Materials and methods: We conducted a cross-sectional study among 116 Colombian teachers and 20 Colombian non-teachers. After signing the informed consent form, participants filled out a questionnaire. Then, a voice sample was recorded and evaluated perceptually by a speech therapist and by objective voice analysis with praat software. Short-term environmental measurements of sound level, temperature, humi­dity, and reverberation time were conducted during visits at the workplaces, such as classrooms and offices. Linear regression analysis was used to determine associations between individual and work-related factors and objective voice parameters. Results: Compared with men, women had higher fundamental frequency (201 Hz for teachers and 209 for non-teachers vs. 120 Hz for teachers and 127 for non-teachers and sound pressure level (82 dB vs. 80 dB, and shorter maximum phonation time (around 14 seconds vs. around 16 seconds. Female teachers younger than 50 years of age evidenced a significant tendency to speak with lower fundamental frequen­cy and shorter mpt compared with female teachers older than 50 years of age. Female teachers had significantly higher fundamental frequency (66 Hz, higher sound pressure level (2 dB and short phonation time (2 seconds than male teachers. Conclusion: Female teachers younger than 50 years of age had significantly lower F0 and shorter mpt compared with those older than 50 years of age. The multivariate analysis showed that gender was a much more important determinant of variations in F0, spl and mpt than age and teaching occupation. Objectively measured temperature also contributed to the changes on spl among school workers.

  15. The recognition of female voice based on voice registers in singing techniques in real-time using hankel transform method and macdonald function

    Science.gov (United States)

    Meiyanti, R.; Subandi, A.; Fuqara, N.; Budiman, M. A.; Siahaan, A. P. U.

    2018-03-01

    A singer doesn’t just recite the lyrics of a song, but also with the use of particular sound techniques to make it more beautiful. In the singing technique, more female have a diverse sound registers than male. There are so many registers of the human voice, but the voice registers used while singing, among others, Chest Voice, Head Voice, Falsetto, and Vocal fry. Research of speech recognition based on the female’s voice registers in singing technique is built using Borland Delphi 7.0. Speech recognition process performed by the input recorded voice samples and also in real time. Voice input will result in weight energy values based on calculations using Hankel Transformation method and Macdonald Functions. The results showed that the accuracy of the system depends on the accuracy of sound engineering that trained and tested, and obtained an average percentage of the successful introduction of the voice registers record reached 48.75 percent, while the average percentage of the successful introduction of the voice registers in real time to reach 57 percent.

  16. You're a What? Voice Actor

    Science.gov (United States)

    Liming, Drew

    2009-01-01

    This article talks about voice actors and features Tony Oliver, a professional voice actor. Voice actors help to bring one's favorite cartoon and video game characters to life. They also do voice-overs for radio and television commercials and movie trailers. These actors use the sound of their voice to sell a character's emotions--or an advertised…

  17. Sound specificity effects in spoken word recognition: The effect of integrality between words and sounds.

    Science.gov (United States)

    Strori, Dorina; Zaar, Johannes; Cooke, Martin; Mattys, Sven L

    2018-01-01

    Recent evidence has shown that nonlinguistic sounds co-occurring with spoken words may be retained in memory and affect later retrieval of the words. This sound-specificity effect shares many characteristics with the classic voice-specificity effect. In this study, we argue that the sound-specificity effect is conditional upon the context in which the word and sound coexist. Specifically, we argue that, besides co-occurrence, integrality between words and sounds is a crucial factor in the emergence of the effect. In two recognition-memory experiments, we compared the emergence of voice and sound specificity effects. In Experiment 1 , we examined two conditions where integrality is high. Namely, the classic voice-specificity effect (Exp. 1a) was compared with a condition in which the intensity envelope of a background sound was modulated along the intensity envelope of the accompanying spoken word (Exp. 1b). Results revealed a robust voice-specificity effect and, critically, a comparable sound-specificity effect: A change in the paired sound from exposure to test led to a decrease in word-recognition performance. In the second experiment, we sought to disentangle the contribution of integrality from a mere co-occurrence context effect by removing the intensity modulation. The absence of integrality led to the disappearance of the sound-specificity effect. Taken together, the results suggest that the assimilation of background sounds into memory cannot be reduced to a simple context effect. Rather, it is conditioned by the extent to which words and sounds are perceived as integral as opposed to distinct auditory objects.

  18. Singing Voice Analysis, Synthesis, and Modeling

    Science.gov (United States)

    Kim, Youngmoo E.

    The singing voice is the oldest musical instrument, but its versatility and emotional power are unmatched. Through the combination of music, lyrics, and expression, the voice is able to affect us in ways that no other instrument can. The fact that vocal music is prevalent in almost all cultures is indicative of its innate appeal to the human aesthetic. Singing also permeates most genres of music, attesting to the wide range of sounds the human voice is capable of producing. As listeners we are naturally drawn to the sound of the human voice, and, when present, it immediately becomes the focus of our attention.

  19. Face the voice

    DEFF Research Database (Denmark)

    Lønstrup, Ansa

    2014-01-01

    will be based on a reception aesthetic and phenomenological approach, the latter as presented by Don Ihde in his book Listening and Voice. Phenomenologies of Sound , and my analytical sketches will be related to theoretical statements concerning the understanding of voice and media (Cavarero, Dolar, La......Belle, Neumark). Finally, the article will discuss the specific artistic combination and our auditory experience of mediated human voices and sculpturally projected faces in an art museum context under the general conditions of the societal panophonia of disembodied and mediated voices, as promoted by Steven...

  20. Reverberation impairs brainstem temporal representations of voiced vowel sounds: challenging periodicity-tagged segregation of competing speech in rooms

    Directory of Open Access Journals (Sweden)

    Mark eSayles

    2015-01-01

    Full Text Available The auditory system typically processes information from concurrently active sound sources (e.g., two voices speaking at once, in the presence of multiple delayed, attenuated and distorted sound-wave reflections (reverberation. Brainstem circuits help segregate these complex acoustic mixtures into auditory objects. Psychophysical studies demonstrate a strong interaction between reverberation and fundamental-frequency (F0 modulation, leading to impaired segregation of competing vowels when segregation is on the basis of F0 differences. Neurophysiological studies of complex-sound segregation have concentrated on sounds with steady F0s, in anechoic environments. However, F0 modulation and reverberation are quasi-ubiquitous.We examine the ability of 129 single units in the ventral cochlear nucleus of the anesthetized guinea pig to segregate the concurrent synthetic vowel sounds /a/ and /i/, based on temporal discharge patterns under closed-field conditions. We address the effects of added real-room reverberation, F0 modulation, and the interaction of these two factors, on brainstem neural segregation of voiced speech sounds. A firing-rate representation of single-vowels’ spectral envelopes is robust to the combination of F0 modulation and reverberation: local firing-rate maxima and minima across the tonotopic array code vowel-formant structure. However, single-vowel F0-related periodicity information in shuffled inter-spike interval distributions is significantly degraded in the combined presence of reverberation and F0 modulation. Hence, segregation of double-vowels’ spectral energy into two streams (corresponding to the two vowels, on the basis of temporal discharge patterns, is impaired by reverberation; specifically when F0 is modulated. All unit types (primary-like, chopper, onset are similarly affected. These results offer neurophysiological insights to perceptual organization of complex acoustic scenes under realistically challenging

  1. Separation and reconstruction of high pressure water-jet reflective sound signal based on ICA

    Science.gov (United States)

    Yang, Hongtao; Sun, Yuling; Li, Meng; Zhang, Dongsu; Wu, Tianfeng

    2011-12-01

    The impact of high pressure water-jet on the different materials target will produce different reflective mixed sound. In order to reconstruct the reflective sound signals distribution on the linear detecting line accurately and to separate the environment noise effectively, the mixed sound signals acquired by linear mike array were processed by ICA. The basic principle of ICA and algorithm of FASTICA were described in detail. The emulation experiment was designed. The environment noise signal was simulated by using band-limited white noise and the reflective sound signal was simulated by using pulse signal. The reflective sound signal attenuation produced by the different distance transmission was simulated by weighting the sound signal with different contingencies. The mixed sound signals acquired by linear mike array were synthesized by using the above simulated signals and were whitened and separated by ICA. The final results verified that the environment noise separation and the reconstruction of the detecting-line sound distribution can be realized effectively.

  2. Foetal response to music and voice.

    Science.gov (United States)

    Al-Qahtani, Noura H

    2005-10-01

    To examine whether prenatal exposure to music and voice alters foetal behaviour and whether foetal response to music differs from human voice. A prospective observational study was conducted in 20 normal term pregnant mothers. Ten foetuses were exposed to music and voice for 15 s at different sound pressure levels to find out the optimal setting for the auditory stimulation. Music, voice and sham were played to another 10 foetuses via a headphone on the maternal abdomen. The sound pressure level was 105 db and 94 db for music and voice, respectively. Computerised assessment of foetal heart rate and activity were recorded. 90 actocardiograms were obtained for the whole group. One way anova followed by posthoc (Student-Newman-Keuls method) analysis was used to find if there is significant difference in foetal response to music and voice versus sham. Foetuses responded with heart rate acceleration and motor response to both music and voice. This was statistically significant compared to sham. There was no significant difference between the foetal heart rate acceleration to music and voice. Prenatal exposure to music and voice alters the foetal behaviour. No difference was detected in foetal response to music and voice.

  3. Sounds like a winner: voice pitch influences perception of leadership capacity in both men and women.

    Science.gov (United States)

    Klofstad, Casey A; Anderson, Rindy C; Peters, Susan

    2012-07-07

    It is well known that non-human animals respond to information encoded in vocal signals, and the same can be said of humans. Specifically, human voice pitch affects how speakers are perceived. As such, does voice pitch affect how we perceive and select our leaders? To answer this question, we recorded men and women saying 'I urge you to vote for me this November'. Each recording was manipulated digitally to yield a higher- and lower-pitched version of the original. We then asked men and women to vote for either the lower- or higher-pitched version of each voice. Our results show that both men and women select male and female leaders with lower voices. These findings suggest that men and women with lower-pitched voices may be more successful in obtaining positions of leadership. This might also suggest that because women, on average, have higher-pitched voices than men, voice pitch could be a factor that contributes to fewer women holding leadership roles than men. Additionally, while people are free to choose their leaders, these results clearly demonstrate that these choices cannot be understood in isolation from biological influences.

  4. Emotional voices in context: a neurobiological model of multimodal affective information processing.

    Science.gov (United States)

    Brück, Carolin; Kreifelts, Benjamin; Wildgruber, Dirk

    2011-12-01

    Just as eyes are often considered a gateway to the soul, the human voice offers a window through which we gain access to our fellow human beings' minds - their attitudes, intentions and feelings. Whether in talking or singing, crying or laughing, sighing or screaming, the sheer sound of a voice communicates a wealth of information that, in turn, may serve the observant listener as valuable guidepost in social interaction. But how do human beings extract information from the tone of a voice? In an attempt to answer this question, the present article reviews empirical evidence detailing the cerebral processes that underlie our ability to decode emotional information from vocal signals. The review will focus primarily on two prominent classes of vocal emotion cues: laughter and speech prosody (i.e. the tone of voice while speaking). Following a brief introduction, behavioral as well as neuroimaging data will be summarized that allows to outline cerebral mechanisms associated with the decoding of emotional voice cues, as well as the influence of various context variables (e.g. co-occurring facial and verbal emotional signals, attention focus, person-specific parameters such as gender and personality) on the respective processes. Building on the presented evidence, a cerebral network model will be introduced that proposes a differential contribution of various cortical and subcortical brain structures to the processing of emotional voice signals both in isolation and in context of accompanying (facial and verbal) emotional cues. Copyright © 2011 Elsevier B.V. All rights reserved.

  5. Emotional voices in context: A neurobiological model of multimodal affective information processing

    Science.gov (United States)

    Brück, Carolin; Kreifelts, Benjamin; Wildgruber, Dirk

    2011-12-01

    Just as eyes are often considered a gateway to the soul, the human voice offers a window through which we gain access to our fellow human beings' minds - their attitudes, intentions and feelings. Whether in talking or singing, crying or laughing, sighing or screaming, the sheer sound of a voice communicates a wealth of information that, in turn, may serve the observant listener as valuable guidepost in social interaction. But how do human beings extract information from the tone of a voice? In an attempt to answer this question, the present article reviews empirical evidence detailing the cerebral processes that underlie our ability to decode emotional information from vocal signals. The review will focus primarily on two prominent classes of vocal emotion cues: laughter and speech prosody (i.e. the tone of voice while speaking). Following a brief introduction, behavioral as well as neuroimaging data will be summarized that allows to outline cerebral mechanisms associated with the decoding of emotional voice cues, as well as the influence of various context variables (e.g. co-occurring facial and verbal emotional signals, attention focus, person-specific parameters such as gender and personality) on the respective processes. Building on the presented evidence, a cerebral network model will be introduced that proposes a differential contribution of various cortical and subcortical brain structures to the processing of emotional voice signals both in isolation and in context of accompanying (facial and verbal) emotional cues.

  6. Sound specificity effects in spoken word recognition: The effect of integrality between words and sounds

    DEFF Research Database (Denmark)

    Strori, Dorina; Zaar, Johannes; Cooke, Martin

    2017-01-01

    Recent evidence has shown that nonlinguistic sounds co-occurring with spoken words may be retained in memory and affect later retrieval of the words. This sound-specificity effect shares many characteristics with the classic voice-specificity effect. In this study, we argue that the sound......-specificity effect is conditional upon the context in which the word and sound coexist. Specifically, we argue that, besides co-occurrence, integrality between words and sounds is a crucial factor in the emergence of the effect. In two recognition-memory experiments, we compared the emergence of voice and sound...... from a mere co-occurrence context effect by removing the intensity modulation. The absence of integrality led to the disappearance of the sound-specificity effect. Taken together, the results suggest that the assimilation of background sounds into memory cannot be reduced to a simple context effect...

  7. The Belt voice: Acoustical measurements and esthetic correlates

    Science.gov (United States)

    Bounous, Barry Urban

    This dissertation explores the esthetic attributes of the Belt voice through spectral acoustical analysis. The process of understanding the nature and safe practice of Belt is just beginning, whereas the understanding of classical singing is well established. The unique nature of the Belt sound provides difficulties for voice teachers attempting to evaluate the quality and appropriateness of a particular sound or performance. This study attempts to provide answers to the question "does Belt conform to a set of measurable esthetic standards?" In answering this question, this paper expands on a previous study of the esthetic attributes of the classical baritone voice (see "Vocal Beauty", NATS Journal 51,1) which also drew some tentative conclusions about the Belt voice but which had an inadequate sample pool of subjects from which to draw. Further, this study demonstrates that it is possible to scientifically investigate the realm of musical esthetics in the singing voice. It is possible to go beyond the "a trained voice compared to an untrained voice" paradigm when evaluating quantitative vocal parameters and actually investigate what truly beautiful voices do. There are functions of sound energy (measured in dB) transference which may affect the nervous system in predictable ways and which can be measured and associated with esthetics. This study does not show consistency in measurements for absolute beauty (taste) even among belt teachers and researchers but does show some markers with varying degrees of importance which may point to a difference between our cognitive learned response to singing and our emotional, more visceral response to sounds. The markers which are significant in determining vocal beauty are: (1) Vibrancy-Characteristics of vibrato including speed, width, and consistency (low variability). (2) Spectral makeup-Ratio of partial strength above the fundamental to the fundamental. (3) Activity of the voice-The quantity of energy being produced. (4

  8. Analyzing the mediated voice - a datasession

    DEFF Research Database (Denmark)

    Lawaetz, Anna

    Broadcasted voices are technologically manipulated. In order to achieve a certain autencity or sound of “reality” paradoxically the voices are filtered and trained in order to reach the listeners. This “mis-en-scene” is important knowledge when it comes to the development of a consistent method o...... of analysis of the mediated voice...

  9. Prior and posterior probabilistic models of uncertainties in a model for producing voice

    International Nuclear Information System (INIS)

    Cataldo, Edson; Sampaio, Rubens; Soize, Christian

    2010-01-01

    The aim of this paper is to use Bayesian statistics to update a probability density function related to the tension parameter, which is one of the main parameters responsible for the changing of the fundamental frequency of a voice signal, generated by a mechanical/mathematical model for producing voiced sounds. We follow a parametric approach for stochastic modeling, which requires the adoption of random variables to represent the uncertain parameters present in the cited model. For each random variable, a probability density function is constructed using the Maximum Entropy Principle and the Monte Carlo method is used to generate voice signals as the output of the model. Then, a probability density function of the voice fundamental frequency is constructed. The random variables are fit to experimental data so that the probability density function of the fundamental frequency obtained by the model can be as near as possible of a probability density function obtained from experimental data. New values are obtained experimentally for the fundamental frequency and they are used to update the probability density function of the tension parameter, via Bayes's Theorem.

  10. Bodies, Spaces, Voices, Silences

    Directory of Open Access Journals (Sweden)

    Donatella Mazzoleni

    2013-07-01

    Full Text Available A good architecture should not only allow functional, formal and technical quality for urban spaces, but also let the voice of the city be perceived, listened, enjoyed. Every city has got its specific sound identity, or “ISO” (R. O. Benenzon, made up of a complex texture of background noises and fluctuation of sound figures emerging and disappearing in a game of continuous fadings. For instance, the ISO of Naples is characterized by a spread need of hearing the sound return of one’s/others voices, by a hate of silence. Cities may fall ill: illness from noise, within super-crowded neighbourhoods, or illness from silence, in the forced isolation of peripheries. The proposal of an urban music therapy denotes an unpublished and innovative enlarged interdisciplinary research path, where architecture, music, medicine, psychology, communication science may converge, in order to work for rebalancing spaces and relation life of the urban collectivity, through the care of body and sound dimensions.

  11. Sounds of Modified Flight Feathers Reliably Signal Danger in a Pigeon.

    Science.gov (United States)

    Murray, Trevor G; Zeil, Jochen; Magrath, Robert D

    2017-11-20

    In his book on sexual selection, Darwin [1] devoted equal space to non-vocal and vocal communication in birds. Since then, vocal communication has become a model for studies of neurobiology, learning, communication, evolution, and conservation [2, 3]. In contrast, non-vocal "instrumental music," as Darwin called it, has only recently become subject to sustained inquiry [4, 5]. In particular, outstanding work reveals how feathers, often highly modified, produce distinctive sounds [6-9], and suggests that these sounds have evolved at least 70 times, in many orders [10]. It remains to be shown, however, that such sounds are signals used in communication. Here we show that crested pigeons (Ochyphaps lophotes) signal alarm with specially modified wing feathers. We used video and feather-removal experiments to demonstrate that the highly modified 8 th primary wing feather (P8) produces a distinct note during each downstroke. The sound changes with wingbeat frequency, so that birds fleeing danger produce wing sounds with a higher tempo. Critically, a playback experiment revealed that only if P8 is present does the sound of escape flight signal danger. Our results therefore indicate, nearly 150 years after Darwin's book, that modified feathers can be used for non-vocal communication, and they reveal an intrinsically reliable alarm signal. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. Connections between voice ergonomic risk factors in classrooms and teachers' voice production.

    Science.gov (United States)

    Rantala, Leena M; Hakala, Suvi; Holmqvist, Sofia; Sala, Eeva

    2012-01-01

    The aim of the study was to investigate if voice ergonomic risk factors in classrooms correlated with acoustic parameters of teachers' voice production. The voice ergonomic risk factors in the fields of working culture, working postures and indoor air quality were assessed in 40 classrooms using the Voice Ergonomic Assessment in Work Environment - Handbook and Checklist. Teachers (32 females, 8 males) from the above-mentioned classrooms recorded text readings before and after a working day. Fundamental frequency, sound pressure level (SPL) and the slope of the spectrum (alpha ratio) were analyzed. The higher the number of the risk factors in the classrooms, the higher SPL the teachers used and the more strained the males' voices (increased alpha ratio) were. The SPL was already higher before the working day in the teachers with higher risk than in those with lower risk. In the working environment with many voice ergonomic risk factors, speakers increase voice loudness and use more strained voice quality (males). A practical implication of the results is that voice ergonomic assessments are needed in schools. Copyright © 2013 S. Karger AG, Basel.

  13. A homology sound-based algorithm for speech signal interference

    Science.gov (United States)

    Jiang, Yi-jiao; Chen, Hou-jin; Li, Ju-peng; Zhang, Zhan-song

    2015-12-01

    Aiming at secure analog speech communication, a homology sound-based algorithm for speech signal interference is proposed in this paper. We first split speech signal into phonetic fragments by a short-term energy method and establish an interference noise cache library with the phonetic fragments. Then we implement the homology sound interference by mixing the randomly selected interferential fragments and the original speech in real time. The computer simulation results indicated that the interference produced by this algorithm has advantages of real time, randomness, and high correlation with the original signal, comparing with the traditional noise interference methods such as white noise interference. After further studies, the proposed algorithm may be readily used in secure speech communication.

  14. Cultural and language differences in voice quality perception: a preliminary investigation using synthesized signals.

    Science.gov (United States)

    Yiu, Edwin M-L; Murdoch, Bruce; Hird, Kathryn; Lau, Polly; Ho, Elaine Mandy

    2008-01-01

    Perceptual voice evaluation is a common clinical tool. However, to date, there is no consensus yet as to which common quality should be measured. Some available evidence shows that voice quality is a language-specific property which may be different across different languages. The familiarity of a language may affect the perception and reliability in rating voice quality. The present study set out to investigate the effects of listeners' cultural and language backgrounds on the perception of voice qualities. Forty speech pathology students from Australia and Hong Kong were asked to rate the breathy and rough qualities of synthesized voice signals in Cantonese and English. Results showed that the English stimulus sets as a whole were rated less severely than the Cantonese stimuli by both groups of listeners. In addition, the male Cantonese and English breathy stimuli were rated differently by the Australian and Hong Kong listeners. These results provided some evidence to support the claim that cultural and language backgrounds of the listeners would affect the perception for some voice quality types. Thus, the cultural and language backgrounds of judges should be taken into consideration in clinical voice evaluation. 2008 S. Karger AG, Basel.

  15. Analysis of sound data streamed over the network

    Directory of Open Access Journals (Sweden)

    Jiří Fejfar

    2013-01-01

    Full Text Available In this paper we inspect a difference between original sound recording and signal captured after streaming this original recording over a network loaded with a heavy traffic. There are several kinds of failures occurring in the captured recording caused by network congestion. We try to find a method how to evaluate correctness of streamed audio. Usually there are metrics based on a human perception of a signal such as “signal is clear, without audible failures”, “signal is having some failures but it is understandable”, or “signal is inarticulate”. These approaches need to be statistically evaluated on a broad set of respondents, which is time and resource consuming. We try to propose some metrics based on signal properties allowing us to compare the original and captured recording. We use algorithm called Dynamic Time Warping (Müller, 2007 commonly used for time series comparison in this paper. Some other time series exploration approaches can be found in (Fejfar, 2011 and (Fejfar, 2012. The data was acquired in our network laboratory simulating network traffic by downloading files, streaming audio and video simultaneously. Our former experiment inspected Quality of Service (QoS and its impact on failures of received audio data stream. This experiment is focused on the comparison of sound recordings rather than network mechanism.We focus, in this paper, on a real time audio stream such as a telephone call, where it is not possible to stream audio in advance to a “pool”. Instead it is necessary to achieve as small delay as possible (between speaker voice recording and listener voice replay. We are using RTP protocol for streaming audio.

  16. Performance of Phonatory Deviation Diagrams in Synthesized Voice Analysis.

    Science.gov (United States)

    Lopes, Leonardo Wanderley; da Silva, Karoline Evangelista; da Silva Evangelista, Deyverson; Almeida, Anna Alice; Silva, Priscila Oliveira Costa; Lucero, Jorge; Behlau, Mara

    2018-05-02

    To analyze the performance of a phonatory deviation diagram (PDD) in discriminating the presence and severity of voice deviation and the predominant voice quality of synthesized voices. A speech-language pathologist performed the auditory-perceptual analysis of the synthesized voice (n = 871). The PDD distribution of voice signals was analyzed according to area, quadrant, shape, and density. Differences in signal distribution regarding the PDD area and quadrant were detected when differentiating the signals with and without voice deviation and with different predominant voice quality. Differences in signal distribution were found in all PDD parameters as a function of the severity of voice disorder. The PDD area and quadrant can differentiate normal voices from deviant synthesized voices. There are differences in signal distribution in PDD area and quadrant as a function of the severity of voice disorder and the predominant voice quality. However, the PDD area and quadrant do not differentiate the signals as a function of severity of voice disorder and differentiated only the breathy and rough voices from the normal and strained voices. PDD density is able to differentiate only signals with moderate and severe deviation. PDD shape shows differences between signals with different severities of voice deviation. © 2018 S. Karger AG, Basel.

  17. Natural asynchronies in audiovisual communication signals regulate neuronal multisensory interactions in voice-sensitive cortex.

    Science.gov (United States)

    Perrodin, Catherine; Kayser, Christoph; Logothetis, Nikos K; Petkov, Christopher I

    2015-01-06

    When social animals communicate, the onset of informative content in one modality varies considerably relative to the other, such as when visual orofacial movements precede a vocalization. These naturally occurring asynchronies do not disrupt intelligibility or perceptual coherence. However, they occur on time scales where they likely affect integrative neuronal activity in ways that have remained unclear, especially for hierarchically downstream regions in which neurons exhibit temporally imprecise but highly selective responses to communication signals. To address this, we exploited naturally occurring face- and voice-onset asynchronies in primate vocalizations. Using these as stimuli we recorded cortical oscillations and neuronal spiking responses from functional MRI (fMRI)-localized voice-sensitive cortex in the anterior temporal lobe of macaques. We show that the onset of the visual face stimulus resets the phase of low-frequency oscillations, and that the face-voice asynchrony affects the prominence of two key types of neuronal multisensory responses: enhancement or suppression. Our findings show a three-way association between temporal delays in audiovisual communication signals, phase-resetting of ongoing oscillations, and the sign of multisensory responses. The results reveal how natural onset asynchronies in cross-sensory inputs regulate network oscillations and neuronal excitability in the voice-sensitive cortex of macaques, a suggested animal model for human voice areas. These findings also advance predictions on the impact of multisensory input on neuronal processes in face areas and other brain regions.

  18. Enhancement of speech signals - with a focus on voiced speech models

    DEFF Research Database (Denmark)

    Nørholm, Sidsel Marie

    This thesis deals with speech enhancement, i.e., noise reduction in speech signals. This has applications in, e.g., hearing aids and teleconference systems. We consider a signal-driven approach to speech enhancement where a model of the speech is assumed and filters are generated based...... on this model. The basic model used in this thesis is the harmonic model which is a commonly used model for describing the voiced part of the speech signal. We show that it can be beneficial to extend the model to take inharmonicities or the non-stationarity of speech into account. Extending the model...

  19. Voice Quality and Gender Stereotypes: A Study of Lebanese Women With Reinke's Edema.

    Science.gov (United States)

    Matar, Nayla; Portes, Cristel; Lancia, Leonardo; Legou, Thierry; Baider, Fabienne

    2016-12-01

    Women with Reinke's edema (RW) report being mistaken for men during telephone conversations. For this reason, their masculine-sounding voices are interesting for the study of gender stereotypes. The study's objective is to verify their complaint and to understand the cues used in gender identification. Using a self-evaluation study, we verified RW's perception of their own voices. We compared the acoustic parameters of vowels produced by 10 RW to those produced by 10 men and 10 women with healthy voices (hereafter referred to as NW) in Lebanese Arabic. We conducted a perception study for the evaluation of RW, healthy men's, and NW voices by naïve listeners. RW self-evaluated their voices as masculine and their gender identities as feminine. The acoustic parameters that distinguish RW from NW voices concern fundamental frequency, spectral slope, harmonicity of the voicing signal, and complexity of the spectral envelope. Naïve listeners very often rate RW as surely masculine. Listeners may rate RW's gender incorrectly. These incorrect gender ratings are correlated with acoustic measures of fundamental frequency and voice quality. Further investigations will reveal the contribution of each of these parameters to gender perception and guide the treatment plan of patients complaining of a gender ambiguous voice.

  20. Reconstruction of sound source signal by analytical passive TR in the environment with airflow

    Science.gov (United States)

    Wei, Long; Li, Min; Yang, Debin; Niu, Feng; Zeng, Wu

    2017-03-01

    In the acoustic design of air vehicles, the time-domain signals of noise sources on the surface of air vehicles can serve as data support to reveal the noise source generation mechanism, analyze acoustic fatigue, and take measures for noise insulation and reduction. To rapidly reconstruct the time-domain sound source signals in an environment with flow, a method combining the analytical passive time reversal mirror (AP-TR) with a shear flow correction is proposed. In this method, the negative influence of flow on sound wave propagation is suppressed by the shear flow correction, obtaining the corrected acoustic propagation time delay and path. Those corrected time delay and path together with the microphone array signals are then submitted to the AP-TR, reconstructing more accurate sound source signals in the environment with airflow. As an analytical method, AP-TR offers a supplementary way in 3D space to reconstruct the signal of sound source in the environment with airflow instead of the numerical TR. Experiments on the reconstruction of the sound source signals of a pair of loud speakers are conducted in an anechoic wind tunnel with subsonic airflow to validate the effectiveness and priorities of the proposed method. Moreover the comparison by theorem and experiment result between the AP-TR and the time-domain beamforming in reconstructing the sound source signal is also discussed.

  1. Measurement of Voice Onset Time in Maxillectomy Patients

    OpenAIRE

    Hattori, Mariko; Sumita, Yuka I.; Taniguchi, Hisashi

    2014-01-01

    Objective speech evaluation using acoustic measurement is needed for the proper rehabilitation of maxillectomy patients. For digital evaluation of consonants, measurement of voice onset time is one option. However, voice onset time has not been measured in maxillectomy patients as their consonant sound spectra exhibit unique characteristics that make the measurement of voice onset time challenging. In this study, we established criteria for measuring voice onset time in maxillectomy patients ...

  2. Applications of Hilbert Spectral Analysis for Speech and Sound Signals

    Science.gov (United States)

    Huang, Norden E.

    2003-01-01

    A new method for analyzing nonlinear and nonstationary data has been developed, and the natural applications are to speech and sound signals. The key part of the method is the Empirical Mode Decomposition method with which any complicated data set can be decomposed into a finite and often small number of Intrinsic Mode Functions (IMF). An IMF is defined as any function having the same numbers of zero-crossing and extrema, and also having symmetric envelopes defined by the local maxima and minima respectively. The IMF also admits well-behaved Hilbert transform. This decomposition method is adaptive, and, therefore, highly efficient. Since the decomposition is based on the local characteristic time scale of the data, it is applicable to nonlinear and nonstationary processes. With the Hilbert transform, the Intrinsic Mode Functions yield instantaneous frequencies as functions of time, which give sharp identifications of imbedded structures. This method invention can be used to process all acoustic signals. Specifically, it can process the speech signals for Speech synthesis, Speaker identification and verification, Speech recognition, and Sound signal enhancement and filtering. Additionally, as the acoustical signals from machinery are essentially the way the machines are talking to us. Therefore, the acoustical signals, from the machines, either from sound through air or vibration on the machines, can tell us the operating conditions of the machines. Thus, we can use the acoustic signal to diagnosis the problems of machines.

  3. [Realization of Heart Sound Envelope Extraction Implemented on LabVIEW Based on Hilbert-Huang Transform].

    Science.gov (United States)

    Tan, Zhixiang; Zhang, Yi; Zeng, Deping; Wang, Hua

    2015-04-01

    We proposed a research of a heart sound envelope extraction system in this paper. The system was implemented on LabVIEW based on the Hilbert-Huang transform (HHT). We firstly used the sound card to collect the heart sound, and then implemented the complete system program of signal acquisition, pretreatment and envelope extraction on LabVIEW based on the theory of HHT. Finally, we used a case to prove that the system could collect heart sound, preprocess and extract the envelope easily. The system was better to retain and show the characteristics of heart sound envelope, and its program and methods were important to other researches, such as those on the vibration and voice, etc.

  4. Keep Your Voice Sound: How to Prevent and Avoid Voice Problems

    Science.gov (United States)

    ... Brain Listen Up! Wise Choices Avoid Voice Problems Drink 6 to 8 glasses of water a day. This helps keep your vocal folds moist and healthy. Limit intake of caffeinated or alcoholic drinks. These can dehydrate your body and make the ...

  5. Effect of sound on gap-junction-based intercellular signaling: Calcium waves under acoustic irradiation.

    Science.gov (United States)

    Deymier, P A; Swinteck, N; Runge, K; Deymier-Black, A; Hoying, J B

    2015-01-01

    We present a previously unrecognized effect of sound waves on gap-junction-based intercellular signaling such as in biological tissues composed of endothelial cells. We suggest that sound irradiation may, through temporal and spatial modulation of cell-to-cell conductance, create intercellular calcium waves with unidirectional signal propagation associated with nonconventional topologies. Nonreciprocity in calcium wave propagation induced by sound wave irradiation is demonstrated in the case of a linear and a nonlinear reaction-diffusion model. This demonstration should be applicable to other types of gap-junction-based intercellular signals, and it is thought that it should be of help in interpreting a broad range of biological phenomena associated with the beneficial therapeutic effects of sound irradiation and possibly the harmful effects of sound waves on health.

  6. Optical voice encryption based on digital holography.

    Science.gov (United States)

    Rajput, Sudheesh K; Matoba, Osamu

    2017-11-15

    We propose an optical voice encryption scheme based on digital holography (DH). An off-axis DH is employed to acquire voice information by obtaining phase retardation occurring in the object wave due to sound wave propagation. The acquired hologram, including voice information, is encrypted using optical image encryption. The DH reconstruction and decryption with all the correct parameters can retrieve an original voice. The scheme has the capability to record the human voice in holograms and encrypt it directly. These aspects make the scheme suitable for other security applications and help to use the voice as a potential security tool. We present experimental and some part of simulation results.

  7. Fish protection at water intakes using a new signal development process and sound system

    International Nuclear Information System (INIS)

    Loeffelman, P.H.; Klinect, D.A.; Van Hassel, J.H.

    1991-01-01

    American Electric Power Company, Inc., is exploring the feasibility of using a patented signal development process and sound system to guide aquatic animals with underwater sound. Sounds from animals such as chinook salmon, steelhead trout, striped bass, freshwater drum, largemouth bass, and gizzard shad can be used to synthesize a new signal to stimulate the animal in the most sensitive portion of its hearing range. AEP's field tests during its research demonstrate that adult chinook salmon, steelhead trout and warmwater fish, and steelhead trout and chinook salmon smolts can be repelled with a properly-tuned system. The signal development process and sound system is designed to be transportable and use animals at the site to incorporate site-specific factors known to affect underwater sound, e.g., bottom shape and type, water current, and temperature. This paper reports that, because the overall goal of this research was to determine the feasibility of using sound to divert fish, it was essential that the approach use a signal development process which could be customized to animals and site conditions at any hydropower plant site

  8. Sound and sound sources

    DEFF Research Database (Denmark)

    Larsen, Ole Næsbye; Wahlberg, Magnus

    2017-01-01

    There is no difference in principle between the infrasonic and ultrasonic sounds, which are inaudible to humans (or other animals) and the sounds that we can hear. In all cases, sound is a wave of pressure and particle oscillations propagating through an elastic medium, such as air. This chapter...... is about the physical laws that govern how animals produce sound signals and how physical principles determine the signals’ frequency content and sound level, the nature of the sound field (sound pressure versus particle vibrations) as well as directional properties of the emitted signal. Many...... of these properties are dictated by simple physical relationships between the size of the sound emitter and the wavelength of emitted sound. The wavelengths of the signals need to be sufficiently short in relation to the size of the emitter to allow for the efficient production of propagating sound pressure waves...

  9. The electronic cry: Voice and gender in electroacoustic music

    NARCIS (Netherlands)

    Bosma, H.M.

    2013-01-01

    The voice provides an entrance to discuss gender and related fundamental issues in electroacoustic music that are relevant as well in other musical genres and outside of music per se: the role of the female voice; the use of language versus non-verbal vocal sounds; the relation of voice, embodiment

  10. A pilot study of the relations within which hearing voices participates: Towards a functional distinction between voice hearers and controls

    NARCIS (Netherlands)

    McEnteggart, C.; Barnes-Holmes, Y.; Egger, J.I.M.; Barnes-Holmes, D.

    2016-01-01

    The current research used the Implicit Relational Assessment Procedure (IRAP) as a preliminary step toward bringing a broad, functional approach to understanding psychosis, by focusing on the specific phenomenon of auditory hallucinations of voices and sounds (often referred to as hearing voices).

  11. Bodies, Spaces, Voices, Silences

    OpenAIRE

    Donatella Mazzoleni; Pietro Vitiello

    2013-01-01

    A good architecture should not only allow functional, formal and technical quality for urban spaces, but also let the voice of the city be perceived, listened, enjoyed. Every city has got its specific sound identity, or “ISO” (R. O. Benenzon), made up of a complex texture of background noises and fluctuation of sound figures emerging and disappearing in a game of continuous fadings. For instance, the ISO of Naples is characterized by a spread need of hearing the sound return of one’s/others v...

  12. Mechanics of human voice production and control.

    Science.gov (United States)

    Zhang, Zhaoyan

    2016-10-01

    As the primary means of communication, voice plays an important role in daily life. Voice also conveys personal information such as social status, personal traits, and the emotional state of the speaker. Mechanically, voice production involves complex fluid-structure interaction within the glottis and its control by laryngeal muscle activation. An important goal of voice research is to establish a causal theory linking voice physiology and biomechanics to how speakers use and control voice to communicate meaning and personal information. Establishing such a causal theory has important implications for clinical voice management, voice training, and many speech technology applications. This paper provides a review of voice physiology and biomechanics, the physics of vocal fold vibration and sound production, and laryngeal muscular control of the fundamental frequency of voice, vocal intensity, and voice quality. Current efforts to develop mechanical and computational models of voice production are also critically reviewed. Finally, issues and future challenges in developing a causal theory of voice production and perception are discussed.

  13. Influence of classroom acoustics on the voice levels of teachers with and without voice problems: a field study

    DEFF Research Database (Denmark)

    Pelegrin Garcia, David; Lyberg-Åhlander, Viveka; Rydell, Roland

    2010-01-01

    of the classroom. The results thus suggest that teachers with voice problems are more aware of classroom acoustic conditions than their healthy colleagues and make use of the more supportive rooms to lower their voice levels. This behavior may result from an adaptation process of the teachers with voice problems...... of the voice problems was made with a questionnaire and a laryngological examination. During teaching, the sound pressure level at the teacher’s position was monitored. The teacher’s voice level and the activity noise level were separated using mixed Gaussians. In addition, objective acoustic parameters...... of Reverberation Time and Voice Support were measured in the 30 empty classrooms of the study. An empirical model shows that the measured voice levels depended on the activity noise levels and the voice support. Teachers with and without voice problems were differently affected by the voice support...

  14. Mobile voice health monitoring using a wearable accelerometer sensor and a smartphone platform.

    Science.gov (United States)

    Mehta, Daryush D; Zañartu, Matías; Feng, Shengran W; Cheyne, Harold A; Hillman, Robert E

    2012-11-01

    Many common voice disorders are chronic or recurring conditions that are likely to result from faulty and/or abusive patterns of vocal behavior, referred to generically as vocal hyperfunction. An ongoing goal in clinical voice assessment is the development and use of noninvasively derived measures to quantify and track the daily status of vocal hyperfunction so that the diagnosis and treatment of such behaviorally based voice disorders can be improved. This paper reports on the development of a new, versatile, and cost-effective clinical tool for mobile voice monitoring that acquires the high-bandwidth signal from an accelerometer sensor placed on the neck skin above the collarbone. Using a smartphone as the data acquisition platform, the prototype device provides a user-friendly interface for voice use monitoring, daily sensor calibration, and periodic alert capabilities. Pilot data are reported from three vocally normal speakers and three subjects with voice disorders to demonstrate the potential of the device to yield standard measures of fundamental frequency and sound pressure level and model-based glottal airflow properties. The smartphone-based platform enables future clinical studies for the identification of the best set of measures for differentiating between normal and hyperfunctional patterns of voice use.

  15. The sound of trustworthiness: Acoustic-based modulation of perceived voice personality.

    Directory of Open Access Journals (Sweden)

    Pascal Belin

    Full Text Available When we hear a new voice we automatically form a "first impression" of the voice owner's personality; a single word is sufficient to yield ratings highly consistent across listeners. Past studies have shown correlations between personality ratings and acoustical parameters of voice, suggesting a potential acoustical basis for voice personality impressions, but its nature and extent remain unclear. Here we used data-driven voice computational modelling to investigate the link between acoustics and perceived trustworthiness in the single word "hello". Two prototypical voice stimuli were generated based on the acoustical features of voices rated low or high in perceived trustworthiness, respectively, as well as a continuum of stimuli inter- and extrapolated between these two prototypes. Five hundred listeners provided trustworthiness ratings on the stimuli via an online interface. We observed an extremely tight relationship between trustworthiness ratings and position along the trustworthiness continuum (r = 0.99. Not only were trustworthiness ratings higher for the high- than the low-prototypes, but the difference could be modulated quasi-linearly by reducing or exaggerating the acoustical difference between the prototypes, resulting in a strong caricaturing effect. The f0 trajectory, or intonation, appeared a parameter of particular relevance: hellos rated high in trustworthiness were characterized by a high starting f0 then a marked decrease at mid-utterance to finish on a strong rise. These results demonstrate a strong acoustical basis for voice personality impressions, opening the door to multiple potential applications.

  16. Performance of the phonatory deviation diagram in the evaluation of rough and breathy synthesized voices.

    Science.gov (United States)

    Lopes, Leonardo Wanderley; Freitas, Jonas Almeida de; Almeida, Anna Alice; Silva, Priscila Oliveira Costa; Alves, Giorvan Ânderson Dos Santos

    2017-07-05

    Voice disorders alter the sound signal in several ways, combining several types of vocal emission disturbances and noise. The Phonatory Deviation Diagram (PDD) is a two-dimensional chart that allows the evaluation of the vocal signal based on the combination of periodicity (jitter, shimmer, and correlation coefficient) and noise (Glottal to Noise Excitation - GNE) measurements. The use of synthesized signals, where one has a greater control and knowledge of the production conditions, may allow a better understanding of the physiological and acoustic mechanisms underlying the vocal emission and its main perceptual-auditory correlates regarding the intensity of the deviation and types of vocal quality. To analyze the performance of the PDD in the discrimination of the presence and degree of roughness and breathiness in synthesized voices. 871 synthesized vocal signals were used corresponding to the vowel /ɛ/. The perceptual-auditory analysis of the degree of roughness and breathiness of the synthesized signals was performed using Visual Analogue Scale (VAS). Subsequently, the signals were categorized regarding the presence/absence of these parameters based on the VAS cutoff values. Acoustic analysis was performed by assessing the distribution of vocal signals according to the PDD area, quadrant, shape, and density. The equality of proportions and the chi-square tests were performed to compare the variables. Rough and breathy vocal signals were located predominantly outside the normal range and in the lower right quadrant of the PDD. Voices with higher degrees of roughness and breathiness were located outside the area of normality in the lower right quadrant and had concentrated density. The normality area and the PDD quadrant can discriminate healthy voices from rough and breathy ones. Voices with higher degrees of roughness and breathiness are proportionally located outside the area of normality, in the lower right quadrant and with concentrated density. Copyright

  17. A Novel Voice Sensor for the Detection of Speech Signals

    Directory of Open Access Journals (Sweden)

    Kun-Ching Wang

    2013-12-01

    Full Text Available In order to develop a novel voice sensor to detect human voices, the use of features which are more robust to noise is an important issue. Voice sensor is also called voice activity detection (VAD. Due to that the inherent nature of the formant structure only occurred on the speech spectrogram (well-known as voiceprint, Wu et al. were the first to use band-spectral entropy (BSE to describe the characteristics of voiceprints. However, the performance of VAD based on BSE feature was degraded in colored noise (or voiceprint-like noise environments. In order to solve this problem, we propose the two-dimensional part-band energy entropy (TD-PBEE parameter based on two variables: part-band partition number upon frequency index and long-term window size upon time index to further improve the BSE-based VAD algorithm. The two variables can efficiently represent the characteristics of voiceprints on each critical frequency band and use long-term information for noisy speech spectrograms, respectively. The TD-PBEE parameter can be regarded as a PBEE parameter over time. First, the strength of voiceprints can be partly enhanced by using four entropies applied to four part-bands. We can use the four part-band energy entropies for describing the voiceprints in detail. Due to the characteristics of non-stationary for speech and various noises, we will then use long-term information processing to refine the PBEE, so the voice-like noise can be distinguished from noisy speech through the concept of PBEE with long-term information. Our experiments show that the proposed feature extraction with the TD-PBEE parameter is quite insensitive to background noise. The proposed TD-PBEE-based VAD algorithm is evaluated for four types of noises and five signal-to-noise ratio (SNR levels. We find that the accuracy of the proposed TD-PBEE-based VAD algorithm averaged over all noises and all SNR levels is better than that of other considered VAD algorithms.

  18. Audio-Visual: Disembodied Voices in Theory

    OpenAIRE

    Le Fèvre-Berthelot, Anaïs

    2013-01-01

    After a survey of the major critical trends since the generalization of synchronized film sound, this bibliographical essay sets out to delineate the way film sound studies have developed around issues of taxonomy, meaning, and reception. Focusing on the treatment of the disembodied voice by various theorists, three trends can be identified: borrowing from semiology and narratology, an essentially descriptive approach first emerges that creates a new vocabulary to talk about sound and analyze...

  19. Stop consonant voicing in young children's speech: Evidence from a cross-sectional study

    Science.gov (United States)

    Ganser, Emily

    There are intuitive reasons to believe that speech-sound acquisition and language acquisition should be related in development. Surprisingly, only recently has research begun to parse just how the two might be related. This study investigated possible correlations between speech-sound acquisition and language acquisition, as part of a large-scale, longitudinal study of the relationship between different types of phonological development and vocabulary growth in the preschool years. Productions of voiced and voiceless stop-initial words were recorded from 96 children aged 28-39 months. Voice Onset Time (VOT, in ms) for each token context was calculated. A mixed-model logistic regression was calculated which predicted whether the sound was intended to be voiced or voiceless based on its VOT. This model estimated the slopes of the logistic function for each child. This slope was referred to as Robustness of Contrast (based on Holliday, Reidy, Beckman, and Edwards, 2015), defined as being the degree of categorical differentiation between the production of two speech sounds or classes of sounds, in this case, voiced and voiceless stops. Results showed a wide range of slopes for individual children, suggesting that slope-derived Robustness of Contrast could be a viable means of measuring a child's acquisition of the voicing contrast. Robustness of Contrast was then compared to traditional measures of speech and language skills to investigate whether there was any correlation between the production of stop voicing and broader measures of speech and language development. The Robustness of Contrast measure was found to correlate with all individual measures of speech and language, suggesting that it might indeed be predictive of later language skills.

  20. Voice Use Among Music Theory Teachers: A Voice Dosimetry and Self-Assessment Study.

    Science.gov (United States)

    Schiller, Isabel S; Morsomme, Dominique; Remacle, Angélique

    2017-07-25

    This study aimed (1) to investigate music theory teachers' professional and extra-professional vocal loading and background noise exposure, (2) to determine the correlation between vocal loading and background noise, and (3) to determine the correlation between vocal loading and self-evaluation data. Using voice dosimetry, 13 music theory teachers were monitored for one workweek. The parameters analyzed were voice sound pressure level (SPL), fundamental frequency (F0), phonation time, vocal loading index (VLI), and noise SPL. Spearman correlation was used to correlate vocal loading parameters (voice SPL, F0, and phonation time) and noise SPL. Each day, the subjects self-assessed their voice using visual analog scales. VLI and self-evaluation data were correlated using Spearman correlation. Vocal loading parameters and noise SPL were significantly higher in the professional than in the extra-professional environment. Voice SPL, phonation time, and female subjects' F0 correlated positively with noise SPL. VLI correlated with self-assessed voice quality, vocal fatigue, and amount of singing and speaking voice produced. Teaching music theory is a profession with high vocal demands. More background noise is associated with increased vocal loading and may indirectly increase the risk for voice disorders. Correlations between VLI and self-assessments suggest that these teachers are well aware of their vocal demands and feel their effect on voice quality and vocal fatigue. Visual analog scales seem to represent a useful tool for subjective vocal loading assessment and associated symptoms in these professional voice users. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  1. Advanced Time-Frequency Representation in Voice Signal Analysis

    Directory of Open Access Journals (Sweden)

    Dariusz Mika

    2018-03-01

    Full Text Available The most commonly used time-frequency representation of the analysis in voice signal is spectrogram. This representation belongs in general to Cohen's class, the class of time-frequency energy distributions. From the standpoint of properties of the resolution spectrogram representation is not optimal. In Cohen class representations are known which have a better resolution properties. All of them are created by smoothing the Wigner-Ville'a (WVD distribution characterized by the best resolution, however, the biggest harmful interference. Used smoothing functions decide about a compromise between the properties of resolution and eliminating harmful interference term. Another class of time-frequency energy distributions is the affine class of distributions. From the point of view of readability of analysis the best properties are known so called Redistribution of energy caused by the use of a general methodology referred to as reassignment to any time-frequency representation. Reassigned distributions efficiently combine a reduction of the interference terms provided by a well adapted smoothing kernel and an increased concentration of the signal components.

  2. Sound [signal] noise

    DEFF Research Database (Denmark)

    Bjørnsten, Thomas

    2012-01-01

    The article discusses the intricate relationship between sound and signification through notions of noise. The emergence of new fields of sonic artistic practices has generated several questions of how to approach sound as aesthetic form and material. During the past decade an increased attention...... has been paid to, for instance, a category such as ‘sound art’ together with an equally strengthened interest in phenomena and concepts that fall outside the accepted aesthetic procedures and constructions of what we traditionally would term as musical sound – a recurring example being ‘noise’....

  3. Your Cheatin' Voice Will Tell on You: Detection of Past Infidelity from Voice.

    Science.gov (United States)

    Hughes, Susan M; Harrison, Marissa A

    2017-01-01

    Evidence suggests that many physical, behavioral, and trait qualities can be detected solely from the sound of a person's voice, irrespective of the semantic information conveyed through speech. This study examined whether raters could accurately assess the likelihood that a person has cheated on committed, romantic partners simply by hearing the speaker's voice. Independent raters heard voice samples of individuals who self-reported that they either cheated or had never cheated on their romantic partners. To control for aspects that may clue a listener to the speaker's mate value, we used voice samples that did not differ between these groups for voice attractiveness, age, voice pitch, and other acoustic measures. We found that participants indeed rated the voices of those who had a history of cheating as more likely to cheat. Male speakers were given higher ratings for cheating, while female raters were more likely to ascribe the likelihood to cheat to speakers. Additionally, we manipulated the pitch of the voice samples, and for both sexes, the lower pitched versions were consistently rated to be from those who were more likely to have cheated. Regardless of the pitch manipulation, speakers were able to assess actual history of infidelity; the one exception was that men's accuracy decreased when judging women whose voices were lowered. These findings expand upon the idea that the human voice may be of value as a cheater detection tool and very thin slices of vocal information are all that is needed to make certain assessments about others.

  4. Experiences with voice to design ceramics

    DEFF Research Database (Denmark)

    Hansen, Flemming Tvede; Jensen, Kristoffer

    2014-01-01

    This article presents SoundShaping, a system to create ceramics from the human voice and thus how digital technology makes new possibilities in ceramic craft. The article is about how experiential knowledge that the craftsmen gains in a direct physical and tactile interaction with a responding...... material can be transformed and utilised in the use of digital technologies. SoundShaping is based on a generic audio feature extraction system and the principal component analysis to ensure that the pertinent information in the voice is used. Moreover, 3D shape is created using simple geometric rules....... The shape is output to a 3D printer to make ceramic results. The system demonstrates the close connection between digital technology and craft practice. Several experiments and reflections demonstrate the validity of this work....

  5. Experiences with Voice to Design Ceramics

    DEFF Research Database (Denmark)

    Hansen, Flemming Tvede; Jensen, Kristoffer

    2013-01-01

    This article presents SoundShaping, a system to create ceramics from the human voice and thus how digital technology makes new possibilities in ceramic craft. The article is about how experiential knowledge that the craftsmen gains in a direct physical and tactile interaction with a responding...... material can be transformed and utilized in the use of digital technologies. SoundShaping is based on a generic audio feature extraction system and the principal component analysis to ensure that the pertinent information in the voice is used. Moreover, 3D shape is created using simple geometric rules....... The shape is output to a 3D printer to make ceramic results. The system demonstrates the close connection between digital technology and craft practice. Several experiments and reflections demonstrate the validity of this work....

  6. Spectral distribution of solo voice and accompaniment in pop music.

    Science.gov (United States)

    Borch, Daniel Zangger; Sundberg, Johan

    2002-01-01

    Singers performing in popular styles of music mostly rely on feedback provided by monitor loudspeakers on the stage. The highest sound level that these loudspeakers can provide without feedback noise is often too low to be heard over the ambient sound level on the stage. Long-term-average spectra of some orchestral accompaniments typically used in pop music are compared with those of classical symphonic orchestras. In loud pop accompaniment the sound level difference between 0.5 and 2.5 kHz is similar to that of a Wagner orchestra. Long-term-average spectra of pop singers' voices showed no signs of a singer's formant but a peak near 3.5 kHz. It is suggested that pop singers' difficulties to hear their own voices may be reduced if the frequency range 3-4 kHz is boosted in the monitor sound.

  7. Root phonotropism: Early signalling events following sound perception in Arabidopsis roots.

    Science.gov (United States)

    Rodrigo-Moreno, Ana; Bazihizina, Nadia; Azzarello, Elisa; Masi, Elisa; Tran, Daniel; Bouteau, François; Baluska, Frantisek; Mancuso, Stefano

    2017-11-01

    Sound is a fundamental form of energy and it has been suggested that plants can make use of acoustic cues to obtain information regarding their environments and alter and fine-tune their growth and development. Despite an increasing body of evidence indicating that it can influence plant growth and physiology, many questions concerning the effect of sound waves on plant growth and the underlying signalling mechanisms remains unknown. Here we show that in Arabidopsis thaliana, exposure to sound waves (200Hz) for 2 weeks induced positive phonotropism in roots, which grew towards to sound source. We found that sound waves triggered very quickly (within  minutes) an increase in cytosolic Ca 2+ , possibly mediated by an influx through plasma membrane and a release from internal stock. Sound waves likewise elicited rapid reactive oxygen species (ROS) production and K + efflux. Taken together these results suggest that changes in ion fluxes (Ca 2+ and K + ) and an increase in superoxide production are involved in sound perception in plants, as previously established in animals. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. A model to explain human voice production

    Science.gov (United States)

    Vilas Bôas, C. S. N.; Gobara, S. T.

    2018-05-01

    This article presents a device constructed with low-cost material to demonstrate and explain voice production. It also provides a contextualized, interdisciplinary approach to introduce the study of sound waves.

  9. Evaluating signal-to-noise ratios, loudness, and related measures as indicators of airborne sound insulation.

    Science.gov (United States)

    Park, H K; Bradley, J S

    2009-09-01

    Subjective ratings of the audibility, annoyance, and loudness of music and speech sounds transmitted through 20 different simulated walls were used to identify better single number ratings of airborne sound insulation. The first part of this research considered standard measures such as the sound transmission class the weighted sound reduction index (R(w)) and variations of these measures [H. K. Park and J. S. Bradley, J. Acoust. Soc. Am. 126, 208-219 (2009)]. This paper considers a number of other measures including signal-to-noise ratios related to the intelligibility of speech and measures related to the loudness of sounds. An exploration of the importance of the included frequencies showed that the optimum ranges of included frequencies were different for speech and music sounds. Measures related to speech intelligibility were useful indicators of responses to speech sounds but were not as successful for music sounds. A-weighted level differences, signal-to-noise ratios and an A-weighted sound transmission loss measure were good predictors of responses when the included frequencies were optimized for each type of sound. The addition of new spectrum adaptation terms to R(w) values were found to be the most practical approach for achieving more accurate predictions of subjective ratings of transmitted speech and music sounds.

  10. Masculine Voices Predict Well-Being in Female-to-Male Transgender Individuals.

    Science.gov (United States)

    Watt, Seth O; Tskhay, Konstantin O; Rule, Nicholas O

    2018-05-01

    Voices convey important social information about an individual's identity, including gender. This is especially relevant to transgender individuals, who cite voice alteration as a primary goal of the gender alignment process. Although the voice is a primary target of testosterone therapy among female-to-male (FTM) trans people, little research has explored the effects of such changes on their psychological well-being. Here, we investigated how FTMs' vocal gender related to their well-being. A total of 77 FTMs (M age  = 25.45 years, SD = 6.77) provided voice samples and completed measures of their well-being and psychological health. An independent group of 32 naïve raters (M age  = 22.16 years, SD = 8.21) subsequently rated the voice samples for masculinity. We found that FTMs whose voices sounded more congruent with their experienced gender (i.e., sounded more masculine) reported greater well-being (better life satisfaction, quality of life, and self-esteem; lower levels of anxiety and depression) than FTMs with less gender congruent (i.e., more feminine) voices (β = .48). The convergence between outwardly perceived vocal gender and gender identity brought about through hormone replacement therapy may therefore support greater well-being for FTMs.

  11. 33 CFR 81.20 - Lights and sound signal appliances.

    Science.gov (United States)

    2010-07-01

    ... 33 Navigation and Navigable Waters 1 2010-07-01 2010-07-01 false Lights and sound signal appliances. 81.20 Section 81.20 Navigation and Navigable Waters COAST GUARD, DEPARTMENT OF HOMELAND SECURITY... appliances. Each vessel under the 72 COLREGS, except the vessels of the Navy, is exempt from the requirements...

  12. Sound algorithms

    OpenAIRE

    De Götzen , Amalia; Mion , Luca; Tache , Olivier

    2007-01-01

    International audience; We call sound algorithms the categories of algorithms that deal with digital sound signal. Sound algorithms appeared in the very infancy of computer. Sound algorithms present strong specificities that are the consequence of two dual considerations: the properties of the digital sound signal itself and its uses, and the properties of auditory perception.

  13. Processing Electromyographic Signals to Recognize Words

    Science.gov (United States)

    Jorgensen, C. C.; Lee, D. D.

    2009-01-01

    A recently invented speech-recognition method applies to words that are articulated by means of the tongue and throat muscles but are otherwise not voiced or, at most, are spoken sotto voce. This method could satisfy a need for speech recognition under circumstances in which normal audible speech is difficult, poses a hazard, is disturbing to listeners, or compromises privacy. The method could also be used to augment traditional speech recognition by providing an additional source of information about articulator activity. The method can be characterized as intermediate between (1) conventional speech recognition through processing of voice sounds and (2) a method, not yet developed, of processing electroencephalographic signals to extract unspoken words directly from thoughts. This method involves computational processing of digitized electromyographic (EMG) signals from muscle innervation acquired by surface electrodes under a subject's chin near the tongue and on the side of the subject s throat near the larynx. After preprocessing, digitization, and feature extraction, EMG signals are processed by a neural-network pattern classifier, implemented in software, that performs the bulk of the recognition task as described.

  14. Associations between the Transsexual Voice Questionnaire (TVQMtF ) and self-report of voice femininity and acoustic voice measures.

    Science.gov (United States)

    Dacakis, Georgia; Oates, Jennifer; Douglas, Jacinta

    2017-11-01

    -related experience. Evidence supporting the validity of the TVQ MtF is strong and indicates that it is a sound measure for capturing the MtF woman's self-perceptions of her vocal functioning and how her voice impacts on her everyday life. © 2017 Royal College of Speech and Language Therapists.

  15. Office noise: Can headphones and masking sound attenuate distraction by background speech?

    Science.gov (United States)

    Jahncke, Helena; Björkeholm, Patrik; Marsh, John E; Odelius, Johan; Sörqvist, Patrik

    2016-11-22

    Background speech is one of the most disturbing noise sources at shared workplaces in terms of both annoyance and performance-related disruption. Therefore, it is important to identify techniques that can efficiently protect performance against distraction. It is also important that the techniques are perceived as satisfactory and are subjectively evaluated as effective in their capacity to reduce distraction. The aim of the current study was to compare three methods of attenuating distraction from background speech: masking a background voice with nature sound through headphones, masking a background voice with other voices through headphones and merely wearing headphones (without masking) as a way to attenuate the background sound. Quiet was deployed as a baseline condition. Thirty students participated in an experiment employing a repeated measures design. Performance (serial short-term memory) was impaired by background speech (1 voice), but this impairment was attenuated when the speech was masked - and in particular when it was masked by nature sound. Furthermore, perceived workload was lowest in the quiet condition and significantly higher in all other sound conditions. Notably, the headphones tested as a sound-attenuating device (i.e. without masking) did not protect against the effects of background speech on performance and subjective work load. Nature sound was the only masking condition that worked as a protector of performance, at least in the context of the serial recall task. However, despite the attenuation of distraction by nature sound, perceived workload was still high - suggesting that it is difficult to find a masker that is both effective and perceived as satisfactory.

  16. Trailblazers and Cassandras: Other Voices in Northern Ireland

    DEFF Research Database (Denmark)

    McQuaid, Sara Dybris

    2012-01-01

    voices and alternative positions in the process of conflict interpretation and resolution. This essay will outline a ‘thumbnail’ sketch of three areas in which ‘other’ voices are sidelined or silenced: in terms of political discourses; community discourses; and wider academic and public discourses......’ and ‘Cassandras’ the essay concludes that the arguments forwarded by other voices are not disappeared but adapted and realigned to the reigning discourses, and that there is not so much a culture of silence surrounding ‘other’ voices as a certain selective and sectarian hearing in picking them up. Whilst...... it follows that ‘other’ voices have failed to dissolve the magnetic field of Northern Irish politics, the essay suggests that in order to rise to current political challenges in Northern Ireland it is worthwhile sounding out the historical and contemporary ‘other’ voices for carefully thought out and non...

  17. “The Voice as the Sound of Many Waters” in the Book of Revelation in Light of Old Testament Semantics: A Threatening Message or One of Beauty?

    Directory of Open Access Journals (Sweden)

    Joanna Nowińska

    2017-03-01

    Full Text Available The sentence ἡ φωνὴ αὐτοῦ ὡς φωνὴ ὑδάτων πολλῶν isn’t very commonly found in the Bible, despite the fact that the subject of God’s voice is one of the main motifs not only in the Old Testament. It’s used twice in Ezekiel and three times in the Book of Revelation. Both connect this motive with God to describe His Identity and deeds. The “many waters” do not only mean force, danger and terrible rule in the Bible. They are also a metaphor for abun-dance, which a good condition for progress, because water gives life. So “the voice as the sound of many waters” is the message of power, liveliness, beauty, and care. It’s so strong a voice that nobody and nothing is capable of overcoming it. Everybody who wants can hear it. It’s like the voice embraced from all sides. The Book of Revelation describes Jesus’ voice (Rev 1 : 15 and the voice from heaven (Rev 14 : 2 in such a way. Also for John, the mystery of internal experience (Rev 19 : 6 avoids any categorization. But for God, it’s the preferred way to communicate with human beings.

  18. Effects of voice harmonic complexity on ERP responses to pitch-shifted auditory feedback.

    Science.gov (United States)

    Behroozmand, Roozbeh; Korzyukov, Oleg; Larson, Charles R

    2011-12-01

    The present study investigated the neural mechanisms of voice pitch control for different levels of harmonic complexity in the auditory feedback. Event-related potentials (ERPs) were recorded in response to+200 cents pitch perturbations in the auditory feedback of self-produced natural human vocalizations, complex and pure tone stimuli during active vocalization and passive listening conditions. During active vocal production, ERP amplitudes were largest in response to pitch shifts in the natural voice, moderately large for non-voice complex stimuli and smallest for the pure tones. However, during passive listening, neural responses were equally large for pitch shifts in voice and non-voice complex stimuli but still larger than that for pure tones. These findings suggest that pitch change detection is facilitated for spectrally rich sounds such as natural human voice and non-voice complex stimuli compared with pure tones. Vocalization-induced increase in neural responses for voice feedback suggests that sensory processing of naturally-produced complex sounds such as human voice is enhanced by means of motor-driven mechanisms (e.g. efference copies) during vocal production. This enhancement may enable the audio-vocal system to more effectively detect and correct for vocal errors in the feedback of natural human vocalizations to maintain an intended vocal output for speaking. Copyright © 2011 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.

  19. Stochastic Signal Processing for Sound Environment System with Decibel Evaluation and Energy Observation

    Directory of Open Access Journals (Sweden)

    Akira Ikuta

    2014-01-01

    Full Text Available In real sound environment system, a specific signal shows various types of probability distribution, and the observation data are usually contaminated by external noise (e.g., background noise of non-Gaussian distribution type. Furthermore, there potentially exist various nonlinear correlations in addition to the linear correlation between input and output time series. Consequently, often the system input and output relationship in the real phenomenon cannot be represented by a simple model using only the linear correlation and lower order statistics. In this study, complex sound environment systems difficult to analyze by using usual structural method are considered. By introducing an estimation method of the system parameters reflecting correlation information for conditional probability distribution under existence of the external noise, a prediction method of output response probability for sound environment systems is theoretically proposed in a suitable form for the additive property of energy variable and the evaluation in decibel scale. The effectiveness of the proposed stochastic signal processing method is experimentally confirmed by applying it to the observed data in sound environment systems.

  20. Measurement of voice onset time in maxillectomy patients.

    Science.gov (United States)

    Hattori, Mariko; Sumita, Yuka I; Taniguchi, Hisashi

    2014-01-01

    Objective speech evaluation using acoustic measurement is needed for the proper rehabilitation of maxillectomy patients. For digital evaluation of consonants, measurement of voice onset time is one option. However, voice onset time has not been measured in maxillectomy patients as their consonant sound spectra exhibit unique characteristics that make the measurement of voice onset time challenging. In this study, we established criteria for measuring voice onset time in maxillectomy patients for objective speech evaluation. We examined voice onset time for /ka/ and /ta/ in 13 maxillectomy patients by calculating the number of valid measurements of voice onset time out of three trials for each syllable. Wilcoxon's signed rank test showed that voice onset time measurements were more successful for /ka/ and /ta/ when a prosthesis was used (Z = -2.232, P = 0.026 and Z = -2.401, P = 0.016, resp.) than when a prosthesis was not used. These results indicate a prosthesis affected voice onset measurement in these patients. Although more research in this area is needed, measurement of voice onset time has the potential to be used to evaluate consonant production in maxillectomy patients wearing a prosthesis.

  1. Measurement of Voice Onset Time in Maxillectomy Patients

    Directory of Open Access Journals (Sweden)

    Mariko Hattori

    2014-01-01

    Full Text Available Objective speech evaluation using acoustic measurement is needed for the proper rehabilitation of maxillectomy patients. For digital evaluation of consonants, measurement of voice onset time is one option. However, voice onset time has not been measured in maxillectomy patients as their consonant sound spectra exhibit unique characteristics that make the measurement of voice onset time challenging. In this study, we established criteria for measuring voice onset time in maxillectomy patients for objective speech evaluation. We examined voice onset time for /ka/ and /ta/ in 13 maxillectomy patients by calculating the number of valid measurements of voice onset time out of three trials for each syllable. Wilcoxon’s signed rank test showed that voice onset time measurements were more successful for /ka/ and /ta/ when a prosthesis was used (Z=−2.232, P=0.026 and Z=−2.401, P=0.016, resp. than when a prosthesis was not used. These results indicate a prosthesis affected voice onset measurement in these patients. Although more research in this area is needed, measurement of voice onset time has the potential to be used to evaluate consonant production in maxillectomy patients wearing a prosthesis.

  2. Orientation Estimation and Signal Reconstruction of a Directional Sound Source

    DEFF Research Database (Denmark)

    Guarato, Francesco

    , one for each call emission, were compared to those calculated through a pre-existing technique based on interpolation of sound-pressure levels at microphone locations. The application of the method to the bat calls could provide knowledge on bat behaviour that may be useful for a bat-inspired sensor......Previous works in the literature about one tone or broadband sound sources mainly deal with algorithms and methods developed in order to localize the source and, occasionally, estimate the source bearing angle (with respect to a global reference frame). The problem setting assumes, in these cases......, omnidirectional receivers collecting the acoustic signal from the source: analysis of arrival times in the recordings together with microphone positions and source directivity cues allows to get information about source position and bearing. Moreover, sound sources have been included into sensor systems together...

  3. Gay- and Lesbian-Sounding Auditory Cues Elicit Stereotyping and Discrimination.

    Science.gov (United States)

    Fasoli, Fabio; Maass, Anne; Paladino, Maria Paola; Sulpizio, Simone

    2017-07-01

    The growing body of literature on the recognition of sexual orientation from voice ("auditory gaydar") is silent on the cognitive and social consequences of having a gay-/lesbian- versus heterosexual-sounding voice. We investigated this issue in four studies (overall N = 276), conducted in Italian language, in which heterosexual listeners were exposed to single-sentence voice samples of gay/lesbian and heterosexual speakers. In all four studies, listeners were found to make gender-typical inferences about traits and preferences of heterosexual speakers, but gender-atypical inferences about those of gay or lesbian speakers. Behavioral intention measures showed that listeners considered lesbian and gay speakers as less suitable for a leadership position, and male (but not female) listeners took distance from gay speakers. Together, this research demonstrates that having a gay/lesbian rather than heterosexual-sounding voice has tangible consequences for stereotyping and discrimination.

  4. A new signal development process and sound system for diverting fish from water intakes

    International Nuclear Information System (INIS)

    Klinet, D.A.; Loeffelman, P.H.; van Hassel, J.H.

    1992-01-01

    This paper reports that American Electric Power Service Corporation has explored the feasibility of using a patented signal development process and underwater sound system to divert fish away from water intake areas. The effect of water intakes on fish is being closely scrutinized as hydropower projects are re-licensed. The overall goal of this four-year research project was to develop an underwater guidance system which is biologically effective, reliable and cost-effective compared to other proposed methods of diversion, such as physical screens. Because different fish species have various listening ranges, it was essential to the success of this experiment that the sound system have a great amount of flexibility. Assuming a fish's sounds are heard by the same kind of fish, it was necessary to develop a procedure and acquire instrumentation to properly analyze the sounds that the target fish species create to communicate and any artificial signals being generated for diversion

  5. The voiced pronunciation of initial phonemes predicts the gender of names.

    Science.gov (United States)

    Slepian, Michael L; Galinsky, Adam D

    2016-04-01

    Although it is known that certain names gain popularity within a culture because of historical events, it is unknown how names become associated with different social categories in the first place. We propose that vocal cord vibration during the pronunciation of an initial phoneme plays a critical role in explaining which names are assigned to males versus females. This produces a voiced gendered name effect, whereby voiced phonemes (vibration of the vocal cords) are more associated with male names, and unvoiced phonemes (no vibration of the vocal cords) are more associated with female names. Eleven studies test this association between voiced names and gender (a) using 270 million names (more than 80,000 unique names) given to children over 75 years, (b) names across 2 cultures (the U.S. and India), and (c) hundreds of novel names. The voiced gendered name effect was mediated through how hard or soft names sounded, and moderated by gender stereotype endorsement. Although extensive work has demonstrated morphological and physical cues to gender (e.g., facial, bodily, vocal), this work provides a systematic account of name-based cues to gender. Overall, the current research extends work on sound symbolism to names; the way in which a name sounds can be symbolically related to stereotypes associated with its social category. (c) 2016 APA, all rights reserved).

  6. Reliability in perceptual analysis of voice quality.

    Science.gov (United States)

    Bele, Irene Velsvik

    2005-12-01

    This study focuses on speaking voice quality in male teachers (n = 35) and male actors (n = 36), who represent untrained and trained voice users, because we wanted to investigate normal and supranormal voices. In this study, both substantial and methodologic aspects were considered. It includes a method for perceptual voice evaluation, and a basic issue was rater reliability. A listening group of 10 listeners, 7 experienced speech-language therapists, and 3 speech-language therapist students evaluated the voices by 15 vocal characteristics using VA scales. Two sets of voice signals were investigated: text reading (2 loudness levels) and sustained vowel (3 levels). The results indicated a high interrater reliability for most perceptual characteristics. Connected speech was evaluated more reliably, especially at the normal level, but both types of voice signals were evaluated reliably, although the reliability for connected speech was somewhat higher than for vowels. Experienced listeners tended to be more consistent in their ratings than did the student raters. Some vocal characteristics achieved acceptable reliability even with a smaller panel of listeners. The perceptual characteristics grouped in 4 factors reflected perceptual dimensions.

  7. Sound card based digital correlation detection of weak photoelectrical signals

    International Nuclear Information System (INIS)

    Tang Guanghui; Wang Jiangcheng

    2005-01-01

    A simple and low-cost digital correlation method is proposed to investigate weak photoelectrical signals, using a high-speed photodiode as detector, which is directly connected to a programmably triggered sound card analogue-to-digital converter and a personal computer. Two testing experiments, autocorrelation detection of weak flickering signals from a computer monitor under background of noisy outdoor stray light and cross-correlation measurement of the surface velocity of a motional tape, are performed, showing that the results are reliable and the method is easy to implement

  8. The role of emotions in the development of voice

    Directory of Open Access Journals (Sweden)

    Anna Maria Disanto

    2014-06-01

    Full Text Available In this paper the authors refer to the voice as expressive sphere of communication between two people. The voice expresses a symbolic meaning whose function is to represent our feelings, and thus our emotional life.The emission of sounds weaves an unconscious communication of affection, expresses the archaic nature of the links between body and language, the presence of a strong sensorial auditory, olfactory, tactile and visual.

  9. A Signal Processing Module for the Analysis of Heart Sounds and Heart Murmurs

    International Nuclear Information System (INIS)

    Javed, Faizan; Venkatachalam, P A; H, Ahmad Fadzil M

    2006-01-01

    In this paper a Signal Processing Module (SPM) for the computer-aided analysis of heart sounds has been developed. The module reveals important information of cardiovascular disorders and can assist general physician to come up with more accurate and reliable diagnosis at early stages. It can overcome the deficiency of expert doctors in rural as well as urban clinics and hospitals. The module has five main blocks: Data Acquisition and Pre-processing, Segmentation, Feature Extraction, Murmur Detection and Murmur Classification. The heart sounds are first acquired using an electronic stethoscope which has the capability of transferring these signals to the near by workstation using wireless media. Then the signals are segmented into individual cycles as well as individual components using the spectral analysis of heart without using any reference signal like ECG. Then the features are extracted from the individual components using Spectrogram and are used as an input to a MLP (Multiple Layer Perceptron) Neural Network that is trained to detect the presence of heart murmurs. Once the murmur is detected they are classified into seven classes depending on their timing within the cardiac cycle using Smoothed Pseudo Wigner-Ville distribution. The module has been tested with real heart sounds from 40 patients and has proved to be quite efficient and robust while dealing with a large variety of pathological conditions

  10. A Signal Processing Module for the Analysis of Heart Sounds and Heart Murmurs

    Energy Technology Data Exchange (ETDEWEB)

    Javed, Faizan; Venkatachalam, P A; H, Ahmad Fadzil M [Signal and Imaging Processing and Tele-Medicine Technology Research Group, Department of Electrical and Electronics Engineering, Universiti Teknologi PETRONAS, 31750 Tronoh, Perak (Malaysia)

    2006-04-01

    In this paper a Signal Processing Module (SPM) for the computer-aided analysis of heart sounds has been developed. The module reveals important information of cardiovascular disorders and can assist general physician to come up with more accurate and reliable diagnosis at early stages. It can overcome the deficiency of expert doctors in rural as well as urban clinics and hospitals. The module has five main blocks: Data Acquisition and Pre-processing, Segmentation, Feature Extraction, Murmur Detection and Murmur Classification. The heart sounds are first acquired using an electronic stethoscope which has the capability of transferring these signals to the near by workstation using wireless media. Then the signals are segmented into individual cycles as well as individual components using the spectral analysis of heart without using any reference signal like ECG. Then the features are extracted from the individual components using Spectrogram and are used as an input to a MLP (Multiple Layer Perceptron) Neural Network that is trained to detect the presence of heart murmurs. Once the murmur is detected they are classified into seven classes depending on their timing within the cardiac cycle using Smoothed Pseudo Wigner-Ville distribution. The module has been tested with real heart sounds from 40 patients and has proved to be quite efficient and robust while dealing with a large variety of pathological conditions.

  11. Finding your mate at a cocktail party: frequency separation promotes auditory stream segregation of concurrent voices in multi-species frog choruses.

    Directory of Open Access Journals (Sweden)

    Vivek Nityananda

    Full Text Available Vocal communication in crowded social environments is a difficult problem for both humans and nonhuman animals. Yet many important social behaviors require listeners to detect, recognize, and discriminate among signals in a complex acoustic milieu comprising the overlapping signals of multiple individuals, often of multiple species. Humans exploit a relatively small number of acoustic cues to segregate overlapping voices (as well as other mixtures of concurrent sounds, like polyphonic music. By comparison, we know little about how nonhuman animals are adapted to solve similar communication problems. One important cue enabling source segregation in human speech communication is that of frequency separation between concurrent voices: differences in frequency promote perceptual segregation of overlapping voices into separate "auditory streams" that can be followed through time. In this study, we show that frequency separation (ΔF also enables frogs to segregate concurrent vocalizations, such as those routinely encountered in mixed-species breeding choruses. We presented female gray treefrogs (Hyla chrysoscelis with a pulsed target signal (simulating an attractive conspecific call in the presence of a continuous stream of distractor pulses (simulating an overlapping, unattractive heterospecific call. When the ΔF between target and distractor was small (e.g., ≤3 semitones, females exhibited low levels of responsiveness, indicating a failure to recognize the target as an attractive signal when the distractor had a similar frequency. Subjects became increasingly more responsive to the target, as indicated by shorter latencies for phonotaxis, as the ΔF between target and distractor increased (e.g., ΔF = 6-12 semitones. These results support the conclusion that gray treefrogs, like humans, can exploit frequency separation as a perceptual cue to segregate concurrent voices in noisy social environments. The ability of these frogs to segregate

  12. Cuffless and Continuous Blood Pressure Estimation from the Heart Sound Signals

    Directory of Open Access Journals (Sweden)

    Rong-Chao Peng

    2015-09-01

    Full Text Available Cardiovascular disease, like hypertension, is one of the top killers of human life and early detection of cardiovascular disease is of great importance. However, traditional medical devices are often bulky and expensive, and unsuitable for home healthcare. In this paper, we proposed an easy and inexpensive technique to estimate continuous blood pressure from the heart sound signals acquired by the microphone of a smartphone. A cold-pressor experiment was performed in 32 healthy subjects, with a smartphone to acquire heart sound signals and with a commercial device to measure continuous blood pressure. The Fourier spectrum of the second heart sound and the blood pressure were regressed using a support vector machine, and the accuracy of the regression was evaluated using 10-fold cross-validation. Statistical analysis showed that the mean correlation coefficients between the predicted values from the regression model and the measured values from the commercial device were 0.707, 0.712, and 0.748 for systolic, diastolic, and mean blood pressure, respectively, and that the mean errors were less than 5 mmHg, with standard deviations less than 8 mmHg. These results suggest that this technique is of potential use for cuffless and continuous blood pressure monitoring and it has promising application in home healthcare services.

  13. Cognitive Bias for Learning Speech Sounds From a Continuous Signal Space Seems Nonlinguistic

    Directory of Open Access Journals (Sweden)

    Sabine van der Ham

    2015-10-01

    Full Text Available When learning language, humans have a tendency to produce more extreme distributions of speech sounds than those observed most frequently: In rapid, casual speech, vowel sounds are centralized, yet cross-linguistically, peripheral vowels occur almost universally. We investigate whether adults’ generalization behavior reveals selective pressure for communication when they learn skewed distributions of speech-like sounds from a continuous signal space. The domain-specific hypothesis predicts that the emergence of sound categories is driven by a cognitive bias to make these categories maximally distinct, resulting in more skewed distributions in participants’ reproductions. However, our participants showed more centered distributions, which goes against this hypothesis, indicating that there are no strong innate linguistic biases that affect learning these speech-like sounds. The centralization behavior can be explained by a lack of communicative pressure to maintain categories.

  14. Multilingual evaluation of voice disability index using pitch rate

    Directory of Open Access Journals (Sweden)

    Shuji Shinohara

    2017-06-01

    Full Text Available We propose the use of the pitch rate of free-form speech recorded by smartphones as an index of voice disability. This research compares the effectiveness of pitch rate, jitter, shimmer, and harmonic-to-noise ratio (HNR as indices of voice disability in English, German, and Japanese. Normally, the evaluation of these indices is performed using long-vowel sounds; however, this study included the recitation of a set passage, which is more similar to free-form speech. The results showed that for English, the jitter, shimmer, and HNR were very effective indices for long-vowel sounds, but the shimmer and HNR for read speech were considerably worse. Although the effectiveness of jitter as an index was maintained for read speech, the pitch rate was better in distinguishing between healthy individuals and patients with illnesses affecting their voice. The read speech results in German, Japanese, and English were similar, and the pitch rate showed the greatest efficiency for identification. Nevertheless, compared to English, the identification efficiency for the other two languages was lower.

  15. Colour and texture associations in voice-induced synaesthesia

    Directory of Open Access Journals (Sweden)

    Anja eMoos

    2013-09-01

    Full Text Available Voice-induced synaesthesia, a form of synaesthesia in which synaesthetic perceptions are induced by the sounds of people’s voices, appears to be relatively rare and has not been systematically studied. In this study we investigated the synaesthetic colour and visual texture perceptions experienced in response to different types of voice quality (e.g. nasal, whisper, falsetto. Experiences of three different groups – self-reported voice synaesthetes, phoneticians and controls – were compared using both qualitative and quantitative analysis in a study conducted online. Whilst, in the qualitative analysis, synaesthetes used more colour and texture terms to describe voices than either phoneticians or controls, only weak differences, and many similarities, between groups were found in the quantitative analysis. Notable consistent results between groups were the matching of higher speech fundamental frequencies with lighter and redder colours, the matching of whispery voices with smoke-like textures and the matching of harsh and creaky voices with textures resembling dry cracked soil. These data are discussed in the light of current thinking about definitions and categorizations of synaesthesia, especially in cases where individuals apparently have a range of different synaesthetic inducers.

  16. Color and texture associations in voice-induced synesthesia

    Science.gov (United States)

    Moos, Anja; Simmons, David; Simner, Julia; Smith, Rachel

    2013-01-01

    Voice-induced synesthesia, a form of synesthesia in which synesthetic perceptions are induced by the sounds of people's voices, appears to be relatively rare and has not been systematically studied. In this study we investigated the synesthetic color and visual texture perceptions experienced in response to different types of “voice quality” (e.g., nasal, whisper, falsetto). Experiences of three different groups—self-reported voice synesthetes, phoneticians, and controls—were compared using both qualitative and quantitative analysis in a study conducted online. Whilst, in the qualitative analysis, synesthetes used more color and texture terms to describe voices than either phoneticians or controls, only weak differences, and many similarities, between groups were found in the quantitative analysis. Notable consistent results between groups were the matching of higher speech fundamental frequencies with lighter and redder colors, the matching of “whispery” voices with smoke-like textures, and the matching of “harsh” and “creaky” voices with textures resembling dry cracked soil. These data are discussed in the light of current thinking about definitions and categorizations of synesthesia, especially in cases where individuals apparently have a range of different synesthetic inducers. PMID:24032023

  17. [Acoustic and aerodynamic characteristics of the oesophageal voice].

    Science.gov (United States)

    Vázquez de la Iglesia, F; Fernández González, S

    2005-12-01

    The aim of the study is to determine the physiology and pathophisiology of esophageal voice according to objective aerodynamic and acoustic parameters (quantitative and qualitative parameters). Our subjects were comprised of 33 laryngectomized patients (all male) that underwent aerodynamic, acoustic and perceptual protocol. There is a statistical association between acoustic and aerodynamic qualitative parameters (phonation flow chart type, sound spectrum, perceptual analysis) among quantitative parameters (neoglotic pressure, phonation flow, phonation time, fundamental frequency, maximum intensity sound level, speech rate). Nevertheles, not always such observations bring practical resources to clinical practice. We consider that the facts studied may enable us to add, pragmatically, new resources to the more effective vocal rehabilitation to these patients. The physiology of esophageal voice is well understood by the method we have applied, also seeking for rehabilitation, improving oral communication skills in the laryngectomee population.

  18. Voicing the Technological Body. Some Musicological Reflections on Combinations of Voice and Technology in Popular Music

    Directory of Open Access Journals (Sweden)

    Florian Heesch

    2016-05-01

    Full Text Available The article deals with interrelations of voice, body and technology in popular music from a musicological perspective. It is an attempt to outline a systematic approach to the history of music technology with regard to aesthetic aspects, taking the identity of the singing subject as a main point of departure for a hermeneutic reading of popular song. Although the argumentation is based largely on musicological research, it is also inspired by the notion of presentness as developed by theologian and media scholar Walter Ong. The variety of the relationships between voice, body, and technology with regard to musical representations of identity, in particular gender and race, is systematized alongside the following cagories: (1 the “absence of the body,” that starts with the establishment of phonography; (2 “amplified presence,” as a signifier for uses of the microphone to enhance low sounds in certain manners; and (3 “hybridity,” including vocal identities that blend human body sounds and technological processing, whereby special focus is laid on uses of the vocoder and similar technologies.

  19. A Novel Fast and Secure Approach for Voice Encryption Based on DNA Computing

    Science.gov (United States)

    Kakaei Kate, Hamidreza; Razmara, Jafar; Isazadeh, Ayaz

    2018-06-01

    Today, in the world of information communication, voice information has a particular importance. One way to preserve voice data from attacks is voice encryption. The encryption algorithms use various techniques such as hashing, chaotic, mixing, and many others. In this paper, an algorithm is proposed for voice encryption based on three different schemes to increase flexibility and strength of the algorithm. The proposed algorithm uses an innovative encoding scheme, the DNA encryption technique and a permutation function to provide a secure and fast solution for voice encryption. The algorithm is evaluated based on various measures including signal to noise ratio, peak signal to noise ratio, correlation coefficient, signal similarity and signal frequency content. The results demonstrate applicability of the proposed method in secure and fast encryption of voice files

  20. An experimental test of noise-dependent voice amplitude regulation in Cope's grey treefrog (Hyla chrysoscelis).

    Science.gov (United States)

    Love, Elliot K; Bee, Mark A

    2010-09-01

    One strategy for coping with the constraints on acoustic signal reception posed by ambient noise is to signal louder as noise levels increase. Termed the 'Lombard effect', this reflexive behaviour is widespread among birds and mammals and occurs with a diversity of signal types, leading to the hypothesis that voice amplitude regulation represents a general vertebrate mechanism for coping with environmental noise. Support for this evolutionary hypothesis, however, remains limited due to a lack of studies in taxa other than birds and mammals. Here, we report the results of an experimental test of the hypothesis that male grey treefrogs increase the amplitude of their advertisement calls in response to increasing levels of chorus-shaped noise. We recorded spontaneously produced calls in quiet and in the presence of noise broadcast at sound pressure levels ranging between 40 dB and 70 dB. While increasing noise levels induced predictable changes in call duration and rate, males did not regulate call amplitude. These results do not support the hypothesis that voice amplitude regulation is a generic vertebrate mechanism for coping with noise. We discuss the possibility that intense sexual selection and high levels of competition for mates in choruses place some frogs under strong selection to call consistently as loudly as possible.

  1. Making Sense of Sound

    Science.gov (United States)

    Menon, Deepika; Lankford, Deanna

    2016-01-01

    From the earliest days of their lives, children are exposed to all kinds of sound, from soft, comforting voices to the frightening rumble of thunder. Consequently, children develop their own naïve explanations largely based upon their experiences with phenomena encountered every day. When new information does not support existing conceptions,…

  2. Radial Basis Function Networks for Conversion of Sound Spectra

    Directory of Open Access Journals (Sweden)

    Carlo Drioli

    2001-03-01

    Full Text Available In many advanced signal processing tasks, such as pitch shifting, voice conversion or sound synthesis, accurate spectral processing is required. Here, the use of Radial Basis Function Networks (RBFN is proposed for the modeling of the spectral changes (or conversions related to the control of important sound parameters, such as pitch or intensity. The identification of such conversion functions is based on a procedure which learns the shape of the conversion from few couples of target spectra from a data set. The generalization properties of RBFNs provides for interpolation with respect to the pitch range. In the construction of the training set, mel-cepstral encoding of the spectrum is used to catch the perceptually most relevant spectral changes. Moreover, a singular value decomposition (SVD approach is used to reduce the dimension of conversion functions. The RBFN conversion functions introduced are characterized by a perceptually-based fast training procedure, desirable interpolation properties and computational efficiency.

  3. Noise Source Visualization Using a Digital Voice Recorder and Low-Cost Sensors.

    Science.gov (United States)

    Cho, Yong Thung

    2018-04-03

    Accurate sound visualization of noise sources is required for optimal noise control. Typically, noise measurement systems require microphones, an analog-digital converter, cables, a data acquisition system, etc., which may not be affordable for potential users. Also, many such systems are not highly portable and may not be convenient for travel. Handheld personal electronic devices such as smartphones and digital voice recorders with relatively lower costs and higher performance have become widely available recently. Even though such devices are highly portable, directly implementing them for noise measurement may lead to erroneous results since such equipment was originally designed for voice recording. In this study, external microphones were connected to a digital voice recorder to conduct measurements and the input received was processed for noise visualization. In this way, a low cost, compact sound visualization system was designed and introduced to visualize two actual noise sources for verification with different characteristics: an enclosed loud speaker and a small air compressor. Reasonable accuracy of noise visualization for these two sources was shown over a relatively wide frequency range. This very affordable and compact sound visualization system can be used for many actual noise visualization applications in addition to educational purposes.

  4. High quality voice synthesis middle ware; Kohinshitsu onsei gosei middle war

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-03-01

    Toshiba Corp. newly developed a natural voice synthesis system, TOS Drive TTS (TOtally speaker Driven Text-To-Speech) system, in which natural high-quality read-aloud is greatly improved, and also developed as its application a voice synthesis middle ware. In the newly developed system, using as a model a narrator's voice recorded preliminarily, a metrical control dictionary is automatically learned that reproduces the characteristics of metrical patters such as intonation or rhythm of a human voice, as is a voice bases dictionary that reproduces the characteristics of a voice quality, enabling natural voice synthesis to be realized that picks up human voice characteristics. The system is high quality and also very compact, while the voice synthesis middle ware utilizing this technology is adaptable to various platforms such as MPU or OS. The system is very suitable for audio response in the ITS field having car navigation systems as the core; besides, expanded application is expected to an audio response system that used to employ a sound recording and reproducing system. (translated by NEDO)

  5. Electrolarynx Voice Recognition Utilizing Pulse Coupled Neural Network

    Directory of Open Access Journals (Sweden)

    Fatchul Arifin

    2010-08-01

    Full Text Available The laryngectomies patient has no ability to speak normally because their vocal chords have been removed. The easiest option for the patient to speak again is by using electrolarynx speech. This tool is placed on the lower chin. Vibration of the neck while speaking is used to produce sound. Meanwhile, the technology of "voice recognition" has been growing very rapidly. It is expected that the technology of "voice recognition" can also be used by laryngectomies patients who use electrolarynx.This paper describes a system for electrolarynx speech recognition. Two main parts of the system are feature extraction and pattern recognition. The Pulse Coupled Neural Network – PCNN is used to extract the feature and characteristic of electrolarynx speech. Varying of β (one of PCNN parameter also was conducted. Multi layer perceptron is used to recognize the sound patterns. There are two kinds of recognition conducted in this paper: speech recognition and speaker recognition. The speech recognition recognizes specific speech from every people. Meanwhile, speaker recognition recognizes specific speech from specific person. The system ran well. The "electrolarynx speech recognition" has been tested by recognizing of “A” and "not A" voice. The results showed that the system had 94.4% validation. Meanwhile, the electrolarynx speaker recognition has been tested by recognizing of “saya” voice from some different speakers. The results showed that the system had 92.2% validation. Meanwhile, the best β parameter of PCNN for electrolarynx recognition is 3.

  6. Air conducted and body conducted sound produced by own voice

    DEFF Research Database (Denmark)

    Hansen, Mie Østergaard

    1998-01-01

    When we speak, sound reaches our ears both through the air, from the mouth to ear, and through our body, as vibrations. The ratio between the air borne and body conducted sound has been studied in a pilot experiment where the air borne sound was eliminated by isolating the ear with a large...... attenuation box. The ratio was found to lie between -15 dB to -7 dB, below 1 kHz, comparable with theoretical estimations. This work is part of a broader study of the occlusion effect and the results provide important input data for modelling the sound pressure change between an open and an occluded ear canal....

  7. Analyzing the Pattern of L1 Sounds on L2 Sounds Produced by Javanese Students of Stkip PGRI Jombang

    Directory of Open Access Journals (Sweden)

    Daning Hentasmaka

    2015-07-01

    Full Text Available The studyconcerns on an analysis on the tendency of first language (L1 sound patterning on second language (L2 sounds done by Javanese students.Focusing on the consonant sounds, the data were collected by recording students’ pronunciationof English words during the pronunciation test. The data then analysed through three activities: data reduction, data display, and conclusion drawing/ verification. Theresult showedthatthe patterning of L1 sounds happened on L2 sounds especially on eleven consonant sounds: the fricatives [v, θ, ð, ʃ, ʒ], the voiceless stops [p, t, k], and the voiced stops [b, d, g].Thosepatterning case emergedmostlyduetothe difference in the existence of consonant sounds and rules of consonant distribution. Besides, one of the cases was caused by the difference in consonant clusters between L1 and L2

  8. ANALYZING THE PATTERN OF L1 SOUNDS ON L2 SOUNDS PRODUCED BY JAVANESE STUDENTS OF STKIP PGRI JOMBANG

    Directory of Open Access Journals (Sweden)

    Daning Hentasmaka

    2015-07-01

    Full Text Available The studyconcerns on an analysis on the tendency of first language (L1 sound patterning on second language (L2 sounds done by Javanese students.Focusing on the consonant sounds, the data were collected by recording students’ pronunciationof English words during the pronunciation test. The data then analysed through three activities: data reduction, data display, and conclusion drawing/ verification. Theresult showedthatthe patterning of L1 sounds happened on L2 sounds especially on eleven consonant sounds: the fricatives [v, θ, ð, ʃ, ʒ], the voiceless stops [p, t, k], and the voiced stops [b, d, g].Thosepatterning case emergedmostlyduetothe difference in the existence of consonant sounds and rules of consonant distribution. Besides, one of the cases was caused by the difference in consonant clusters between L1 and L2.

  9. The Show with the Voice: An [Au]/-[o]-tophonographic Parody

    Directory of Open Access Journals (Sweden)

    David D.J. Sander Scheidt

    2008-05-01

    Full Text Available According to my claim that voice as a phenomenon cannot be materialised or located, neither in the (voice organ of the self nor in the (ear of the other, I coined the term [au]/[o]-tophonography for my examination of the possibilities of performing subjectivity in writing and in sound productions. Drawing on the theory of performativity in its deconstructive senses (see BUTLER, 1993, 1997, 1999/1990; DERRIDA, 1988/1972, 1997/1967, 2002/1981; SMITH, 1995 my performative epistemology reaches beyond the theoretical, including the practical and the aesthetical, aiming at questioning notions of "self", "audience", "voice", "writing" and "communication". "The show with the voice" (http://www.qualitative-research.net/fqs-texte/2-08/08-2-27_audio.mp3 is an example of this practice. It parodies the medico-scientific approach to the human voice by presenting some of its possible appearances (the "normal", the "disordered", the "homosexual" and the "transsexual" voice in an audio collage that takes the shape of a mock tutorial. Through re-contextualising and re-compiling voice samples from different sources that are usually kept apart (e.g. the lecturer's voice, the researcher's voice, the artist's voice, the autobiographer's voice I open a space for a multidisciplinary and creative perspective to the examination of voice. URN: urn:nbn:de:0114-fqs0802279

  10. Decoding the neural signatures of emotions expressed through sound.

    Science.gov (United States)

    Sachs, Matthew E; Habibi, Assal; Damasio, Antonio; Kaplan, Jonas T

    2018-03-01

    Effective social functioning relies in part on the ability to identify emotions from auditory stimuli and respond appropriately. Previous studies have uncovered brain regions engaged by the affective information conveyed by sound. But some of the acoustical properties of sounds that express certain emotions vary remarkably with the instrument used to produce them, for example the human voice or a violin. Do these brain regions respond in the same way to different emotions regardless of the sound source? To address this question, we had participants (N = 38, 20 females) listen to brief audio excerpts produced by the violin, clarinet, and human voice, each conveying one of three target emotions-happiness, sadness, and fear-while brain activity was measured with fMRI. We used multivoxel pattern analysis to test whether emotion-specific neural responses to the voice could predict emotion-specific neural responses to musical instruments and vice-versa. A whole-brain searchlight analysis revealed that patterns of activity within the primary and secondary auditory cortex, posterior insula, and parietal operculum were predictive of the affective content of sound both within and across instruments. Furthermore, classification accuracy within the anterior insula was correlated with behavioral measures of empathy. The findings suggest that these brain regions carry emotion-specific patterns that generalize across sounds with different acoustical properties. Also, individuals with greater empathic ability have more distinct neural patterns related to perceiving emotions. These results extend previous knowledge regarding how the human brain extracts emotional meaning from auditory stimuli and enables us to understand and connect with others effectively. Copyright © 2018 Elsevier Inc. All rights reserved.

  11. Analysis And Voice Recognition In Indonesian Language Using MFCC And SVM Method

    Directory of Open Access Journals (Sweden)

    Harvianto Harvianto

    2016-06-01

    Full Text Available Voice recognition technology is one of biometric technology. Sound is a unique part of the human being which made an individual can be easily distinguished one from another. Voice can also provide information such as gender, emotion, and identity of the speaker. This research will record human voices that pronounce digits between 0 and 9 with and without noise. Features of this sound recording will be extracted using Mel Frequency Cepstral Coefficient (MFCC. Mean, standard deviation, max, min, and the combination of them will be used to construct the feature vectors. This feature vectors then will be classified using Support Vector Machine (SVM. There will be two classification models. The first one is based on the speaker and the other one based on the digits pronounced. The classification model then will be validated by performing 10-fold cross-validation.The best average accuracy from two classification model is 91.83%. This result achieved using Mean + Standard deviation + Min + Max as features.

  12. Reproducibility of Dual-Microphone Voice Range Profile Equipment

    DEFF Research Database (Denmark)

    Printz, Trine; Pedersen, Ellen Raben; Juhl, Peter

    2017-01-01

    in an anechoic chamber and an office: (a) comparing sound pressure levels (SPLs) from a dual-microphone VRP device, the Voice Profiler, when given the same input repeatedly (test-retest reliability); (b) comparing SPLs from 3 devices when given the same input repeatedly (intervariation); and (c) assessing...

  13. Using the Voice to Design Ceramics

    DEFF Research Database (Denmark)

    Hansen, Flemming Tvede; Jensen, Kristoffer

    2011-01-01

    Digital technology makes new possibilities in ceramic craft. This project is about how experiential knowledge that the craftsmen gains in a direct physical and tactile interaction with a responding material can be transformed and utilized in the use of digital technologies. The project presents...... to make ceramic results. The system demonstrates the close connection between digital technology and craft practice....... SoundShaping, a system to create ceramics from the human voice. Based on a generic audio feature extraction system, and the principal component analysis to ensure that the pertinent information in the voice is used, a 3D shape is created using simple geometric rules. This shape is output to a 3D printer...

  14. Noise Source Visualization Using a Digital Voice Recorder and Low-Cost Sensors

    Directory of Open Access Journals (Sweden)

    Yong Thung Cho

    2018-04-01

    Full Text Available Accurate sound visualization of noise sources is required for optimal noise control. Typically, noise measurement systems require microphones, an analog-digital converter, cables, a data acquisition system, etc., which may not be affordable for potential users. Also, many such systems are not highly portable and may not be convenient for travel. Handheld personal electronic devices such as smartphones and digital voice recorders with relatively lower costs and higher performance have become widely available recently. Even though such devices are highly portable, directly implementing them for noise measurement may lead to erroneous results since such equipment was originally designed for voice recording. In this study, external microphones were connected to a digital voice recorder to conduct measurements and the input received was processed for noise visualization. In this way, a low cost, compact sound visualization system was designed and introduced to visualize two actual noise sources for verification with different characteristics: an enclosed loud speaker and a small air compressor. Reasonable accuracy of noise visualization for these two sources was shown over a relatively wide frequency range. This very affordable and compact sound visualization system can be used for many actual noise visualization applications in addition to educational purposes.

  15. Onset and Maturation of Fetal Heart Rate Response to the Mother's Voice over Late Gestation

    Science.gov (United States)

    Kisilevsky, Barbara S.; Hains, Sylvia M. J.

    2011-01-01

    Background: Term fetuses discriminate their mother's voice from a female stranger's, suggesting recognition/learning of some property of her voice. Identification of the onset and maturation of the response would increase our understanding of the influence of environmental sounds on the development of sensory abilities and identify the period when…

  16. Imagined Voices : a poetics of Music-Text-Film

    NARCIS (Netherlands)

    Kyriakides, Y.

    2017-01-01

    Imagined Voices deals with a form of composition, music with on-screen text, in which the dynamic between sound, words and visuals is explored. The research explores the ideas around these 'music-text-films', and attempts to explain how meaning is constructed in the interplay between the different

  17. Removing the Influence of Shimmer in the Calculation of Harmonics-To-Noise Ratios Using Ensemble-Averages in Voice Signals

    OpenAIRE

    Carlos Ferrer; Eduardo González; María E. Hernández-Díaz; Diana Torres; Anesto del Toro

    2009-01-01

    Harmonics-to-noise ratios (HNRs) are affected by general aperiodicity in voiced speech signals. To specifically reflect a signal-to-additive-noise ratio, the measurement should be insensitive to other periodicity perturbations, like jitter, shimmer, and waveform variability. The ensemble averaging technique is a time-domain method which has been gradually refined in terms of its sensitivity to jitter and waveform variability and required number of pulses. In this paper, shimmer is introduced ...

  18. Broadcast sound technology

    CERN Document Server

    Talbot-Smith, Michael

    1990-01-01

    Broadcast Sound Technology provides an explanation of the underlying principles of modern audio technology. Organized into 21 chapters, the book first describes the basic sound; behavior of sound waves; aspects of hearing, harming, and charming the ear; room acoustics; reverberation; microphones; phantom power; loudspeakers; basic stereo; and monitoring of audio signal. Subsequent chapters explore the processing of audio signal, sockets, sound desks, and digital audio. Analogue and digital tape recording and reproduction, as well as noise reduction, are also explained.

  19. Propagation of sound

    DEFF Research Database (Denmark)

    Wahlberg, Magnus; Larsen, Ole Næsbye

    2017-01-01

    properties can be modified by sound absorption, refraction, and interference from multi paths caused by reflections.The path from the source to the receiver may be bent due to refraction. Besides geometrical attenuation, the ground effect and turbulence are the most important mechanisms to influence...... communication sounds for airborne acoustics and bottom and surface effects for underwater sounds. Refraction becomes very important close to shadow zones. For echolocation signals, geometric attenuation and sound absorption have the largest effects on the signals....

  20. Voice search for development

    CSIR Research Space (South Africa)

    Barnard, E

    2010-09-01

    Full Text Available of speech technology development, similar approaches are likely to be applicable in both circumstances. However, within these broad approaches there are details which are specific to certain languages (or lan- guage families) that may require solutions... to the modeling of pitch were therefore required. Similarly, it is possible that novel solutions will be required to deal with the click sounds that occur in some Southern Bantu languages, or the voicing Copyright  2010 ISCA 26-30 September 2010, Makuhari...

  1. Image/Music/Voice: Song Dubbing in Hollywood Musicals.

    Science.gov (United States)

    Siefert, Marsha

    1995-01-01

    Uses the practice of song dubbing in the Hollywood film musical to explore the implications and consequences of the singing voice for imaging practices in the 1930s through 1960s. Discusses the ideological, technological, and socioeconomic basis for song dubbing. Discusses gender, race, and ethnicity patterns of image-sound practices. (SR)

  2. Numerical simulation of flow-induced sound in human voice production

    Czech Academy of Sciences Publication Activity Database

    Šidlof, Petr; Zörner, S.; Huppe, A.

    2013-01-01

    Roč. 61, č. 2013 (2013), s. 333-340 E-ISSN 1877-7058. [ParCFD 2013 International conference /25./. Changsha, 20.05.2013-24.05.2013] R&D Projects: GA ČR(CZ) GAP101/11/0207 Institutional support: RVO:61388998 Keywords : aeroacoustics * parallel CFD * human voice * biomechanics * vocal folds Subject RIV: BI - Acoustics

  3. Contralateral routing of signals disrupts monaural level and spectral cues to sound localisation on the horizontal plane.

    Science.gov (United States)

    Pedley, Adam J; Kitterick, Pádraig T

    2017-09-01

    Contra-lateral routing of signals (CROS) devices re-route sound between the deaf and hearing ears of unilaterally-deaf individuals. This rerouting would be expected to disrupt access to monaural level cues that can support monaural localisation in the horizontal plane. However, such a detrimental effect has not been confirmed by clinical studies of CROS use. The present study aimed to exercise strict experimental control over the availability of monaural cues to localisation in the horizontal plane and the fitting of the CROS device to assess whether signal routing can impair the ability to locate sources of sound and, if so, whether CROS selectively disrupts monaural level or spectral cues to horizontal location, or both. Unilateral deafness and CROS device use were simulated in twelve normal hearing participants. Monaural recordings of broadband white noise presented from three spatial locations (-60°, 0°, and +60°) were made in the ear canal of a model listener using a probe microphone with and without a CROS device. The recordings were presented to participants via an insert earphone placed in their right ear. The recordings were processed to disrupt either monaural level or spectral cues to horizontal sound location by roving presentation level or the energy across adjacent frequency bands, respectively. Localisation ability was assessed using a three-alternative forced-choice spatial discrimination task. Participants localised above chance levels in all conditions. Spatial discrimination accuracy was poorer when participants only had access to monaural spectral cues compared to when monaural level cues were available. CROS use impaired localisation significantly regardless of whether level or spectral cues were available. For both cues, signal re-routing had a detrimental effect on the ability to localise sounds originating from the side of the deaf ear (-60°). CROS use also impaired the ability to use level cues to localise sounds originating from

  4. Plant acoustics: in the search of a sound mechanism for sound signaling in plants.

    Science.gov (United States)

    Mishra, Ratnesh Chandra; Ghosh, Ritesh; Bae, Hanhong

    2016-08-01

    Being sessile, plants continuously deal with their dynamic and complex surroundings, identifying important cues and reacting with appropriate responses. Consequently, the sensitivity of plants has evolved to perceive a myriad of external stimuli, which ultimately ensures their successful survival. Research over past centuries has established that plants respond to environmental factors such as light, temperature, moisture, and mechanical perturbations (e.g. wind, rain, touch, etc.) by suitably modulating their growth and development. However, sound vibrations (SVs) as a stimulus have only started receiving attention relatively recently. SVs have been shown to increase the yields of several crops and strengthen plant immunity against pathogens. These vibrations can also prime the plants so as to make them more tolerant to impending drought. Plants can recognize the chewing sounds of insect larvae and the buzz of a pollinating bee, and respond accordingly. It is thus plausible that SVs may serve as a long-range stimulus that evokes ecologically relevant signaling mechanisms in plants. Studies have suggested that SVs increase the transcription of certain genes, soluble protein content, and support enhanced growth and development in plants. At the cellular level, SVs can change the secondary structure of plasma membrane proteins, affect microfilament rearrangements, produce Ca(2+) signatures, cause increases in protein kinases, protective enzymes, peroxidases, antioxidant enzymes, amylase, H(+)-ATPase / K(+) channel activities, and enhance levels of polyamines, soluble sugars and auxin. In this paper, we propose a signaling model to account for the molecular episodes that SVs induce within the cell, and in so doing we uncover a number of interesting questions that need to be addressed by future research in plant acoustics. © The Author 2016. Published by Oxford University Press on behalf of the Society for Experimental Biology. All rights reserved. For permissions

  5. The effect of singing training on voice quality for people with quadriplegia.

    Science.gov (United States)

    Tamplin, Jeanette; Baker, Felicity A; Buttifant, Mary; Berlowitz, David J

    2014-01-01

    Despite anecdotal reports of voice impairment in quadriplegia, the exact nature of these impairments is not well described in the literature. This article details objective and subjective voice assessments for people with quadriplegia at baseline and after a respiratory-targeted singing intervention. Randomized controlled trial. Twenty-four participants with quadriplegia were randomly assigned to a 12-week program of either a singing intervention or active music therapy control. Recordings of singing and speech were made at baseline, 6 weeks, 12 weeks, and 6 months postintervention. These deidentified recordings were used to measure sound pressure levels and assess voice quality using the Multidimensional Voice Profile and the Perceptual Voice Profile. Baseline voice quality data indicated deviation from normality in the areas of breathiness, strain, and roughness. A greater percentage of intervention participants moved toward more normal voice quality in terms of jitter, shimmer, and noise-to-harmonic ratio; however, the improvements failed to achieve statistical significance. Subjective and objective assessments of voice quality indicate that quadriplegia may have a detrimental effect on voice quality; in particular, causing a perception of roughness and breathiness in the voice. The results of this study suggest that singing training may have a role in ameliorating these voice impairments. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  6. A posteriori error estimates in voice source recovery

    Science.gov (United States)

    Leonov, A. S.; Sorokin, V. N.

    2017-12-01

    The inverse problem of voice source pulse recovery from a segment of a speech signal is under consideration. A special mathematical model is used for the solution that relates these quantities. A variational method of solving inverse problem of voice source recovery for a new parametric class of sources, that is for piecewise-linear sources (PWL-sources), is proposed. Also, a technique for a posteriori numerical error estimation for obtained solutions is presented. A computer study of the adequacy of adopted speech production model with PWL-sources is performed in solving the inverse problems for various types of voice signals, as well as corresponding study of a posteriori error estimates. Numerical experiments for speech signals show satisfactory properties of proposed a posteriori error estimates, which represent the upper bounds of possible errors in solving the inverse problem. The estimate of the most probable error in determining the source-pulse shapes is about 7-8% for the investigated speech material. It is noted that a posteriori error estimates can be used as a criterion of the quality for obtained voice source pulses in application to speaker recognition.

  7. Improving Robustness against Environmental Sounds for Directing Attention of Social Robots

    DEFF Research Database (Denmark)

    Thomsen, Nicolai Bæk; Tan, Zheng-Hua; Lindberg, Børge

    2015-01-01

    This paper presents a multi-modal system for finding out where to direct the attention of a social robot in a dialog scenario, which is robust against environmental sounds (door slamming, phone ringing etc.) and short speech segments. The method is based on combining voice activity detection (VAD......) and sound source localization (SSL) and furthermore apply post-processing to SSL to filter out short sounds. The system is tested against a baseline system in four different real-world experiments, where different sounds are used as interfering sounds. The results are promising and show a clear improvement....

  8. Speaker comfort and increase of voice level in lecture rooms

    DEFF Research Database (Denmark)

    Brunskog, Jonas; Gade, Anders Christian; Bellester, G P

    2008-01-01

    Teachers often suffer health problems or tension related to their voice. These problems may be related to there working environment, including room acoustics of the lecture rooms which forces them to stress their voices. The present paper describes a first effort in finding relationships between...... were also measured in the rooms and subjective impressions from about 20 persons who had experience talking in these rooms were collected as well. Analysis of the data revealed significant differences in the sound power produced by the speaker in the different rooms. It was also found...

  9. Design, development and test of the gearbox condition monitoring system using sound signal processing

    Directory of Open Access Journals (Sweden)

    M Zamani

    2016-09-01

    Full Text Available Introduction One of the ways used for minimizing the cost of maintenance and repairs of rotating industrial equipment is condition monitoring using acoustic analysis. One of the most important problems which always have been under consideration in industrial equipment application is confidence possibility. Each dynamic, electrical, hydraulic or thermal system has certain characteristics which show the normal condition of the machine during function. Any changes of the characteristics can be a signal of a problem in the machine. The aim of condition monitoring is system condition determination using measurements of the signals of characteristics and using this information for system impairment prognostication. There are a lot of ways for condition monitoring of different systems, but sound analysis is accepted and used extensively as a method for condition investigation of rotating machines. The aim of this research is the design and construction of considered gearbox and using of obtaining data in frequency and time spectrum in order to analyze the sound and diagnosis. Materials and Methods This research was conducted at the department of mechanical biosystem workshop at Aboureihan College at Tehran University in February 15th.2015. In this research, in order to investigate the trend of diagnosis and gearbox condition, a system was designed and then constructed. The sound of correct and damaged gearbox was investigated by audiometer and stored in computer for data analysis. Sound measurement was done in three pinions speed of 749, 1050 and 1496 rpm and for correct gearboxes, damage of the fracture of a tooth and a tooth wear. Gearbox design and construction: In order to conduct the research, a gearbox with simple gearwheels was designed according to current needs. Then mentioned gearbox and its accessories were modeled in CATIA V5-R20 software and then the system was constructed. Gearbox is a machine that is used for mechanical power transition

  10. Experience with speech sounds is not necessary for cue trading by budgerigars (Melopsittacus undulatus.

    Directory of Open Access Journals (Sweden)

    Mary Flaherty

    Full Text Available The influence of experience with human speech sounds on speech perception in budgerigars, vocal mimics whose speech exposure can be tightly controlled in a laboratory setting, was measured. Budgerigars were divided into groups that differed in auditory exposure and then tested on a cue-trading identification paradigm with synthetic speech. Phonetic cue trading is a perceptual phenomenon observed when changes on one cue dimension are offset by changes in another cue dimension while still maintaining the same phonetic percept. The current study examined whether budgerigars would trade the cues of voice onset time (VOT and the first formant onset frequency when identifying syllable initial stop consonants and if this would be influenced by exposure to speech sounds. There were a total of four different exposure groups: No speech exposure (completely isolated, Passive speech exposure (regular exposure to human speech, and two Speech-trained groups. After the exposure period, all budgerigars were tested for phonetic cue trading using operant conditioning procedures. Birds were trained to peck keys in response to different synthetic speech sounds that began with "d" or "t" and varied in VOT and frequency of the first formant at voicing onset. Once training performance criteria were met, budgerigars were presented with the entire intermediate series, including ambiguous sounds. Responses on these trials were used to determine which speech cues were used, if a trading relation between VOT and the onset frequency of the first formant was present, and whether speech exposure had an influence on perception. Cue trading was found in all birds and these results were largely similar to those of a group of humans. Results indicated that prior speech experience was not a requirement for cue trading by budgerigars. The results are consistent with theories that explain phonetic cue trading in terms of a rich auditory encoding of the speech signal.

  11. Automated signal quality assessment of mobile phone-recorded heart sound signals.

    Science.gov (United States)

    Springer, David B; Brennan, Thomas; Ntusi, Ntobeko; Abdelrahman, Hassan Y; Zühlke, Liesl J; Mayosi, Bongani M; Tarassenko, Lionel; Clifford, Gari D

    Mobile phones, due to their audio processing capabilities, have the potential to facilitate the diagnosis of heart disease through automated auscultation. However, such a platform is likely to be used by non-experts, and hence, it is essential that such a device is able to automatically differentiate poor quality from diagnostically useful recordings since non-experts are more likely to make poor-quality recordings. This paper investigates the automated signal quality assessment of heart sound recordings performed using both mobile phone-based and commercial medical-grade electronic stethoscopes. The recordings, each 60 s long, were taken from 151 random adult individuals with varying diagnoses referred to a cardiac clinic and were professionally annotated by five experts. A mean voting procedure was used to compute a final quality label for each recording. Nine signal quality indices were defined and calculated for each recording. A logistic regression model for classifying binary quality was then trained and tested. The inter-rater agreement level for the stethoscope and mobile phone recordings was measured using Conger's kappa for multiclass sets and found to be 0.24 and 0.54, respectively. One-third of all the mobile phone-recorded phonocardiogram (PCG) signals were found to be of sufficient quality for analysis. The classifier was able to distinguish good- and poor-quality mobile phone recordings with 82.2% accuracy, and those made with the electronic stethoscope with an accuracy of 86.5%. We conclude that our classification approach provides a mechanism for substantially improving auscultation recordings by non-experts. This work is the first systematic evaluation of a PCG signal quality classification algorithm (using a separate test dataset) and assessment of the quality of PCG recordings captured by non-experts, using both a medical-grade digital stethoscope and a mobile phone.

  12. Unmasking the effects of masking on performance: The potential of multiple-voice masking in the office environment.

    Science.gov (United States)

    Keus van de Poll, Marijke; Carlsson, Johannes; Marsh, John E; Ljung, Robert; Odelius, Johan; Schlittmeier, Sabine J; Sundin, Gunilla; Sörqvist, Patrik

    2015-08-01

    Broadband noise is often used as a masking sound to combat the negative consequences of background speech on performance in open-plan offices. As office workers generally dislike broadband noise, it is important to find alternatives that are more appreciated while being at least not less effective. The purpose of experiment 1 was to compare broadband noise with two alternatives-multiple voices and water waves-in the context of a serial short-term memory task. A single voice impaired memory in comparison with silence, but when the single voice was masked with multiple voices, performance was on level with silence. Experiment 2 explored the benefits of multiple-voice masking in more detail (by comparing one voice, three voices, five voices, and seven voices) in the context of word processed writing (arguably a more office-relevant task). Performance (i.e., writing fluency) increased linearly from worst performance in the one-voice condition to best performance in the seven-voice condition. Psychological mechanisms underpinning these effects are discussed.

  13. Research and Implementation of Heart Sound Denoising

    Science.gov (United States)

    Liu, Feng; Wang, Yutai; Wang, Yanxiang

    Heart sound is one of the most important signals. However, the process of getting heart sound signal can be interfered with many factors outside. Heart sound is weak electric signal and even weak external noise may lead to the misjudgment of pathological and physiological information in this signal, thus causing the misjudgment of disease diagnosis. As a result, it is a key to remove the noise which is mixed with heart sound. In this paper, a more systematic research and analysis which is involved in heart sound denoising based on matlab has been made. The study of heart sound denoising based on matlab firstly use the powerful image processing function of matlab to transform heart sound signals with noise into the wavelet domain through wavelet transform and decomposition these signals in muli-level. Then for the detail coefficient, soft thresholding is made using wavelet transform thresholding to eliminate noise, so that a signal denoising is significantly improved. The reconstructed signals are gained with stepwise coefficient reconstruction for the processed detail coefficient. Lastly, 50HZ power frequency and 35 Hz mechanical and electrical interference signals are eliminated using a notch filter.

  14. Development of a voice database to aid children with hearing impairments

    International Nuclear Information System (INIS)

    Kuzman, M G; Agüero, P D; Tulli, J C; Gonzalez, E L; Cervellini, M P; Uriz, A J

    2011-01-01

    In the development of software for voice analysis or training, for people with hearing impairments, a database having sounds of properly pronounced words is of paramount importance. This paper shows the advantage that will be obtained from getting an own voice database, rather than using those coming from other countries, even having the same language, in the development of speech training software aimed to people with hearing impairments. This database will be used by software developers at the School of Engineering of Mar del Plata National University.

  15. Measuring the 'complexity'of sound

    Indian Academy of Sciences (India)

    Sounds in the natural environment form an important class of biologically relevant nonstationary signals. We propose a dynamic spectral measure to characterize the spectral dynamics of such non-stationary sound signals and classify them based on rate of change of spectral dynamics. We categorize sounds with slowly ...

  16. Mouth and Voice: A Relationship between Visual and Auditory Preference in the Human Superior Temporal Sulcus.

    Science.gov (United States)

    Zhu, Lin L; Beauchamp, Michael S

    2017-03-08

    Cortex in and around the human posterior superior temporal sulcus (pSTS) is known to be critical for speech perception. The pSTS responds to both the visual modality (especially biological motion) and the auditory modality (especially human voices). Using fMRI in single subjects with no spatial smoothing, we show that visual and auditory selectivity are linked. Regions of the pSTS were identified that preferred visually presented moving mouths (presented in isolation or as part of a whole face) or moving eyes. Mouth-preferring regions responded strongly to voices and showed a significant preference for vocal compared with nonvocal sounds. In contrast, eye-preferring regions did not respond to either vocal or nonvocal sounds. The converse was also true: regions of the pSTS that showed a significant response to speech or preferred vocal to nonvocal sounds responded more strongly to visually presented mouths than eyes. These findings can be explained by environmental statistics. In natural environments, humans see visual mouth movements at the same time as they hear voices, while there is no auditory accompaniment to visual eye movements. The strength of a voxel's preference for visual mouth movements was strongly correlated with the magnitude of its auditory speech response and its preference for vocal sounds, suggesting that visual and auditory speech features are coded together in small populations of neurons within the pSTS. SIGNIFICANCE STATEMENT Humans interacting face to face make use of auditory cues from the talker's voice and visual cues from the talker's mouth to understand speech. The human posterior superior temporal sulcus (pSTS), a brain region known to be important for speech perception, is complex, with some regions responding to specific visual stimuli and others to specific auditory stimuli. Using BOLD fMRI, we show that the natural statistics of human speech, in which voices co-occur with mouth movements, are reflected in the neural architecture of

  17. Second sound tracking system

    Science.gov (United States)

    Yang, Jihee; Ihas, Gary G.; Ekdahl, Dan

    2017-10-01

    It is common that a physical system resonates at a particular frequency, whose frequency depends on physical parameters which may change in time. Often, one would like to automatically track this signal as the frequency changes, measuring, for example, its amplitude. In scientific research, one would also like to utilize the standard methods, such as lock-in amplifiers, to improve the signal to noise ratio. We present a complete He ii second sound system that uses positive feedback to generate a sinusoidal signal of constant amplitude via automatic gain control. This signal is used to produce temperature/entropy waves (second sound) in superfluid helium-4 (He ii). A lock-in amplifier limits the oscillation to a desirable frequency and demodulates the received sound signal. Using this tracking system, a second sound signal probed turbulent decay in He ii. We present results showing that the tracking system is more reliable than those of a conventional fixed frequency method; there is less correlation with temperature (frequency) fluctuation when the tracking system is used.

  18. Measuring positive and negative affect in the voiced sounds of African elephants (Loxodonta africana).

    Science.gov (United States)

    Soltis, Joseph; Blowers, Tracy E; Savage, Anne

    2011-02-01

    As in other mammals, there is evidence that the African elephant voice reflects affect intensity, but it is less clear if positive and negative affective states are differentially reflected in the voice. An acoustic comparison was made between African elephant "rumble" vocalizations produced in negative social contexts (dominance interactions), neutral social contexts (minimal social activity), and positive social contexts (affiliative interactions) by four adult females housed at Disney's Animal Kingdom®. Rumbles produced in the negative social context exhibited higher and more variable fundamental frequencies (F(0)) and amplitudes, longer durations, increased voice roughness, and higher first formant locations (F1), compared to the neutral social context. Rumbles produced in the positive social context exhibited similar shifts in most variables (F(0 )variation, amplitude, amplitude variation, duration, and F1), but the magnitude of response was generally less than that observed in the negative context. Voice roughness and F(0) observed in the positive social context remained similar to that observed in the neutral context. These results are most consistent with the vocal expression of affect intensity, in which the negative social context elicited higher intensity levels than the positive context, but differential vocal expression of positive and negative affect cannot be ruled out.

  19. [The Bell Labs contributions to (singing) voice enginee­ring].

    Science.gov (United States)

    Vincent, C

    While in «art» and «traditional» music, the nimbleness of the voice and the mastering of the vocal tone are put into pers­pective, in «popular» music, sound engineering takes the lead, and relegates the vocal virtuosity of the interpreter to second place. We propose to study here three technologies with contri­butions to music. All are developed and patented by the Bell Labs: The artificial larynx (and its derivatives, Sonovox and TalkBox), the vocoder and the speech synthesis. After a presen­tation of the source-filter theory, vital to these innovations, the principle of these three technologies is explained. A brief historical is outlined and is complemented by examples of films and musical selections depicting these processes. In light of these elements, we conclude: Sound engineering, and in parti­cular the modification of voice sonority, has become an indis­pensable component in the process of «pop» artistic musical creation.

  20. The Voice of the Heart: Vowel-Like Sound in Pulmonary Artery Hypertension

    Directory of Open Access Journals (Sweden)

    Mohamed Elgendi

    2018-04-01

    Full Text Available Increased blood pressure in the pulmonary artery is referred to as pulmonary hypertension and often is linked to loud pulmonic valve closures. For the purpose of this paper, it was hypothesized that pulmonary circulation vibrations will create sounds similar to sounds created by vocal cords during speech and that subjects with pulmonary artery hypertension (PAH could have unique sound signatures across four auscultatory sites. Using a digital stethoscope, heart sounds were recorded at the cardiac apex, 2nd left intercostal space (2LICS, 2nd right intercostal space (2RICS, and 4th left intercostal space (4LICS undergoing simultaneous cardiac catheterization. From the collected heart sounds, relative power of the frequency band, energy of the sinusoid formants, and entropy were extracted. PAH subjects were differentiated by applying the linear discriminant analysis with leave-one-out cross-validation. The entropy of the first sinusoid formant decreased significantly in subjects with a mean pulmonary artery pressure (mPAp ≥ 25 mmHg versus subjects with a mPAp < 25 mmHg with a sensitivity of 84% and specificity of 88.57%, within a 10-s optimized window length for heart sounds recorded at the 2LICS. First sinusoid formant entropy reduction of heart sounds in PAH subjects suggests the existence of a vowel-like pattern. Pattern analysis revealed a unique sound signature, which could be used in non-invasive screening tools.

  1. Enlightened Use of Passive Voice in Technical Writing

    Science.gov (United States)

    Trammell, M. K.

    1981-01-01

    The passive voice as a normal, acceptable, and established syntactic form in technical writing is defended. Passive/active verb ratios, taken from sources including 'antipassivist' text books, are considered. The suitability of the passive voice in technical writing which involves unknown or irrelevant agents is explored. Three 'myths' that the passive (1) utilizes an abnormal and artificial word order, (2) is lifeless, and (3) is indirect are considered. Awkward and abnormal sounding examples encountered in text books are addressed in terms of original context. Unattractive or incoherent passive sentences are explained in terms of inappropriate conversion from active sentences having (1) short nominal or pronominal subjects or (2) verbs with restrictions on their passive use.

  2. Explaining the high voice superiority effect in polyphonic music: evidence from cortical evoked potentials and peripheral auditory models.

    Science.gov (United States)

    Trainor, Laurel J; Marie, Céline; Bruce, Ian C; Bidelman, Gavin M

    2014-02-01

    Natural auditory environments contain multiple simultaneously-sounding objects and the auditory system must parse the incoming complex sound wave they collectively create into parts that represent each of these individual objects. Music often similarly requires processing of more than one voice or stream at the same time, and behavioral studies demonstrate that human listeners show a systematic perceptual bias in processing the highest voice in multi-voiced music. Here, we review studies utilizing event-related brain potentials (ERPs), which support the notions that (1) separate memory traces are formed for two simultaneous voices (even without conscious awareness) in auditory cortex and (2) adults show more robust encoding (i.e., larger ERP responses) to deviant pitches in the higher than in the lower voice, indicating better encoding of the former. Furthermore, infants also show this high-voice superiority effect, suggesting that the perceptual dominance observed across studies might result from neurophysiological characteristics of the peripheral auditory system. Although musically untrained adults show smaller responses in general than musically trained adults, both groups similarly show a more robust cortical representation of the higher than of the lower voice. Finally, years of experience playing a bass-range instrument reduces but does not reverse the high voice superiority effect, indicating that although it can be modified, it is not highly neuroplastic. Results of new modeling experiments examined the possibility that characteristics of middle-ear filtering and cochlear dynamics (e.g., suppression) reflected in auditory nerve firing patterns might account for the higher-voice superiority effect. Simulations show that both place and temporal AN coding schemes well-predict a high-voice superiority across a wide range of interval spacings and registers. Collectively, we infer an innate, peripheral origin for the higher-voice superiority observed in human

  3. Waveform analysis of sound

    CERN Document Server

    Tohyama, Mikio

    2015-01-01

    What is this sound? What does that sound indicate? These are two questions frequently heard in daily conversation. Sound results from the vibrations of elastic media and in daily life provides informative signals of events happening in the surrounding environment. In interpreting auditory sensations, the human ear seems particularly good at extracting the signal signatures from sound waves. Although exploring auditory processing schemes may be beyond our capabilities, source signature analysis is a very attractive area in which signal-processing schemes can be developed using mathematical expressions. This book is inspired by such processing schemes and is oriented to signature analysis of waveforms. Most of the examples in the book are taken from data of sound and vibrations; however, the methods and theories are mostly formulated using mathematical expressions rather than by acoustical interpretation. This book might therefore be attractive and informative for scientists, engineers, researchers, and graduat...

  4. Identification of Mobile Phone and Analysis of Original Version of Videos through a Delay Time Analysis of Sound Signals from Mobile Phone Videos.

    Science.gov (United States)

    Hwang, Min Gu; Har, Dong Hwan

    2017-11-01

    This study designs a method of identifying the camera model used to take videos that are distributed through mobile phones and determines the original version of the mobile phone video for use as legal evidence. For this analysis, an experiment was conducted to find the unique characteristics of each mobile phone. The videos recorded by mobile phones were analyzed to establish the delay time of sound signals, and the differences between the delay times of sound signals for different mobile phones were traced by classifying their characteristics. Furthermore, the sound input signals for mobile phone videos used as legal evidence were analyzed to ascertain whether they have the unique characteristics of the original version. The objective of this study was to find a method for validating the use of mobile phone videos as legal evidence using mobile phones through differences in the delay times of sound input signals. © 2017 American Academy of Forensic Sciences.

  5. Sounding the Alert: Designing an Effective Voice for Earthquake Early Warning

    Science.gov (United States)

    Burkett, E. R.; Given, D. D.

    2015-12-01

    The USGS is working with partners to develop the ShakeAlert Earthquake Early Warning (EEW) system (http://pubs.usgs.gov/fs/2014/3083/) to protect life and property along the U.S. West Coast, where the highest national seismic hazard is concentrated. EEW sends an alert that shaking from an earthquake is on its way (in seconds to tens of seconds) to allow recipients or automated systems to take appropriate actions at their location to protect themselves and/or sensitive equipment. ShakeAlert is transitioning toward a production prototype phase in which test users might begin testing applications of the technology. While a subset of uses will be automated (e.g., opening fire house doors), other applications will alert individuals by radio or cellphone notifications and require behavioral decisions to protect themselves (e.g., "Drop, Cover, Hold On"). The project needs to select and move forward with a consistent alert sound to be widely and quickly recognized as an earthquake alert. In this study we combine EEW science and capabilities with an understanding of human behavior from the social and psychological sciences to provide insight toward the design of effective sounds to help best motivate proper action by alert recipients. We present a review of existing research and literature, compiled as considerations and recommendations for alert sound characteristics optimized for EEW. We do not yet address wording of an audible message about the earthquake (e.g., intensity and timing until arrival of shaking or possible actions), although it will be a future component to accompany the sound. We consider pitch(es), loudness, rhythm, tempo, duration, and harmony. Important behavioral responses to sound to take into account include that people respond to discordant sounds with anxiety, can be calmed by harmony and softness, and are innately alerted by loud and abrupt sounds, although levels high enough to be auditory stressors can negatively impact human judgment.

  6. Singer's preferred acoustic condition in performance in an opera house and self-perception of the singer's voice

    Science.gov (United States)

    Noson, Dennis; Kato, Kosuke; Ando, Yoichi

    2004-05-01

    Solo singers have been shown to over estimate the relative sound pressure level of a delayed, external reproduction of their own voice, singing single syllables, which, in turn, appears to influence the preferred delay of simulated stage reflections [Noson, Ph.D. thesis, Kobe University, 2003]. Bone conduction is thought to be one factor separating singer versus instrumental performer judgments of stage acoustics. Using a parameter derived from the vocal signal autocorrelation function (ACF envelope), the changes in singer preference for delayed reflections is primarily explained by the ACF parameter, rather than internal bone conduction. An auditory model of a singer's preferred reflection delay is proposed, combining the effects of acoustical environment (reflection amplitude), bone conduction, and performer vocal overestimate, which may be applied to the acoustic design of reflecting elements in both upstage and forestage environments of opera stages. For example, soloists who characteristically underestimate external voice levels (or overestimate their own voice) should be provided shorter distances to reflective panels-irrespective of their singing style. Adjustable elements can be deployed to adapt opera houses intended for bel canto style performances to other styles. Additional examples will also be discussed. a)Now at Kumamoto Univ., Kumamoto, Japan. b)Now at: 1-10-27 Yamano Kami, Kumamoto, Japan.

  7. A long distance voice transmission system based on the white light LED

    Science.gov (United States)

    Tian, Chunyu; Wei, Chang; Wang, Yulian; Wang, Dachi; Yu, Benli; Xu, Feng

    2017-10-01

    A long distance voice transmission system based on a visible light communication technology (VLCT) is proposed in the paper. Our proposed system includes transmitter, receiver and the voice signal processing of single chip microcomputer. In the compact-sized LED transmitter, we use on-off-keying and not-return-to-zero (OOK-NRZ) to easily realize high speed modulation, and then systematic complexity is reduced. A voice transmission system, which possesses the properties of the low-noise and wide modulation band, is achieved by the design of high efficiency receiving optical path and using filters to reduce noise from the surrounding light. To improve the speed of the signal processing, we use single chip microcomputer to code and decode voice signal. Furthermore, serial peripheral interface (SPI) is adopted to accurately transmit voice signal data. The test results of our proposed system show that the transmission distance of this system is more than100 meters with the maximum data rate of 1.5 Mbit/s and a SNR of 30dB. This system has many advantages, such as simple construction, low cost and strong practicality. Therefore, it has extensive application prospect in the fields of the emergency communication and indoor wireless communication, etc.

  8. Aeroacoustics of the swinging corrugated tube : voice of the dragon

    NARCIS (Netherlands)

    Nakiboglu, G.; Rudenko, O.; Hirschberg, A.

    2012-01-01

    When one swings a short corrugated pipe segment around one’s head, it produces a musically interesting whistling sound. As a musical toy it is called a "Hummer" and as a musical instrument, the "Voice of the Dragon." The fluid dynamics aspects of the instrument are addressed, corresponding to the

  9. Fluctuations of radio occultation signals in sounding the Earth's atmosphere

    Directory of Open Access Journals (Sweden)

    V. Kan

    2018-02-01

    Full Text Available We discuss the relationships that link the observed fluctuation spectra of the amplitude and phase of signals used for the radio occultation sounding of the Earth's atmosphere, with the spectra of atmospheric inhomogeneities. Our analysis employs the approximation of the phase screen and of weak fluctuations. We make our estimates for the following characteristic inhomogeneity types: (1 the isotropic Kolmogorov turbulence and (2 the anisotropic saturated internal gravity waves. We obtain the expressions for the variances of the amplitude and phase fluctuations of radio occultation signals as well as their estimates for the typical parameters of inhomogeneity models. From the GPS/MET observations, we evaluate the spectra of the amplitude and phase fluctuations in the altitude interval from 4 to 25 km in the middle and polar latitudes. As indicated by theoretical and experimental estimates, the main contribution into the radio signal fluctuations comes from the internal gravity waves. The influence of the Kolmogorov turbulence is negligible. We derive simple relationships that link the parameters of internal gravity waves and the statistical characteristics of the radio signal fluctuations. These results may serve as the basis for the global monitoring of the wave activity in the stratosphere and upper troposphere.

  10. Aeroacoustics of the swinging corrugated tube: Voice of the Dragon

    NARCIS (Netherlands)

    Nakiboglu, G.; Rudenko, O.; Hirschberg, Abraham

    2012-01-01

    When one swings a short corrugated pipe segment around one’s head, it produces a musically interesting whistling sound. As a musical toy it is called a “Hummer” and as a musical instrument, the “Voice of the Dragon.” The fluid dynamics aspects of the instrument are addressed, corresponding to the

  11. Revisiting vocal perception in non-human animals: a review of vowel discrimination, speaker voice recognition, and speaker normalization

    Directory of Open Access Journals (Sweden)

    Buddhamas eKriengwatana

    2015-01-01

    Full Text Available The extent to which human speech perception evolved by taking advantage of predispositions and pre-existing features of vertebrate auditory and cognitive systems remains a central question in the evolution of speech. This paper reviews asymmetries in vowel perception, speaker voice recognition, and speaker normalization in non-human animals – topics that have not been thoroughly discussed in relation to the abilities of non-human animals, but are nonetheless important aspects of vocal perception. Throughout this paper we demonstrate that addressing these issues in non-human animals is relevant and worthwhile because many non-human animals must deal with similar issues in their natural environment. That is, they must also discriminate between similar-sounding vocalizations, determine signaler identity from vocalizations, and resolve signaler-dependent variation in vocalizations from conspecifics. Overall, we find that, although plausible, the current evidence is insufficiently strong to conclude that directional asymmetries in vowel perception are specific to humans, or that non-human animals can use voice characteristics to recognize human individuals. However, we do find some indication that non-human animals can normalize speaker differences. Accordingly, we identify avenues for future research that would greatly improve and advance our understanding of these topics.

  12. Intra-oral pressure-based voicing control of electrolaryngeal speech with intra-oral vibrator.

    Science.gov (United States)

    Takahashi, Hirokazu; Nakao, Masayuki; Kikuchi, Yataro; Kaga, Kimitaka

    2008-07-01

    In normal speech, coordinated activities of intrinsic laryngeal muscles suspend a glottal sound at utterance of voiceless consonants, automatically realizing a voicing control. In electrolaryngeal speech, however, the lack of voicing control is one of the causes of unclear voice, voiceless consonants tending to be misheard as the corresponding voiced consonants. In the present work, we developed an intra-oral vibrator with an intra-oral pressure sensor that detected utterance of voiceless phonemes during the intra-oral electrolaryngeal speech, and demonstrated that an intra-oral pressure-based voicing control could improve the intelligibility of the speech. The test voices were obtained from one electrolaryngeal speaker and one normal speaker. We first investigated on the speech analysis software how a voice onset time (VOT) and first formant (F1) transition of the test consonant-vowel syllables contributed to voiceless/voiced contrasts, and developed an adequate voicing control strategy. We then compared the intelligibility of consonant-vowel syllables among the intra-oral electrolaryngeal speech with and without online voicing control. The increase of intra-oral pressure, typically with a peak ranging from 10 to 50 gf/cm2, could reliably identify utterance of voiceless consonants. The speech analysis and intelligibility test then demonstrated that a short VOT caused the misidentification of the voiced consonants due to a clear F1 transition. Finally, taking these results together, the online voicing control, which suspended the prosthetic tone while the intra-oral pressure exceeded 2.5 gf/cm2 and during the 35 milliseconds that followed, proved efficient to improve the voiceless/voiced contrast.

  13. Trends in musical theatre voice: an analysis of audition requirements for singers.

    Science.gov (United States)

    Green, Kathryn; Freeman, Warren; Edwards, Matthew; Meyer, David

    2014-05-01

    The American musical theatre industry is a multibillion dollar business in which the requirements for singers are varied and complex. This study identifies the musical genres and voice requirements that are currently most requested at professional auditions to help voice teachers, pedagogues, and physicians who work with musical theatre singers understand the demands of their clients' business. Frequency count. One thousand two thirty-eight professional musical theatre audition listings were gathered over a 6-month period, and information from each listing was categorized and entered into a spreadsheet for analysis. The results indicate that four main genres of music were requested over a wide variety of styles, with more than half of auditions requesting genre categories that may not be served by traditional or classical voice technique alone. To adequately prepare young musical theatre performers for the current job market and keep the performers healthily making the sounds required by the industry, new singing styles may need to be studied and integrated into voice training that only teaches classical styles. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  14. Neuromimetic Sound Representation for Percept Detection and Manipulation

    Directory of Open Access Journals (Sweden)

    Chi Taishih

    2005-01-01

    Full Text Available The acoustic wave received at the ears is processed by the human auditory system to separate different sounds along the intensity, pitch, and timbre dimensions. Conventional Fourier-based signal processing, while endowed with fast algorithms, is unable to easily represent a signal along these attributes. In this paper, we discuss the creation of maximally separable sounds in auditory user interfaces and use a recently proposed cortical sound representation, which performs a biomimetic decomposition of an acoustic signal, to represent and manipulate sound for this purpose. We briefly overview algorithms for obtaining, manipulating, and inverting a cortical representation of a sound and describe algorithms for manipulating signal pitch and timbre separately. The algorithms are also used to create sound of an instrument between a "guitar" and a "trumpet." Excellent sound quality can be achieved if processing time is not a concern, and intelligible signals can be reconstructed in reasonable processing time (about ten seconds of computational time for a one-second signal sampled at . Work on bringing the algorithms into the real-time processing domain is ongoing.

  15. Sound response of superheated drop bubble detectors to neutrons

    International Nuclear Information System (INIS)

    Gao Size; Chen Zhe; Liu Chao; Ni Bangfa; Zhang Guiying; Zhao Changfa; Xiao Caijin; Liu Cunxiong; Nie Peng; Guan Yongjing

    2012-01-01

    The sound response of the bubble detectors to neutrons by using 252 Cf neutron source was described. Sound signals were filtered by sound card and PC. The short-time signal energy. FFT spectrum, power spectrum, and decay time constant were got to determine the authenticity of sound signal for bubbles. (authors)

  16. Neural processing of auditory signals and modular neural control for sound tropism of walking machines

    DEFF Research Database (Denmark)

    Manoonpong, Poramate; Pasemann, Frank; Fischer, Joern

    2005-01-01

    and a neural preprocessing system together with a modular neural controller are used to generate a sound tropism of a four-legged walking machine. The neural preprocessing network is acting as a low-pass filter and it is followed by a network which discerns between signals coming from the left or the right....... The parameters of these networks are optimized by an evolutionary algorithm. In addition, a simple modular neural controller then generates the desired different walking patterns such that the machine walks straight, then turns towards a switched-on sound source, and then stops near to it....

  17. Audiovisual integration of emotional signals from music improvisation does not depend on temporal correspondence.

    Science.gov (United States)

    Petrini, Karin; McAleer, Phil; Pollick, Frank

    2010-04-06

    In the present study we applied a paradigm often used in face-voice affect perception to solo music improvisation to examine how the emotional valence of sound and gesture are integrated when perceiving an emotion. Three brief excerpts expressing emotion produced by a drummer and three by a saxophonist were selected. From these bimodal congruent displays the audio-only, visual-only, and audiovisually incongruent conditions (obtained by combining the two signals both within and between instruments) were derived. In Experiment 1 twenty musical novices judged the perceived emotion and rated the strength of each emotion. The results indicate that sound dominated the visual signal in the perception of affective expression, though this was more evident for the saxophone. In Experiment 2 a further sixteen musical novices were asked to either pay attention to the musicians' movements or to the sound when judging the perceived emotions. The results showed no effect of visual information when judging the sound. On the contrary, when judging the emotional content of the visual information, a worsening in performance was obtained for the incongruent condition that combined different emotional auditory and visual information for the same instrument. The effect of emotionally discordant information thus became evident only when the auditory and visual signals belonged to the same categorical event despite their temporal mismatch. This suggests that the integration of emotional information may be reinforced by its semantic attributes but might be independent from temporal features. Copyright 2010 Elsevier B.V. All rights reserved.

  18. Sons fricativos surdos Voiceless fricatives sounds

    Directory of Open Access Journals (Sweden)

    Carla Aparecida Cielo

    2008-01-01

    Full Text Available TEMA: características dos sons fricativos surdos. OBJETIVO: propor uma revisão da literatura pertinente às características acústicas, fonéticas e fonológicas dos fonemas fricativos surdos que integram o sistema fonológico do Português. Além disso, são descritas suas aplicações na terapia vocal. CONCLUSÕES: os fricativos são fonemas agudos, abrangendo de 2500 a 8000Hz; são plenamente adquiridos até os 3:7 anos de idade; o /s/ que também é o mais afetado em casos de frênulo lingual curto; a omissão do /s/ é uma das ocorrências mais freqüentes na alfabetização; sendo que, no desvio fonológico e na fissura lábio-palatina, freqüentemente ocorre comprometimento de toda a classe de fricativos. Na avaliação de voz, os fricativos são mencionados com as medidas de TMF e relação s/z, bem como seu uso como sons de apoio na fonoterapia.BACKGROUND: characteristics of voiceless fricative sounds PURPOSE: to review the literature related to acoustic, phonetics and phonological characteristics of voiceless fricative sounds that are part of the phonological system of the Portuguese Language. Furthermore, it describes the use of these sounds in voice therapy. CONCLUSIONS: fricatives are acute phonemes comprised between 2500 and 8000 Hz; they are fully acquired up to the age of 3:7 years; the /s/, which is the most affected in cases of short lingual frenum; omission of the /s/ is one of the most frequent occurrences during literacy education; and in the phonological deviation and labial-palatine fissure the entire class of fricatives is frequently affected. In voice assessment, fricative sounds are mentioned as TMF measurements and s/z relationship, and their use as support sounds in speech therapy.

  19. The Speaker Behind The Voice: Therapeutic Practice from the Perspective of Pragmatic Theory

    Directory of Open Access Journals (Sweden)

    Felicity eDeamer

    2015-06-01

    Full Text Available Many attempts at understanding auditory verbal hallucinations (AVHs have tried to explain why there is an auditory experience in the absence of an appropriate stimulus. We suggest that many instance of voice-hearing should be approached differently. More specifically, they could be viewed primarily as hallucinated acts of communication, rather than hallucinated sounds. We suggest that this change of perspective is reflected in, and helps to explain, the successes of two recent therapeutic techniques. These two techniques are: Relating Therapy for Voices and Avatar Therapy.

  20. [Evaluation of music department students who passed the entrance exam with phonetogram (Voice Range Profile)].

    Science.gov (United States)

    Gökdoğan, Çağıl; Gökdoğan, Ozan; Şahin, Esra; Yılmaz, Metin

    2014-01-01

    This study aims to evaluate phonetogram data of the students in the department of music who passed the entrance exam. The phonetogram data of 44 individuals with a good voice quality in the department of music and age-matched individuals who were not trained in the field of music or not involved in music amateurish as the control group were compared. The voice of both groups were recorded using the voice range profile within the scope of Kay Elemetrics CSL (Model 4300 B) programmed. There was a significant difference in the voice range profile parameters including max Fo, Fo range, Fo range (St), min dB SPL, and max dB sound pressure level (pmusic is higher than the control group and that plays a major role in their acceptance to the department of music.

  1. Voice amplification as a means of reducing vocal load for elementary music teachers.

    Science.gov (United States)

    Morrow, Sharon L; Connor, Nadine P

    2011-07-01

    Music teachers are over four times more likely than classroom teachers to develop voice disorders and greater than eight times more likely to have voice-related problems than the general public. Research has shown that individual voice-use parameters of phonation time, fundamental frequency and vocal intensity, as well as vocal load as calculated by cycle dose and distance dose are significantly higher for music teachers than their classroom teacher counterparts. Finding effective and inexpensive prophylactic measures to decrease vocal load for music teachers is an important aspect for voice preservation for this group of professional voice users. The purpose of this study was to determine the effects of voice amplification on vocal intensity and vocal load in the workplace as measured using a KayPENTAX Ambulatory Phonation Monitor (APM) (KayPENTAX, Lincoln Park, NJ). Seven music teachers were monitored for 1 workweek using an APM to determine average vocal intensity (dB sound pressure level [SPL]) and vocal load as calculated by cycle dose and distance dose. Participants were monitored a second week while using a voice amplification unit (Asyst ChatterVox; Asyst Communications Company, Inc., Indian Creek, IL). Significant decreases in mean vocal intensity of 7.00-dB SPL (Pmusic teachers in the classroom. Copyright © 2011 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  2. Removing the Influence of Shimmer in the Calculation of Harmonics-To-Noise Ratios Using Ensemble-Averages in Voice Signals

    Directory of Open Access Journals (Sweden)

    Carlos Ferrer

    2009-01-01

    Full Text Available Harmonics-to-noise ratios (HNRs are affected by general aperiodicity in voiced speech signals. To specifically reflect a signal-to-additive-noise ratio, the measurement should be insensitive to other periodicity perturbations, like jitter, shimmer, and waveform variability. The ensemble averaging technique is a time-domain method which has been gradually refined in terms of its sensitivity to jitter and waveform variability and required number of pulses. In this paper, shimmer is introduced in the model of the ensemble average, and a formula is derived which allows the reduction of shimmer effects in HNR calculation. The validity of the technique is evaluated using synthetically shimmered signals, and the prerequisites (glottal pulse positions and amplitudes are obtained by means of fully automated methods. The results demonstrate the feasibility and usefulness of the correction.

  3. Hidden Markov Model-based Packet Loss Concealment for Voice over IP

    DEFF Research Database (Denmark)

    Rødbro, Christoffer A.; Murthi, Manohar N.; Andersen, Søren Vang

    2006-01-01

    As voice over IP proliferates, packet loss concealment (PLC) at the receiver has emerged as an important factor in determining voice quality of service. Through the use of heuristic variations of signal and parameter repetition and overlap-add interpolation to handle packet loss, conventional PLC...

  4. Comparing the experience of voices in borderline personality disorder with the experience of voices in a psychotic disorder: A systematic review.

    Science.gov (United States)

    Merrett, Zalie; Rossell, Susan L; Castle, David J

    2016-07-01

    In clinical settings, there is substantial evidence both clinically and empirically to suggest that approximately 50% of individuals with borderline personality disorder experience auditory verbal hallucinations. However, there is limited research investigating the phenomenology of these voices. The aim of this study was to review and compare our current understanding of auditory verbal hallucinations in borderline personality disorder with auditory verbal hallucinations in patients with a psychotic disorder, to critically analyse existing studies investigating auditory verbal hallucinations in borderline personality disorder and to identify gaps in current knowledge, which will help direct future research. The literature was searched using the electronic database Scopus, PubMed and MEDLINE. Relevant studies were included if they were written in English, were empirical studies specifically addressing auditory verbal hallucinations and borderline personality disorder, were peer reviewed, used only adult humans and sample comprising borderline personality disorder as the primary diagnosis, and included a comparison group with a primary psychotic disorder such as schizophrenia. Our search strategy revealed a total of 16 articles investigating the phenomenology of auditory verbal hallucinations in borderline personality disorder. Some studies provided evidence to suggest that the voice experiences in borderline personality disorder are similar to those experienced by people with schizophrenia, for example, occur inside the head, and often involved persecutory voices. Other studies revealed some differences between schizophrenia and borderline personality disorder voice experiences, with the borderline personality disorder voices sounding more derogatory and self-critical in nature and the voice-hearers' response to the voices were more emotionally resistive. Furthermore, in one study, the schizophrenia group's voices resulted in more disruption in daily functioning

  5. Speech enhancement on smartphone voice recording

    International Nuclear Information System (INIS)

    Atmaja, Bagus Tris; Farid, Mifta Nur; Arifianto, Dhany

    2016-01-01

    Speech enhancement is challenging task in audio signal processing to enhance the quality of targeted speech signal while suppress other noises. In the beginning, the speech enhancement algorithm growth rapidly from spectral subtraction, Wiener filtering, spectral amplitude MMSE estimator to Non-negative Matrix Factorization (NMF). Smartphone as revolutionary device now is being used in all aspect of life including journalism; personally and professionally. Although many smartphones have two microphones (main and rear) the only main microphone is widely used for voice recording. This is why the NMF algorithm widely used for this purpose of speech enhancement. This paper evaluate speech enhancement on smartphone voice recording by using some algorithms mentioned previously. We also extend the NMF algorithm to Kulback-Leibler NMF with supervised separation. The last algorithm shows improved result compared to others by spectrogram and PESQ score evaluation. (paper)

  6. "Ring" in the solo child singing voice.

    Science.gov (United States)

    Howard, David M; Williams, Jenevora; Herbst, Christian T

    2014-03-01

    Listeners often describe the voices of solo child singers as being "pure" or "clear"; these terms would suggest that the voice is not only pleasant but also clearly audible. The audibility or clarity could be attributed to the presence of high-frequency partials in the sound: a "brightness" or "ring." This article aims to investigate spectrally the acoustic nature of this ring phenomenon in children's solo voices, and in particular, relating it to their "nonring" production. Additionally, this is set in the context of establishing to what extent, if any, the spectral characteristics of ring are shared with those of the singer's formant cluster associated with professional adult opera singers in the 2.5-3.5kHz region. A group of child solo singers, acknowledged as outstanding by a singing teacher who specializes in teaching professional child singers, were recorded in a major UK concert hall performing Come unto him, all ye that labour, from the aria He shall feed his flock from The Messiah by GF Handel. Their singing was accompanied by a recording of a piano played through in-ear headphones. Sound pressure recordings were made from well within the critical distance in the hall. The singers were observed to produce notes with and without ring, and these recordings were analyzed in the frequency domain to investigate their spectra. The results indicate that there is evidence to suggest that ring in child solo singers is carried in two areas of the output spectrum: first in the singer's formant cluster region, centered around 4kHz, which is more than 1000Hz higher than what is observed in adults; and second in the region around 7.5-11kHz where a significant strengthening of harmonic presence is observed. A perceptual test has been carried out demonstrating that 94% of 62 listeners label a synthesized version of the calculated overall average ring spectrum for all subjects as having ring when compared with a synthesized version of the calculated overall average nonring

  7. Implicit multisensory associations influence voice recognition.

    Directory of Open Access Journals (Sweden)

    Katharina von Kriegstein

    2006-10-01

    Full Text Available Natural objects provide partially redundant information to the brain through different sensory modalities. For example, voices and faces both give information about the speech content, age, and gender of a person. Thanks to this redundancy, multimodal recognition is fast, robust, and automatic. In unimodal perception, however, only part of the information about an object is available. Here, we addressed whether, even under conditions of unimodal sensory input, crossmodal neural circuits that have been shaped by previous associative learning become activated and underpin a performance benefit. We measured brain activity with functional magnetic resonance imaging before, while, and after participants learned to associate either sensory redundant stimuli, i.e. voices and faces, or arbitrary multimodal combinations, i.e. voices and written names, ring tones, and cell phones or brand names of these cell phones. After learning, participants were better at recognizing unimodal auditory voices that had been paired with faces than those paired with written names, and association of voices with faces resulted in an increased functional coupling between voice and face areas. No such effects were observed for ring tones that had been paired with cell phones or names. These findings demonstrate that brief exposure to ecologically valid and sensory redundant stimulus pairs, such as voices and faces, induces specific multisensory associations. Consistent with predictive coding theories, associative representations become thereafter available for unimodal perception and facilitate object recognition. These data suggest that for natural objects effective predictive signals can be generated across sensory systems and proceed by optimization of functional connectivity between specialized cortical sensory modules.

  8. Signal-to-background ratio preferences of normal-hearing listeners as a function of music

    Science.gov (United States)

    Barrett, Jillian Gallant

    The purpose of this study was to identify listeners' signal-to-background-ratio (SBR) preference levels for vocal music and to investigate whether or not SBR differences existed for different music genres. The ``signal'' was the singer's voice, and the ``background'' was the accompanying music. Three songs were each produced in two different genres (total of 6 genres represented). Each song was performed by three male and three female singers. Analyses addressed influences of musical genre, singing style, and singer timbre on listener's SBR choices. Fifty-three normal-hearing California State University of Northridge students ranging in age from 20-52 years participated as subjects. Subjects adjusted the overall music loudness to a comfortable listening level, and manipulated a second gain control which affected only the singer's voice. Subjects listened to 72 stimuli and adjusted the singer's voice to the level they felt sounded appropriate in comparison to the background music. Singer and Genre were the two primary contributors to significant differences in subject's SBR preferences, although the results clearly indicate Genre, Style and Singer interact in different combinations under different conditions. SBR differences for each song, each singer, and each subject did not occur in a predictable manner, and support the hypothesis that SBR preferences are neither fixed nor dependent merely upon music application or setting. Further investigations regarding psychoacoustical bases responsible for differences in SBR preferences are warranted.

  9. Design of Efficient Sound Systems for Low Voltage Battery Driven Applications

    DEFF Research Database (Denmark)

    Iversen, Niels Elkjær; Oortgiesen, Rien; Knott, Arnold

    2016-01-01

    The efficiency of portable battery driven sound systems is crucial as it relates to both the playback time and cost of the system. This paper presents design considerations when designing such systems. This include loudspeaker and amplifier design. Using a low resistance voice coil realized...

  10. Voice Disorder Classification Based on Multitaper Mel Frequency Cepstral Coefficients Features

    Directory of Open Access Journals (Sweden)

    Ömer Eskidere

    2015-01-01

    Full Text Available The Mel Frequency Cepstral Coefficients (MFCCs are widely used in order to extract essential information from a voice signal and became a popular feature extractor used in audio processing. However, MFCC features are usually calculated from a single window (taper characterized by large variance. This study shows investigations on reducing variance for the classification of two different voice qualities (normal voice and disordered voice using multitaper MFCC features. We also compare their performance by newly proposed windowing techniques and conventional single-taper technique. The results demonstrate that adapted weighted Thomson multitaper method could distinguish between normal voice and disordered voice better than the results done by the conventional single-taper (Hamming window technique and two newly proposed windowing methods. The multitaper MFCC features may be helpful in identifying voices at risk for a real pathology that has to be proven later.

  11. Enhanced Living by Assessing Voice Pathology Using a Co-Occurrence Matrix.

    Science.gov (United States)

    Muhammad, Ghulam; Alhamid, Mohammed F; Hossain, M Shamim; Almogren, Ahmad S; Vasilakos, Athanasios V

    2017-01-29

    A large number of the population around the world suffers from various disabilities. Disabilities affect not only children but also adults of different professions. Smart technology can assist the disabled population and lead to a comfortable life in an enhanced living environment (ELE). In this paper, we propose an effective voice pathology assessment system that works in a smart home framework. The proposed system takes input from various sensors, and processes the acquired voice signals and electroglottography (EGG) signals. Co-occurrence matrices in different directions and neighborhoods from the spectrograms of these signals were obtained. Several features such as energy, entropy, contrast, and homogeneity from these matrices were calculated and fed into a Gaussian mixture model-based classifier. Experiments were performed with a publicly available database, namely, the Saarbrucken voice database. The results demonstrate the feasibility of the proposed system in light of its high accuracy and speed. The proposed system can be extended to assess other disabilities in an ELE.

  12. Depressed mothers' infants are less responsive to faces and voices.

    Science.gov (United States)

    Field, Tiffany; Diego, Miguel; Hernandez-Reif, Maria

    2009-06-01

    A review of our recent research suggests that infants of depressed mothers appeared to be less responsive to faces and voices as early as the neonatal period. At that time they have shown less orienting to the live face/voice stimulus of the Brazelton scale examiner and to their own and other infants' cry sounds. This lesser responsiveness has been attributed to higher arousal, less attentiveness and less "empathy." Their delayed heart rate decelerations to instrumental and vocal music sounds have also been ascribed to their delayed attention and/or slower processing. Later at 3-6 months they showed less negative responding to their mothers' non-contingent and still-face behavior, suggesting that they were more accustomed to this behavior in their mothers. The less responsive behavior of the depressed mothers was further compounded by their comorbid mood states of anger and anxiety and their difficult interaction styles including withdrawn or intrusive interaction styles and their later authoritarian parenting style. Pregnancy massage was effectively used to reduce prenatal depression and facilitate more optimal neonatal behavior. Interaction coaching was used during the postnatal period to help these dyads with their interactions and ultimately facilitate the infants' development.

  13. Vocal Imitations of Non-Vocal Sounds

    Science.gov (United States)

    Houix, Olivier; Voisin, Frédéric; Misdariis, Nicolas; Susini, Patrick

    2016-01-01

    Imitative behaviors are widespread in humans, in particular whenever two persons communicate and interact. Several tokens of spoken languages (onomatopoeias, ideophones, and phonesthemes) also display different degrees of iconicity between the sound of a word and what it refers to. Thus, it probably comes at no surprise that human speakers use a lot of imitative vocalizations and gestures when they communicate about sounds, as sounds are notably difficult to describe. What is more surprising is that vocal imitations of non-vocal everyday sounds (e.g. the sound of a car passing by) are in practice very effective: listeners identify sounds better with vocal imitations than with verbal descriptions, despite the fact that vocal imitations are inaccurate reproductions of a sound created by a particular mechanical system (e.g. a car driving by) through a different system (the voice apparatus). The present study investigated the semantic representations evoked by vocal imitations of sounds by experimentally quantifying how well listeners could match sounds to category labels. The experiment used three different types of sounds: recordings of easily identifiable sounds (sounds of human actions and manufactured products), human vocal imitations, and computational “auditory sketches” (created by algorithmic computations). The results show that performance with the best vocal imitations was similar to the best auditory sketches for most categories of sounds, and even to the referent sounds themselves in some cases. More detailed analyses showed that the acoustic distance between a vocal imitation and a referent sound is not sufficient to account for such performance. Analyses suggested that instead of trying to reproduce the referent sound as accurately as vocally possible, vocal imitations focus on a few important features, which depend on each particular sound category. These results offer perspectives for understanding how human listeners store and access long

  14. Sound Zones

    DEFF Research Database (Denmark)

    Møller, Martin Bo; Olsen, Martin

    2017-01-01

    Sound zones, i.e. spatially confined regions of individual audio content, can be created by appropriate filtering of the desired audio signals reproduced by an array of loudspeakers. The challenge of designing filters for sound zones is twofold: First, the filtered responses should generate...... an acoustic separation between the control regions. Secondly, the pre- and post-ringing as well as spectral deterioration introduced by the filters should be minimized. The tradeoff between acoustic separation and filter ringing is the focus of this paper. A weighted L2-norm penalty is introduced in the sound...

  15. A Wireless LAN and Voice Information System for Underground Coal Mine

    Directory of Open Access Journals (Sweden)

    Yu Zhang

    2014-06-01

    Full Text Available In this paper we constructed a wireless information system, and developed a wireless voice communication subsystem based on Wireless Local Area Networks (WLAN for underground coal mine, which employs Voice over IP (VoIP technology and Session Initiation Protocol (SIP to achieve wireless voice dispatching communications. The master control voice dispatching interface and call terminal software are also developed on the WLAN ground server side to manage and implement the voice dispatching communication. A testing system for voice communication was constructed in tunnels of an underground coal mine, which was used to actually test the wireless voice communication subsystem via a network analysis tool, named Clear Sight Analyzer. In tests, the actual flow charts of registration, call establishment and call removal were analyzed by capturing call signaling of SIP terminals, and the key performance indicators were evaluated in coal mine, including average subjective value of voice quality, packet loss rate, delay jitter, disorder packet transmission and end-to- end delay. Experimental results and analysis demonstrate that the wireless voice communication subsystem developed communicates well in underground coal mine environment, achieving the designed function of voice dispatching communication.

  16. Environmental Sound Recognition Using Time-Frequency Intersection Patterns

    Directory of Open Access Journals (Sweden)

    Xuan Guo

    2012-01-01

    Full Text Available Environmental sound recognition is an important function of robots and intelligent computer systems. In this research, we use a multistage perceptron neural network system for environmental sound recognition. The input data is a combination of time-variance pattern of instantaneous powers and frequency-variance pattern with instantaneous spectrum at the power peak, referred to as a time-frequency intersection pattern. Spectra of many environmental sounds change more slowly than those of speech or voice, so the intersectional time-frequency pattern will preserve the major features of environmental sounds but with drastically reduced data requirements. Two experiments were conducted using an original database and an open database created by the RWCP project. The recognition rate for 20 kinds of environmental sounds was 92%. The recognition rate of the new method was about 12% higher than methods using only an instantaneous spectrum. The results are also comparable with HMM-based methods, although those methods need to treat the time variance of an input vector series with more complicated computations.

  17. Fetus Sound Stimulation: Cilia Memristor Effect of Signal Transduction

    Directory of Open Access Journals (Sweden)

    Svetlana Jankovic-Raznatovic

    2014-01-01

    Full Text Available Background. This experimental study evaluates fetal middle cerebral artery (MCA circulation after the defined prenatal acoustical stimulation (PAS and the role of cilia in hearing and memory and could explain signal transduction and memory according to cilia optical-acoustical properties. Methods. PAS was performed twice on 119 no-risk term pregnancies. We analyzed fetal MCA circulation before, after first and second PAS. Results. Analysis of the Pulsatility index basic (PIB and before PAS and Pulsatility index reactive after the first PAS (PIR 1 shows high statistical difference, representing high influence on the brain circulation. Analysis of PIB and Pulsatility index reactive after the second PAS (PIR 2 shows no statistical difference. Cilia as nanoscale structure possess magnetic flux linkage that depends on the amount of charge that has passed between two-terminal variable resistors of cilia. Microtubule resistance, as a function of the current through and voltage across the structure, leads to appearance of cilia memory with the “memristor” property. Conclusion. Acoustical and optical cilia properties play crucial role in hearing and memory processes. We suggest that fetuses are getting used to sound, developing a kind of memory patterns, considering acoustical and electromagnetically waves and involving cilia and microtubules and try to explain signal transduction.

  18. Site-Specific Soundscape Design for the Creation of Sonic Architectures and the Emergent Voices of Buildings

    Directory of Open Access Journals (Sweden)

    Jordan Lacey

    2014-01-01

    Full Text Available Does a building contain its own Voice? And if so, can that Voice be discovered, transformed and augmented by soundscape design? Barry Blesser’s writings on acoustic space, discuss reverberation and resonant frequencies as providing architectural spaces with characteristic listening conditions related to the architectural space’s dimensions and materiality. The paper argues that Blesser and Salter expand such discussion into pantheistic speculation when suggesting that humanity contains the imaginative capacity to experience spaces as “living spirits”. This argument is achieved by building on the speculation through the discussion of a soundscape design methodology that considers space as containing pantheistic qualities. Sonic architectures are created with electroacoustic sound installations that recompose existing architectural soundscapes, to create the conditions for the emergence of the Voices of buildings. This paper describes two soundscape designs, Revoicing the Striated Soundscape and Subterranean Voices, which transformed existing architectural soundscapes for the emergence of Voices in a laneway and a building located in the City of Melbourne, Australia.

  19. Protective Strategies Against Dysphonia in Teachers: Preliminary Results Comparing Voice Amplification and 0.9% NaCl Nebulization.

    Science.gov (United States)

    Masson, Maria Lúcia Vaz; de Araújo, Tânia Maria

    2018-03-01

    This study aimed to compare the effects of two protective strategies, voice amplification (VA) and 0.9% NaCl nebulization (NEB), on teachers' voice in the work setting. An interventional evaluator-blind study was conducted, assigning 53 teachers from two public high schools to one of the two protective strategy groups (VA or NEB). Vocal function was assessed in a sound-treated booth before and after a 4-week period. Assessment included the severity of voice impairment (Consensus Auditory-Perceptual Evaluation of Voice [CAPE-V]), acoustic analysis of fundamental frequency (f0), sound pressure level (SPL), jitter, shimmer, glottal-to-noise excitation ratio (GNE), noise (VoxMetria), and the self-rated Screening Index for Voice Disorder (SIVD). Data were statistically analyzed using SPSS Statistics (version 22) with a significance level of P ≤ 0.05. Effect size was calculated using Cohen's d coefficient. There were no statistical differences between groups at baseline in terms of age, sex, time of teaching, teaching workload, and voice outcomes, except for SPL. During postintervention between groups, NEB displayed lower SIVD scores (VA = 3; NEB = 0; P = 0.018) and VA had lower acoustic irregularity (VA = 3.19; NEB = 3.69; P = 0.027), with moderate to large effect size. Postintervention within-groups decreased CAPE-V for VA (pretest = 31.97; posttest = 28.24; P = 0.021) and SIVD for NEB (pretest = 3; posttest = 0; P = 0.001). SPL decreased in both groups, NEB decreased in men only, and VA decreased in both men and women. NEB increased f0 for female participants (P ≤ 0.001). Both VA and NEB may help mitigate dysphonia in different pathways, being potential interventions for protecting teachers' voices in the work setting. An ongoing study with a control group will further support these preliminary results. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  20. Anti-voice adaptation suggests prototype-based coding of voice identity

    Directory of Open Access Journals (Sweden)

    Marianne eLatinus

    2011-07-01

    Full Text Available We used perceptual aftereffects induced by adaptation with anti-voice stimuli to investigate voice identity representations. Participants learned a set of voices then were tested on a voice identification task with vowel stimuli morphed between identities, after different conditions of adaptation. In Experiment 1, participants chose the identity opposite to the adapting anti-voice significantly more often than the other two identities (e.g., after being adapted to anti-A, they identified the average voice as A. In Experiment 2, participants showed a bias for identities opposite to the adaptor specifically for anti-voice, but not for non anti-voice adaptors. These results are strikingly similar to adaptation aftereffects observed for facial identity. They are compatible with a representation of individual voice identities in a multidimensional perceptual voice space referenced on a voice prototype.

  1. Songbirds use pulse tone register in two voices to generate low-frequency sound

    DEFF Research Database (Denmark)

    Jensen, Kenneth Kragh; Cooper, Brenton G.; Larsen, Ole Næsbye

    2007-01-01

    , the syrinx, is unknown. We present the first high-speed video records of the intact syrinx during induced phonation. The syrinx of anaesthetized crows shows a vibration pattern of the labia similar to that of the human vocal fry register. Acoustic pulses result from short opening of the labia, and pulse...... generation alternates between the left and right sound sources. Spontaneously calling crows can also generate similar pulse characteristics with only one sound generator. Airflow recordings in zebra finches and starlings show that pulse tone sounds can be generated unilaterally, synchronously...

  2. The auditory dorsal stream plays a crucial role in projecting hallucinated voices into external space

    NARCIS (Netherlands)

    Looijestijn, Jasper; Diederen, Kelly M. J.; Goekoop, Rutger; Sommer, Iris E. C.; Daalman, Kirstin; Kahn, Rene S.; Hoek, Hans W.; Blom, Jan Dirk

    Introduction: Verbal auditory hallucinations (VAHs) are experienced as spoken voices which seem to originate in the extracorporeal environment or inside the head. Animal and human research has identified a 'where' pathway for sound processing comprising the planum temporale, the middle frontal gyrus

  3. Collaboration and conquest: MTD as viewed by voice teacher (singing voice specialist) and speech-language pathologist.

    Science.gov (United States)

    Goffi-Fynn, Jeanne C; Carroll, Linda M

    2013-05-01

    This study was designed as a qualitative case study to demonstrate the process of diagnosis and treatment between a voice team to manage a singer diagnosed with muscular tension dysphonia (MTD). Traditionally, literature suggests that MTD is challenging to treat and little in the literature directly addresses singers with MTD. Data collected included initial medical screening with laryngologist, referral to speech-language pathologist (SLP) specializing in voice disorders among singers, and adjunctive voice training with voice teacher trained in vocology (singing voice specialist or SVS). Initial target goals with SLP included reducing extrinsic laryngeal tension, using a relaxed laryngeal posture, and effective abdominal-diaphragmatic support for all phonation events. Balance of respiratory forces, laryngeal coordination, and use of optimum filtering of the source signal through resonance and articulatory awareness was emphasized. Further work with SVS included three main goals including a lowered breathing pattern to aid in decreasing subglottic air pressure, vertical laryngeal position to lower to allow for a relaxed laryngeal position, and a top-down singing approach to encourage an easier, more balanced registration, and better resonance. Initial results also emphasize the retraining of subject toward a sensory rather than auditory mode of monitoring. Other areas of consideration include singers' training and vocal use, the psychological effects of MTD, the personalities potentially associated with it, and its relationship with stress. Finally, the results emphasize that a positive rapport with the subject and collaboration between all professionals involved in a singer's care are essential for recovery. Copyright © 2013 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  4. Comparison of acoustic voice characteristics in smoking and nonsmoking teachers

    Directory of Open Access Journals (Sweden)

    Šehović Ivana

    2012-01-01

    Full Text Available Voice of vocal professionals is exposed to great temptations, i.e. there is a high probability of voice alterations. Smoking, allergies and respiratory infections greatly affect the voice, which can change its acoustic characteristics. In smokers, the vocal cords mass increases, resulting in changes in vocal fold vibratory cycle. Pathological changes of vocal folds deform the acoustic signal and affect voice production. As vocal professionals, teachers are much more affected by voice disorders than average speakers. The aim of this study was to examine the differences in acoustic parameters of voice between smoking and nonsmoking teachers, in a sample of vocal professionals. The sample consisted of 60 female subjects, aged from 25 to 59. For voice analysis we used Computer lab, model 4300, 'Kay Elemetrics Corporation'. The statistical significance of differences in the values of acoustic parameters between smokers and nonsmokers was tested by ANOVA. Results showed that in the sample of female teachers, professional use of voice combined with the smoking habit can be linked to the changes in voice parameters. Comparing smokers and nonsmokers, average values of the parameters in short-term and long-term disturbances of frequency and amplitude proved to be significantly different.

  5. Using voice input and audio feedback to enhance the reality of a virtual experience

    Energy Technology Data Exchange (ETDEWEB)

    Miner, N.E.

    1994-04-01

    Virtual Reality (VR) is a rapidly emerging technology which allows participants to experience a virtual environment through stimulation of the participant`s senses. Intuitive and natural interactions with the virtual world help to create a realistic experience. Typically, a participant is immersed in a virtual environment through the use of a 3-D viewer. Realistic, computer-generated environment models and accurate tracking of a participant`s view are important factors for adding realism to a virtual experience. Stimulating a participant`s sense of sound and providing a natural form of communication for interacting with the virtual world are equally important. This paper discusses the advantages and importance of incorporating voice recognition and audio feedback capabilities into a virtual world experience. Various approaches and levels of complexity are discussed. Examples of the use of voice and sound are presented through the description of a research application developed in the VR laboratory at Sandia National Laboratories.

  6. The Study of MSADQ/CDMA Protocol in Voice/Data Integration Packet Networks

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    A new packet medium access protocol, namely, minislot signalingaccess based on distributed queues(MSADQ/CDMA), is proposed in voice and data intergration CDMA networks. The MSADQ protocol is based on distributed queues and collision resolution algorithm. Through proper management of the PN codes, the number of random competition collision reduces greatly, the multiple access interference (MAI) decreases. It has several special access signaling channels to carry the voice and data access request. Each slot is devided into several control minislots (CMSs), in which the Data Terminals (DT) or Voice Terminals (VT) transmit their request. According to the voice and data traffic character, the signaling access structure is proposed. The code assign rules and queue managing rules are also proposed to ensure the QoS requirement of each traffic. Comparisions with other three protocol are developed by simulation, which shows that MSADQ/CDMA protocol occupies less PN codes, but still has very good performance.

  7. Effects of melody and technique on acoustical and musical features of western operatic singing voices.

    Science.gov (United States)

    Larrouy-Maestri, Pauline; Magis, David; Morsomme, Dominique

    2014-05-01

    The operatic singing technique is frequently used in classical music. Several acoustical parameters of this specific technique have been studied but how these parameters combine remains unclear. This study aims to further characterize the Western operatic singing technique by observing the effects of melody and technique on acoustical and musical parameters of the singing voice. Fifty professional singers performed two contrasting melodies (popular song and romantic melody) with two vocal techniques (with and without operatic singing technique). The common quality parameters (energy distribution, vibrato rate, and extent), perturbation parameters (standard deviation of the fundamental frequency, signal-to-noise ratio, jitter, and shimmer), and musical features (fundamental frequency of the starting note, average tempo, and sound pressure level) of the 200 sung performances were analyzed. The results regarding the effect of melody and technique on the acoustical and musical parameters show that the choice of melody had a limited impact on the parameters observed, whereas a particular vocal profile appeared depending on the vocal technique used. This study confirms that vocal technique affects most of the parameters examined. In addition, the observation of quality, perturbation, and musical parameters contributes to a better understanding of the Western operatic singing technique. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  8. Voice Quality and Gender Stereotypes: A Study of Lebanese Women with Reinke's Edema

    Science.gov (United States)

    Matar, Nayla; Portes, Cristel; Lancia, Leonardo; Legou, Thierry; Baider, Fabienne

    2016-01-01

    Purpose Women with Reinke's edema (RW) report being mistaken for men during telephone conversations. For this reason, their masculine-sounding voices are interesting for the study of gender stereotypes. The study's objective is to verify their complaint and to understand the cues used in gender identification. Method Using a self-evaluation study,…

  9. Effects of flow gradients on directional radiation of human voice.

    Science.gov (United States)

    Pulkki, Ville; Lähivaara, Timo; Huhtakallio, Ilkka

    2018-02-01

    In voice communication in windy outdoor conditions, complex velocity gradients appear in the flow field around the source, the receiver, and also in the atmosphere. It is commonly known that voice emanates stronger towards the downstream direction when compared with the upstream direction. In literature, the atmospheric effects are used to explain the stronger emanation in the downstream direction. This work shows that the wind also has an effect to the directivity of voice also favouring the downstream direction. The effect is addressed by measurements and simulations. Laboratory measurements are conducted by using a large pendulum with a loudspeaker mimicking the human head, whereas practical measurements utilizing the human voice are realized by placing a subject through the roof window of a moving car. The measurements and a simulation indicate congruent results in the speech frequency range: When the source faces the downstream direction, stronger radiation coinciding with the wind direction is observed, and when it faces the upstream direction, radiation is not affected notably. The simulated flow gradients show a wake region in the downstream direction, and the simulated acoustic field in the flow show that the region causes a wave-guide effect focusing the sound in the direction.

  10. Comparison of voice-use profiles between elementary classroom and music teachers.

    Science.gov (United States)

    Morrow, Sharon L; Connor, Nadine P

    2011-05-01

    Among teachers, music teachers are roughly four times more likely than classroom teachers to develop voice-related problems. Although it has been established that music teachers use their voices at high intensities and durations in the course of their workday, voice-use profiles concerning the amount and intensity of vocal use and vocal load have neither been quantified nor has vocal load for music teachers been compared with classroom teachers using these same voice-use parameters. In this study, total phonation time, fundamental frequency (F₀), and vocal intensity (dB SPL [sound pressure level]) were measured or estimated directly using a KayPENTAX Ambulatory Phonation Monitor (KayPENTAX, Lincoln Park, NJ). Vocal load was calculated as cycle and distance dose, as defined by Švec et al (2003), which integrates total phonation time, F₀, and vocal intensity. Twelve participants (n = 7 elementary music teachers and n = 5 elementary classroom teachers) were monitored during five full teaching days of one workweek to determine average vocal load for these two groups of teachers. Statistically significant differences in all measures were found between the two groups (P vocal loads for music teachers are substantially higher than those experienced by classroom teachers (P vocal load may have immediate clinical and educational benefits in vocal health in music teachers. Copyright © 2011 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  11. A new multifeature mismatch negativity (MMN) paradigm for the study of music perception with more real-sounding stimuli

    DEFF Research Database (Denmark)

    Quiroga Martinez, David Ricardo; Hansen, Niels Christian; Højlund, Andreas

    . Interestingly, this reduction did not hold for mistunings and slide in the melody, probably due to interval mistuning and the high voice superiority effect. Our results indicate that it is possible to use the MMN for the study of more real-sounding music and that stimulus complexity plays a crucial role......The MMN is a brain response elicited by deviants in a series of repetitive sounds that has been valuable for the study of music perception. However, most MMN experimental designs use simple tone patterns as stimuli, failing to represent the complexity of everyday music. Our goal was to develop...... a new paradigm using more real-sounding stimuli. Concretely, we wanted to assess the perception of nonrepetitive melodies when presented alone and when embedded in two-part music. An Alberti bass used previously served both as a comparison and as the second voice in the two-part stimuli. We used MEG...

  12. Particle Filter with Integrated Voice Activity Detection for Acoustic Source Tracking

    Directory of Open Access Journals (Sweden)

    Anders M. Johansson

    2007-01-01

    Full Text Available In noisy and reverberant environments, the problem of acoustic source localisation and tracking (ASLT using an array of microphones presents a number of challenging difficulties. One of the main issues when considering real-world situations involving human speakers is the temporally discontinuous nature of speech signals: the presence of silence gaps in the speech can easily misguide the tracking algorithm, even in practical environments with low to moderate noise and reverberation levels. A natural extension of currently available sound source tracking algorithms is the integration of a voice activity detection (VAD scheme. We describe a new ASLT algorithm based on a particle filtering (PF approach, where VAD measurements are fused within the statistical framework of the PF implementation. Tracking accuracy results for the proposed method is presented on the basis of synthetic audio samples generated with the image method, whereas performance results obtained with a real-time implementation of the algorithm, and using real audio data recorded in a reverberant room, are published elsewhere. Compared to a previously proposed PF algorithm, the experimental results demonstrate the improved robustness of the method described in this work when tracking sources emitting real-world speech signals, which typically involve significant silence gaps between utterances.

  13. Understanding the mechanisms of familiar voice-identity recognition in the human brain.

    Science.gov (United States)

    Maguinness, Corrina; Roswandowitz, Claudia; von Kriegstein, Katharina

    2018-03-31

    Humans have a remarkable skill for voice-identity recognition: most of us can remember many voices that surround us as 'unique'. In this review, we explore the computational and neural mechanisms which may support our ability to represent and recognise a unique voice-identity. We examine the functional architecture of voice-sensitive regions in the superior temporal gyrus/sulcus, and bring together findings on how these regions may interact with each other, and additional face-sensitive regions, to support voice-identity processing. We also contrast findings from studies on neurotypicals and clinical populations which have examined the processing of familiar and unfamiliar voices. Taken together, the findings suggest that representations of familiar and unfamiliar voices might dissociate in the human brain. Such an observation does not fit well with current models for voice-identity processing, which by-and-large assume a common sequential analysis of the incoming voice signal, regardless of voice familiarity. We provide a revised audio-visual integrative model of voice-identity processing which brings together traditional and prototype models of identity processing. This revised model includes a mechanism of how voice-identity representations are established and provides a novel framework for understanding and examining the potential differences in familiar and unfamiliar voice processing in the human brain. Copyright © 2018 Elsevier Ltd. All rights reserved.

  14. Robust signal selection for lineair prediction analysis of voiced speech

    NARCIS (Netherlands)

    Ma, C.; Kamp, Y.; Willems, L.F.

    1993-01-01

    This paper investigates a weighted LPC analysis of voiced speech. In view of the speech production model, the weighting function is either chosen to be the short-time energy function of the preemphasized speech sample sequence with certain delays or is obtained by thresholding the short-time energy

  15. Time-frequency peak filtering for random noise attenuation of magnetic resonance sounding signal

    Science.gov (United States)

    Lin, Tingting; Zhang, Yang; Yi, Xiaofeng; Fan, Tiehu; Wan, Ling

    2018-05-01

    When measuring in a geomagnetic field, the method of magnetic resonance sounding (MRS) is often limited because of the notably low signal-to-noise ratio (SNR). Most current studies focus on discarding spiky noise and power-line harmonic noise cancellation. However, the effects of random noise should not be underestimated. The common method for random noise attenuation is stacking, but collecting multiple recordings merely to suppress random noise is time-consuming. Moreover, stacking is insufficient to suppress high-level random noise. Here, we propose the use of time-frequency peak filtering for random noise attenuation, which is performed after the traditional de-spiking and power-line harmonic removal method. By encoding the noisy signal with frequency modulation and estimating the instantaneous frequency using the peak of the time-frequency representation of the encoded signal, the desired MRS signal can be acquired from only one stack. The performance of the proposed method is tested on synthetic envelope signals and field data from different surveys. Good estimations of the signal parameters are obtained at different SNRs. Moreover, an attempt to use the proposed method to handle a single recording provides better results compared to 16 stacks. Our results suggest that the number of stacks can be appropriately reduced to shorten the measurement time and improve the measurement efficiency.

  16. Crowing Sound Analysis of Gaga' Chicken; Local Chicken from South Sulawesi Indonesia

    OpenAIRE

    Aprilita Bugiwati, Sri Rachma; Ashari, Fachri

    2008-01-01

    Gaga??? chicken was known as a local chicken at South Sulawesi Indonesia which has unique, specific, and different crowing sound, especially at the ending of crowing sound which is like the voice character of human laughing, comparing with the other types of singing chicken in the world. 287 birds of Gaga??? chicken at 3 districts at the centre habitat of Gaga??? chicken were separated into 2 groups (163 birds of Dangdut type and 124 birds of Slow type) which is based on the speed...

  17. Plethysmogram and EEG: Effects of Music and Voice Sound

    Science.gov (United States)

    Miao, Tiejun; Oyama-Higa, Mayumi; Sato, Sadaka; Kojima, Junji; Lin, Juan; Reika, Sato

    2011-06-01

    We studied a relation of chaotic dynamics of finger plethysmogram to complexity of high cerebral center in both theoretical and experimental approaches. We proposed a mathematical model to describe emergence of chaos in finger tip pulse wave, which gave a theoretical prediction indicating increased chaoticity in higher cerebral center leading to an increase of chaos dynamics in plethysmograms. We designed an experiment to observe scalp-EEG and finger plethysmogram using two mental tasks to validate the relationship. We found that scalp-EEG showed an increase of the largest Lyapunov exponents (LLE) during speaking certain voices. Topographical scalp map of LLE showed enhanced arise around occipital and right cerebral area. Whereas there was decreasing tendency during listening music, where LLE scalp map revealed a drop around center cerebral area. The same tendency was found for LLE obtained from finger plethysmograms as ones of EEG under either speaking or listening tasks. The experiment gave results that agreed well with the theoretical relation derived from our proposed model.

  18. An intelligent artificial throat with sound-sensing ability based on laser induced graphene

    Science.gov (United States)

    Tao, Lu-Qi; Tian, He; Liu, Ying; Ju, Zhen-Yi; Pang, Yu; Chen, Yuan-Quan; Wang, Dan-Yang; Tian, Xiang-Guang; Yan, Jun-Chao; Deng, Ning-Qin; Yang, Yi; Ren, Tian-Ling

    2017-02-01

    Traditional sound sources and sound detectors are usually independent and discrete in the human hearing range. To minimize the device size and integrate it with wearable electronics, there is an urgent requirement of realizing the functional integration of generating and detecting sound in a single device. Here we show an intelligent laser-induced graphene artificial throat, which can not only generate sound but also detect sound in a single device. More importantly, the intelligent artificial throat will significantly assist for the disabled, because the simple throat vibrations such as hum, cough and scream with different intensity or frequency from a mute person can be detected and converted into controllable sounds. Furthermore, the laser-induced graphene artificial throat has the advantage of one-step fabrication, high efficiency, excellent flexibility and low cost, and it will open practical applications in voice control, wearable electronics and many other areas.

  19. When the face fits: recognition of celebrities from matching and mismatching faces and voices.

    Science.gov (United States)

    Stevenage, Sarah V; Neil, Greg J; Hamlin, Iain

    2014-01-01

    The results of two experiments are presented in which participants engaged in a face-recognition or a voice-recognition task. The stimuli were face-voice pairs in which the face and voice were co-presented and were either "matched" (same person), "related" (two highly associated people), or "mismatched" (two unrelated people). Analysis in both experiments confirmed that accuracy and confidence in face recognition was consistently high regardless of the identity of the accompanying voice. However accuracy of voice recognition was increasingly affected as the relationship between voice and accompanying face declined. Moreover, when considering self-reported confidence in voice recognition, confidence remained high for correct responses despite the proportion of these responses declining across conditions. These results converged with existing evidence indicating the vulnerability of voice recognition as a relatively weak signaller of identity, and results are discussed in the context of a person-recognition framework.

  20. [Fundamental frequency analysis - a contribution to the objective examination of the speaking and singing voice (author's transl)].

    Science.gov (United States)

    Schultz-Coulon, H J

    1975-07-01

    The applicability of a newly developed fundamental frequency analyzer to diagnosis in phoniatrics is reviewed. During routine voice examination, the analyzer allows a quick and accurate measurement of fundamental frequency and sound level of the speaking voice, and of vocal range and maximum phonation time. By computing fundamental frequency histograms, the median fundamental frequency and the total pitch range can be better determined and compared. Objective studies of certain technical faculties of the singing voice, which usually are estimated subjectively by the speech therapist, may now be done by means of this analyzer. Several examples demonstrate the differences between correct and incorrect phonation. These studies compare the pitch perturbations during the crescendo and decrescendo of a swell-tone, and show typical traces of staccato, thrill and yodel. Conclusions of the study indicate that fundamental frequency analysis is a valuable supplemental method for objective voice examination.

  1. Assessing signal-driven mechanism in neonates: brain responses to temporally and spectrally different sounds

    Directory of Open Access Journals (Sweden)

    Yasuyo eMinagawa-Kawai

    2011-06-01

    Full Text Available Past studies have found that in adults that acoustic properties of sound signals (such as fast vs. slow temporal features differentially activate the left and right hemispheres, and some have hypothesized that left-lateralization for speech processing may follow from left-lateralization to rapidly changing signals. Here, we tested whether newborns’ brains show some evidence of signal-specific lateralization responses using near-infrared spectroscopy (NIRS and auditory stimuli that elicits lateralized responses in adults, composed of segments that vary in duration and spectral diversity. We found significantly greater bilateral responses of oxygenated hemoglobin (oxy-Hb in the temporal areas for stimuli with a minimum segment duration of 21 ms, than stimuli with a minimum segment duration of 667 ms. However, we found no evidence for hemispheric asymmetries dependent on the stimulus characteristics. We hypothesize that acoustic-based functional brain asymmetries may develop throughout early infancy, and discuss their possible relationship with brain asymmetries for language.

  2. Exploring expressivity and emotion with artificial voice and speech technologies.

    Science.gov (United States)

    Pauletto, Sandra; Balentine, Bruce; Pidcock, Chris; Jones, Kevin; Bottaci, Leonardo; Aretoulaki, Maria; Wells, Jez; Mundy, Darren P; Balentine, James

    2013-10-01

    Emotion in audio-voice signals, as synthesized by text-to-speech (TTS) technologies, was investigated to formulate a theory of expression for user interface design. Emotional parameters were specified with markup tags, and the resulting audio was further modulated with post-processing techniques. Software was then developed to link a selected TTS synthesizer with an automatic speech recognition (ASR) engine, producing a chatbot that could speak and listen. Using these two artificial voice subsystems, investigators explored both artistic and psychological implications of artificial speech emotion. Goals of the investigation were interdisciplinary, with interest in musical composition, augmentative and alternative communication (AAC), commercial voice announcement applications, human-computer interaction (HCI), and artificial intelligence (AI). The work-in-progress points towards an emerging interdisciplinary ontology for artificial voices. As one study output, HCI tools are proposed for future collaboration.

  3. Acoustic analysis of trill sounds.

    Science.gov (United States)

    Dhananjaya, N; Yegnanarayana, B; Bhaskararao, Peri

    2012-04-01

    In this paper, the acoustic-phonetic characteristics of steady apical trills--trill sounds produced by the periodic vibration of the apex of the tongue--are studied. Signal processing methods, namely, zero-frequency filtering and zero-time liftering of speech signals, are used to analyze the excitation source and the resonance characteristics of the vocal tract system, respectively. Although it is natural to expect the effect of trilling on the resonances of the vocal tract system, it is interesting to note that trilling influences the glottal source of excitation as well. The excitation characteristics derived using zero-frequency filtering of speech signals are glottal epochs, strength of impulses at the glottal epochs, and instantaneous fundamental frequency of the glottal vibration. Analysis based on zero-time liftering of speech signals is used to study the dynamic resonance characteristics of vocal tract system during the production of trill sounds. Qualitative analysis of trill sounds in different vowel contexts, and the acoustic cues that may help spotting trills in continuous speech are discussed.

  4. Robotic vehicle uses acoustic sensors for voice detection and diagnostics

    Science.gov (United States)

    Young, Stuart H.; Scanlon, Michael V.

    2000-07-01

    An acoustic sensor array that cues an imaging system on a small tele- operated robotic vehicle was used to detect human voice and activity inside a building. The advantage of acoustic sensors is that it is a non-line of sight (NLOS) sensing technology that can augment traditional LOS sensors such as visible and IR cameras. Acoustic energy emitted from a target, such as from a person, weapon, or radio, will travel through walls and smoke, around corners, and down corridors, whereas these obstructions would cripple an imaging detection system. The hardware developed and tested used an array of eight microphones to detect the loudest direction and automatically setter a camera's pan/tilt toward the noise centroid. This type of system has applicability for counter sniper applications, building clearing, and search/rescue. Data presented will be time-frequency representations showing voice detected within rooms and down hallways at various ranges. Another benefit of acoustics is that it provides the tele-operator some situational awareness clues via low-bandwidth transmission of raw audio data for the operator to interpret with either headphones or through time-frequency analysis. This data can be useful to recognize familiar sounds that might indicate the presence of personnel, such as talking, equipment, movement noise, etc. The same array also detects the sounds of the robot it is mounted on, and can be useful for engine diagnostics and trouble shooting, or for self-noise emanations for stealthy travel. Data presented will characterize vehicle self noise over various surfaces such as tiles, carpets, pavement, sidewalk, and grass. Vehicle diagnostic sounds will indicate a slipping clutch and repeated unexpected application of emergency braking mechanism.

  5. [Mechanism of the constant representation of the position of a sound signal source by the cricket cercal system neurons].

    Science.gov (United States)

    Rozhkova, G I; Polishcuk, N A

    1976-01-01

    Previously it has been shown that some abdominal giant neurones of the cricket have constant preffered directions of sound stimulation in relation not to the cerci (the organs bearing sound receptors) but to the insect body (fig. 1) [1]. Now it is found that the independence of directional sensitivity of giant neurones on the cerci position disappears after cutting all structures connecting the cerci to the body (except cercal nerves) (fig 2). Therefore the constancy of directional sensitivity of the giant nerones is provided by proprioceptive signals about cerci position.

  6. Heart Sound Localization and Reduction in Tracheal Sounds by Gabor Time-Frequency Masking

    OpenAIRE

    SAATCI, Esra; Akan, Aydın

    2018-01-01

    Background and aim: Respiratorysounds, i.e. tracheal and lung sounds, have been of great interest due to theirdiagnostic values as well as the potential of their use in the estimation ofthe respiratory dynamics (mainly airflow). Thus the aim of the study is topresent a new method to filter the heart sound interference from the trachealsounds. Materials and methods: Trachealsounds and airflow signals were collected by using an accelerometer from 10 healthysubjects. Tracheal sounds were then pr...

  7. Familiarity and Voice Representation: From Acoustic-Based Representation to Voice Averages

    Directory of Open Access Journals (Sweden)

    Maureen Fontaine

    2017-07-01

    Full Text Available The ability to recognize an individual from their voice is a widespread ability with a long evolutionary history. Yet, the perceptual representation of familiar voices is ill-defined. In two experiments, we explored the neuropsychological processes involved in the perception of voice identity. We specifically explored the hypothesis that familiar voices (trained-to-familiar (Experiment 1, and famous voices (Experiment 2 are represented as a whole complex pattern, well approximated by the average of multiple utterances produced by a single speaker. In experiment 1, participants learned three voices over several sessions, and performed a three-alternative forced-choice identification task on original voice samples and several “speaker averages,” created by morphing across varying numbers of different vowels (e.g., [a] and [i] produced by the same speaker. In experiment 2, the same participants performed the same task on voice samples produced by familiar speakers. The two experiments showed that for famous voices, but not for trained-to-familiar voices, identification performance increased and response times decreased as a function of the number of utterances in the averages. This study sheds light on the perceptual representation of familiar voices, and demonstrates the power of average in recognizing familiar voices. The speaker average captures the unique characteristics of a speaker, and thus retains the information essential for recognition; it acts as a prototype of the speaker.

  8. Throw Yo' Voice Out: Disability as a Desirable Practice in Hip-Hop Vocal Performance

    Directory of Open Access Journals (Sweden)

    Alex S. Porco

    2014-12-01

    Full Text Available Disabled bodies and disabling spaces— especially the recording studio— shape the sound iconicity of hip-hop vocal performances. The disabled voice is the audible sign by which hip-hop artists trouble cultural definitions of the self and other; exceptionalism and failure; the natural and techno-mediated; comedy and tragedy; and aesthetic play and seriousness. Hip-hop vocal performances also function as self-conscious acts of transvaluation that challenge the discursive dominance of ableism. A materialist approach to vocal performance resists reducing voice to a silent metaphor for race, oppositionality, or liberation; and it emphasizes, instead, the physiological and social processes that render hip-hop voices unique, particular, and audible. It emphasizes the agency hip-hop artists possess in seeking out disabled bodies and assuming disabled identities for aesthetic and political ends. Thus, the body is returned to the analysis of style.

  9. Abnormal sound detection device

    International Nuclear Information System (INIS)

    Yamada, Izumi; Matsui, Yuji.

    1995-01-01

    Only components synchronized with rotation of pumps are sampled from detected acoustic sounds, to judge the presence or absence of abnormality based on the magnitude of the synchronized components. A synchronized component sampling means can remove resonance sounds and other acoustic sounds generated at a synchronously with the rotation based on the knowledge that generated acoustic components in a normal state are a sort of resonance sounds and are not precisely synchronized with the number of rotation. On the other hand, abnormal sounds of a rotating body are often caused by compulsory force accompanying the rotation as a generation source, and the abnormal sounds can be detected by extracting only the rotation-synchronized components. Since components of normal acoustic sounds generated at present are discriminated from the detected sounds, reduction of the abnormal sounds due to a signal processing can be avoided and, as a result, abnormal sound detection sensitivity can be improved. Further, since it is adapted to discriminate the occurrence of the abnormal sound from the actually detected sounds, the other frequency components which are forecast but not generated actually are not removed, so that it is further effective for the improvement of detection sensitivity. (N.H.)

  10. 7 CFR 1735.2 - Definitions.

    Science.gov (United States)

    2010-01-01

    ... operated as one system. Mobile telecommunications service means radio communication voice service between... exchange, or within a connected system of telephone exchanges within the same exchange area operated to... communication service for the transmission or reception of voice, data, sounds, signals, pictures, writing, or...

  11. Raising voices: How sixth graders construct authority and knowledge in argumentative essays

    Science.gov (United States)

    Monahan, Mary Elizabeth

    This qualitative classroom-based study documents one teacher-researcher's response to the "voice" debate in composition studies and to the opposing views expressed by Elbow and Bartholomae. The author uses Bakhtin's principle of dialogism, Hymes's theory of communicative competence, as well as Ivanic's discussion of discoursally constructed identities to reconceptualize voice and to redesign writing instruction in her sixth grade classroom. This study shows how students, by redefining and then acting on that voice pedagogy in terms that made sense to them, shaped the author's understanding of what counts as "voiced" writing in non-narrative discourse. Based on a grounded-theory analysis of the twenty-six sixth graders' argumentative essays in science, the author explains voice, not as a property of writers or of texts, but as a process of "knowing together"---a collaborative, but not entirely congenial, exercise of establishing one's authority by talking with, against, and through other voices on the issue. As the results of this study show, the students' "I-Ness" or authorial presence within their texts, was born in a nexus of relationships with "rivals," "allies" and "readers." Given their teacher's injunctions to project confidence and authority in argumentative writing, the students assumed fairly adversarial stances toward these conversational partners throughout their essays. Exaggerating the terms for voiced writing built into the curriculum, the sixth graders produced essays that read more like caricatures than examples of argumentation. Their displays of rhetorical bravado and intellectual aggressiveness, however offsetting to the reader, still enabled these sixth graders to composed voiced essays. This study raises doubts about the value of urging students to sound like their "true selves" or to adopt the formal registers of academe. Students, it seems clear, stand to gain by experimenting with a range of textual identities. The author suggests that voice

  12. Visualization of Broadband Sound Sources

    OpenAIRE

    Sukhanov Dmitry; Erzakova Nadezhda

    2016-01-01

    In this paper the method of imaging of wideband audio sources based on the 2D microphone array measurements of the sound field at the same time in all the microphones is proposed. Designed microphone array consists of 160 microphones allowing to digitize signals with a frequency of 7200 Hz. Measured signals are processed using the special algorithm that makes it possible to obtain a flat image of wideband sound sources. It is shown experimentally that the visualization is not dependent on the...

  13. Sparse representation of Gravitational Sound

    Science.gov (United States)

    Rebollo-Neira, Laura; Plastino, A.

    2018-03-01

    Gravitational Sound clips produced by the Laser Interferometer Gravitational-Wave Observatory (LIGO) and the Massachusetts Institute of Technology (MIT) are considered within the particular context of data reduction. We advance a procedure to this effect and show that these types of signals can be approximated with high quality using significantly fewer elementary components than those required within the standard orthogonal basis framework. Furthermore, a local measure sparsity is shown to render meaningful information about the variation of a signal along time, by generating a set of local sparsity values which is much smaller than the dimension of the signal. This point is further illustrated by recourse to a more complex signal, generated by Milde Science Communication to divulge Gravitational Sound in the form of a ring tone.

  14. From Sound Morphing to the Synthesis of Starlight. Musical experiences with the Phase Vocoder over 25 years

    Directory of Open Access Journals (Sweden)

    Trevor Wishart

    2013-08-01

    Full Text Available The article reports the author’s experiences with the phase vocoder. Starting from the first attempts during the years 1973-77 – in connection with a speculative project to morph the sounds of a speaking voice into sounds from the natural world, project subsequently developed at Ircam in Paris between 1979 and 1986 – up to the most recent experiences in 2011-12 associated with the realization of Supernova, an 8-channel sound-surround piece, where the phase vocoder data format is used as a synthesis tool.

  15. Evolving Spiking Neural Networks for Recognition of Aged Voices.

    Science.gov (United States)

    Silva, Marco; Vellasco, Marley M B R; Cataldo, Edson

    2017-01-01

    The aging of the voice, known as presbyphonia, is a natural process that can cause great change in vocal quality of the individual. This is a relevant problem to those people who use their voices professionally, and its early identification can help determine a suitable treatment to avoid its progress or even to eliminate the problem. This work focuses on the development of a new model for the identification of aging voices (independently of their chronological age), using as input attributes parameters extracted from the voice and glottal signals. The proposed model, named Quantum binary-real evolving Spiking Neural Network (QbrSNN), is based on spiking neural networks (SNNs), with an unsupervised training algorithm, and a Quantum-Inspired Evolutionary Algorithm that automatically determines the most relevant attributes and the optimal parameters that configure the SNN. The QbrSNN model was evaluated in a database composed of 120 records, containing samples from three groups of speakers. The results obtained indicate that the proposed model provides better accuracy than other approaches, with fewer input attributes. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  16. The Sound Quality of Cochlear Implants: Studies With Single-sided Deaf Patients.

    Science.gov (United States)

    Dorman, Michael F; Natale, Sarah Cook; Butts, Austin M; Zeitler, Daniel M; Carlson, Matthew L

    2017-09-01

    The goal of the present study was to assess the sound quality of a cochlear implant for single-sided deaf (SSD) patients fit with a cochlear implant (CI). One of the fundamental, unanswered questions in CI research is "what does an implant sound like?" Conventional CI patients must use the memory of a clean signal, often decades old, to judge the sound quality of their CIs. In contrast, SSD-CI patients can rate the similarity of a clean signal presented to the CI ear and candidate, CI-like signals presented to the ear with normal hearing. For Experiment 1 four types of stimuli were created for presentation to the normal hearing ear: noise vocoded signals, sine vocoded signals, frequency shifted, sine vocoded signals and band-pass filtered, natural speech signals. Listeners rated the similarity of these signals to unmodified signals sent to the CI on a scale of 0 to 10 with 10 being a complete match to the CI signal. For Experiment 2 multitrack signal mixing was used to create natural speech signals that varied along multiple dimensions. In Experiment 1 for eight adult SSD-CI listeners, the best median similarity rating to the sound of the CI for noise vocoded signals was 1.9; for sine vocoded signals 2.9; for frequency upshifted signals, 1.9; and for band pass filtered signals, 5.5. In Experiment 2 for three young listeners, combinations of band pass filtering and spectral smearing lead to ratings of 10. The sound quality of noise and sine vocoders does not generally correspond to the sound quality of cochlear implants fit to SSD patients. Our preliminary conclusion is that natural speech signals that have been muffled to one degree or another by band pass filtering and/or spectral smearing provide a close, but incomplete, match to CI sound quality for some patients.

  17. Modern Methods of Voice Authentication in Mobile Devices

    Directory of Open Access Journals (Sweden)

    Vladimir Leonovich Evseev

    2016-03-01

    Full Text Available Modern methods of voice authentication in mobile devices.The proposed evaluation of the probability errors of the first and second kind for multi-modal methods of voice authentication. The advantages of multimodal multivariate methods before, when authentication takes place in several stages – this is the one-stage, which means convenience for customers. Further development of multimodal methods of authentication will be based on the significantly increased computing power of mobile devices, the growing number and improved accuracy built-in mobile device sensors, as well as to improve the algorithms of signal processing.

  18. Assessment and improvement of sound quality in cochlear implant users.

    Science.gov (United States)

    Caldwell, Meredith T; Jiam, Nicole T; Limb, Charles J

    2017-06-01

    Cochlear implants (CIs) have successfully provided speech perception to individuals with sensorineural hearing loss. Recent research has focused on more challenging acoustic stimuli such as music and voice emotion. The purpose of this review is to evaluate and describe sound quality in CI users with the purposes of summarizing novel findings and crucial information about how CI users experience complex sounds. Here we review the existing literature on PubMed and Scopus to present what is known about perceptual sound quality in CI users, discuss existing measures of sound quality, explore how sound quality may be effectively studied, and examine potential strategies of improving sound quality in the CI population. Sound quality, defined here as the perceived richness of an auditory stimulus, is an attribute of implant-mediated listening that remains poorly studied. Sound quality is distinct from appraisal, which is generally defined as the subjective likability or pleasantness of a sound. Existing studies suggest that sound quality perception in the CI population is limited by a range of factors, most notably pitch distortion and dynamic range compression. Although there are currently very few objective measures of sound quality, the CI-MUSHRA has been used as a means of evaluating sound quality. There exist a number of promising strategies to improve sound quality perception in the CI population including apical cochlear stimulation, pitch tuning, and noise reduction processing strategies. In the published literature, sound quality perception is severely limited among CI users. Future research should focus on developing systematic, objective, and quantitative sound quality metrics and designing therapies to mitigate poor sound quality perception in CI users. NA.

  19. Applying cybernetic technology to diagnose human pulmonary sounds.

    Science.gov (United States)

    Chen, Mei-Yung; Chou, Cheng-Han

    2014-06-01

    Chest auscultation is a crucial and efficient method for diagnosing lung disease; however, it is a subjective process that relies on physician experience and the ability to differentiate between various sound patterns. Because the physiological signals composed of heart sounds and pulmonary sounds (PSs) are greater than 120 Hz and the human ear is not sensitive to low frequencies, successfully making diagnostic classifications is difficult. To solve this problem, we constructed various PS recognition systems for classifying six PS classes: vesicular breath sounds, bronchial breath sounds, tracheal breath sounds, crackles, wheezes, and stridor sounds. First, we used a piezoelectric microphone and data acquisition card to acquire PS signals and perform signal preprocessing. A wavelet transform was used for feature extraction, and the PS signals were decomposed into frequency subbands. Using a statistical method, we extracted 17 features that were used as the input vectors of a neural network. We proposed a 2-stage classifier combined with a back-propagation (BP) neural network and learning vector quantization (LVQ) neural network, which improves classification accuracy by using a haploid neural network. The receiver operating characteristic (ROC) curve verifies the high performance level of the neural network. To expand traditional auscultation methods, we constructed various PS diagnostic systems that can correctly classify the six common PSs. The proposed device overcomes the lack of human sensitivity to low-frequency sounds and various PS waves, characteristic values, and a spectral analysis charts are provided to elucidate the design of the human-machine interface.

  20. Singing in groups for Parkinson's disease (SING-PD): a pilot study of group singing therapy for PD-related voice/speech disorders.

    Science.gov (United States)

    Shih, Ludy C; Piel, Jordan; Warren, Amanda; Kraics, Lauren; Silver, Althea; Vanderhorst, Veronique; Simon, David K; Tarsy, Daniel

    2012-06-01

    Parkinson's disease related speech and voice impairment have significant impact on quality of life measures. LSVT(®)LOUD voice and speech therapy (Lee Silverman Voice Therapy) has demonstrated scientific efficacy and clinical effectiveness, but musically based voice and speech therapy has been underexplored as a potentially useful method of rehabilitation. We undertook a pilot, open-label study of a group-based singing intervention, consisting of twelve 90-min weekly sessions led by a voice and speech therapist/singing instructor. The primary outcome measure of vocal loudness as measured by sound pressure level (SPL) at 50 cm during connected speech was not significantly different one week after the intervention or at 13 weeks after the intervention. A number of secondary measures reflecting pitch range, phonation time and maximum loudness also were unchanged. Voice related quality of life (VRQOL) and voice handicap index (VHI) also were unchanged. This study suggests that a group singing therapy intervention at this intensity and frequency does not result in significant improvement in objective and subject-rated measures of voice and speech impairment. Copyright © 2012 Elsevier Ltd. All rights reserved.

  1. [Voice disorders in female teachers assessed by Voice Handicap Index].

    Science.gov (United States)

    Niebudek-Bogusz, Ewa; Kuzańska, Anna; Woźnicka, Ewelina; Sliwińska-Kowalska, Mariola

    2007-01-01

    The aim of this study was to assess the application of Voice Handicap Index (VHI) in the diagnosis of occupational voice disorders in female teachers. The subjective assessment of voice by VHI was performed in fifty subjects with dysphonia diagnosed in laryngovideostroboscopic examination. The control group comprised 30 women whose jobs did not involve vocal effort. The results of the total VHI score and each of its subscales: functional, emotional and physical was significantly worse in the study group than in controls (p teachers estimated their own voice problems as a moderate disability, while 12% of them reported severe voice disability. However, all non-teachers assessed their voice problems as slight, their results ranged at the lowest level of VHI score. This study confirmed that VHI as a tool for self-assessment of voice can be a significant contribution to the diagnosis of occupational dysphonia.

  2. Constraints on decay of environmental sound memory in adult rats.

    Science.gov (United States)

    Sakai, Masashi

    2006-11-27

    When adult rats are pretreated with a 48-h-long 'repetitive nonreinforced sound exposure', performance in two-sound discriminative operant conditioning transiently improves. We have already proven that this 'sound exposure-enhanced discrimination' is dependent upon enhancement of the perceptual capacity of the auditory cortex. This study investigated principles governing decay of sound exposure-enhanced discrimination decay. Sound exposure-enhanced discrimination disappeared within approximately 72 h if animals were deprived of environmental sounds after sound exposure, and that shortened to less than approximately 60 h if they were exposed to environmental sounds in the animal room. Sound-deprivation itself exerted no clear effects. These findings suggest that the memory of a passively exposed behaviorally irrelevant sound signal does not merely pass along the intrinsic lifetime but also gets deteriorated by other incoming signals.

  3. Investigation of a glottal related harmonics-to-noise ratio and spectral tilt as indicators of glottal noise in synthesized and human voice signals.

    LENUS (Irish Health Repository)

    Murphy, Peter J

    2008-03-01

    The harmonics-to-noise ratio (HNR) of the voiced speech signal has implicitly been used to infer information regarding the turbulent noise level at the glottis. However, two problems exist for inferring glottal noise attributes from the HNR of the speech wave form: (i) the measure is fundamental frequency (f0) dependent for equal levels of glottal noise, and (ii) any deviation from signal periodicity affects the ratio, not just turbulent noise. An alternative harmonics-to-noise ratio formulation [glottal related HNR (GHNR\\')] is proposed to overcome the former problem. In GHNR\\' a mean over the spectral range of interest of the HNRs at specific harmonic\\/between-harmonic frequencies (expressed in linear scale) is calculated. For the latter issue [(ii)] two spectral tilt measures are shown, using synthesis data, to be sensitive to glottal noise while at the same time being comparatively insensitive to other glottal aperiodicities. The theoretical development predicts that the spectral tilt measures reduce as noise levels increase. A conventional HNR estimator, GHNR\\' and two spectral tilt measures are applied to a data set of 13 pathological and 12 normal voice samples. One of the tilt measures and GHNR\\' are shown to provide statistically significant differentiating power over a conventional HNR estimator.

  4. Singing voice outcomes following singing voice therapy.

    Science.gov (United States)

    Dastolfo-Hromack, Christina; Thomas, Tracey L; Rosen, Clark A; Gartner-Schmidt, Jackie

    2016-11-01

    The objectives of this study were to describe singing voice therapy (SVT), describe referred patient characteristics, and document the outcomes of SVT. Retrospective. Records of patients receiving SVT between June 2008 and June 2013 were reviewed (n = 51). All diagnoses were included. Demographic information, number of SVT sessions, and symptom severity were retrieved from the medical record. Symptom severity was measured via the 10-item Singing Voice Handicap Index (SVHI-10). Treatment outcome was analyzed by diagnosis, history of previous training, and SVHI-10. SVHI-10 scores decreased following SVT (mean change = 11, 40% decrease) (P singing lessons (n = 10) also completed an average of three SVT sessions. Primary muscle tension dysphonia (MTD1) and benign vocal fold lesion (lesion) were the most common diagnoses. Most patients (60%) had previous vocal training. SVHI-10 decrease was not significantly different between MTD and lesion. This is the first outcome-based study of SVT in a disordered population. Diagnosis of MTD or lesion did not influence treatment outcomes. Duration of SVT was short (approximately three sessions). Voice care providers are encouraged to partner with a singing voice therapist to provide optimal care for the singing voice. This study supports the use of SVT as a tool for the treatment of singing voice disorders. 4 Laryngoscope, 126:2546-2551, 2016. © 2016 The American Laryngological, Rhinological and Otological Society, Inc.

  5. Effects of consensus training on the reliability of auditory perceptual ratings of voice quality.

    Science.gov (United States)

    Iwarsson, Jenny; Reinholt Petersen, Niels

    2012-05-01

    This study investigates the effect of consensus training of listeners on intrarater and interrater reliability and agreement of perceptual voice analysis. The use of such training, including a reference voice sample, could be assumed to make the internal standards held in memory common and more robust, which is of great importance to reduce the variability of auditory perceptual ratings. A prospective design with testing before and after training. Thirteen students of audiologopedics served as listening subjects. The ratings were made using a multidimensional protocol with four-point equal-appearing interval scales. The stimuli consisted of text reading by authentic dysphonic patients. The consensus training for each perceptual voice parameter included (1) definition, (2) underlying physiology, (3) presentation of carefully selected sound examples representing the parameter in three different grades followed by group discussions of perceived characteristics, and (4) practical exercises including imitation to make use of the listeners' proprioception. Intrarater reliability and agreement showed a marked improvement for intermittent aphonia but not for vocal fry. Interrater reliability was high for most parameters before training with a slight increase after training. Interrater agreement showed marked increases for most voice quality parameters as a result of the training. The results support the recommendation of specific consensus training, including use of a reference voice sample material, to calibrate, equalize, and stabilize the internal standards held in memory by the listeners. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  6. Auditory comprehension: from the voice up to the single word level

    OpenAIRE

    Jones, Anna Barbara

    2016-01-01

    Auditory comprehension, the ability to understand spoken language, consists of a number of different auditory processing skills. In the five studies presented in this thesis I investigated both intact and impaired auditory comprehension at different levels: voice versus phoneme perception, as well as single word auditory comprehension in terms of phonemic and semantic content. In the first study, using sounds from different continua of ‘male’-/pæ/ to ‘female’-/tæ/ and ‘male’...

  7. Cross-Modal Correspondence between Brightness and Chinese Speech Sound with Aspiration

    Directory of Open Access Journals (Sweden)

    Sachiko Hirata

    2011-10-01

    Full Text Available Phonetic symbolism is the phenomenon of speech sounds evoking images based on sensory experiences; it is often discussed with cross-modal correspondence. By using Garner's task, Hirata, Kita, and Ukita (2009 showed the cross-modal congruence between brightness and voiced/voiceless consonants in Japanese speech sound, which is known as phonetic symbolism. In the present study, we examined the effect of the meaning of mimetics (lexical words whose sound reflects its meaning, like “ding-dong” in Japanese language on the cross-modal correspondence. We conducted an experiment with Chinese speech sounds with or without aspiration using Chinese people. Chinese vocabulary also contains mimetics but the existence of aspiration doesn't relate to the meaning of Chinese mimetics. As a result, Chinese speech sounds with aspiration, which resemble voiceless consonants, were matched with white color, whereas those without aspiration were matched with black. This result is identical to its pattern in Japanese people and consequently suggests that cross-modal correspondence occurs without the effect of the meaning of mimetics. The problem that whether these cross-modal correspondences are purely based on physical properties of speech sound or affected from phonetic properties remains for further study.

  8. Vocal parameters and voice-related quality of life in adult women with and without ovarian function.

    Science.gov (United States)

    Ferraz, Pablo Rodrigo Rocha; Bertoldo, Simão Veras; Costa, Luanne Gabrielle Morais; Serra, Emmeliny Cristini Nogueira; Silva, Eduardo Magalhães; Brito, Luciane Maria Oliveira; Chein, Maria Bethânia da Costa

    2013-05-01

    To identify the perceptual and acoustic parameters of voice in adult women with and without ovarian function and its impact on quality of life related to voice. Cross-sectional and analytical study with 106 women divided into, two groups: G1, with ovarian function (n=43) and G2, without physiological ovarian function (n=63). The women were instructed to sustain the vowel "a" and the sounds of /s/ and /z/ in habitual pitch and loudness. They were also asked to classify their voices and answer the voice-related quality of life (V-RQOL) questionnaire. The perceptual analysis of the vocal samples was performed by three speech-language pathologists using the GRBASI (G: grade; R: roughness; B: breathness; A: asthenia; S: strain; I: instability) scale. The acoustic analysis was carried out with the software VoxMetria 2.7h (CTS Informatica). The data were analyzed using descriptive statistics. In the perceptual analysis, both groups showed a mild deviation for the parameters roughness, strain, and instability, but only G2 showed a mild impact for the overall degree of dysphonia. The mean of fundamental frequency was significantly lower for the G2, with a difference of 17.41Hz between the two groups. There was no impact on V-RQOL in any of the V-RQOL domains for this group. With the menopause, there is a change in women's voices, impacting on some voice parameters. However, there is no direct impact on their quality of life related to voice. Copyright © 2013 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  9. Effects of sounds of locomotion on speech perception

    Directory of Open Access Journals (Sweden)

    Matz Larsson

    2015-01-01

    Full Text Available Human locomotion typically creates noise, a possible consequence of which is the masking of sound signals originating in the surroundings. When walking side by side, people often subconsciously synchronize their steps. The neurophysiological and evolutionary background of this behavior is unclear. The present study investigated the potential of sound created by walking to mask perception of speech and compared the masking produced by walking in step with that produced by unsynchronized walking. The masking sound (footsteps on gravel and the target sound (speech were presented through the same speaker to 15 normal-hearing subjects. The original recorded walking sound was modified to mimic the sound of two individuals walking in pace or walking out of synchrony. The participants were instructed to adjust the sound level of the target sound until they could just comprehend the speech signal ("just follow conversation" or JFC level when presented simultaneously with synchronized or unsynchronized walking sound at 40 dBA, 50 dBA, 60 dBA, or 70 dBA. Synchronized walking sounds produced slightly less masking of speech than did unsynchronized sound. The median JFC threshold in the synchronized condition was 38.5 dBA, while the corresponding value for the unsynchronized condition was 41.2 dBA. Combined results at all sound pressure levels showed an improvement in the signal-to-noise ratio (SNR for synchronized footsteps; the median difference was 2.7 dB and the mean difference was 1.2 dB [P < 0.001, repeated-measures analysis of variance (RM-ANOVA]. The difference was significant for masker levels of 50 dBA and 60 dBA, but not for 40 dBA or 70 dBA. This study provides evidence that synchronized walking may reduce the masking potential of footsteps.

  10. Noise Reduction in Breath Sound Files Using Wavelet Transform Based Filter

    Science.gov (United States)

    Syahputra, M. F.; Situmeang, S. I. G.; Rahmat, R. F.; Budiarto, R.

    2017-04-01

    The development of science and technology in the field of healthcare increasingly provides convenience in diagnosing respiratory system problem. Recording the breath sounds is one example of these developments. Breath sounds are recorded using a digital stethoscope, and then stored in a file with sound format. This breath sounds will be analyzed by health practitioners to diagnose the symptoms of disease or illness. However, the breath sounds is not free from interference signals. Therefore, noise filter or signal interference reduction system is required so that breath sounds component which contains information signal can be clarified. In this study, we designed a filter called a wavelet transform based filter. The filter that is designed in this study is using Daubechies wavelet with four wavelet transform coefficients. Based on the testing of the ten types of breath sounds data, the data is obtained in the largest SNRdB bronchial for 74.3685 decibels.

  11. Takete and Maluma in Action: A Cross-Modal Relationship between Gestures and Sounds.

    Directory of Open Access Journals (Sweden)

    Kazuko Shinohara

    Full Text Available Despite Saussure's famous observation that sound-meaning relationships are in principle arbitrary, we now have a substantial body of evidence that sounds themselves can have meanings, patterns often referred to as "sound symbolism". Previous studies have found that particular sounds can be associated with particular meanings, and also with particular static visual shapes. Less well studied is the association between sounds and dynamic movements. Using a free elicitation method, the current experiment shows that several sound symbolic associations between sounds and dynamic movements exist: (1 front vowels are more likely to be associated with small movements than with large movements; (2 front vowels are more likely to be associated with angular movements than with round movements; (3 obstruents are more likely to be associated with angular movements than with round movements; (4 voiced obstruents are more likely to be associated with large movements than with small movements. All of these results are compatible with the results of the previous studies of sound symbolism using static images or meanings. Overall, the current study supports the hypothesis that particular dynamic motions can be associated with particular sounds. Building on the current results, we discuss a possible practical application of these sound symbolic associations in sports instructions.

  12. Takete and Maluma in Action: A Cross-Modal Relationship between Gestures and Sounds.

    Science.gov (United States)

    Shinohara, Kazuko; Yamauchi, Naoto; Kawahara, Shigeto; Tanaka, Hideyuki

    Despite Saussure's famous observation that sound-meaning relationships are in principle arbitrary, we now have a substantial body of evidence that sounds themselves can have meanings, patterns often referred to as "sound symbolism". Previous studies have found that particular sounds can be associated with particular meanings, and also with particular static visual shapes. Less well studied is the association between sounds and dynamic movements. Using a free elicitation method, the current experiment shows that several sound symbolic associations between sounds and dynamic movements exist: (1) front vowels are more likely to be associated with small movements than with large movements; (2) front vowels are more likely to be associated with angular movements than with round movements; (3) obstruents are more likely to be associated with angular movements than with round movements; (4) voiced obstruents are more likely to be associated with large movements than with small movements. All of these results are compatible with the results of the previous studies of sound symbolism using static images or meanings. Overall, the current study supports the hypothesis that particular dynamic motions can be associated with particular sounds. Building on the current results, we discuss a possible practical application of these sound symbolic associations in sports instructions.

  13. Enhanced Living by Assessing Voice Pathology Using a Co-Occurrence Matrix

    OpenAIRE

    Muhammad, Ghulam; Alhamid, Mohammed F.; Hossain, M. Shamim; Almogren, Ahmad S.; Vasilakos, Athanasios V.

    2017-01-01

    A large number of the population around the world suffers from various disabilities. Disabilities affect not only children but also adults of different professions. Smart technology can assist the disabled population and lead to a comfortable life in an enhanced living environment (ELE). In this paper, we propose an effective voice pathology assessment system that works in a smart home framework. The proposed system takes input from various sensors, and processes the acquired voice signals an...

  14. Perception of acoustic scale and size in musical instrument sounds.

    Science.gov (United States)

    van Dinther, Ralph; Patterson, Roy D

    2006-10-01

    There is size information in natural sounds. For example, as humans grow in height, their vocal tracts increase in length, producing a predictable decrease in the formant frequencies of speech sounds. Recent studies have shown that listeners can make fine discriminations about which of two speakers has the longer vocal tract, supporting the view that the auditory system discriminates changes on the acoustic-scale dimension. Listeners can also recognize vowels scaled well beyond the range of vocal tracts normally experienced, indicating that perception is robust to changes in acoustic scale. This paper reports two perceptual experiments designed to extend research on acoustic scale and size perception to the domain of musical sounds: The first study shows that listeners can discriminate the scale of musical instrument sounds reliably, although not quite as well as for voices. The second experiment shows that listeners can recognize the family of an instrument sound which has been modified in pitch and scale beyond the range of normal experience. We conclude that processing of acoustic scale in music perception is very similar to processing of acoustic scale in speech perception.

  15. Can blind persons accurately assess body size from the voice?

    Science.gov (United States)

    Pisanski, Katarzyna; Oleszkiewicz, Anna; Sorokowska, Agnieszka

    2016-04-01

    Vocal tract resonances provide reliable information about a speaker's body size that human listeners use for biosocial judgements as well as speech recognition. Although humans can accurately assess men's relative body size from the voice alone, how this ability is acquired remains unknown. In this study, we test the prediction that accurate voice-based size estimation is possible without prior audiovisual experience linking low frequencies to large bodies. Ninety-one healthy congenitally or early blind, late blind and sighted adults (aged 20-65) participated in the study. On the basis of vowel sounds alone, participants assessed the relative body sizes of male pairs of varying heights. Accuracy of voice-based body size assessments significantly exceeded chance and did not differ among participants who were sighted, or congenitally blind or who had lost their sight later in life. Accuracy increased significantly with relative differences in physical height between men, suggesting that both blind and sighted participants used reliable vocal cues to size (i.e. vocal tract resonances). Our findings demonstrate that prior visual experience is not necessary for accurate body size estimation. This capacity, integral to both nonverbal communication and speech perception, may be present at birth or may generalize from broader cross-modal correspondences. © 2016 The Author(s).

  16. Similarities between the irrelevant sound effect and the suffix effect.

    Science.gov (United States)

    Hanley, J Richard; Bourgaize, Jake

    2018-03-29

    Although articulatory suppression abolishes the effect of irrelevant sound (ISE) on serial recall when sequences are presented visually, the effect persists with auditory presentation of list items. Two experiments were designed to test the claim that, when articulation is suppressed, the effect of irrelevant sound on the retention of auditory lists resembles a suffix effect. A suffix is a spoken word that immediately follows the final item in a list. Even though participants are told to ignore it, the suffix impairs serial recall of auditory lists. In Experiment 1, the irrelevant sound consisted of instrumental music. The music generated a significant ISE that was abolished by articulatory suppression. It therefore appears that, when articulation is suppressed, irrelevant sound must contain speech for it to have any effect on recall. This is consistent with what is known about the suffix effect. In Experiment 2, the effect of irrelevant sound under articulatory suppression was greater when the irrelevant sound was spoken by the same voice that presented the list items. This outcome is again consistent with the known characteristics of the suffix effect. It therefore appears that, when rehearsal is suppressed, irrelevant sound disrupts the acoustic-perceptual encoding of auditorily presented list items. There is no evidence that the persistence of the ISE under suppression is a result of interference to the representation of list items in a postcategorical phonological store.

  17. Sounds scary? Lack of habituation following the presentation of novel sounds.

    Directory of Open Access Journals (Sweden)

    Tine A Biedenweg

    Full Text Available BACKGROUND: Animals typically show less habituation to biologically meaningful sounds than to novel signals. We might therefore expect that acoustic deterrents should be based on natural sounds. METHODOLOGY: We investigated responses by western grey kangaroos (Macropus fulignosus towards playback of natural sounds (alarm foot stomps and Australian raven (Corvus coronoides calls and artificial sounds (faux snake hiss and bull whip crack. We then increased rate of presentation to examine whether animals would habituate. Finally, we varied frequency of playback to investigate optimal rates of delivery. PRINCIPAL FINDINGS: Nine behaviors clustered into five Principal Components. PC factors 1 and 2 (animals alert or looking, or hopping and moving out of area accounted for 36% of variance. PC factor 3 (eating cessation, taking flight, movement out of area accounted for 13% of variance. Factors 4 and 5 (relaxing, grooming and walking; 12 and 11% of variation, respectively discontinued upon playback. The whip crack was most evocative; eating was reduced from 75% of time spent prior to playback to 6% following playback (post alarm stomp: 32%, raven call: 49%, hiss: 75%. Additionally, 24% of individuals took flight and moved out of area (50 m radius in response to the whip crack (foot stomp: 0%, raven call: 8% and 4%, hiss: 6%. Increasing rate of presentation (12x/min ×2 min caused 71% of animals to move out of the area. CONCLUSIONS/SIGNIFICANCE: The bull whip crack, an artificial sound, was as effective as the alarm stomp at eliciting aversive behaviors. Kangaroos did not fully habituate despite hearing the signal up to 20x/min. Highest rates of playback did not elicit the greatest responses, suggesting that 'more is not always better'. Ultimately, by utilizing both artificial and biological sounds, predictability may be masked or offset, so that habituation is delayed and more effective deterrents may be produced.

  18. Voice Habits and Behaviors: Voice Care Among Flamenco Singers.

    Science.gov (United States)

    Garzón García, Marina; Muñoz López, Juana; Y Mendoza Lara, Elvira

    2017-03-01

    The purpose of this study is to analyze the vocal behavior of flamenco singers, as compared with classical music singers, to establish a differential vocal profile of voice habits and behaviors in flamenco music. Bibliographic review was conducted, and the Singer's Vocal Habits Questionnaire, an experimental tool designed by the authors to gather data regarding hygiene behavior, drinking and smoking habits, type of practice, voice care, and symptomatology perceived in both the singing and the speaking voice, was administered. We interviewed 94 singers, divided into two groups: the flamenco experimental group (FEG, n = 48) and the classical control group (CCG, n = 46). Frequency analysis, a Likert scale, and discriminant and exploratory factor analysis were used to obtain a differential profile for each group. The FEG scored higher than the CCG in speaking voice symptomatology. The FEG scored significantly higher than the CCG in use of "inadequate vocal technique" when singing. Regarding voice habits, the FEG scored higher in "lack of practice and warm-up" and "environmental habits." A total of 92.6% of the subjects classified themselves correctly in each group. The Singer's Vocal Habits Questionnaire has proven effective in differentiating flamenco and classical singers. Flamenco singers are exposed to numerous vocal risk factors that make them more prone to vocal fatigue, mucosa dehydration, phonotrauma, and muscle stiffness than classical singers. Further research is needed in voice training in flamenco music, as a means to strengthen the voice and enable it to meet the requirements of this musical genre. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  19. Spike-timing-based computation in sound localization.

    Directory of Open Access Journals (Sweden)

    Dan F M Goodman

    2010-11-01

    Full Text Available Spike timing is precise in the auditory system and it has been argued that it conveys information about auditory stimuli, in particular about the location of a sound source. However, beyond simple time differences, the way in which neurons might extract this information is unclear and the potential computational advantages are unknown. The computational difficulty of this task for an animal is to locate the source of an unexpected sound from two monaural signals that are highly dependent on the unknown source signal. In neuron models consisting of spectro-temporal filtering and spiking nonlinearity, we found that the binaural structure induced by spatialized sounds is mapped to synchrony patterns that depend on source location rather than on source signal. Location-specific synchrony patterns would then result in the activation of location-specific assemblies of postsynaptic neurons. We designed a spiking neuron model which exploited this principle to locate a variety of sound sources in a virtual acoustic environment using measured human head-related transfer functions. The model was able to accurately estimate the location of previously unknown sounds in both azimuth and elevation (including front/back discrimination in a known acoustic environment. We found that multiple representations of different acoustic environments could coexist as sets of overlapping neural assemblies which could be associated with spatial locations by Hebbian learning. The model demonstrates the computational relevance of relative spike timing to extract spatial information about sources independently of the source signal.

  20. Integrating cues of social interest and voice pitch in men's preferences for women's voices.

    Science.gov (United States)

    Jones, Benedict C; Feinberg, David R; Debruine, Lisa M; Little, Anthony C; Vukovic, Jovana

    2008-04-23

    Most previous studies of vocal attractiveness have focused on preferences for physical characteristics of voices such as pitch. Here we examine the content of vocalizations in interaction with such physical traits, finding that vocal cues of social interest modulate the strength of men's preferences for raised pitch in women's voices. Men showed stronger preferences for raised pitch when judging the voices of women who appeared interested in the listener than when judging the voices of women who appeared relatively disinterested in the listener. These findings show that voice preferences are not determined solely by physical properties of voices and that men integrate information about voice pitch and the degree of social interest expressed by women when forming voice preferences. Women's preferences for raised pitch in women's voices were not modulated by cues of social interest, suggesting that the integration of cues of social interest and voice pitch when men judge the attractiveness of women's voices may reflect adaptations that promote efficient allocation of men's mating effort.

  1. Voice Therapy Practices and Techniques: A Survey of Voice Clinicians.

    Science.gov (United States)

    Mueller, Peter B.; Larson, George W.

    1992-01-01

    Eighty-three voice disorder therapists' ratings of statements regarding voice therapy practices indicated that vocal nodules are the most frequent disorder treated; vocal abuse and hard glottal attack elimination, counseling, and relaxation were preferred treatment approaches; and voice therapy is more effective with adults than with children.…

  2. Vibrotactile Identification of Signal-Processed Sounds from Environmental Events Presented by a Portable Vibrator: A Laboratory Study

    Directory of Open Access Journals (Sweden)

    Parivash Ranjbar

    2008-09-01

    Full Text Available Objectives: To evaluate different signal-processing algorithms for tactile identification of environmental sounds in a monitoring aid for the deafblind. Two men and three women, sensorineurally deaf or profoundly hearing impaired with experience of vibratory experiments, age 22-36 years. Methods: A closed set of 45 representative environmental sounds were processed using two transposing (TRHA, TR1/3 and three modulating algorithms (AM, AMFM, AMMC and presented as tactile stimuli using a portable vibrator in three experiments. The algorithms TRHA, TR1/3, AMFM and AMMC had two alternatives (with and without adaption to vibratory thresholds. In Exp. 1, the sounds were preprocessed and directly fed to the vibrator. In Exp. 2 and 3, the sounds were presented in an acoustic test room, without or with background noise (SNR=+5 dB, and processed in real time. Results: In Exp. 1, Algorithm AMFM and AMFM(A consistently had the lowest identification scores, and were thus excluded in Exp. 2 and 3. TRHA, AM, AMMC, and AMMC(A showed comparable identification scores (30%-42% and the addition of noise did not deteriorate the performance. Discussion: Algorithm TRHA, AM, AMMC, and AMMC(A showed good performance in all three experiments and were robust in noise they can therefore be used in further testing in real environments.

  3. Two-component network model in voice identification technologies

    Directory of Open Access Journals (Sweden)

    Edita K. Kuular

    2018-03-01

    Full Text Available Among the most important parameters of biometric systems with voice modalities that determine their effectiveness, along with reliability and noise immunity, a speed of identification and verification of a person has been accentuated. This parameter is especially sensitive while processing large-scale voice databases in real time regime. Many research studies in this area are aimed at developing new and improving existing algorithms for presentation and processing voice records to ensure high performance of voice biometric systems. Here, it seems promising to apply a modern approach, which is based on complex network platform for solving complex massive problems with a large number of elements and taking into account their interrelationships. Thus, there are known some works which while solving problems of analysis and recognition of faces from photographs, transform images into complex networks for their subsequent processing by standard techniques. One of the first applications of complex networks to sound series (musical and speech analysis are description of frequency characteristics by constructing network models - converting the series into networks. On the network ontology platform a previously proposed technique of audio information representation aimed on its automatic analysis and speaker recognition has been developed. This implies converting information into the form of associative semantic (cognitive network structure with amplitude and frequency components both. Two speaker exemplars have been recorded and transformed into pertinent networks with consequent comparison of their topological metrics. The set of topological metrics for each of network models (amplitude and frequency one is a vector, and together  those combine a matrix, as a digital "network" voiceprint. The proposed network approach, with its sensitivity to personal conditions-physiological, psychological, emotional, might be useful not only for person identification

  4. Infants of Depressed Mothers Are Less Responsive To Faces and Voices: A Review

    Science.gov (United States)

    Field, Tiffany; Diego, Miguel; Hernandez-Reif, Maria

    2009-01-01

    A review of our recent research suggests that infants of depressed mothers appeared to be less responsive to faces and voices as early as the neonatal period. At that time they have shown less orienting to the live face/voice stimulus of the Brazelton scale examiner and to their own and other infants’ cry sounds. This lesser responsiveness has been attributed to higher arousal, less attentiveness and less “empathy.” Their delayed heart rate decelerations to instrumental and vocal music sounds have also been ascribed to their delayed attention and/or slower processing. Later at 3–6 months they showed less negative responding to their mothers’ non-contingent and still-face behavior, suggesting that they were more accustomed to this behavior in their mothers. The less responsive behavior of the depressed mothers was further compounded by their comorbid mood states of anger and anxiety and their difficult interaction styles including withdrawn or intrusive interaction styles and their later authoritarian parenting style. Pregnancy massage was effectively used to reduce prenatal depression and facilitate more optimal neonatal behavior. Interaction coaching was used during the postnatal period to help these dyads with their interactions and ultimately facilitate the infants’ development PMID:19439359

  5. Assessing Chronic Stress, Coping Skills, and Mood Disorders through Speech Analysis: A Self-Assessment 'Voice App' for Laptops, Tablets, and Smartphones.

    Science.gov (United States)

    Braun, Silke; Annovazzi, Chiara; Botella, Cristina; Bridler, René; Camussi, Elisabetta; Delfino, Juan P; Mohr, Christine; Moragrega, Ines; Papagno, Costanza; Pisoni, Alberto; Soler, Carla; Seifritz, Erich; Stassen, Hans H

    2016-01-01

    Computerized speech analysis (CSA) is a powerful method that allows one to assess stress-induced mood disturbances and affective disorders through repeated measurements of speaking behavior and voice sound characteristics. Over the past decades CSA has been successfully used in the clinical context to monitor the transition from 'affectively disturbed' to 'normal' among psychiatric patients under treatment. This project, by contrast, aimed to extend the CSA method in such a way that the transition from 'normal' to 'affected' can be detected among subjects of the general population through 10-20 self-assessments. Central to the project was a normative speech study of 5 major languages (English, French, German, Italian, and Spanish). Each language comprised 120 subjects stratified according to gender, age, and education with repeated assessments at 14-day intervals (total n = 697). In a first step, we developed a multivariate model to assess affective state and stress-induced bodily reactions through speaking behavior and voice sound characteristics. Secondly, we determined language-, gender-, and age-specific thresholds that draw a line between 'natural fluctuations' and 'significant changes'. Thirdly, we implemented the model along with the underlying methods and normative data in a self-assessment 'voice app' for laptops, tablets, and smartphones. Finally, a longitudinal self-assessment study of 36 subjects was carried out over 14 days to test the performance of the CSA method in home environments. The data showed that speaking behavior and voice sound characteristics can be quantified in a reproducible and language-independent way. Gender and age explained 15-35% of the observed variance, whereas the educational level had a relatively small effect in the range of 1-3%. The self-assessment 'voice app' was realized in modular form so that additional languages can simply be 'plugged in' once the respective normative data become available. Results of the longitudinal

  6. Error-dependent modulation of speech-induced auditory suppression for pitch-shifted voice feedback

    Directory of Open Access Journals (Sweden)

    Larson Charles R

    2011-06-01

    Full Text Available Abstract Background The motor-driven predictions about expected sensory feedback (efference copies have been proposed to play an important role in recognition of sensory consequences of self-produced motor actions. In the auditory system, this effect was suggested to result in suppression of sensory neural responses to self-produced voices that are predicted by the efference copies during vocal production in comparison with passive listening to the playback of the identical self-vocalizations. In the present study, event-related potentials (ERPs were recorded in response to upward pitch shift stimuli (PSS with five different magnitudes (0, +50, +100, +200 and +400 cents at voice onset during active vocal production and passive listening to the playback. Results Results indicated that the suppression of the N1 component during vocal production was largest for unaltered voice feedback (PSS: 0 cents, became smaller as the magnitude of PSS increased to 200 cents, and was almost completely eliminated in response to 400 cents stimuli. Conclusions Findings of the present study suggest that the brain utilizes the motor predictions (efference copies to determine the source of incoming stimuli and maximally suppresses the auditory responses to unaltered feedback of self-vocalizations. The reduction of suppression for 50, 100 and 200 cents and its elimination for 400 cents pitch-shifted voice auditory feedback support the idea that motor-driven suppression of voice feedback leads to distinctly different sensory neural processing of self vs. non-self vocalizations. This characteristic may enable the audio-vocal system to more effectively detect and correct for unexpected errors in the feedback of self-produced voice pitch compared with externally-generated sounds.

  7. Error-dependent modulation of speech-induced auditory suppression for pitch-shifted voice feedback.

    Science.gov (United States)

    Behroozmand, Roozbeh; Larson, Charles R

    2011-06-06

    The motor-driven predictions about expected sensory feedback (efference copies) have been proposed to play an important role in recognition of sensory consequences of self-produced motor actions. In the auditory system, this effect was suggested to result in suppression of sensory neural responses to self-produced voices that are predicted by the efference copies during vocal production in comparison with passive listening to the playback of the identical self-vocalizations. In the present study, event-related potentials (ERPs) were recorded in response to upward pitch shift stimuli (PSS) with five different magnitudes (0, +50, +100, +200 and +400 cents) at voice onset during active vocal production and passive listening to the playback. Results indicated that the suppression of the N1 component during vocal production was largest for unaltered voice feedback (PSS: 0 cents), became smaller as the magnitude of PSS increased to 200 cents, and was almost completely eliminated in response to 400 cents stimuli. Findings of the present study suggest that the brain utilizes the motor predictions (efference copies) to determine the source of incoming stimuli and maximally suppresses the auditory responses to unaltered feedback of self-vocalizations. The reduction of suppression for 50, 100 and 200 cents and its elimination for 400 cents pitch-shifted voice auditory feedback support the idea that motor-driven suppression of voice feedback leads to distinctly different sensory neural processing of self vs. non-self vocalizations. This characteristic may enable the audio-vocal system to more effectively detect and correct for unexpected errors in the feedback of self-produced voice pitch compared with externally-generated sounds.

  8. Electroglottographic analysis of actresses and nonactresses' voices in different levels of intensity.

    Science.gov (United States)

    Master, Suely; Guzman, Marco; Carlos de Miranda, Helder; Lloyd, Adam

    2013-03-01

    Previous studies with long-term average spectrum (LTAS) showed the importance of the glottal source for understanding the projected voices of actresses. In this study, electroglottographic (EGG) analysis was used to investigate the contribution of the glottal source to the projected voice, comparing actresses and nonactresses' voices, in different levels of intensity. Thirty actresses and 30 nonactresses sustained vowels in habitual, moderate, and loud intensity levels. The EGG variables were contact quotient (CQ), closing quotient (QCQ), and opening quotient (QOQ). Other variables were sound pressure level (SPL) and fundamental frequency (F0). A KayPENTAX EGG was used. Variables were inputted in a general linear model. Actresses showed significantly higher values for SPL, in all levels, and both groups increased SPL significantly while changing from habitual to moderate and further to loud. There were no significant differences between groups for EGG quotients. There were significant differences between the levels only for F0 and CQ for both groups. SPL was significantly higher among actresses in all intensity levels, but in the EGG analysis, no differences were found. This apparently weak contribution of the glottal source in the supposedly projected voices of actresses, contrary to previous LTAS studies, might be because of a higher subglottal pressure or perhaps greater vocal tract contribution in SPL. Results from the present study suggest that trained subjects did not produce a significant higher SPL than untrained individuals by increasing the cost in terms of higher vocal fold collision and hence more impact stress. Future researches should explore the difference between trained and nontrained voices by aerodynamic measurements to evaluate the relationship between physiologic findings and the acoustic and EGG data. Moreover, further studies should consider both types of vocal tasks, sustained vowel and running speech, for both EGG and LTAS analysis

  9. Speech versus singing: Infants choose happier sounds

    Directory of Open Access Journals (Sweden)

    Marieve eCorbeil

    2013-06-01

    Full Text Available Infants prefer speech to non-vocal sounds and to non-human vocalizations, and they prefer happy-sounding speech to neutral speech. They also exhibit an interest in singing, but there is little knowledge of their relative interest in speech and singing. The present study explored infants’ attention to unfamiliar audio samples of speech and singing. In Experiment 1, infants 4-13 months of age were exposed to happy-sounding infant-directed speech versus hummed lullabies by the same woman. They listened significantly longer to the speech, which had considerably greater acoustic variability and expressiveness, than to the lullabies. In Experiment 2, infants of comparable age who heard the lyrics of a Turkish children’s song spoken versus sung in a joyful/happy manner did not exhibit differential listening. Infants in Experiment 3 heard the happily sung lyrics of the Turkish children’s song versus a version that was spoken in an adult-directed or affectively neutral manner. They listened significantly longer to the sung version. Overall, happy voice quality rather than vocal mode (speech or singing was the principal contributor to infant attention, regardless of age.

  10. Evaluation of Speech Recognition of Cochlear Implant Recipients Using Adaptive, Digital Remote Microphone Technology and a Speech Enhancement Sound Processing Algorithm.

    Science.gov (United States)

    Wolfe, Jace; Morais, Mila; Schafer, Erin; Agrawal, Smita; Koch, Dawn

    2015-05-01

    Cochlear implant recipients often experience difficulty with understanding speech in the presence of noise. Cochlear implant manufacturers have developed sound processing algorithms designed to improve speech recognition in noise, and research has shown these technologies to be effective. Remote microphone technology utilizing adaptive, digital wireless radio transmission has also been shown to provide significant improvement in speech recognition in noise. There are no studies examining the potential improvement in speech recognition in noise when these two technologies are used simultaneously. The goal of this study was to evaluate the potential benefits and limitations associated with the simultaneous use of a sound processing algorithm designed to improve performance in noise (Advanced Bionics ClearVoice) and a remote microphone system that incorporates adaptive, digital wireless radio transmission (Phonak Roger). A two-by-two way repeated measures design was used to examine performance differences obtained without these technologies compared to the use of each technology separately as well as the simultaneous use of both technologies. Eleven Advanced Bionics (AB) cochlear implant recipients, ages 11 to 68 yr. AzBio sentence recognition was measured in quiet and in the presence of classroom noise ranging in level from 50 to 80 dBA in 5-dB steps. Performance was evaluated in four conditions: (1) No ClearVoice and no Roger, (2) ClearVoice enabled without the use of Roger, (3) ClearVoice disabled with Roger enabled, and (4) simultaneous use of ClearVoice and Roger. Speech recognition in quiet was better than speech recognition in noise for all conditions. Use of ClearVoice and Roger each provided significant improvement in speech recognition in noise. The best performance in noise was obtained with the simultaneous use of ClearVoice and Roger. ClearVoice and Roger technology each improves speech recognition in noise, particularly when used at the same time

  11. Voice following radiotherapy

    International Nuclear Information System (INIS)

    Stoicheff, M.L.

    1975-01-01

    This study was undertaken to provide information on the voice of patients following radiotherapy for glottic cancer. Part I presents findings from questionnaires returned by 227 of 235 patients successfully irradiated for glottic cancer from 1960 through 1971. Part II presents preliminary findings on the speaking fundamental frequencies of 22 irradiated patients. Normal to near-normal voice was reported by 83 percent of the 227 patients; however, 80 percent did indicate persisting vocal difficulties such as fatiguing of voice with much usage, inability to sing, reduced loudness, hoarse voice quality and inability to shout. Amount of talking during treatments appeared to affect length of time for voice to recover following treatments in those cases where it took from nine to 26 weeks; also, with increasing years since treatment, patients rated their voices more favorably. Smoking habits following treatments improved significantly with only 27 percent smoking heavily as compared with 65 percent prior to radiation therapy. No correlation was found between smoking (during or after treatments) and vocal ratings or between smoking and length of time for voice to recover. There was no relationship found between reported vocal ratings and stage of the disease

  12. Voices Not Heard: Voice-Use Profiles of Elementary Music Teachers, the Effects of Voice Amplification on Vocal Load, and Perceptions of Issues Surrounding Voice Use

    Science.gov (United States)

    Morrow, Sharon L.

    2009-01-01

    Teachers represent the largest group of occupational voice users and have voice-related problems at a rate of over twice that found in the general population. Among teachers, music teachers are roughly four times more likely than classroom teachers to develop voice-related problems. Although it has been established that music teachers use their…

  13. Dimensionality in voice quality.

    Science.gov (United States)

    Bele, Irene Velsvik

    2007-05-01

    This study concerns speaking voice quality in a group of male teachers (n = 35) and male actors (n = 36), as the purpose was to investigate normal and supranormal voices. The goal was the development of a method of valid perceptual evaluation for normal to supranormal and resonant voices. The voices (text reading at two loudness levels) had been evaluated by 10 listeners, for 15 vocal characteristics using VA scales. In this investigation, the results of an exploratory factor analysis of the vocal characteristics used in this method are presented, reflecting four dimensions of major importance for normal and supranormal voices. Special emphasis is placed on the effects on voice quality of a change in the loudness variable, as two loudness levels are studied. Furthermore, the vocal characteristics Sonority and Ringing voice quality are paid special attention, as the essence of the term "resonant voice" was a basic issue throughout a doctoral dissertation where this study was included.

  14. [Assessment of voice acoustic parameters in female teachers with diagnosed occupational voice disorders].

    Science.gov (United States)

    Niebudek-Bogusz, Ewa; Fiszer, Marta; Sliwińska-Kowalska, Mariola

    2005-01-01

    Laryngovideostroboscopy is the method most frequently used in the assessment of voice disorders. However, the employment of quantitative methods, such as voice acoustic analysis, is essential for evaluating the effectiveness of prophylactic and therapeutic activities as well as for objective medical certification of larynx pathologies. The aim of this study was to examine voice acoustic parameters in female teachers with occupational voice diseases. Acoustic analysis (IRIS software) was performed in 66 female teachers, including 35 teachers with occupational voice diseases and 31 with functional dysphonia. The teachers with occupational voice diseases presented the lower average fundamental frequency (193 Hz) compared to the group with functional dysphonia (209 Hz) and to the normative value (236 Hz), whereas other acoustic parameters did not differ significantly in both groups. Voice acoustic analysis, when applied separately from vocal loading, cannot be used as a testing method to verify the diagnosis of occupational voice disorders.

  15. Communication in a noisy environment: Perception of one's own voice and speech enhancement

    Science.gov (United States)

    Le Cocq, Cecile

    Workers in noisy industrial environments are often confronted to communication problems. Lost of workers complain about not being able to communicate easily with their coworkers when they wear hearing protectors. In consequence, they tend to remove their protectors, which expose them to the risk of hearing loss. In fact this communication problem is a double one: first the hearing protectors modify one's own voice perception; second they interfere with understanding speech from others. This double problem is examined in this thesis. When wearing hearing protectors, the modification of one's own voice perception is partly due to the occlusion effect which is produced when an earplug is inserted in the car canal. This occlusion effect has two main consequences: first the physiological noises in low frequencies are better perceived, second the perception of one's own voice is modified. In order to have a better understanding of this phenomenon, the literature results are analyzed systematically, and a new method to quantify the occlusion effect is developed. Instead of stimulating the skull with a bone vibrator or asking the subject to speak as is usually done in the literature, it has been decided to excite the buccal cavity with an acoustic wave. The experiment has been designed in such a way that the acoustic wave which excites the buccal cavity does not excite the external car or the rest of the body directly. The measurement of the hearing threshold in open and occluded car has been used to quantify the subjective occlusion effect for an acoustic wave in the buccal cavity. These experimental results as well as those reported in the literature have lead to a better understanding of the occlusion effect and an evaluation of the role of each internal path from the acoustic source to the internal car. The speech intelligibility from others is altered by both the high sound levels of noisy industrial environments and the speech signal attenuation due to hearing

  16. Muscular tension and body posture in relation to voice handicap and voice quality in teachers with persistent voice complaints.

    Science.gov (United States)

    Kooijman, P G C; de Jong, F I C R S; Oudes, M J; Huinck, W; van Acht, H; Graamans, K

    2005-01-01

    The aim of this study was to investigate the relationship between extrinsic laryngeal muscular hypertonicity and deviant body posture on the one hand and voice handicap and voice quality on the other hand in teachers with persistent voice complaints and a history of voice-related absenteeism. The study group consisted of 25 female teachers. A voice therapist assessed extrinsic laryngeal muscular tension and a physical therapist assessed body posture. The assessed parameters were clustered in categories. The parameters in the different categories represent the same function. Further a tension/posture index was created, which is the summation of the different parameters. The different parameters and the index were related to the Voice Handicap Index (VHI) and the Dysphonia Severity Index (DSI). The scores of the VHI and the individual parameters differ significantly except for the posterior weight bearing and tension of the sternocleidomastoid muscle. There was also a significant difference between the individual parameters and the DSI, except for tension of the cricothyroid muscle and posterior weight bearing. The score of the tension/posture index correlates significantly with both the VHI and the DSI. In a linear regression analysis, the combination of hypertonicity of the sternocleidomastoid, the geniohyoid muscles and posterior weight bearing is the most important predictor for a high voice handicap. The combination of hypertonicity of the geniohyoid muscle, posterior weight bearing, high position of the hyoid bone, hypertonicity of the cricothyroid muscle and anteroposition of the head is the most important predictor for a low DSI score. The results of this study show the higher the score of the index, the higher the score of the voice handicap and the worse the voice quality is. Moreover, the results are indicative for the importance of assessment of muscular tension and body posture in the diagnosis of voice disorders.

  17. Mindfulness of voices, self-compassion, and secure attachment in relation to the experience of hearing voices.

    Science.gov (United States)

    Dudley, James; Eames, Catrin; Mulligan, John; Fisher, Naomi

    2018-03-01

    Developing compassion towards oneself has been linked to improvement in many areas of psychological well-being, including psychosis. Furthermore, developing a non-judgemental, accepting way of relating to voices is associated with lower levels of distress for people who hear voices. These factors have also been associated with secure attachment. This study explores associations between the constructs of mindfulness of voices, self-compassion, and distress from hearing voices and how secure attachment style related to each of these variables. Cross-sectional online. One hundred and twenty-eight people (73% female; M age  = 37.5; 87.5% Caucasian) who currently hear voices completed the Self-Compassion Scale, Southampton Mindfulness of Voices Questionnaire, Relationships Questionnaire, and Hamilton Programme for Schizophrenia Voices Questionnaire. Results showed that mindfulness of voices mediated the relationship between self-compassion and severity of voices, and self-compassion mediated the relationship between mindfulness of voices and severity of voices. Self-compassion and mindfulness of voices were significantly positively correlated with each other and negatively correlated with distress and severity of voices. Mindful relation to voices and self-compassion are associated with reduced distress and severity of voices, which supports the proposed potential benefits of mindful relating to voices and self-compassion as therapeutic skills for people experiencing distress by voice hearing. Greater self-compassion and mindfulness of voices were significantly associated with less distress from voices. These findings support theory underlining compassionate mind training. Mindfulness of voices mediated the relationship between self-compassion and distress from voices, indicating a synergistic relationship between the constructs. Although the current findings do not give a direction of causation, consideration is given to the potential impact of mindful and

  18. Vibrotactile Detection, Identification and Directional Perception of signal-Processed Sounds from Environmental Events: A Pilot Field Evaluation in Five Cases

    Directory of Open Access Journals (Sweden)

    Parivash Ranjbar

    2008-09-01

    Full Text Available Objectives: Conducting field tests of a vibrotactile aid for deaf/deafblind persons for detection, identification and directional perception of environmental sounds. Methods: Five deaf (3F/2M, 22–36 years individuals tested the aid separately in a home environment (kitchen and in a traffic environment. Their eyes were blindfolded and they wore a headband and holding a vibrator for sound identification. In the headband, three microphones were mounted and two vibrators for signalling direction of the sound source. The sounds originated from events typical for the home environment and traffic. The subjects were inexperienced (events unknown and experienced (events known. They identified the events in a home and traffic environment, but perceived sound source direction only in traffic. Results: The detection scores were higher than 98% both in the home and in the traffic environment. In the home environment, identification scores varied between 25%-58% when the subjects were inexperienced and between 33%-83% when they were experienced. In traffic, identification scores varied between 20%-40% when the subjects were inexperienced and between 22%-56% when they were experienced. The directional perception scores varied between 30%-60% when inexperienced and between 61%-83% when experienced. Discussion: The vibratory aid consistently improved all participants’ detection, identification and directional perception ability.

  19. Effects of Bel Canto Training on Acoustic and Aerodynamic Characteristics of the Singing Voice.

    Science.gov (United States)

    McHenry, Monica A; Evans, Joseph; Powitzky, Eric

    2016-03-01

    This study was designed to assess the impact of 2 years of operatic training on acoustic and aerodynamic characteristics of the singing voice. This is a longitudinal study. Participants were 21 graduate students and 16 undergraduate students. They completed a variety of tasks, including laryngeal videostroboscopy, audio recording of pitch range, and singing of syllable trains at full voice in chest, passaggio, and head registers. Inspiration, intraoral pressure, airflow, and sound pressure level (SPL) were captured during the syllable productions. Both graduate and undergraduate students significantly increased semitone range and SPL. The contributions to increased SPL were typically increased inspiration, increased airflow, and reduced laryngeal resistance, although there were individual differences. Two graduate students increased SPL without increased airflow and likely used supraglottal strategies to do so. Students demonstrated improvements in both acoustic and aerodynamic components of singing. Increasing SPL primarily through respiratory drive is a healthy strategy and results from intensive training. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  20. Locating and classification of structure-borne sound occurrence using wavelet transformation

    International Nuclear Information System (INIS)

    Winterstein, Martin; Thurnreiter, Martina

    2011-01-01

    For the surveillance of nuclear facilities with respect to detached or loose parts within the pressure boundary structure-borne sound detector systems are used. The impact of loose parts on the wall causes energy transfer to the wall that is measured a so called singular sound event. The run-time differences of sound signals allow a rough locating of the loose part. The authors performed a finite element based simulation of structure-borne sound measurements using real geometries. New knowledge on sound wave propagation, signal analysis and processing, neuronal networks or hidden Markov models were considered. Using the wavelet transformation it is possible to improve the localization of structure-borne sound events.

  1. Pitch Based Sound Classification

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U

    2006-01-01

    A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft......-max output function. Both linear and quadratic inputs are used. The model is trained on 2 hours of sound and tested on publicly available data. A test classification error below 0.05 with 1 s classification windows is achieved. Further more it is shown that linear input performs as well as a quadratic......, and that even though classification gets marginally better, not much is achieved by increasing the window size beyond 1 s....

  2. Visualization of Broadband Sound Sources

    Directory of Open Access Journals (Sweden)

    Sukhanov Dmitry

    2016-01-01

    Full Text Available In this paper the method of imaging of wideband audio sources based on the 2D microphone array measurements of the sound field at the same time in all the microphones is proposed. Designed microphone array consists of 160 microphones allowing to digitize signals with a frequency of 7200 Hz. Measured signals are processed using the special algorithm that makes it possible to obtain a flat image of wideband sound sources. It is shown experimentally that the visualization is not dependent on the waveform, but determined by the bandwidth. Developed system allows to visualize sources with a resolution of up to 10 cm.

  3. The Story of a Poet Who Beat Cancer and Became a Squeak: A Sounded Narrative about Art, Education, and the Power of the Human Spirit

    Science.gov (United States)

    Gershon, Walter S.; Van Deventer, George V.

    2013-01-01

    This collaborative piece represents one of the first iterations of a methodological possibility called sounded narratives. It is also a performative piece of sound/art, a narrative about a poet and his voice, stories that are as much about himself as they are about curricular possibilities and the power of art. Based on a pair of over two-hour…

  4. "Voice Forum" The Human Voice as Primary Instrument in Music Therapy

    DEFF Research Database (Denmark)

    Pedersen, Inge Nygaard; Storm, Sanne

    2009-01-01

    Aspects will be drawn on the human voice as tool for embodying our psychological and physiological state, and attempting integration of feelings. Presentations and dialogues on different methods and techniques in "Therapy related body-and voice work.", as well as the human voice as a tool for non...

  5. Does the speaker's voice quality influence children's performance on a language comprehension test?

    Science.gov (United States)

    Lyberg-Åhlander, Viveka; Haake, Magnus; Brännström, Jonas; Schötz, Susanne; Sahlén, Birgitta

    2015-02-01

    A small number of studies have explored children's perception of speakers' voice quality and its possible influence on language comprehension. The aim of this explorative study was to investigate the relationship between the examiner's voice quality, the child's performance on a digital version of a language comprehension test, the Test for Reception of Grammar (TROG-2), and two measures of cognitive functioning. The participants were (n = 86) mainstreamed 8-year old children with typical language development. Two groups of children (n = 41/45) were presented with the TROG-2 through recordings of one female speaker: one group was presented with a typical voice and the other with a simulated dysphonic voice. Significant associations were found between executive functioning and language comprehension. The results also showed that children listening to the dysphonic voice achieved significantly lower scores for more difficult sentences ("the man but not the horse jumps") and used more self-corrections on simpler sentences ("the girl is sitting"). Findings suggest that a dysphonic speaker's voice may force the child to allocate capacity to the processing of the voice signal at the expense of comprehension. The findings have implications for clinical and research settings where standardized language tests are used.

  6. The Effects of Musician's Earplugs on Acoustic and Perceptual Measures of Choral and Solo Sound.

    Science.gov (United States)

    Cook-Cunningham, Sheri L

    2017-10-25

    The purpose of this investigation was to assess the effects of earplugs on acoustical and perceptual measures of choral and solo sound. The researcher tested the effects of musician's earplugs on choral and solo timbre and singer perceptions. Members of an intact women's university choir recorded Dona Nobis Pacem under two conditions, no earplugs and with earplugs over time. Approximately half of the choir members also participated as soloists, recording Over the Rainbow under the same two conditions. All recordings were analyzed using long-term average spectra (LTAS). After participating in each recording session, the participants responded to a questionnaire about ability to hear self (solo and choral context) and ability to hear others (choral context) under two conditions, no earplugs and with earplugs. LTAS results revealed that wearing earplugs in a choral setting caused decreased mean signal energy (>1 dB), resulting in less resonant singing. LTAS results also indicated that wearing earplugs in a solo setting had less effect on mean signal energy, resulting in a mean difference solo setting. Findings from this study could provide important information when structuring hearing conservation strategies. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  7. Acoustic markers to differentiate gender in prepubescent children's speaking and singing voice.

    Science.gov (United States)

    Guzman, Marco; Muñoz, Daniel; Vivero, Martin; Marín, Natalia; Ramírez, Mirta; Rivera, María Trinidad; Vidal, Carla; Gerhard, Julia; González, Catalina

    2014-10-01

    Investigation sought to determine whether there is any acoustic variable to objectively differentiate gender in children with normal voices. A total of 30 children, 15 boys and 15 girls, with perceptually normal voices were examined. They were between 7 and 10 years old (mean: 8.1, SD: 0.7 years). Subjects were required to perform the following phonatory tasks: (1) to phonate sustained vowels [a:], [i:], [u:], (2) to read a phonetically balanced text, and (3) to sing a song. Acoustic analysis included long-term average spectrum (LTAS), fundamental frequency (F0), speaking fundamental frequency (SFF), equivalent continuous sound level (Leq), linear predictive code (LPC) to obtain formant frequencies, perturbation measures, harmonic to noise ratio (HNR), and Cepstral peak prominence (CPP). Auditory perceptual analysis was performed by four blinded judges to determine gender. No significant gender-related differences were found for most acoustic variables. Perceptual assessment showed good intra and inter rater reliability for gender. Cepstrum for [a:], alpha ratio in text, shimmer for [i:], F3 in [a:], and F3 in [i:], were the parameters that composed the multivariate logistic regression model to best differentiate male and female children's voices. Since perceptual assessment reliably detected gender, it is likely that other acoustic markers (not evaluated in the present study) are able to make clearer gender differences. For example, gender-specific patterns of intonation may be a more accurate feature for differentiating gender in children's voices. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  8. Writing with Voice

    Science.gov (United States)

    Kesler, Ted

    2012-01-01

    In this Teaching Tips article, the author argues for a dialogic conception of voice, based in the work of Mikhail Bakhtin. He demonstrates a dialogic view of voice in action, using two writing examples about the same topic from his daughter, a fifth-grade student. He then provides five practical tips for teaching a dialogic conception of voice in…

  9. Marshall’s Voice

    Directory of Open Access Journals (Sweden)

    Halper Thomas

    2017-12-01

    Full Text Available Most judicial opinions, for a variety of reasons, do not speak with the voice of identifiable judges, but an analysis of several of John Marshall’s best known opinions reveals a distinctive voice, with its characteristic language and style of argumentation. The power of this voice helps to account for the influence of his views.

  10. How male sound pressure level influences phonotaxis in virgin female Jamaican field crickets (Gryllus assimilis

    Directory of Open Access Journals (Sweden)

    Karen Pacheco

    2014-06-01

    Full Text Available Understanding female mate preference is important for determining the strength and direction of sexual trait evolution. The sound pressure level (SPL acoustic signalers use is often an important predictor of mating success because higher sound pressure levels are detectable at greater distances. If females are more attracted to signals produced at higher sound pressure levels, then the potential fitness impacts of signalling at higher sound pressure levels should be elevated beyond what would be expected from detection distance alone. Here we manipulated the sound pressure level of cricket mate attraction signals to determine how female phonotaxis was influenced. We examined female phonotaxis using two common experimental methods: spherical treadmills and open arenas. Both methods showed similar results, with females exhibiting greatest phonotaxis towards loud sound pressure levels relative to the standard signal (69 vs. 60 dB SPL but showing reduced phonotaxis towards very loud sound pressure level signals relative to the standard (77 vs. 60 dB SPL. Reduced female phonotaxis towards supernormal stimuli may signify an acoustic startle response, an absence of other required sensory cues, or perceived increases in predation risk.

  11. Vocal Qualities in Music Theater Voice: Perceptions of Expert Pedagogues.

    Science.gov (United States)

    Bourne, Tracy; Kenny, Dianna

    2016-01-01

    To gather qualitative descriptions of music theater vocal qualities including belt, legit, and mix from expert pedagogues to better define this voice type. This is a prospective, semistructured interview. Twelve expert teachers from United States, United Kingdom, Asia, and Australia were interviewed by Skype and asked to identify characteristics of music theater vocal qualities including vocal production, physiology, esthetics, pitch range, and pedagogical techniques. Responses were compared with published studies on music theater voice. Belt and legit were generally described as distinct sounds with differing physiological and technical requirements. Teachers were concerned that belt should be taught "safely" to minimize vocal health risks. There was consensus between teachers and published research on the physiology of the glottis and vocal tract; however, teachers were not in agreement about breathing techniques. Neither were teachers in agreement about the meaning of "mix." Most participants described belt as heavily weighted, thick folds, thyroarytenoid-dominant, or chest register; however, there was no consensus on an appropriate term. Belt substyles were named and generally categorized by weightedness or tone color. Descriptions of male belt were less clear than for female belt. This survey provides an overview of expert pedagogical perspectives on the characteristics of belt, legit, and mix qualities in the music theater voice. Although teacher responses are generally in agreement with published research, there are still many controversial issues and gaps in knowledge and understanding of this vocal technique. Breathing techniques, vocal range, mix, male belt, and vocal registers require continuing investigation so that we can learn more about efficient and healthy vocal function in music theater singing. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  12. Differing Roles of the Face and Voice in Early Human Communication: Roots of Language in Multimodal Expression

    Directory of Open Access Journals (Sweden)

    Yuna Jhang

    2017-09-01

    Full Text Available Seeking roots of language, we probed infant facial expressions and vocalizations. Both have roles in language, but the voice plays an especially flexible role, expressing a variety of functions and affect conditions with the same vocal categories—a word can be produced with many different affective flavors. This requirement of language is seen in very early infant vocalizations. We examined the extent to which affect is transmitted by early vocal categories termed “protophones” (squeals, vowel-like sounds, and growls and by their co-occurring facial expressions, and similarly the extent to which vocal type is transmitted by the voice and co-occurring facial expressions. Our coder agreement data suggest infant affect during protophones was most reliably transmitted by the face (judged in video-only, while vocal type was transmitted most reliably by the voice (judged in audio-only. Voice alone transmitted negative affect more reliably than neutral or positive affect, suggesting infant protophones may be used especially to call for attention when the infant is in distress. By contrast, the face alone provided no significant information about protophone categories. Indeed coders in VID could scarcely recognize the difference between silence and voice when coding protophones in VID. The results suggest that partial decoupling of communicative roles for face and voice occurs even in the first months of life. Affect in infancy appears to be transmitted in a way that audio and video aspects are flexibly interwoven, as in mature language.

  13. Differing Roles of the Face and Voice in Early Human Communication: Roots of Language in Multimodal Expression.

    Science.gov (United States)

    Jhang, Yuna; Franklin, Beau; Ramsdell-Hudock, Heather L; Oller, D Kimbrough

    2017-01-01

    Seeking roots of language, we probed infant facial expressions and vocalizations. Both have roles in language, but the voice plays an especially flexible role, expressing a variety of functions and affect conditions with the same vocal categories-a word can be produced with many different affective flavors. This requirement of language is seen in very early infant vocalizations. We examined the extent to which affect is transmitted by early vocal categories termed "protophones" (squeals, vowel-like sounds, and growls) and by their co-occurring facial expressions, and similarly the extent to which vocal type is transmitted by the voice and co-occurring facial expressions. Our coder agreement data suggest infant affect during protophones was most reliably transmitted by the face (judged in video-only), while vocal type was transmitted most reliably by the voice (judged in audio-only). Voice alone transmitted negative affect more reliably than neutral or positive affect, suggesting infant protophones may be used especially to call for attention when the infant is in distress. By contrast, the face alone provided no significant information about protophone categories. Indeed coders in VID could scarcely recognize the difference between silence and voice when coding protophones in VID. The results suggest that partial decoupling of communicative roles for face and voice occurs even in the first months of life. Affect in infancy appears to be transmitted in a way that audio and video aspects are flexibly interwoven, as in mature language.

  14. SIBYLLE: an expert system for the interpretation in real time of mono-dimensional signals; application to vocal signal

    International Nuclear Information System (INIS)

    Minault, Sophie

    1987-01-01

    This report presents an interactive tool for computer aided building of signals processing and interpretation systems. This tool includes three main parts: - an expert system, - a rule compiler, - a real time procedural system. The expert system allows the acquisition of knowledge about the signal. Knowledge has to be formalized as a set of rewriting rules (or syntaxical rules) and is introduced with an interactive interface. The compiler makes a compilation of the knowledge base (the set of rules) and generates a procedural system, which is equivalent to the expert system. The generated procedural system is a fixed one but is much faster than the expert system: it can work in real time. The expert system is used along the experimental phase on a small corpus of data: the knowledge base is then tested and possibly modified thanks to the interactive interface. Once the knowledge base is steady enough, the procedural system is generated and tested on a bigger data corpus. This allows to perform significant statistical studies which generally induce some corrections at the expert system level. The overall constitutes a tool which conciliates the expert systems flexibility with the procedural systems speed. It has been used for building a set of recognition rules modules on vocal signal - module of sound-silence detection - module of voiced-unvoiced segmentation - module of synchronous pitch detection. Its possibilities are not limited to the study of vocal signal, but can be enlarged to any mono-dimensional signal processing. A feasibility study has been realised for an electrocardiograms application. (author) [fr

  15. A pneumatic Bionic Voice prosthesis-Pre-clinical trials of controlling the voice onset and offset.

    Science.gov (United States)

    Ahmadi, Farzaneh; Noorian, Farzad; Novakovic, Daniel; van Schaik, André

    2018-01-01

    Despite emergent progress in many fields of bionics, a functional Bionic Voice prosthesis for laryngectomy patients (larynx amputees) has not yet been achieved, leading to a lifetime of vocal disability for these patients. This study introduces a novel framework of Pneumatic Bionic Voice Prostheses as an electronic adaptation of the Pneumatic Artificial Larynx (PAL) device. The PAL is a non-invasive mechanical voice source, driven exclusively by respiration with an exceptionally high voice quality, comparable to the existing gold standard of Tracheoesophageal (TE) voice prosthesis. Following PAL design closely as the reference, Pneumatic Bionic Voice Prostheses seem to have a strong potential to substitute the existing gold standard by generating a similar voice quality while remaining non-invasive and non-surgical. This paper designs the first Pneumatic Bionic Voice prosthesis and evaluates its onset and offset control against the PAL device through pre-clinical trials on one laryngectomy patient. The evaluation on a database of more than five hours of continuous/isolated speech recordings shows a close match between the onset/offset control of the Pneumatic Bionic Voice and the PAL with an accuracy of 98.45 ±0.54%. When implemented in real-time, the Pneumatic Bionic Voice prosthesis controller has an average onset/offset delay of 10 milliseconds compared to the PAL. Hence it addresses a major disadvantage of previous electronic voice prostheses, including myoelectric Bionic Voice, in meeting the short time-frames of controlling the onset/offset of the voice in continuous speech.

  16. Voice similarity in identical twins.

    Science.gov (United States)

    Van Gysel, W D; Vercammen, J; Debruyne, F

    2001-01-01

    If people are asked to discriminate visually the two individuals of a monozygotic twin (MT), they mostly get into trouble. Does this problem also exist when listening to twin voices? Twenty female and 10 male MT voices were randomly assembled with one "strange" voice to get voice trios. The listeners (10 female students in Speech and Language Pathology) were asked to label the twins (voices 1-2, 1-3 or 2-3) in two conditions: two standard sentences read aloud and a 2.5-second midsection of a sustained /a/. The proportion correctly labelled twins was for female voices 82% and 63% and for male voices 74% and 52% for the sentences and the sustained /a/ respectively, both being significantly greater than chance (33%). The acoustic analysis revealed a high intra-twin correlation for the speaking fundamental frequency (SFF) of the sentences and the fundamental frequency (F0) of the sustained /a/. So the voice pitch could have been a useful characteristic in the perceptual identification of the twins. We conclude that there is a greater perceptual resemblance between the voices of identical twins than between voices without genetic relationship. The identification however is not perfect. The voice pitch possibly contributes to the correct twin identifications.

  17. Tips for Healthy Voices

    Science.gov (United States)

    ... prevent voice problems and maintain a healthy voice: Drink water (stay well hydrated): Keeping your body well hydrated by drinking plenty of water each day (6-8 glasses) is essential to maintaining a healthy voice. The ...

  18. Exploring the perceived harshness of cello sounds by morphing and synthesis techniques.

    Science.gov (United States)

    Rozé, Jocelyn; Aramaki, Mitsuko; Kronland-Martinet, Richard; Ystad, Sølvi

    2017-03-01

    Cello bowing requires a very fine control of the musicians' gestures to ensure the quality of the perceived sound. When the interaction between the bow hair and the string is optimal, the sound is perceived as broad and round. On the other hand, when the gestural control becomes more approximate, the sound quality deteriorates and often becomes harsh, shrill, and quavering. In this study, such a timbre degradation, often described by French cellists as harshness (décharnement), is investigated from both signal and perceptual perspectives. Harsh sounds were obtained from experienced cellists subjected to a postural constraint. A signal approach based on Gabor masks enabled us to capture the main dissimilarities between round and harsh sounds. Two complementary methods perceptually validated these signal features: First, a predictive regression model of the perceived harshness was built from sound continua obtained by a morphing technique. Next, the signal structures identified by the model were validated within a perceptual timbre space, obtained by multidimensional scaling analysis on pairs of synthesized stimuli controlled in harshness. The results revealed that the perceived harshness was due to a combination between a more chaotic harmonic behavior, a formantic emergence, and a weaker attack slope.

  19. Affective state and voice: cross-cultural assessment of speaking behavior and voice sound characteristics--a normative multicenter study of 577 + 36 healthy subjects.

    Science.gov (United States)

    Braun, Silke; Botella, Cristina; Bridler, René; Chmetz, Florian; Delfino, Juan Pablo; Herzig, Daniela; Kluckner, Viktoria J; Mohr, Christine; Moragrega, Ines; Schrag, Yann; Seifritz, Erich; Soler, Carla; Stassen, Hans H

    2014-01-01

    Human speech is greatly influenced by the speakers' affective state, such as sadness, happiness, grief, guilt, fear, anger, aggression, faintheartedness, shame, sexual arousal, love, amongst others. Attentive listeners discover a lot about the affective state of their dialog partners with no great effort, and without having to talk about it explicitly during a conversation or on the phone. On the other hand, speech dysfunctions, such as slow, delayed or monotonous speech, are prominent features of affective disorders. This project was comprised of four studies with healthy volunteers from Bristol (English: n = 117), Lausanne (French: n = 128), Zurich (German: n = 208), and Valencia (Spanish: n = 124). All samples were stratified according to gender, age, and education. The specific study design with different types of spoken text along with repeated assessments at 14-day intervals allowed us to estimate the 'natural' variation of speech parameters over time, and to analyze the sensitivity of speech parameters with respect to form and content of spoken text. Additionally, our project included a longitudinal self-assessment study with university students from Zurich (n = 18) and unemployed adults from Valencia (n = 18) in order to test the feasibility of the speech analysis method in home environments. The normative data showed that speaking behavior and voice sound characteristics can be quantified in a reproducible and language-independent way. The high resolution of the method was verified by a computerized assignment of speech parameter patterns to languages at a success rate of 90%, while the correct assignment to texts was 70%. In the longitudinal self-assessment study we calculated individual 'baselines' for each test person along with deviations thereof. The significance of such deviations was assessed through the normative reference data. Our data provided gender-, age-, and language-specific thresholds that allow one to reliably distinguish between 'natural

  20. Voice-band Modems: A Device to Transmit Data Over Telephone ...

    Indian Academy of Sciences (India)

    Voice-band Modems: A Device to Transmit Data. Over Telephone Networks. 1. Basic Principles of Data Trans.mission v U Reddy is with the. Electrical Communica- tion Engineering. Department, Indian. Institute of Science. His research areas are adaptive signal process- ing, multirate filtering and wavelets, and multi-.

  1. Perceiving a stranger's voice as being one's own: a 'rubber voice' illusion?

    Directory of Open Access Journals (Sweden)

    Zane Z Zheng

    2011-04-01

    Full Text Available We describe an illusion in which a stranger's voice, when presented as the auditory concomitant of a participant's own speech, is perceived as a modified version of their own voice. When the congruence between utterance and feedback breaks down, the illusion is also broken. Compared to a baseline condition in which participants heard their own voice as feedback, hearing a stranger's voice induced robust changes in the fundamental frequency (F0 of their production. Moreover, the shift in F0 appears to be feedback dependent, since shift patterns depended reliably on the relationship between the participant's own F0 and the stranger-voice F0. The shift in F0 was evident both when the illusion was present and after it was broken, suggesting that auditory feedback from production may be used separately for self-recognition and for vocal motor control. Our findings indicate that self-recognition of voices, like other body attributes, is malleable and context dependent.

  2. The din of gunfire: Rethinking the role of sound in World War II newsreels

    Directory of Open Access Journals (Sweden)

    Masha Shpolberg

    2014-12-01

    Full Text Available French film historian Laurent Véray has famously called World War I ‘the first media war of the twentieth century’. Newsreels, which first appeared in 1910, brought the war to movie theaters across Europe and the U.S., screening combat for those on the ‘home front’. However, while the audience could see the action it could not hear it – sometimes only live music would accompany the movements of the troops. The arrival of sound newsreels in 1929 radically transformed moviegoers’ experiences of the news, and, by necessity, of armed conflict. Drawing on examples of World War II newsreels from British Pathé’s archive that was recently made available online, this article seeks to delineate the logic governing the combination of voice-over commentary, music, sound effects, and field-recorded sound, and argues that it can be traced directly to the treatment of sound in the ‘Great War’ fiction films of the preceding decade.

  3. Unfamiliar voice identification: Effect of post-event information on accuracy and voice ratings

    Directory of Open Access Journals (Sweden)

    Harriet Mary Jessica Smith

    2014-04-01

    Full Text Available This study addressed the effect of misleading post-event information (PEI on voice ratings, identification accuracy, and confidence, as well as the link between verbal recall and accuracy. Participants listened to a dialogue between male and female targets, then read misleading information about voice pitch. Participants engaged in verbal recall, rated voices on a feature checklist, and made a lineup decision. Accuracy rates were low, especially on target-absent lineups. Confidence and accuracy were unrelated, but the number of facts recalled about the voice predicted later lineup accuracy. There was a main effect of misinformation on ratings of target voice pitch, but there was no effect on identification accuracy or confidence ratings. As voice lineup evidence from earwitnesses is used in courts, the findings have potential applied relevance.

  4. A pneumatic Bionic Voice prosthesis-Pre-clinical trials of controlling the voice onset and offset.

    Directory of Open Access Journals (Sweden)

    Farzaneh Ahmadi

    Full Text Available Despite emergent progress in many fields of bionics, a functional Bionic Voice prosthesis for laryngectomy patients (larynx amputees has not yet been achieved, leading to a lifetime of vocal disability for these patients. This study introduces a novel framework of Pneumatic Bionic Voice Prostheses as an electronic adaptation of the Pneumatic Artificial Larynx (PAL device. The PAL is a non-invasive mechanical voice source, driven exclusively by respiration with an exceptionally high voice quality, comparable to the existing gold standard of Tracheoesophageal (TE voice prosthesis. Following PAL design closely as the reference, Pneumatic Bionic Voice Prostheses seem to have a strong potential to substitute the existing gold standard by generating a similar voice quality while remaining non-invasive and non-surgical. This paper designs the first Pneumatic Bionic Voice prosthesis and evaluates its onset and offset control against the PAL device through pre-clinical trials on one laryngectomy patient. The evaluation on a database of more than five hours of continuous/isolated speech recordings shows a close match between the onset/offset control of the Pneumatic Bionic Voice and the PAL with an accuracy of 98.45 ±0.54%. When implemented in real-time, the Pneumatic Bionic Voice prosthesis controller has an average onset/offset delay of 10 milliseconds compared to the PAL. Hence it addresses a major disadvantage of previous electronic voice prostheses, including myoelectric Bionic Voice, in meeting the short time-frames of controlling the onset/offset of the voice in continuous speech.

  5. A pneumatic Bionic Voice prosthesis—Pre-clinical trials of controlling the voice onset and offset

    Science.gov (United States)

    Noorian, Farzad; Novakovic, Daniel; van Schaik, André

    2018-01-01

    Despite emergent progress in many fields of bionics, a functional Bionic Voice prosthesis for laryngectomy patients (larynx amputees) has not yet been achieved, leading to a lifetime of vocal disability for these patients. This study introduces a novel framework of Pneumatic Bionic Voice Prostheses as an electronic adaptation of the Pneumatic Artificial Larynx (PAL) device. The PAL is a non-invasive mechanical voice source, driven exclusively by respiration with an exceptionally high voice quality, comparable to the existing gold standard of Tracheoesophageal (TE) voice prosthesis. Following PAL design closely as the reference, Pneumatic Bionic Voice Prostheses seem to have a strong potential to substitute the existing gold standard by generating a similar voice quality while remaining non-invasive and non-surgical. This paper designs the first Pneumatic Bionic Voice prosthesis and evaluates its onset and offset control against the PAL device through pre-clinical trials on one laryngectomy patient. The evaluation on a database of more than five hours of continuous/isolated speech recordings shows a close match between the onset/offset control of the Pneumatic Bionic Voice and the PAL with an accuracy of 98.45 ±0.54%. When implemented in real-time, the Pneumatic Bionic Voice prosthesis controller has an average onset/offset delay of 10 milliseconds compared to the PAL. Hence it addresses a major disadvantage of previous electronic voice prostheses, including myoelectric Bionic Voice, in meeting the short time-frames of controlling the onset/offset of the voice in continuous speech. PMID:29466455

  6. Swallowing sound detection using hidden markov modeling of recurrence plot features

    International Nuclear Information System (INIS)

    Aboofazeli, Mohammad; Moussavi, Zahra

    2009-01-01

    Automated detection of swallowing sounds in swallowing and breath sound recordings is of importance for monitoring purposes in which the recording durations are long. This paper presents a novel method for swallowing sound detection using hidden Markov modeling of recurrence plot features. Tracheal sound recordings of 15 healthy and nine dysphagic subjects were studied. The multidimensional state space trajectory of each signal was reconstructed using the Taken method of delays. The sequences of three recurrence plot features of the reconstructed trajectories (which have shown discriminating capability between swallowing and breath sounds) were modeled by three hidden Markov models. The Viterbi algorithm was used for swallowing sound detection. The results were validated manually by inspection of the simultaneously recorded airflow signal and spectrogram of the sounds, and also by auditory means. The experimental results suggested that the performance of the proposed method using hidden Markov modeling of recurrence plot features was superior to the previous swallowing sound detection methods.

  7. Swallowing sound detection using hidden markov modeling of recurrence plot features

    Energy Technology Data Exchange (ETDEWEB)

    Aboofazeli, Mohammad [Faculty of Engineering, Department of Electrical and Computer Engineering, University of Manitoba, Winnipeg, Manitoba, R3T 5V6 (Canada)], E-mail: umaboofa@cc.umanitoba.ca; Moussavi, Zahra [Faculty of Engineering, Department of Electrical and Computer Engineering, University of Manitoba, Winnipeg, Manitoba, R3T 5V6 (Canada)], E-mail: mousavi@ee.umanitoba.ca

    2009-01-30

    Automated detection of swallowing sounds in swallowing and breath sound recordings is of importance for monitoring purposes in which the recording durations are long. This paper presents a novel method for swallowing sound detection using hidden Markov modeling of recurrence plot features. Tracheal sound recordings of 15 healthy and nine dysphagic subjects were studied. The multidimensional state space trajectory of each signal was reconstructed using the Taken method of delays. The sequences of three recurrence plot features of the reconstructed trajectories (which have shown discriminating capability between swallowing and breath sounds) were modeled by three hidden Markov models. The Viterbi algorithm was used for swallowing sound detection. The results were validated manually by inspection of the simultaneously recorded airflow signal and spectrogram of the sounds, and also by auditory means. The experimental results suggested that the performance of the proposed method using hidden Markov modeling of recurrence plot features was superior to the previous swallowing sound detection methods.

  8. Auditory and Visual Modulation of Temporal Lobe Neurons in Voice-Sensitive and Association Cortices

    Science.gov (United States)

    Perrodin, Catherine; Kayser, Christoph; Logothetis, Nikos K.

    2014-01-01

    Effective interactions between conspecific individuals can depend upon the receiver forming a coherent multisensory representation of communication signals, such as merging voice and face content. Neuroimaging studies have identified face- or voice-sensitive areas (Belin et al., 2000; Petkov et al., 2008; Tsao et al., 2008), some of which have been proposed as candidate regions for face and voice integration (von Kriegstein et al., 2005). However, it was unclear how multisensory influences occur at the neuronal level within voice- or face-sensitive regions, especially compared with classically defined multisensory regions in temporal association cortex (Stein and Stanford, 2008). Here, we characterize auditory (voice) and visual (face) influences on neuronal responses in a right-hemisphere voice-sensitive region in the anterior supratemporal plane (STP) of Rhesus macaques. These results were compared with those in the neighboring superior temporal sulcus (STS). Within the STP, our results show auditory sensitivity to several vocal features, which was not evident in STS units. We also newly identify a functionally distinct neuronal subpopulation in the STP that appears to carry the area's sensitivity to voice identity related features. Audiovisual interactions were prominent in both the STP and STS. However, visual influences modulated the responses of STS neurons with greater specificity and were more often associated with congruent voice-face stimulus pairings than STP neurons. Together, the results reveal the neuronal processes subserving voice-sensitive fMRI activity patterns in primates, generate hypotheses for testing in the visual modality, and clarify the position of voice-sensitive areas within the unisensory and multisensory processing hierarchies. PMID:24523543

  9. Auditory and visual modulation of temporal lobe neurons in voice-sensitive and association cortices.

    Science.gov (United States)

    Perrodin, Catherine; Kayser, Christoph; Logothetis, Nikos K; Petkov, Christopher I

    2014-02-12

    Effective interactions between conspecific individuals can depend upon the receiver forming a coherent multisensory representation of communication signals, such as merging voice and face content. Neuroimaging studies have identified face- or voice-sensitive areas (Belin et al., 2000; Petkov et al., 2008; Tsao et al., 2008), some of which have been proposed as candidate regions for face and voice integration (von Kriegstein et al., 2005). However, it was unclear how multisensory influences occur at the neuronal level within voice- or face-sensitive regions, especially compared with classically defined multisensory regions in temporal association cortex (Stein and Stanford, 2008). Here, we characterize auditory (voice) and visual (face) influences on neuronal responses in a right-hemisphere voice-sensitive region in the anterior supratemporal plane (STP) of Rhesus macaques. These results were compared with those in the neighboring superior temporal sulcus (STS). Within the STP, our results show auditory sensitivity to several vocal features, which was not evident in STS units. We also newly identify a functionally distinct neuronal subpopulation in the STP that appears to carry the area's sensitivity to voice identity related features. Audiovisual interactions were prominent in both the STP and STS. However, visual influences modulated the responses of STS neurons with greater specificity and were more often associated with congruent voice-face stimulus pairings than STP neurons. Together, the results reveal the neuronal processes subserving voice-sensitive fMRI activity patterns in primates, generate hypotheses for testing in the visual modality, and clarify the position of voice-sensitive areas within the unisensory and multisensory processing hierarchies.

  10. Voice Activity Detection Using Fuzzy Entropy and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    R. Johny Elton

    2016-08-01

    Full Text Available This paper proposes support vector machine (SVM based voice activity detection using FuzzyEn to improve detection performance under noisy conditions. The proposed voice activity detection (VAD uses fuzzy entropy (FuzzyEn as a feature extracted from noise-reduced speech signals to train an SVM model for speech/non-speech classification. The proposed VAD method was tested by conducting various experiments by adding real background noises of different signal-to-noise ratios (SNR ranging from −10 dB to 10 dB to actual speech signals collected from the TIMIT database. The analysis proves that FuzzyEn feature shows better results in discriminating noise and corrupted noisy speech. The efficacy of the SVM classifier was validated using 10-fold cross validation. Furthermore, the results obtained by the proposed method was compared with those of previous standardized VAD algorithms as well as recently developed methods. Performance comparison suggests that the proposed method is proven to be more efficient in detecting speech under various noisy environments with an accuracy of 93.29%, and the FuzzyEn feature detects speech efficiently even at low SNR levels.

  11. Human vocal attractiveness as signaled by body size projection.

    Directory of Open Access Journals (Sweden)

    Yi Xu

    Full Text Available Voice, as a secondary sexual characteristic, is known to affect the perceived attractiveness of human individuals. But the underlying mechanism of vocal attractiveness has remained unclear. Here, we presented human listeners with acoustically altered natural sentences and fully synthetic sentences with systematically manipulated pitch, formants and voice quality based on a principle of body size projection reported for animal calls and emotional human vocal expressions. The results show that male listeners preferred a female voice that signals a small body size, with relatively high pitch, wide formant dispersion and breathy voice, while female listeners preferred a male voice that signals a large body size with low pitch and narrow formant dispersion. Interestingly, however, male vocal attractiveness was also enhanced by breathiness, which presumably softened the aggressiveness associated with a large body size. These results, together with the additional finding that the same vocal dimensions also affect emotion judgment, indicate that humans still employ a vocal interaction strategy used in animal calls despite the development of complex language.

  12. A virtual auditory environment for investigating the auditory signal processing of realistic sounds

    DEFF Research Database (Denmark)

    Favrot, Sylvain Emmanuel; Buchholz, Jörg

    2008-01-01

    In the present study, a novel multichannel loudspeaker-based virtual auditory environment (VAE) is introduced. The VAE aims at providing a versatile research environment for investigating the auditory signal processing in real environments, i.e., considering multiple sound sources and room...... reverberation. The environment is based on the ODEON room acoustic simulation software to render the acoustical scene. ODEON outputs are processed using a combination of different order Ambisonic techniques to calculate multichannel room impulse responses (mRIR). Auralization is then obtained by the convolution...... the VAE development, special care was taken in order to achieve a realistic auditory percept and to avoid “artifacts” such as unnatural coloration. The performance of the VAE has been evaluated and optimized on a 29 loudspeaker setup using both objective and subjective measurement techniques....

  13. Ultrasound sounding in air by fast-moving receiver

    Science.gov (United States)

    Sukhanov, D.; Erzakova, N.

    2018-05-01

    A method of ultrasound imaging in the air for a fast receiver. The case, when the speed of movement of the receiver can not be neglected with respect to the speed of sound. In this case, the Doppler effect is significant, making it difficult for matched filtering of the backscattered signal. The proposed method does not use a continuous repetitive noise-sounding signal. generalized approach applies spatial matched filtering in the time domain to recover the ultrasonic tomographic images.

  14. Rainforests as concert halls for birds: Are reverberations improving sound transmission of long song elements?

    DEFF Research Database (Denmark)

    Nemeth, Erwin; Dabelsteen, Torben; Pedersen, Simon Boel

    2006-01-01

    that longer sounds are less attenuated. The results indicate that higher sound pressure level is caused by superimposing reflections. It is suggested that this beneficial effect of reverberations explains interspecific birdsong differences in element length. Transmission paths with stronger reverberations......In forests reverberations have probably detrimental and beneficial effects on avian communication. They constrain signal discrimination by masking fast repetitive sounds and they improve signal detection by elongating sounds. This ambivalence of reflections for animal signals in forests is similar...... to the influence of reverberations on speech or music in indoor sound transmission. Since comparisons of sound fields of forests and concert halls have demonstrated that reflections can contribute in both environments a considerable part to the energy of a received sound, it is here assumed that reverberations...

  15. Toward Inverse Control of Physics-Based Sound Synthesis

    Science.gov (United States)

    Pfalz, A.; Berdahl, E.

    2017-05-01

    Long Short-Term Memory networks (LSTMs) can be trained to realize inverse control of physics-based sound synthesizers. Physics-based sound synthesizers simulate the laws of physics to produce output sound according to input gesture signals. When a user's gestures are measured in real time, she or he can use them to control physics-based sound synthesizers, thereby creating simulated virtual instruments. An intriguing question is how to program a computer to learn to play such physics-based models. This work demonstrates that LSTMs can be trained to accomplish this inverse control task with four physics-based sound synthesizers.

  16. Sound Cross-synthesis and Morphing Using Dictionary-based Methods

    DEFF Research Database (Denmark)

    Collins, Nick; Sturm, Bob L.

    2011-01-01

    Dictionary-based methods (DBMs) provide rich possibilities for new sound transformations; as the analysis dual to granular synthesis, audio signals are decomposed into `atoms', allowing interesting manipulations. We present various approaches to audio signal cross-synthesis and cross-analysis via...... atomic decomposition using scale-time-frequency dictionaries. DBMs naturally provide high-level descriptions of a signal and its content, which can allow for greater control over what is modified and how. Through these models, we can make one signal decomposition influence that of another to create cross......-synthesized sounds. We present several examples of these techniques both theoretically and practically, and present on-going and further work....

  17. Real-Time Detection of Important Sounds with a Wearable Vibration Based Device for Hearing-Impaired People

    Directory of Open Access Journals (Sweden)

    Mete Yağanoğlu

    2018-04-01

    Full Text Available Hearing-impaired people do not hear indoor and outdoor environment sounds, which are important for them both at home and outside. By means of a wearable device that we have developed, a hearing-impaired person will be informed of important sounds through vibrations, thereby understanding what kind of sound it is. Our system, which operates in real time, can achieve a success rate of 98% when estimating a door bell ringing sound, 99% success identifying an alarm sound, 99% success identifying a phone ringing, 91% success identifying honking, 93% success identifying brake sounds, 96% success identifying dog sounds, 97% success identifying human voice, and 96% success identifying other sounds using the audio fingerprint method. Audio fingerprint is a brief summary of an audio file, perceptively summarizing a piece of audio content. In this study, our wearable device is tested 100 times a day for 100 days on five deaf persons and 50 persons with normal hearing whose ears were covered by earphones that provided wind sounds. This study aims to improve the quality of life of deaf persons, and provide them a more prosperous life. In the questionnaire performed, deaf people rate the clarity of the system at 90%, usefulness at 97%, and the likelihood of using this device again at 100%.

  18. Optical Reading and Playing of Sound Signals from Vinyl Records

    OpenAIRE

    Hensman, Arnold; Casey, Kevin

    2007-01-01

    While advanced digital music systems such as compact disk players and MP3 have become the standard in sound reproduction technology, critics claim that conversion to digital often results in a loss of sound quality and richness. For this reason, vinyl records remain the medium of choice for many audiophiles involved in specialist areas. The waveform cut into a vinyl record is an exact replica of the analogue version from the original source. However, while some perceive this media as reproduc...

  19. Developing a reference of normal lung sounds in healthy Peruvian children.

    Science.gov (United States)

    Ellington, Laura E; Emmanouilidou, Dimitra; Elhilali, Mounya; Gilman, Robert H; Tielsch, James M; Chavez, Miguel A; Marin-Concha, Julio; Figueroa, Dante; West, James; Checkley, William

    2014-10-01

    Lung auscultation has long been a standard of care for the diagnosis of respiratory diseases. Recent advances in electronic auscultation and signal processing have yet to find clinical acceptance; however, computerized lung sound analysis may be ideal for pediatric populations in settings, where skilled healthcare providers are commonly unavailable. We described features of normal lung sounds in young children using a novel signal processing approach to lay a foundation for identifying pathologic respiratory sounds. 186 healthy children with normal pulmonary exams and without respiratory complaints were enrolled at a tertiary care hospital in Lima, Peru. Lung sounds were recorded at eight thoracic sites using a digital stethoscope. 151 (81%) of the recordings were eligible for further analysis. Heavy-crying segments were automatically rejected and features extracted from spectral and temporal signal representations contributed to profiling of lung sounds. Mean age, height, and weight among study participants were 2.2 years (SD 1.4), 84.7 cm (SD 13.2), and 12.0 kg (SD 3.6), respectively; and, 47% were boys. We identified ten distinct spectral and spectro-temporal signal parameters and most demonstrated linear relationships with age, height, and weight, while no differences with genders were noted. Older children had a faster decaying spectrum than younger ones. Features like spectral peak width, lower-frequency Mel-frequency cepstral coefficients, and spectro-temporal modulations also showed variations with recording site. Lung sound extracted features varied significantly with child characteristics and lung site. A comparison with adult studies revealed differences in the extracted features for children. While sound-reduction techniques will improve analysis, we offer a novel, reproducible tool for sound analysis in real-world environments.

  20. The Role of Occupational Voice Demand and Patient-Rated Impairment in Predicting Voice Therapy Adherence.

    Science.gov (United States)

    Ebersole, Barbara; Soni, Resha S; Moran, Kathleen; Lango, Miriam; Devarajan, Karthik; Jamal, Nausheen

    2018-05-01

    Examine the relationship among the severity of patient-perceived voice impairment, perceptual dysphonia severity, occupational voice demand, and voice therapy adherence. Identify clinical predictors of increased risk for therapy nonadherence. A retrospective cohort study of patients presenting with a chief complaint of persistent dysphonia at an interdisciplinary voice center was done. The Voice Handicap Index-10 (VHI-10) and the Voice-Related Quality of Life (V-RQOL) survey scores, clinician rating of dysphonia severity using the Grade score from the Grade, Roughness Breathiness, Asthenia, and Strain scale, occupational voice demand, and patient demographics were tested for associations with therapy adherence, defined as completion of the treatment plan. Classification and Regression Tree (CART) analysis was performed to establish thresholds for nonadherence risk. Of 166 patients evaluated, 111 were recommended for voice therapy. The therapy nonadherence rate was 56%. Occupational voice demand category, VHI-10, and V-RQOL scores were the only factors significantly correlated with therapy adherence (P demand are significantly more likely to be nonadherent with therapy than those with high occupational voice demand (P 40 is a significant cutoff point for predicting therapy nonadherence (P demand and patient perception of impairment are significantly and independently correlated with therapy adherence. A VHI-10 score of ≤9 or a V-RQOL score of >40 is a significant cutoff point for predicting nonadherence risk. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  1. Diagnostic value of voice acoustic analysis in assessment of occupational voice pathologies in teachers.

    Science.gov (United States)

    Niebudek-Bogusz, Ewa; Fiszer, Marta; Kotylo, Piotr; Sliwinska-Kowalska, Mariola

    2006-01-01

    It has been shown that teachers are at risk of developing occupational dysphonia, which accounts for over 25% of all occupational diseases diagnosed in Poland. The most frequently used method of diagnosing voice diseases is videostroboscopy. However, to facilitate objective evaluation of voice efficiency as well as medical certification of occupational voice disorders, it is crucial to implement quantitative methods of voice assessment, particularly voice acoustic analysis. The aim of the study was to assess the results of acoustic analysis in 66 female teachers (aged 40-64 years), including 35 subjects with occupational voice pathologies (e.g., vocal nodules) and 31 subjects with functional dysphonia. The acoustic analysis was performed using the IRIS software, before and after a 30-minute vocal loading test. All participants were subjected also to laryngological and videostroboscopic examinations. After the vocal effort, the acoustic parameters displayed statistically significant abnormalities, mostly lowered fundamental frequency (Fo) and incorrect values of shimmer and noise to harmonic ratio. To conclude, quantitative voice acoustic analysis using the IRIS software seems to be an effective complement to voice examinations, which is particularly helpful in diagnosing occupational dysphonia.

  2. Co dokáže náš hlas? Fonetický pohled na variabilitu řečové produkce // What are our voices capable of ? Phonetic perspective on the variability of speech production

    Directory of Open Access Journals (Sweden)

    Radek Skarnitzl

    2016-12-01

    Full Text Available The paper surveys the plasticity of the speech production mechanism. At the level of phonatory behaviour, a distinction is made between the frequency of vocal fold vibration, which is reflected in the pitch of the voice, and the manner in which the vocal folds vibrate, which lends our voice different qualities. The main types of phonatory modifications are described and some of their uses in everyday communication, as well as their perceptual effects, are documented from literature. Modifications of the primary makeup of speech sounds in the supraglottal vocal tract, such as rounding or spreading of the lips, hyper- or hyponasality, and palatalization, are discussed in the following section. The two levels of description — phonatory and articulatory — are formally anchored in Nolan’s model of the sources of variability in speech. The final part of the paper examines speech variability from the perspective of the listener, regarding one’s speech as their auditory face which signals biologically, psychologically, and socially conditioned information about the speaker.

  3. Emotionally conditioning the target-speech voice enhances recognition of the target speech under "cocktail-party" listening conditions.

    Science.gov (United States)

    Lu, Lingxi; Bao, Xiaohan; Chen, Jing; Qu, Tianshu; Wu, Xihong; Li, Liang

    2018-05-01

    Under a noisy "cocktail-party" listening condition with multiple people talking, listeners can use various perceptual/cognitive unmasking cues to improve recognition of the target speech against informational speech-on-speech masking. One potential unmasking cue is the emotion expressed in a speech voice, by means of certain acoustical features. However, it was unclear whether emotionally conditioning a target-speech voice that has none of the typical acoustical features of emotions (i.e., an emotionally neutral voice) can be used by listeners for enhancing target-speech recognition under speech-on-speech masking conditions. In this study we examined the recognition of target speech against a two-talker speech masker both before and after the emotionally neutral target voice was paired with a loud female screaming sound that has a marked negative emotional valence. The results showed that recognition of the target speech (especially the first keyword in a target sentence) was significantly improved by emotionally conditioning the target speaker's voice. Moreover, the emotional unmasking effect was independent of the unmasking effect of the perceived spatial separation between the target speech and the masker. Also, (skin conductance) electrodermal responses became stronger after emotional learning when the target speech and masker were perceptually co-located, suggesting an increase of listening efforts when the target speech was informationally masked. These results indicate that emotionally conditioning the target speaker's voice does not change the acoustical parameters of the target-speech stimuli, but the emotionally conditioned vocal features can be used as cues for unmasking target speech.

  4. Different Types of Sounds and Their Relationship With the Electrocardiographic Signals and the Cardiovascular System – Review

    Directory of Open Access Journals (Sweden)

    Ennio H. Idrobo-Ávila

    2018-05-01

    Full Text Available Background: For some time now, the effects of sound, noise, and music on the human body have been studied. However, despite research done through time, it is still not completely clear what influence, interaction, and effects sounds have on human body. That is why it is necessary to conduct new research on this topic. Thus, in this paper, a systematic review is undertaken in order to integrate research related to several types of sound, both pleasant and unpleasant, specifically noise and music. In addition, it includes as much research as possible to give stakeholders a more general vision about relevant elements regarding methodologies, study subjects, stimulus, analysis, and experimental designs in general. This study has been conducted in order to make a genuine contribution to this area and to perhaps to raise the quality of future research about sound and its effects over ECG signals.Methods: This review was carried out by independent researchers, through three search equations, in four different databases, including: engineering, medicine, and psychology. Inclusion and exclusion criteria were applied and studies published between 1999 and 2017 were considered. The selected documents were read and analyzed independently by each group of researchers and subsequently conclusions were established between all of them.Results: Despite the differences between the outcomes of selected studies, some common factors were found among them. Thus, in noise studies where both BP and HR increased or tended to increase, it was noted that HRV (HF and LF/HF changes with both sound and noise stimuli, whereas GSR changes with sound and musical stimuli. Furthermore, LF also showed changes with exposure to noise.Conclusion: In many cases, samples displayed a limitation in experimental design, and in diverse studies, there was a lack of a control group. There was a lot of variability in the presented stimuli providing a wide overview of the effects they could

  5. Robust segmentation and retrieval of environmental sounds

    Science.gov (United States)

    Wichern, Gordon

    The proliferation of mobile computing has provided much of the world with the ability to record any sound of interest, or possibly every sound heard in a lifetime. The technology to continuously record the auditory world has applications in surveillance, biological monitoring of non-human animal sounds, and urban planning. Unfortunately, the ability to record anything has led to an audio data deluge, where there are more recordings than time to listen. Thus, access to these archives depends on efficient techniques for segmentation (determining where sound events begin and end), indexing (storing sufficient information with each event to distinguish it from other events), and retrieval (searching for and finding desired events). While many such techniques have been developed for speech and music sounds, the environmental and natural sounds that compose the majority of our aural world are often overlooked. The process of analyzing audio signals typically begins with the process of acoustic feature extraction where a frame of raw audio (e.g., 50 milliseconds) is converted into a feature vector summarizing the audio content. In this dissertation, a dynamic Bayesian network (DBN) is used to monitor changes in acoustic features in order to determine the segmentation of continuously recorded audio signals. Experiments demonstrate effective segmentation performance on test sets of environmental sounds recorded in both indoor and outdoor environments. Once segmented, every sound event is indexed with a probabilistic model, summarizing the evolution of acoustic features over the course of the event. Indexed sound events are then retrieved from the database using different query modalities. Two important query types are sound queries (query-by-example) and semantic queries (query-by-text). By treating each sound event and semantic concept in the database as a node in an undirected graph, a hybrid (content/semantic) network structure is developed. This hybrid network can

  6. Pedagogic Voice: Student Voice in Teaching and Engagement Pedagogies

    Science.gov (United States)

    Baroutsis, Aspa; McGregor, Glenda; Mills, Martin

    2016-01-01

    In this paper, we are concerned with the notion of "pedagogic voice" as it relates to the presence of student "voice" in teaching, learning and curriculum matters at an alternative, or second chance, school in Australia. This school draws upon many of the principles of democratic schooling via its utilisation of student voice…

  7. Efficient voice activity detection in reverberant enclosures using far field microphones

    DEFF Research Database (Denmark)

    Petsatodis, Theodore; Boukis, Christos

    2009-01-01

    An algorithm suitable for voice activity detection under reverberant conditions is proposed in this paper. Due to the use of far-filed microphones the proposed solution processes speech signals of highly-varying intensity and signal to noise ratio, that are contaminated with several echoes....... The core of the system is a pair of Hidden Markov Models, that effectively model the speech presence and speech absence situations. To minimise mis-detections an adaptive threshold is used, while a hang-over scheme caters for the intra-frame correlation of speech signals. Experimental results conducted...

  8. Sound source measurement by using a passive sound insulation and a statistical approach

    Science.gov (United States)

    Dragonetti, Raffaele; Di Filippo, Sabato; Mercogliano, Francesco; Romano, Rosario A.

    2015-10-01

    This paper describes a measurement technique developed by the authors that allows carrying out acoustic measurements inside noisy environments reducing background noise effects. The proposed method is based on the integration of a traditional passive noise insulation system with a statistical approach. The latter is applied to signals picked up by usual sensors (microphones and accelerometers) equipping the passive sound insulation system. The statistical approach allows improving of the sound insulation given only by the passive sound insulation system at low frequency. The developed measurement technique has been validated by means of numerical simulations and measurements carried out inside a real noisy environment. For the case-studies here reported, an average improvement of about 10 dB has been obtained in a frequency range up to about 250 Hz. Considerations on the lower sound pressure level that can be measured by applying the proposed method and the measurement error related to its application are reported as well.

  9. Voice Savers for Music Teachers

    Science.gov (United States)

    Cookman, Starr

    2012-01-01

    Music teachers are in a class all their own when it comes to voice use. These elite vocal athletes require stamina, strength, and flexibility from their voices day in, day out for hours at a time. Voice rehabilitation clinics and research show that music education ranks high among the professionals most commonly affected by voice problems.…

  10. Dementias show differential physiological responses to salient sounds.

    Science.gov (United States)

    Fletcher, Phillip D; Nicholas, Jennifer M; Shakespeare, Timothy J; Downey, Laura E; Golden, Hannah L; Agustus, Jennifer L; Clark, Camilla N; Mummery, Catherine J; Schott, Jonathan M; Crutch, Sebastian J; Warren, Jason D

    2015-01-01

    Abnormal responsiveness to salient sensory signals is often a prominent feature of dementia diseases, particularly the frontotemporal lobar degenerations, but has been little studied. Here we assessed processing of one important class of salient signals, looming sounds, in canonical dementia syndromes. We manipulated tones using intensity cues to create percepts of salient approaching ("looming") or less salient withdrawing sounds. Pupil dilatation responses and behavioral rating responses to these stimuli were compared in patients fulfilling consensus criteria for dementia syndromes (semantic dementia, n = 10; behavioral variant frontotemporal dementia, n = 16, progressive nonfluent aphasia, n = 12; amnestic Alzheimer's disease, n = 10) and a cohort of 26 healthy age-matched individuals. Approaching sounds were rated as more salient than withdrawing sounds by healthy older individuals but this behavioral response to salience did not differentiate healthy individuals from patients with dementia syndromes. Pupil responses to approaching sounds were greater than responses to withdrawing sounds in healthy older individuals and in patients with semantic dementia: this differential pupil response was reduced in patients with progressive nonfluent aphasia and Alzheimer's disease relative both to the healthy control and semantic dementia groups, and did not correlate with nonverbal auditory semantic function. Autonomic responses to auditory salience are differentially affected by dementias and may constitute a novel biomarker of these diseases.

  11. Digital servo control of random sound fields

    Science.gov (United States)

    Nakich, R. B.

    1973-01-01

    It is necessary to place number of sensors at different positions in sound field to determine actual sound intensities to which test object is subjected. It is possible to determine whether specification is being met adequately or exceeded. Since excitation is of random nature, signals are essentially coherent and it is impossible to obtain true average.

  12. Sounds of Education

    DEFF Research Database (Denmark)

    Koch, Anette Boye

    2017-01-01

    Voice is a basic tool in communication between adults. However, in early educational settings, adult professionals use their voices in different paralinguistic ways when they communicate with children. A teacher’s use of voice is important because it serves to communicate attitudes and emotions...... in ways that are often ignored in early childhood classroom research. When teachers take different roles in relation to children, they use their voice with different pitch, melody, and loudness. This research examined how various acoustic elements in teachers’ voices are associated with different teaching...... and evaluating educational practice....

  13. A noise reduction technique based on nonlinear kernel function for heart sound analysis.

    Science.gov (United States)

    Mondal, Ashok; Saxena, Ishan; Tang, Hong; Banerjee, Poulami

    2017-02-13

    The main difficulty encountered in interpretation of cardiac sound is interference of noise. The contaminated noise obscures the relevant information which are useful for recognition of heart diseases. The unwanted signals are produced mainly by lungs and surrounding environment. In this paper, a novel heart sound de-noising technique has been introduced based on a combined framework of wavelet packet transform (WPT) and singular value decomposition (SVD). The most informative node of wavelet tree is selected on the criteria of mutual information measurement. Next, the coefficient corresponding to the selected node is processed by SVD technique to suppress noisy component from heart sound signal. To justify the efficacy of the proposed technique, several experiments have been conducted with heart sound dataset, including normal and pathological cases at different signal to noise ratios. The significance of the method is validated by statistical analysis of the results. The biological information preserved in de-noised heart sound (HS) signal is evaluated by k-means clustering algorithm and Fit Factor calculation. The overall results show that proposed method is superior than the baseline methods.

  14. Developing a Reference of Normal Lung Sounds in Healthy Peruvian Children

    Science.gov (United States)

    Ellington, Laura E.; Emmanouilidou, Dimitra; Elhilali, Mounya; Gilman, Robert H.; Tielsch, James M.; Chavez, Miguel A.; Marin-Concha, Julio; Figueroa, Dante; West, James

    2018-01-01

    Purpose Lung auscultation has long been a standard of care for the diagnosis of respiratory diseases. Recent advances in electronic auscultation and signal processing have yet to find clinical acceptance; however, computerized lung sound analysis may be ideal for pediatric populations in settings, where skilled healthcare providers are commonly unavailable. We described features of normal lung sounds in young children using a novel signal processing approach to lay a foundation for identifying pathologic respiratory sounds. Methods 186 healthy children with normal pulmonary exams and without respiratory complaints were enrolled at a tertiary care hospital in Lima, Peru. Lung sounds were recorded at eight thoracic sites using a digital stethoscope. 151 (81 %) of the recordings were eligible for further analysis. Heavy-crying segments were automatically rejected and features extracted from spectral and temporal signal representations contributed to profiling of lung sounds. Results Mean age, height, and weight among study participants were 2.2 years (SD 1.4), 84.7 cm (SD 13.2), and 12.0 kg (SD 3.6), respectively; and, 47 % were boys. We identified ten distinct spectral and spectro-temporal signal parameters and most demonstrated linear relationships with age, height, and weight, while no differences with genders were noted. Older children had a faster decaying spectrum than younger ones. Features like spectral peak width, lower-frequency Mel-frequency cepstral coefficients, and spectro-temporal modulations also showed variations with recording site. Conclusions Lung sound extracted features varied significantly with child characteristics and lung site. A comparison with adult studies revealed differences in the extracted features for children. While sound-reduction techniques will improve analysis, we offer a novel, reproducible tool for sound analysis in real-world environments. PMID:24943262

  15. Voice - How humans communicate?

    Science.gov (United States)

    Tiwari, Manjul; Tiwari, Maneesha

    2012-01-01

    Voices are important things for humans. They are the medium through which we do a lot of communicating with the outside world: our ideas, of course, and also our emotions and our personality. The voice is the very emblem of the speaker, indelibly woven into the fabric of speech. In this sense, each of our utterances of spoken language carries not only its own message but also, through accent, tone of voice and habitual voice quality it is at the same time an audible declaration of our membership of particular social regional groups, of our individual physical and psychological identity, and of our momentary mood. Voices are also one of the media through which we (successfully, most of the time) recognize other humans who are important to us-members of our family, media personalities, our friends, and enemies. Although evidence from DNA analysis is potentially vastly more eloquent in its power than evidence from voices, DNA cannot talk. It cannot be recorded planning, carrying out or confessing to a crime. It cannot be so apparently directly incriminating. As will quickly become evident, voices are extremely complex things, and some of the inherent limitations of the forensic-phonetic method are in part a consequence of the interaction between their complexity and the real world in which they are used. It is one of the aims of this article to explain how this comes about. This subject have unsolved questions, but there is no direct way to present the information that is necessary to understand how voices can be related, or not, to their owners.

  16. Speaker's voice as a memory cue.

    Science.gov (United States)

    Campeanu, Sandra; Craik, Fergus I M; Alain, Claude

    2015-02-01

    Speaker's voice occupies a central role as the cornerstone of auditory social interaction. Here, we review the evidence suggesting that speaker's voice constitutes an integral context cue in auditory memory. Investigation into the nature of voice representation as a memory cue is essential to understanding auditory memory and the neural correlates which underlie it. Evidence from behavioral and electrophysiological studies suggest that while specific voice reinstatement (i.e., same speaker) often appears to facilitate word memory even without attention to voice at study, the presence of a partial benefit of similar voices between study and test is less clear. In terms of explicit memory experiments utilizing unfamiliar voices, encoding methods appear to play a pivotal role. Voice congruency effects have been found when voice is specifically attended at study (i.e., when relatively shallow, perceptual encoding takes place). These behavioral findings coincide with neural indices of memory performance such as the parietal old/new recollection effect and the late right frontal effect. The former distinguishes between correctly identified old words and correctly identified new words, and reflects voice congruency only when voice is attended at study. Characterization of the latter likely depends upon voice memory, rather than word memory. There is also evidence to suggest that voice effects can be found in implicit memory paradigms. However, the presence of voice effects appears to depend greatly on the task employed. Using a word identification task, perceptual similarity between study and test conditions is, like for explicit memory tests, crucial. In addition, the type of noise employed appears to have a differential effect. While voice effects have been observed when white noise is used at both study and test, using multi-talker babble does not confer the same results. In terms of neuroimaging research modulations, characterization of an implicit memory effect

  17. Two models of the sound-signal frequency dependence on the animal body size as exemplified by the ground squirrels of Eurasia (mammalia, rodentia).

    Science.gov (United States)

    Nikol'skii, A A

    2017-11-01

    Dependence of the sound-signal frequency on the animal body length was studied in 14 ground squirrel species (genus Spermophilus) of Eurasia. Regression analysis of the total sample yielded a low determination coefficient (R 2 = 26%), because the total sample proved to be heterogeneous in terms of signal frequency within the dimension classes of animals. When the total sample was divided into two groups according to signal frequency, two statistically significant models (regression equations) were obtained in which signal frequency depended on the body size at high determination coefficients (R 2 = 73 and 94% versus 26% for the total sample). Thus, the problem of correlation between animal body size and the frequency of their vocal signals does not have a unique solution.

  18. Analysis of adventitious lung sounds originating from pulmonary tuberculosis.

    Science.gov (United States)

    Becker, K W; Scheffer, C; Blanckenberg, M M; Diacon, A H

    2013-01-01

    Tuberculosis is a common and potentially deadly infectious disease, usually affecting the respiratory system and causing the sound properties of symptomatic infected lungs to differ from non-infected lungs. Auscultation is often ruled out as a reliable diagnostic technique for TB due to the random distribution of the infection and the varying severity of damage to the lungs. However, advancements in signal processing techniques for respiratory sounds can improve the potential of auscultation far beyond the capabilities of the conventional mechanical stethoscope. Though computer-based signal analysis of respiratory sounds has produced a significant body of research, there have not been any recent investigations into the computer-aided analysis of lung sounds associated with pulmonary Tuberculosis (TB), despite the severity of the disease in many countries. In this paper, respiratory sounds were recorded from 14 locations around the posterior and anterior chest walls of healthy volunteers and patients infected with pulmonary TB. The most significant signal features in both the time and frequency domains associated with the presence of TB, were identified by using the statistical overlap factor (SOF). These features were then employed to train a neural network to automatically classify the auscultation recordings into their respective healthy or TB-origin categories. The neural network yielded a diagnostic accuracy of 73%, but it is believed that automated filtering of the noise in the clinics, more training samples and perhaps other signal processing methods can improve the results of future studies. This work demonstrates the potential of computer-aided auscultation as an aid for the diagnosis and treatment of TB.

  19. An exploratory study of voice change associated with healthy speakers after transcutaneous electrical stimulation to laryngeal muscles.

    Science.gov (United States)

    Fowler, Linda P; Gorham-Rowan, Mary; Hapner, Edie R

    2011-01-01

    The purpose of this study was to determine if measurable changes in fundamental frequency (F(0)) and relative sound level (RSL) occurred in healthy speakers after transcutaneous electrical stimulation (TES) as applied via VitalStim (Chattanooga Group, Chattanooga, TN). A prospective, repeated-measures design. Ten healthy female and 10 healthy male speakers, 20-53 years of age, participated in the study. All participants were nonsmokers and reported negative history for voice disorders. Participants received 1 hour of TES while engaged in eating, drinking, and conversation to simulate a typical dysphagia therapy protocol. Voice recordings were obtained before and immediately after TES. The voice samples consisted of a sustained vowel task and reading of the Rainbow Passage. Measurements of F(0) and RSL were obtained using TF32 (Milenkovic, 2005, University of Wisconsin). The participants also reported any sensations 5 minutes and 24 hours after TES. Measurable changes in F(0) and RSL were found for both tasks but were variable in direction and magnitude. These changes were not statistically significant. Subjective comments ranged from reports of a vocal warm-up feeling to delayed onset muscle soreness. These findings demonstrate that application of TES produces measurable changes in F(0) and RSL. However, the direction and magnitude of these changes are highly variable. Further research is needed to determine factors that may affect the extent to which TES contributes to significant changes in voice. Copyright © 2011 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  20. Integrating cues of social interest and voice pitch in men's preferences for women's voices

    OpenAIRE

    Jones, Benedict C; Feinberg, David R; DeBruine, Lisa M; Little, Anthony C; Vukovic, Jovana

    2008-01-01

    Most previous studies of vocal attractiveness have focused on preferences for physical characteristics of voices such as pitch. Here we examine the content of vocalizations in interaction with such physical traits, finding that vocal cues of social interest modulate the strength of men's preferences for raised pitch in women's voices. Men showed stronger preferences for raised pitch when judging the voices of women who appeared interested in the listener than when judging the voices of women ...

  1. Acoustic signal analysis in the creeping discharge

    International Nuclear Information System (INIS)

    Nakamiya, T; Sonoda, Y; Tsuda, R; Ebihara, K; Ikegami, T

    2008-01-01

    We have previously succeeded in measuring the acoustic signal due to the dielectric barrier discharge and discriminating the dominant frequency components of the acoustic signal. The dominant frequency components appear over 20kHz of acoustic signal by the dielectric barrier discharge. Recently surface discharge control technology has been focused from practical applications such as ozonizer, NO X reactors, light source or display. The fundamental experiments are carried to examine the creeping discharge using the acoustic signal. When the high voltage (6kV, f = 10kHz) is applied to the electrode, the discharge current flows and the acoustic sound is generated. The current, voltage waveforms of creeping discharge and the sound signal detected by the condenser microphone are stored in the digital memory scope. In this scheme, Continuous Wavelet Transform (CWT) is applied to discriminate the acoustic sound of the micro discharge and the dominant frequency components are studied. CWT results of sound signal show the frequency spectrum of wideband up to 100kHz. In addition, the energy distributions of acoustic signal are examined by CWT

  2. FonaDyn - A system for real-time analysis of the electroglottogram, over the voice range

    Science.gov (United States)

    Ternström, Sten; Johansson, Dennis; Selamtzis, Andreas

    2018-01-01

    From soft to loud and low to high, the mechanisms of human voice have many degrees of freedom, making it difficult to assess phonation from the acoustic signal alone. FonaDyn is a research tool that combines acoustics with electroglottography (EGG). It characterizes and visualizes in real time the dynamics of EGG waveforms, using statistical clustering of the cycle-synchronous EGG Fourier components, and their sample entropy. The prevalence and stability of different EGG waveshapes are mapped as colored regions into a so-called voice range profile, without needing pre-defined thresholds or categories. With appropriately 'trained' clusters, FonaDyn can classify and map voice regimes. This is of potential scientific, clinical and pedagogical interest.

  3. Snoring classified: The Munich-Passau Snore Sound Corpus.

    Science.gov (United States)

    Janott, Christoph; Schmitt, Maximilian; Zhang, Yue; Qian, Kun; Pandit, Vedhas; Zhang, Zixing; Heiser, Clemens; Hohenhorst, Winfried; Herzog, Michael; Hemmert, Werner; Schuller, Björn

    2018-03-01

    Snoring can be excited in different locations within the upper airways during sleep. It was hypothesised that the excitation locations are correlated with distinct acoustic characteristics of the snoring noise. To verify this hypothesis, a database of snore sounds is developed, labelled with the location of sound excitation. Video and audio recordings taken during drug induced sleep endoscopy (DISE) examinations from three medical centres have been semi-automatically screened for snore events, which subsequently have been classified by ENT experts into four classes based on the VOTE classification. The resulting dataset containing 828 snore events from 219 subjects has been split into Train, Development, and Test sets. An SVM classifier has been trained using low level descriptors (LLDs) related to energy, spectral features, mel frequency cepstral coefficients (MFCC), formants, voicing, harmonic-to-noise ratio (HNR), spectral harmonicity, pitch, and microprosodic features. An unweighted average recall (UAR) of 55.8% could be achieved using the full set of LLDs including formants. Best performing subset is the MFCC-related set of LLDs. A strong difference in performance could be observed between the permutations of train, development, and test partition, which may be caused by the relatively low number of subjects included in the smaller classes of the strongly unbalanced data set. A database of snoring sounds is presented which are classified according to their sound excitation location based on objective criteria and verifiable video material. With the database, it could be demonstrated that machine classifiers can distinguish different excitation location of snoring sounds in the upper airway based on acoustic parameters. Copyright © 2018 Elsevier Ltd. All rights reserved.

  4. Vocal Noise Cancellation From Respiratory Sounds

    National Research Council Canada - National Science Library

    Moussavi, Zahra

    2001-01-01

    Although background noise cancellation for speech or electrocardiographic recording is well established, however when the background noise contains vocal noises and the main signal is a breath sound...

  5. The effect of frequency-specific sound signals on the germination of maize seeds.

    Science.gov (United States)

    Vicient, Carlos M

    2017-07-25

    The effects of sound treatments on the germination of maize seeds were determined. White noise and bass sounds (300 Hz) had a positive effect on the germination rate. Only 3 h treatment produced an increase of about 8%, and 5 h increased germination in about 10%. Fast-green staining shows that at least part of the effects of sound are due to a physical alteration in the integrity of the pericarp, increasing the porosity of the pericarp and facilitating oxygen availability and water and oxygen uptake. Accordingly, by removing the pericarp from the seeds the positive effect of the sound on the germination disappeared.

  6. Maximum likelihood approach to “informed” Sound Source Localization for Hearing Aid applications

    DEFF Research Database (Denmark)

    Farmani, Mojtaba; Pedersen, Michael Syskind; Tan, Zheng-Hua

    2015-01-01

    Most state-of-the-art Sound Source Localization (SSL) algorithms have been proposed for applications which are "uninformed'' about the target sound content; however, utilizing a wireless microphone worn by a target talker, enables recent Hearing Aid Systems (HASs) to access to an almost noise......-free sound signal of the target talker at the HAS via the wireless connection. Therefore, in this paper, we propose a maximum likelihood (ML) approach, which we call MLSSL, to estimate the Direction of Arrival (DoA) of the target signal given access to the target signal content. Compared with other "informed...

  7. Deviant vocal fold vibration as observed during videokymography : the effect on voice quality

    NARCIS (Netherlands)

    Verdonck-de Leeuw, I M; Festen, J.M.; Mahieu, H.F.

    Videokymographic images of deviant or irregular vocal fold vibration, including diplophonia, the transition from falsetto to modal voice, irregular vibration onset and offset, and phonation following partial laryngectomy were compared with the synchronously recorded acoustic speech signals. A clear

  8. Understanding the 'Anorexic Voice' in Anorexia Nervosa.

    Science.gov (United States)

    Pugh, Matthew; Waller, Glenn

    2017-05-01

    In common with individuals experiencing a number of disorders, people with anorexia nervosa report experiencing an internal 'voice'. The anorexic voice comments on the individual's eating, weight and shape and instructs the individual to restrict or compensate. However, the core characteristics of the anorexic voice are not known. This study aimed to develop a parsimonious model of the voice characteristics that are related to key features of eating disorder pathology and to determine whether patients with anorexia nervosa fall into groups with different voice experiences. The participants were 49 women with full diagnoses of anorexia nervosa. Each completed validated measures of the power and nature of their voice experience and of their responses to the voice. Different voice characteristics were associated with current body mass index, duration of disorder and eating cognitions. Two subgroups emerged, with 'weaker' and 'stronger' voice experiences. Those with stronger voices were characterized by having more negative eating attitudes, more severe compensatory behaviours, a longer duration of illness and a greater likelihood of having the binge-purge subtype of anorexia nervosa. The findings indicate that the anorexic voice is an important element of the psychopathology of anorexia nervosa. Addressing the anorexic voice might be helpful in enhancing outcomes of treatments for anorexia nervosa, but that conclusion might apply only to patients with more severe eating psychopathology. Copyright © 2016 John Wiley & Sons, Ltd. Experiences of an internal 'anorexic voice' are common in anorexia nervosa. Clinicians should consider the role of the voice when formulating eating pathology in anorexia nervosa, including how individuals perceive and relate to that voice. Addressing the voice may be beneficial, particularly in more severe and enduring forms of anorexia nervosa. When working with the voice, clinicians should aim to address both the content of the voice and how

  9. Bottom-up influences of voice continuity in focusing selective auditory attention.

    Science.gov (United States)

    Bressler, Scott; Masud, Salwa; Bharadwaj, Hari; Shinn-Cunningham, Barbara

    2014-01-01

    Selective auditory attention causes a relative enhancement of the neural representation of important information and suppression of the neural representation of distracting sound, which enables a listener to analyze and interpret information of interest. Some studies suggest that in both vision and in audition, the "unit" on which attention operates is an object: an estimate of the information coming from a particular external source out in the world. In this view, which object ends up in the attentional foreground depends on the interplay of top-down, volitional attention and stimulus-driven, involuntary attention. Here, we test the idea that auditory attention is object based by exploring whether continuity of a non-spatial feature (talker identity, a feature that helps acoustic elements bind into one perceptual object) also influences selective attention performance. In Experiment 1, we show that perceptual continuity of target talker voice helps listeners report a sequence of spoken target digits embedded in competing reversed digits spoken by different talkers. In Experiment 2, we provide evidence that this benefit of voice continuity is obligatory and automatic, as if voice continuity biases listeners by making it easier to focus on a subsequent target digit when it is perceptually linked to what was already in the attentional foreground. Our results support the idea that feature continuity enhances streaming automatically, thereby influencing the dynamic processes that allow listeners to successfully attend to objects through time in the cacophony that assails our ears in many everyday settings.

  10. Voice application development for Android

    CERN Document Server

    McTear, Michael

    2013-01-01

    This book will give beginners an introduction to building voice-based applications on Android. It will begin by covering the basic concepts and will build up to creating a voice-based personal assistant. By the end of this book, you should be in a position to create your own voice-based applications on Android from scratch in next to no time.Voice Application Development for Android is for all those who are interested in speech technology and for those who, as owners of Android devices, are keen to experiment with developing voice apps for their devices. It will also be useful as a starting po

  11. Neuroanatomic organization of sound memory in humans.

    Science.gov (United States)

    Kraut, Michael A; Pitcock, Jeffery A; Calhoun, Vince; Li, Juan; Freeman, Thomas; Hart, John

    2006-11-01

    The neural interface between sensory perception and memory is a central issue in neuroscience, particularly initial memory organization following perceptual analyses. We used functional magnetic resonance imaging to identify anatomic regions extracting initial auditory semantic memory information related to environmental sounds. Two distinct anatomic foci were detected in the right superior temporal gyrus when subjects identified sounds representing either animals or threatening items. Threatening animal stimuli elicited signal changes in both foci, suggesting a distributed neural representation. Our results demonstrate both category- and feature-specific responses to nonverbal sounds in early stages of extracting semantic memory information from these sounds. This organization allows for these category-feature detection nodes to extract early, semantic memory information for efficient processing of transient sound stimuli. Neural regions selective for threatening sounds are similar to those of nonhuman primates, demonstrating semantic memory organization for basic biological/survival primitives are present across species.

  12. DolphinAtack: Inaudible Voice Commands

    OpenAIRE

    Zhang, Guoming; Yan, Chen; Ji, Xiaoyu; Zhang, Taimin; Zhang, Tianchen; Xu, Wenyuan

    2017-01-01

    Speech recognition (SR) systems such as Siri or Google Now have become an increasingly popular human-computer interaction method, and have turned various systems into voice controllable systems(VCS). Prior work on attacking VCS shows that the hidden voice commands that are incomprehensible to people can control the systems. Hidden voice commands, though hidden, are nonetheless audible. In this work, we design a completely inaudible attack, DolphinAttack, that modulates voice commands on ultra...

  13. Binaural Processing of Multiple Sound Sources

    Science.gov (United States)

    2016-08-18

    AFRL-AFOSR-VA-TR-2016-0298 Binaural Processing of Multiple Sound Sources William Yost ARIZONA STATE UNIVERSITY 660 S MILL AVE STE 312 TEMPE, AZ 85281...18-08-2016 2. REPORT TYPE Final Performance 3. DATES COVERED (From - To) 15 Jul 2012 to 14 Jul 2016 4. TITLE AND SUBTITLE Binaural Processing of...three topics cited above are entirely within the scope of the AFOSR grant. 15. SUBJECT TERMS Binaural hearing, Sound Localization, Interaural signal

  14. Silence–breathing–snore classification from snore-related sounds

    International Nuclear Information System (INIS)

    Karunajeewa, Asela S; Abeyratne, Udantha R; Hukins, Craig

    2008-01-01

    Obstructive sleep apnea (OSA) is a highly prevalent disease in which upper airways are collapsed during sleep, leading to serious consequences. Snoring is the earliest symptom of OSA, but its potential in clinical diagnosis is not fully recognized yet. The first task in the automatic analysis of snore-related sounds (SRS) is to segment the SRS data as accurately as possible into three main classes: snoring (voiced non-silence), breathing (unvoiced non-silence) and silence. SRS data are generally contaminated with background noise. In this paper, we present classification performance of a new segmentation algorithm based on pattern recognition. We considered four features derived from SRS to classify samples of SRS into three classes. The features—number of zero crossings, energy of the signal, normalized autocorrelation coefficient at 1 ms delay and the first predictor coefficient of linear predictive coding (LPC) analysis—in combination were able to achieve a classification accuracy of 90.74% in classifying a set of test data. We also investigated the performance of the algorithm when three commonly used noise reduction (NR) techniques in speech processing—amplitude spectral subtraction (ASS), power spectral subtraction (PSS) and short time spectral amplitude (STSA) estimation—are used for noise reduction. We found that noise reduction together with a proper choice of features could improve the classification accuracy to 96.78%, making the automated analysis a possibility

  15. Portable system for auscultation and lung sound analysis.

    Science.gov (United States)

    Nabiev, Rustam; Glazova, Anna; Olyinik, Valery; Makarenkova, Anastasiia; Makarenkov, Anatolii; Rakhimov, Abdulvosid; Felländer-Tsai, Li

    2014-01-01

    A portable system for auscultation and lung sound analysis has been developed, including the original electronic stethoscope coupled with mobile devices and special algorithms for the automated analysis of pulmonary sound signals. It's planned that the developed system will be used for monitoring of health status of patients with various pulmonary diseases.

  16. Infants' long-term memory for the sound patterns of words and voices.

    Science.gov (United States)

    Houston, Derek M; Jusczyk, Peter W

    2003-12-01

    Infants' long-term memory for the phonological patterns of words versus the indexical properties of talkers' voices was examined in 3 experiments using the Headturn Preference Procedure (D. G. Kemler Nelson et al., 1995). Infants were familiarized with repetitions of 2 words and tested on the next day for their orientation times to 4 passages--2 of which included the familiarized words. At 7.5 months of age, infants oriented longer to passages containing familiarized words when these were produced by the original talker. At 7.5 and 10.5 months of age, infants did not recognize words in passages produced by a novel female talker. In contrast, 7.5-month-olds demonstrated word recognition in both talker conditions when presented with passages produced by both the original and the novel talker. The findings suggest that talker-specific information can prime infants' memory for words and facilitate word recognition across talkers. ((c) 2003 APA, all rights reserved)

  17. The influence of ski helmets on sound perception and sound localisation on the ski slope

    Directory of Open Access Journals (Sweden)

    Lana Ružić

    2015-04-01

    Full Text Available Objectives: The aim of the study was to investigate whether a ski helmet interferes with the sound localization and the time of sound perception in the frontal plane. Material and Methods: Twenty-three participants (age 30.7±10.2 were tested on the slope in 2 conditions, with and without wearing the ski helmet, by 6 different spatially distributed sound stimuli per each condition. Each of the subjects had to react when hearing the sound as soon as possible and to signalize the correct side of the sound arrival. Results: The results showed a significant difference in the ability to localize the specific ski sounds; 72.5±15.6% of correct answers without a helmet vs. 61.3±16.2% with a helmet (p < 0.01. However, the performance on this test did not depend on whether they were used to wearing a helmet (p = 0.89. In identifying the timing, at which the sound was firstly perceived, the results were also in favor of the subjects not wearing a helmet. The subjects reported hearing the ski sound clues at 73.4±5.56 m without a helmet vs. 60.29±6.34 m with a helmet (p < 0.001. In that case the results did depend on previously used helmets (p < 0.05, meaning that that regular usage of helmets might help to diminish the attenuation of the sound identification that occurs because of the helmets. Conclusions: Ski helmets might limit the ability of a skier to localize the direction of the sounds of danger and might interfere with the moment, in which the sound is firstly heard.

  18. Dementias show differential physiological responses to salient sounds

    Directory of Open Access Journals (Sweden)

    Phillip David Fletcher

    2015-03-01

    Full Text Available Abnormal responsiveness to salient sensory signals is often a prominent feature of dementia diseases, particularly the frontotemporal lobar degenerations, but has been little studied. Here we assessed processing of one important class of salient signals, looming sounds, in canonical dementia syndromes. We manipulated tones using intensity cues to create percepts of salient approaching (‘looming’ or less salient withdrawing sounds. Pupil dilatation responses and behavioural rating responses to these stimuli were compared in patients fulfilling consensus criteria for dementia syndromes (semantic dementia, n=10; behavioural variant frontotemporal dementia, n=16, progressive non-fluent aphasia, n=12; amnestic Alzheimer’s disease, n=10 and a cohort of 26 healthy age-matched individuals. Approaching sounds were rated as more salient than withdrawing sounds by healthy older individuals but this behavioural response to salience did not differentiate healthy individuals from patients with dementia syndromes. Pupil responses to approaching sounds were greater than responses to withdrawing sounds in healthy older individuals and in patients with semantic dementia: this differential pupil response was reduced in patients with progressive nonfluent aphasia and Alzheimer’s disease relative both to the healthy control and semantic dementia groups, and did not correlate with nonverbal auditory semantic function. Autonomic responses to auditory salience are differentially affected by dementias and may constitute a novel biomarker of these diseases.

  19. Dementias show differential physiological responses to salient sounds

    Science.gov (United States)

    Fletcher, Phillip D.; Nicholas, Jennifer M.; Shakespeare, Timothy J.; Downey, Laura E.; Golden, Hannah L.; Agustus, Jennifer L.; Clark, Camilla N.; Mummery, Catherine J.; Schott, Jonathan M.; Crutch, Sebastian J.; Warren, Jason D.

    2015-01-01

    Abnormal responsiveness to salient sensory signals is often a prominent feature of dementia diseases, particularly the frontotemporal lobar degenerations, but has been little studied. Here we assessed processing of one important class of salient signals, looming sounds, in canonical dementia syndromes. We manipulated tones using intensity cues to create percepts of salient approaching (“looming”) or less salient withdrawing sounds. Pupil dilatation responses and behavioral rating responses to these stimuli were compared in patients fulfilling consensus criteria for dementia syndromes (semantic dementia, n = 10; behavioral variant frontotemporal dementia, n = 16, progressive nonfluent aphasia, n = 12; amnestic Alzheimer's disease, n = 10) and a cohort of 26 healthy age-matched individuals. Approaching sounds were rated as more salient than withdrawing sounds by healthy older individuals but this behavioral response to salience did not differentiate healthy individuals from patients with dementia syndromes. Pupil responses to approaching sounds were greater than responses to withdrawing sounds in healthy older individuals and in patients with semantic dementia: this differential pupil response was reduced in patients with progressive nonfluent aphasia and Alzheimer's disease relative both to the healthy control and semantic dementia groups, and did not correlate with nonverbal auditory semantic function. Autonomic responses to auditory salience are differentially affected by dementias and may constitute a novel biomarker of these diseases. PMID:25859194

  20. Voice-to-Phoneme Conversion Algorithms for Voice-Tag Applications in Embedded Platforms

    Directory of Open Access Journals (Sweden)

    Yan Ming Cheng

    2008-08-01

    Full Text Available We describe two voice-to-phoneme conversion algorithms for speaker-independent voice-tag creation specifically targeted at applications on embedded platforms. These algorithms (batch mode and sequential are compared in speech recognition experiments where they are first applied in a same-language context in which both acoustic model training and voice-tag creation and application are performed on the same language. Then, their performance is tested in a cross-language setting where the acoustic models are trained on a particular source language while the voice-tags are created and applied on a different target language. In the same-language environment, both algorithms either perform comparably to or significantly better than the baseline where utterances are manually transcribed by a phonetician. In the cross-language context, the voice-tag performances vary depending on the source-target language pair, with the variation reflecting predicted phonological similarity between the source and target languages. Among the most similar languages, performance nears that of the native-trained models and surpasses the native reference baseline.

  1. Effect of singing on respiratory function, voice, and mood after quadriplegia: a randomized controlled trial.

    Science.gov (United States)

    Tamplin, Jeanette; Baker, Felicity A; Grocke, Denise; Brazzale, Danny J; Pretto, Jeffrey J; Ruehland, Warren R; Buttifant, Mary; Brown, Douglas J; Berlowitz, David J

    2013-03-01

    To explore the effects of singing training on respiratory function, voice, mood, and quality of life for people with quadriplegia. Randomized controlled trial. Large, university-affiliated public hospital, Victoria, Australia. Participants (N=24) with chronic quadriplegia (C4-8, American Spinal Injury Association grades A and B). The experimental group (n=13) received group singing training 3 times weekly for 12 weeks. The control group (n=11) received group music appreciation and relaxation for 12 weeks. Assessments were conducted pre, mid-, immediately post-, and 6-months postintervention. Standard respiratory function testing, surface electromyographic activity from accessory respiratory muscles, sound pressure levels during vocal tasks, assessments of voice quality (Perceptual Voice Profile, Multidimensional Voice Profile), and Voice Handicap Index, Profile of Mood States, and Assessment of Quality of Life instruments. The singing group increased projected speech intensity (P=.028) and maximum phonation length (P=.007) significantly more than the control group. Trends for improvements in respiratory function, muscle strength, and recruitment were also evident for the singing group. These effects were limited by small sample sizes with large intersubject variability. Both groups demonstrated an improvement in mood (P=.002), which was maintained in the music appreciation and relaxation group after 6 months (P=.017). Group music therapy can have a positive effect on not only physical outcomes, but also can improve mood, energy, social participation, and quality of life for an at-risk population, such as those with quadriplegia. Specific singing therapy can augment these general improvements by improving vocal intensity. Copyright © 2013 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  2. Risk factors for voice problems in teachers.

    NARCIS (Netherlands)

    Kooijman, P.G.C.; Jong, F.I.C.R.S. de; Thomas, G.; Huinck, W.J.; Donders, A.R.T.; Graamans, K.; Schutte, H.K.

    2006-01-01

    In order to identify factors that are associated with voice problems and voice-related absenteeism in teachers, 1,878 questionnaires were analysed. The questionnaires inquired about personal data, voice complaints, voice-related absenteeism from work and conditions that may lead to voice complaints

  3. Risk factors for voice problems in teachers

    NARCIS (Netherlands)

    Kooijman, P. G. C.; de Jong, F. I. C. R. S.; Thomas, G.; Huinck, W.; Donders, R.; Graamans, K.; Schutte, H. K.

    2006-01-01

    In order to identify factors that are associated with voice problems and voice-related absenteeism in teachers, 1,878 questionnaires were analysed. The questionnaires inquired about personal data, voice complaints, voice-related absenteeism from work and conditions that may lead to voice complaints

  4. Active structural acoustic control for reduction of radiated sound from structure

    International Nuclear Information System (INIS)

    Hong, Jin Seok; Oh, Jae Eung

    2001-01-01

    Active control of sound radiation from a vibrating rectangular plate by a steady-state harmonic point force disturbance is experimentally studied. Structural excitation is achieved by two piezoceramic actuators mounted on the panel. Two accelerometers are implemented as error sensors. Estimated radiated sound signals using vibro-acoustic path transfer function are used as error signals. The vibro-acoustic path transfer function represents system between accelerometers and microphones. The approach is based on a multi-channel filtered-x LMS algorithm. The results shows that attenuation of sound levels of 11dB, 10dB is achieved

  5. Development of an Amplifier for Electronic Stethoscope System and Heart Sound Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Kim, D. J.; Kang, D. K. [Chongju University, Chongju (Korea)

    2001-05-01

    The conventional stethoscope can not store its stethoscopic sounds. Therefore a doctor diagnoses a patient with instantaneous stethoscopic sounds at that time, and he can not remember the state of the patient's stethoscopic sounds on the next. This prevent accurate and objective diagnosis. If the electronic stethoscope, which can store the stethoscopic sound, is developed, the auscultation will be greatly improved. This study describes an amplifier for electronic stethoscope system that can extract heart sounds of fetus as well as adult and allow us hear and record the sounds. Using the developed stethoscopic amplifier, clean heart sounds of fetus and adult can be heard in noisy environment, such as a consultation room of a university hospital, a laboratory of a university. Surprisingly, the heart sound of a 22-week fetus was heard through the developed electronic stethoscope. Pitch detection experiments using the detected heart sounds showed that the signal represents distinct periodicity. It can be expected that the developed electronic stethoscope can substitute for conventional stethoscopes and if proper analysis method for the stethoscopic signal is developed, a good electronic stethoscope system can be produced. (author). 17 refs., 6 figs.

  6. [Acoustic characteristics of adductor spasmodic dysphonia].

    Science.gov (United States)

    Yang, Yang; Wang, Li-Ping

    2008-06-01

    To explore the acoustic characteristics of adductor spasmodic dysphonia. The acoustic characteristics, including acoustic signal of recorded voice, three-dimensional sonogram patterns and subjective assessment of voice, between 10 patients (7 women, 3 men) with adductor spasmodic dysphonia and 10 healthy volunteers (5 women, 5 men), were compared. The main clinical manifestation of adductor spasmodic dysphonia included the disorders of sound quality, rhyme and fluency. It demonstrated the tension dysphonia when reading, acoustic jitter, momentary fluctuation of frequency and volume, voice squeezing, interruption, voice prolongation, and losing normal chime. Among 10 patients, there were 1 mild dysphonia (abnormal syllable number dysphonia (abnormal syllable number 25%-49%), 1 severe dysphonia (abnormal syllable number 50%-74%) and 2 extremely severe dysphonia (abnormal syllable number > or = 75%). The average reading time in 10 patients was 49 s, with reading time extension and aphasia area interruption in acoustic signals, whereas the average reading time in health control group was 30 s, without voice interruption. The aphasia ratio averaged 42%. The respective symptom syllable in different patients demonstrated in the three-dimensional sonogram. There were voice onset time prolongation, irregular, interrupted and even absent vowel formants. The consonant of symptom syllables displayed absence or prolongation of friction murmur in the block-friction murmur occasionally. The acoustic characteristics of adductor spasmodic dysphonia is the disorders of sound quality, rhyme and fluency. The three-dimensional sonogram of the symptom syllables show distinctive changes of proportional vowels or consonant phonemes.

  7. Voice quality in relation to voice complaints and vocal fold condition during the screening of female student teachers.

    Science.gov (United States)

    Meulenbroek, Leo F P; de Jong, Felix I C R S

    2011-07-01

    The purpose of this study was to compare the perceptual examination of voice quality with the condition of the vocal folds and voice complaints during voice screening in female student teachers. This research was a cross-sectional study in 214 starting student teachers using the four-point grade scale of the GRBAS and laryngostroboscopic assessment of the vocal folds. The voice quality was assessed by speech pathologists using the ordinal 4-point G-scale (overall dysphonia) of the GRBAS method in a running speech sample. Glottal closure and vocal fold lesions were recorded. A questionnaire was used for assessing voice complaints. More students with an insufficient glottal closure (89%) were rated dysphonic compared with students with sufficient glottal closure (80%). Students with sufficient glottal closure had a significantly lower mean G-score (1.21) compared with the group with insufficient glottal closure (1.52) (P = 0.038). This study showed a larger percentage of students with vocal fold lesions (96%) labeled a dysphonic voice compared to students with no vocal fold problems (81%). Students with no vocal fold lesions had a significantly lower mean G-score (1.20) compared with the group with vocal fold lesions (2.05) (P=0.002). A dysphonic voice (G≥1) was rated in 76% of the students without voice complaints compared with 86% of the students with voice complaints. Students with no voice complaints had a lower mean G-score (1.07) compared with the group with voice complaints (1.41) (P=0.090). The present study showed that perceptual assessment of the voice and voice complaints is not sufficient to check if the future professional is at risk. Therefore, preventive measures are needed to detect students at risk early in their education and this depends on broader assessment: on the one hand, assessing voice quality and voice complaints and on the other hand, examination of the vocal folds of all starting students. Copyright © 2011 The Voice Foundation

  8. VOICE QUALITY BEFORE AND AFTER THYROIDECTOMY

    Directory of Open Access Journals (Sweden)

    Dora CVELBAR

    2016-04-01

    Full Text Available Introduction: Voice disorders are a well-known complication which is often associated with thyroid gland diseases and because voice is still the basic mean of communication it is very important to maintain its quality healthy. Objectives: The aim of this study referred to questions whether there is a statistically significant difference between results of voice self-assessment, perceptual voice assessment and acoustic voice analysis before and after thyroidectomy and whether there are statistically significant correlations between variables of voice self-assessment, perceptual assessment and acoustic analysis before and after thyroidectomy. Methods: This scientific research included 12 participants aged between 41 and 76. Voice self-assessment was conducted with the help of Croatian version of Voice Handicap Index (VHI. Recorded reading samples were used for perceptual assessment and later evaluated by two clinical speech and language therapists. Recorded samples of phonation were used for acoustic analysis which was conducted with the help of acoustic program Praat. All of the data was processed through descriptive statistics and nonparametric statistical methods. Results: Results showed that there are statistically significant differences between results of voice self-assessments and results of acoustic analysis before and after thyroidectomy. Statistically significant correlations were found between variables of perceptual assessment and acoustic analysis. Conclusion: Obtained results indicate the importance of multidimensional, preoperative and postoperative assessment. This kind of assessment allows the clinician to describe all of the voice features and provides appropriate recommendation for further rehabilitation to the patient in order to optimize voice outcomes.

  9. A basic study on universal design of auditory signals in automobiles.

    Science.gov (United States)

    Yamauchi, Katsuya; Choi, Jong-dae; Maiguma, Ryo; Takada, Masayuki; Iwamiya, Shin-ichiro

    2004-11-01

    In this paper, the impression of various kinds of auditory signals currently used in automobiles and a comprehensive evaluation were measured by a semantic differential method. The desirable acoustic characteristic was examined for each type of auditory signal. Sharp sounds with dominant high-frequency components were not suitable for auditory signals in automobiles. This trend is expedient for the aged whose auditory sensitivity in the high frequency region is lower. When intermittent sounds were used, a longer OFF time was suitable. Generally, "dull (not sharp)" and "calm" sounds were appropriate for auditory signals. Furthermore, the comparison between the frequency spectrum of interior noise in automobiles and that of suitable sounds for various auditory signals indicates that the suitable sounds are not easily masked. The suitable auditory signals for various purposes is a good solution from the viewpoint of universal design.

  10. Enhancement and Noise Statistics Estimation for Non-Stationary Voiced Speech

    DEFF Research Database (Denmark)

    Nørholm, Sidsel Marie; Jensen, Jesper Rindom; Christensen, Mads Græsbøll

    2016-01-01

    In this paper, single channel speech enhancement in the time domain is considered. We address the problem of modelling non-stationary speech by describing the voiced speech parts by a harmonic linear chirp model instead of using the traditional harmonic model. This means that the speech signal...... through simulations on synthetic and speech signals, that the chirp versions of the filters perform better than their harmonic counterparts in terms of output signal-to-noise ratio (SNR) and signal reduction factor. For synthetic signals, the output SNR for the harmonic chirp APES based filter...... is increased 3 dB compared to the harmonic APES based filter at an input SNR of 10 dB, and at the same time the signal reduction factor is decreased. For speech signals, the increase is 1.5 dB along with a decrease in the signal reduction factor of 0.7. As an implicit part of the APES filter, a noise...

  11. Experiment and practice on signal processing

    International Nuclear Information System (INIS)

    2002-11-01

    The contents of this book contains basic practice of CEM Tool, discrete time signal and experiment and practice of system, experiment and practice of discrete time signal sampling, practice of frequency analysis, experiment of digital filter design, application of digital signal processing, project related voice, basic principle of signal processing, the technique of basic image signal processing, biology astronomy and Robot soccer with apply of image signal processing technique, control video signal and project related image. It also has an introduction of CEM Linker I. O in the end.

  12. Experiment and practice on signal processing

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2002-11-15

    The contents of this book contains basic practice of CEM Tool, discrete time signal and experiment and practice of system, experiment and practice of discrete time signal sampling, practice of frequency analysis, experiment of digital filter design, application of digital signal processing, project related voice, basic principle of signal processing, the technique of basic image signal processing, biology astronomy and Robot soccer with apply of image signal processing technique, control video signal and project related image. It also has an introduction of CEM Linker I. O in the end.

  13. Crossing Cultures with Multi-Voiced Journals

    Science.gov (United States)

    Styslinger, Mary E.; Whisenant, Alison

    2004-01-01

    In this article, the authors discuss the benefits of using multi-voiced journals as a teaching strategy in reading instruction. Multi-voiced journals, an adaptation of dual-voiced journals, encourage responses to reading in varied, cultured voices of characters. It is similar to reading journals in that they prod students to connect to the lives…

  14. [Applicability of Voice Handicap Index to the evaluation of voice therapy effectiveness in teachers].

    Science.gov (United States)

    Niebudek-Bogusz, Ewa; Kuzańska, Anna; Błoch, Piotr; Domańska, Maja; Woźnicka, Ewelina; Politański, Piotr; Sliwińska-Kowalska, Mariola

    2007-01-01

    The aim of this study was to assess the applicability of Voice Handicap Index (VHI) to the evaluation of effectiveness of functional voice disorders treatment in teachers. The subjects were 45 female teachers with functional dysphonia who evaluated their voice problems according to the subjective VHI scale before and after phoniatric management. Group I (29 patients) were subjected to vocal training, whereas group II (16 patients) received only voice hygiene instructions. The results demonstrated that differences in the mean VHI score before and after phoniatric treatment were significantly higher in group 1 than in group II (p teacher's dysphonia.

  15. Voice parameters and videonasolaryngoscopy in children with vocal nodules: a longitudinal study, before and after voice therapy.

    Science.gov (United States)

    Valadez, Victor; Ysunza, Antonio; Ocharan-Hernandez, Esther; Garrido-Bustamante, Norma; Sanchez-Valerio, Araceli; Pamplona, Ma C

    2012-09-01

    Vocal Nodules (VN) are a functional voice disorder associated with voice misuse and abuse in children. There are few reports addressing vocal parameters in children with VN, especially after a period of vocal rehabilitation. The purpose of this study is to describe measurements of vocal parameters including Fundamental Frequency (FF), Shimmer (S), and Jitter (J), videonasolaryngoscopy examination and clinical perceptual assessment, before and after voice therapy in children with VN. Voice therapy was provided using visual support through Speech-Viewer software. Twenty patients with VN were studied. An acoustical analysis of voice was performed and compared with data from subjects from a control group matched by age and gender. Also, clinical perceptual assessment of voice and videonasolaryngoscopy were performed to all patients with VN. After a period of voice therapy, provided with visual support using Speech Viewer-III (SV-III-IBM) software, new acoustical analyses, perceptual assessments and videonasolaryngoscopies were performed. Before the onset of voice therapy, there was a significant difference (ptherapy period, a significant improvement (pvocal nodules were no longer discernible on the vocal folds in any of the cases. SV-III software seems to be a safe and reliable method for providing voice therapy in children with VN. Acoustic voice parameters, perceptual data and videonasolaryngoscopy were significantly improved after the speech therapy period was completed. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  16. Interactive Augmentation of Voice Quality and Reduction of Breath Airflow in the Soprano Voice.

    Science.gov (United States)

    Rothenberg, Martin; Schutte, Harm K

    2016-11-01

    In 1985, at a conference sponsored by the National Institutes of Health, Martin Rothenberg first described a form of nonlinear source-tract acoustic interaction mechanism by which some sopranos, singing in their high range, can use to reduce the total airflow, to allow holding the note longer, and simultaneously enrich the quality of the voice, without straining the voice. (M. Rothenberg, "Source-Tract Acoustic Interaction in the Soprano Voice and Implications for Vocal Efficiency," Fourth International Conference on Vocal Fold Physiology, New Haven, Connecticut, June 3-6, 1985.) In this paper, we describe additional evidence for this type of nonlinear source-tract interaction in some soprano singing and describe an analogous interaction phenomenon in communication engineering. We also present some implications for voice research and pedagogy. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  17. Interventions for preventing voice disorders in adults.

    Science.gov (United States)

    Ruotsalainen, J H; Sellman, J; Lehto, L; Jauhiainen, M; Verbeek, J H

    2007-10-17

    Poor voice quality due to a voice disorder can lead to a reduced quality of life. In occupations where voice use is substantial it can lead to periods of absence from work. To evaluate the effectiveness of interventions to prevent voice disorders in adults. We searched MEDLINE (PubMed, 1950 to 2006), EMBASE (1974 to 2006), CENTRAL (The Cochrane Library, Issue 2 2006), CINAHL (1983 to 2006), PsychINFO (1967 to 2006), Science Citation Index (1986 to 2006) and the Occupational Health databases OSH-ROM (to 2006). The date of the last search was 05/04/06. Randomised controlled clinical trials (RCTs) of interventions evaluating the effectiveness of treatments to prevent voice disorders in adults. For work-directed interventions interrupted time series and prospective cohort studies were also eligible. Two authors independently extracted data and assessed trial quality. Meta-analysis was performed where appropriate. We identified two randomised controlled trials including a total of 53 participants in intervention groups and 43 controls. One study was conducted with teachers and the other with student teachers. Both trials were poor quality. Interventions were grouped into 1) direct voice training, 2) indirect voice training and 3) direct and indirect voice training combined.1) Direct voice training: One study did not find a significant decrease of the Voice Handicap Index for direct voice training compared to no intervention.2) Indirect voice training: One study did not find a significant decrease of the Voice Handicap Index for indirect voice training when compared to no intervention.3) Direct and indirect voice training combined: One study did not find a decrease of the Voice Handicap Index for direct and indirect voice training combined when compared to no intervention. The same study did however find an improvement in maximum phonation time (Mean Difference -3.18 sec; 95 % CI -4.43 to -1.93) for direct and indirect voice training combined when compared to no

  18. Designing a Voice Controlled Interface For Radio : Guidelines for The First Generation of Voice Controlled Public Radio

    OpenAIRE

    Päärni, Anna

    2017-01-01

    From being a fictional element in sci-fi, voice control has become a reality, with inventions such as Apple's Siri, and interactive voice response (IVR) when calling your doctor's office. The combination of radio’s strength as a hands-free medium, public radio’s mission to reach across all platforms and the rise of voice makes up a relevant intersection; voice controlled public radio in Sweden. This thesis has aimed to investigate how radio listeners wish to interact using voice control to li...

  19. Synthesis of vibroarthrographic signals in knee osteoarthritis diagnosis training.

    Science.gov (United States)

    Shieh, Chin-Shiuh; Tseng, Chin-Dar; Chang, Li-Yun; Lin, Wei-Chun; Wu, Li-Fu; Wang, Hung-Yu; Chao, Pei-Ju; Chiu, Chien-Liang; Lee, Tsair-Fwu

    2016-07-19

    Vibroarthrographic (VAG) signals are used as useful indicators of knee osteoarthritis (OA) status. The objective was to build a template database of knee crepitus sounds. Internships can practice in the template database to shorten the time of training for diagnosis of OA. A knee sound signal was obtained using an innovative stethoscope device with a goniometer. Each knee sound signal was recorded with a Kellgren-Lawrence (KL) grade. The sound signal was segmented according to the goniometer data. The signal was Fourier transformed on the correlated frequency segment. An inverse Fourier transform was performed to obtain the time-domain signal. Haar wavelet transform was then done. The median and mean of the wavelet coefficients were chosen to inverse transform the synthesized signal in each KL category. The quality of the synthesized signal was assessed by a clinician. The sample signals were evaluated using different algorithms (median and mean). The accuracy rate of the median coefficient algorithm (93 %) was better than the mean coefficient algorithm (88 %) for cross-validation by a clinician using synthesis of VAG. The artificial signal we synthesized has the potential to build a learning system for medical students, internships and para-medical personnel for the diagnosis of OA. Therefore, our method provides a feasible way to evaluate crepitus sounds that may assist in the diagnosis of knee OA.

  20. Application of computer voice input/output

    International Nuclear Information System (INIS)

    Ford, W.; Shirk, D.G.

    1981-01-01

    The advent of microprocessors and other large-scale integration (LSI) circuits is making voice input and output for computers and instruments practical; specialized LSI chips for speech processing are appearing on the market. Voice can be used to input data or to issue instrument commands; this allows the operator to engage in other tasks, move about, and to use standard data entry systems. Voice synthesizers can generate audible, easily understood instructions. Using voice characteristics, a control system can verify speaker identity for security purposes. Two simple voice-controlled systems have been designed at Los Alamos for nuclear safeguards applicaations. Each can easily be expanded as time allows. The first system is for instrument control that accepts voice commands and issues audible operator prompts. The second system is for access control. The speaker's voice is used to verify his identity and to actuate external devices

  1. Presidential, But Not Prime Minister, Candidates With Lower Pitched Voices Stand a Better Chance of Winning the Election in Conservative Countries.

    Science.gov (United States)

    Banai, Benjamin; Laustsen, Lasse; Banai, Irena Pavela; Bovan, Kosta

    2018-01-01

    Previous studies have shown that voters rely on sexually dimorphic traits that signal masculinity and dominance when they choose political leaders. For example, voters exert strong preferences for candidates with lower pitched voices because these candidates are perceived as stronger and more competent. Moreover, experimental studies demonstrate that conservative voters, more than liberals, prefer political candidates with traits that signal dominance, probably because conservatives are more likely to perceive the world as a threatening place and to be more attentive to dangerous and threatening contexts. In light of these findings, this study investigates whether country-level ideology influences the relationship between candidate voice pitch and electoral outcomes of real elections. Specifically, we collected voice pitch data for presidential and prime minister candidates, aggregate national ideology for the countries in which the candidates were nominated, and measures of electoral outcomes for 69 elections held across the world. In line with previous studies, we found that candidates with lower pitched voices received more votes and had greater likelihood of winning the elections. Furthermore, regression analysis revealed an interaction between candidate voice pitch, national ideology, and election type (presidential or parliamentary). That is, having a lower pitched voice was a particularly valuable asset for presidential candidates in conservative and right-leaning countries (in comparison to presidential candidates in liberal and left-leaning countries and parliamentary elections). We discuss the practical implications of these findings, and how they relate to existing research on candidates' voices, voting preferences, and democratic elections in general.

  2. A system for heart sounds classification.

    Directory of Open Access Journals (Sweden)

    Grzegorz Redlarski

    Full Text Available The future of quick and efficient disease diagnosis lays in the development of reliable non-invasive methods. As for the cardiac diseases - one of the major causes of death around the globe - a concept of an electronic stethoscope equipped with an automatic heart tone identification system appears to be the best solution. Thanks to the advancement in technology, the quality of phonocardiography signals is no longer an issue. However, appropriate algorithms for auto-diagnosis systems of heart diseases that could be capable of distinguishing most of known pathological states have not been yet developed. The main issue is non-stationary character of phonocardiography signals as well as a wide range of distinguishable pathological heart sounds. In this paper a new heart sound classification technique, which might find use in medical diagnostic systems, is presented. It is shown that by combining Linear Predictive Coding coefficients, used for future extraction, with a classifier built upon combining Support Vector Machine and Modified Cuckoo Search algorithm, an improvement in performance of the diagnostic system, in terms of accuracy, complexity and range of distinguishable heart sounds, can be made. The developed system achieved accuracy above 93% for all considered cases including simultaneous identification of twelve different heart sound classes. The respective system is compared with four different major classification methods, proving its reliability.

  3. Voice and silence in organizations

    Directory of Open Access Journals (Sweden)

    Moaşa, H.

    2011-01-01

    Full Text Available Unlike previous research on voice and silence, this article breaksthe distance between the two and declines to treat them as opposites. Voice and silence are interrelated and intertwined strategic forms ofcommunication which presuppose each other in such a way that the absence of one would minimize completely the other’s presence. Social actors are not voice, or silence. Social actors can have voice or silence, they can do both because they operate at multiple levels and deal with multiple issues at different moments in time.

  4. Moth hearing and sound communication

    DEFF Research Database (Denmark)

    Nakano, Ryo; Takanashi, Takuma; Surlykke, Annemarie

    2015-01-01

    Active echolocation enables bats to orient and hunt the night sky for insects. As a counter-measure against the severe predation pressure many nocturnal insects have evolved ears sensitive to ultrasonic bat calls. In moths bat-detection was the principal purpose of hearing, as evidenced by compar......Active echolocation enables bats to orient and hunt the night sky for insects. As a counter-measure against the severe predation pressure many nocturnal insects have evolved ears sensitive to ultrasonic bat calls. In moths bat-detection was the principal purpose of hearing, as evidenced...... by comparable hearing physiology with best sensitivity in the bat echolocation range, 20–60 kHz, across moths in spite of diverse ear morphology. Some eared moths subsequently developed sound-producing organs to warn/startle/jam attacking bats and/or to communicate intraspecifically with sound. Not only...... the sounds for interaction with bats, but also mating signals are within the frequency range where bats echolocate, indicating that sound communication developed after hearing by “sensory exploitation”. Recent findings on moth sound communication reveal that close-range (~ a few cm) communication with low...

  5. Validity of Mind Monitoring System as a Mental Health Indicator using Voice

    Directory of Open Access Journals (Sweden)

    Naoki Hagiwara

    2017-05-01

    Full Text Available We have been developing a method of evaluating the mental health condition of a person based on the sound of their voice. Currently, we have applied this technology to create a smartphone application that shows the vitality and the mental activity as mental health condition indices. Using voice to measure one’s mental health condition is a non-invasive method. Moreover, this application can be used continually through a smartphone call. Unlike a periodic checkup every year, it could be used for monitoring on a daily basis. The purpose of this study is to compare the vitality index to the widely used Beck depression inventory (BDI and to evaluate its validity. This experiment was conducted at the Center of Innovation Program of the University of Tokyo with 50 employees of one corporation as participants between early December 2015 and early February 2016. Each participant was given a smartphone with our application that recorded his/her voice automatically during calls. In addition, the participants had to read and record a fixed phrase daily. The BDI test was conducted at the beginning of the experimental period. The vitality index was calculated based on the voice data collected during the first two weeks of the experiment and was considered as the vitality index at the time when the BDI test was conducted. When the vitality and the mental activity indicators were compared to BDI score, we found that there was a negative correlation between the BDI score and these indices. Additionally, these indices were a useful method to discriminate a participant of high risk of disease with a high BDI score. And the mental activity index shows a higher performance than the vitality index.

  6. Sound generating flames of a gas turbine burner observed by laser-induced fluorescence

    Energy Technology Data Exchange (ETDEWEB)

    Hubschmid, W; Inauen, A.; Bombach, R.; Kreutner, W.; Schenker, S.; Zajadatz, M. [Alstom (Switzerland); Motz, C. [Alstom (Switzerland); Haffner, K. [Alstom (Switzerland); Paschereit, C.O. [Alstom (Switzerland)

    2002-03-01

    We performed 2-D OH LIF measurements to investigate the sound emission of a gas turbine combustor. The measured LIF signal was averaged over pulses at constant phase of the dominant acoustic oscillation. A periodic variation in intensity and position of the signal is observed and it is related to the measured sound intensity. (author)

  7. A framework for automatic heart sound analysis without segmentation

    Directory of Open Access Journals (Sweden)

    Tungpimolrut Kanokvate

    2011-02-01

    Full Text Available Abstract Background A new framework for heart sound analysis is proposed. One of the most difficult processes in heart sound analysis is segmentation, due to interference form murmurs. Method Equal number of cardiac cycles were extracted from heart sounds with different heart rates using information from envelopes of autocorrelation functions without the need to label individual fundamental heart sounds (FHS. The complete method consists of envelope detection, calculation of cardiac cycle lengths using auto-correlation of envelope signals, features extraction using discrete wavelet transform, principal component analysis, and classification using neural network bagging predictors. Result The proposed method was tested on a set of heart sounds obtained from several on-line databases and recorded with an electronic stethoscope. Geometric mean was used as performance index. Average classification performance using ten-fold cross-validation was 0.92 for noise free case, 0.90 under white noise with 10 dB signal-to-noise ratio (SNR, and 0.90 under impulse noise up to 0.3 s duration. Conclusion The proposed method showed promising results and high noise robustness to a wide range of heart sounds. However, more tests are needed to address any bias that may have been introduced by different sources of heart sounds in the current training set, and to concretely validate the method. Further work include building a new training set recorded from actual patients, then further evaluate the method based on this new training set.

  8. Audio-visual identification of place of articulation and voicing in white and babble noise.

    Science.gov (United States)

    Alm, Magnus; Behne, Dawn M; Wang, Yue; Eg, Ragnhild

    2009-07-01

    Research shows that noise and phonetic attributes influence the degree to which auditory and visual modalities are used in audio-visual speech perception (AVSP). Research has, however, mainly focused on white noise and single phonetic attributes, thus neglecting the more common babble noise and possible interactions between phonetic attributes. This study explores whether white and babble noise differentially influence AVSP and whether these differences depend on phonetic attributes. White and babble noise of 0 and -12 dB signal-to-noise ratio were added to congruent and incongruent audio-visual stop consonant-vowel stimuli. The audio (A) and video (V) of incongruent stimuli differed either in place of articulation (POA) or voicing. Responses from 15 young adults show that, compared to white noise, babble resulted in more audio responses for POA stimuli, and fewer for voicing stimuli. Voiced syllables received more audio responses than voiceless syllables. Results can be attributed to discrepancies in the acoustic spectra of both the noise and speech target. Voiced consonants may be more auditorily salient than voiceless consonants which are more spectrally similar to white noise. Visual cues contribute to identification of voicing, but only if the POA is visually salient and auditorily susceptible to the noise type.

  9. Method for measuring violin sound radiation based on bowed glissandi and its application to sound synthesis.

    Science.gov (United States)

    Perez Carrillo, Alfonso; Bonada, Jordi; Patynen, Jukka; Valimaki, Vesa

    2011-08-01

    This work presents a method for measuring and computing violin-body directional frequency responses, which are used for violin sound synthesis. The approach is based on a frame-weighted deconvolution of excitation and response signals. The excitation, consisting of bowed glissandi, is measured with piezoelectric transducers built into the bridge. Radiation responses are recorded in an anechoic chamber with multiple microphones placed at different angles around the violin. The proposed deconvolution algorithm computes impulse responses that, when convolved with any source signal (captured with the same transducer), produce a highly realistic violin sound very similar to that of a microphone recording. The use of motion sensors allows for tracking violin movements. Combining this information with the directional responses and using a dynamic convolution algorithm, helps to improve the listening experience by incorporating the violinist motion effect in stereo.

  10. Clinical voice analysis of Carnatic singers.

    Science.gov (United States)

    Arunachalam, Ravikumar; Boominathan, Prakash; Mahalingam, Shenbagavalli

    2014-01-01

    Carnatic singing is a classical South Indian style of music that involves rigorous training to produce an "open throated" loud, predominantly low-pitched singing, embedded with vocal nuances in higher pitches. Voice problems in singers are not uncommon. The objective was to report the nature of voice problems and apply a routine protocol to assess the voice. Forty-five trained performing singers (females: 36 and males: 9) who reported to a tertiary care hospital with voice problems underwent voice assessment. The study analyzed their problems and the clinical findings. Voice change, difficulty in singing higher pitches, and voice fatigue were major complaints. Most of the singers suffered laryngopharyngeal reflux that coexisted with muscle tension dysphonia and chronic laryngitis. Speaking voices were rated predominantly as "moderate deviation" on GRBAS (Grade, Rough, Breathy, Asthenia, and Strain). Maximum phonation time ranged from 4 to 29 seconds (females: 10.2, standard deviation [SD]: 5.28 and males: 15.7, SD: 5.79). Singing frequency range was reduced (females: 21.3 Semitones and males: 23.99 Semitones). Dysphonia severity index (DSI) scores ranged from -3.5 to 4.91 (females: 0.075 and males: 0.64). Singing frequency range and DSI did not show significant difference between sex and across clinical diagnosis. Self-perception using voice disorder outcome profile revealed overall severity score of 5.1 (SD: 2.7). Findings are discussed from a clinical intervention perspective. Study highlighted the nature of voice problems (hyperfunctional) and required modifications in assessment protocol for Carnatic singers. Need for regular assessments and vocal hygiene education to maintain good vocal health are emphasized as outcomes. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  11. Voice Biometrics for Information Assurance Applications

    National Research Council Canada - National Science Library

    Kang, George

    2002-01-01

    .... The ultimate goal of voice biometrics is to enable the use of voice as a password. Voice biometrics are "man-in-the-loop" systems in which system performance is significantly dependent on human performance...

  12. Sound localization with head movement: implications for 3-d audio displays.

    Directory of Open Access Journals (Sweden)

    Ken Ian McAnally

    2014-08-01

    Full Text Available Previous studies have shown that the accuracy of sound localization is improved if listeners are allowed to move their heads during signal presentation. This study describes the function relating localization accuracy to the extent of head movement in azimuth. Sounds that are difficult to localize were presented in the free field from sources at a wide range of azimuths and elevations. Sounds remained active until the participants’ heads had rotated through windows ranging in width of 2°, 4°, 8°, 16°, 32°, or 64° of azimuth. Error in determining sound-source elevation and the rate of front/back confusion were found to decrease with increases in azimuth window width. Error in determining sound-source lateral angle was not found to vary with azimuth window width. Implications for 3-d audio displays: The utility of a 3-d audio display for imparting spatial information is likely to be improved if operators are able to move their heads during signal presentation. Head movement may compensate in part for a paucity of spectral cues to sound-source location resulting from limitations in either the audio signals presented or the directional filters (i.e., head-related transfer functions used to generate a display. However, head movements of a moderate size (i.e., through around 32° of azimuth may be required to ensure that spatial information is conveyed with high accuracy.

  13. Effects of musical expertise on oscillatory brain activity in response to emotional sounds.

    Science.gov (United States)

    Nolden, Sophie; Rigoulot, Simon; Jolicoeur, Pierre; Armony, Jorge L

    2017-08-01

    Emotions can be conveyed through a variety of channels in the auditory domain, be it via music, non-linguistic vocalizations, or speech prosody. Moreover, recent studies suggest that expertise in one sound category can impact the processing of emotional sounds in other sound categories as they found that musicians process more efficiently emotional musical and vocal sounds than non-musicians. However, the neural correlates of these modulations, especially their time course, are not very well understood. Consequently, we focused here on how the neural processing of emotional information varies as a function of sound category and expertise of participants. Electroencephalogram (EEG) of 20 non-musicians and 17 musicians was recorded while they listened to vocal (speech and vocalizations) and musical sounds. The amplitude of EEG-oscillatory activity in the theta, alpha, beta, and gamma band was quantified and Independent Component Analysis (ICA) was used to identify underlying components of brain activity in each band. Category differences were found in theta and alpha bands, due to larger responses to music and speech than to vocalizations, and in posterior beta, mainly due to differential processing of speech. In addition, we observed greater activation in frontal theta and alpha for musicians than for non-musicians, as well as an interaction between expertise and emotional content of sounds in frontal alpha. The results reflect musicians' expertise in recognition of emotion-conveying music, which seems to also generalize to emotional expressions conveyed by the human voice, in line with previous accounts of effects of expertise on musical and vocal sounds processing. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. The relation of vocal fold lesions and voice quality to voice handicap and psychosomatic well-being

    NARCIS (Netherlands)

    Smits, R.; Marres, H.A.; de Jong, F.

    2012-01-01

    BACKGROUND: Voice disorders have a multifactorial genesis and may be present in various ways. They can cause a significant communication handicap and impaired quality of life. OBJECTIVE: To assess the effect of vocal fold lesions and voice quality on voice handicap and psychosomatic well-being.

  15. Time course of the influence of musical expertise on the processing of vocal and musical sounds.

    Science.gov (United States)

    Rigoulot, S; Pell, M D; Armony, J L

    2015-04-02

    Previous functional magnetic resonance imaging (fMRI) studies have suggested that different cerebral regions preferentially process human voice and music. Yet, little is known on the temporal course of the brain processes that decode the category of sounds and how the expertise in one sound category can impact these processes. To address this question, we recorded the electroencephalogram (EEG) of 15 musicians and 18 non-musicians while they were listening to short musical excerpts (piano and violin) and vocal stimuli (speech and non-linguistic vocalizations). The task of the participants was to detect noise targets embedded within the stream of sounds. Event-related potentials revealed an early differentiation of sound category, within the first 100 ms after the onset of the sound, with mostly increased responses to musical sounds. Importantly, this effect was modulated by the musical background of participants, as musicians were more responsive to music sounds than non-musicians, consistent with the notion that musical training increases sensitivity to music. In late temporal windows, brain responses were enhanced in response to vocal stimuli, but musicians were still more responsive to music. These results shed new light on the temporal course of neural dynamics of auditory processing and reveal how it is impacted by the stimulus category and the expertise of participants. Copyright © 2015 IBRO. Published by Elsevier Ltd. All rights reserved.

  16. Detection System of Sound Noise Level (SNL) Based on Condenser Microphone Sensor

    Science.gov (United States)

    Rajagukguk, Juniastel; Eka Sari, Nurdieni

    2018-03-01

    The research aims to know the noise level by using the Arduino Uno as data processing input from sensors and called as Sound Noise Level (SNL). The working principle of the instrument is as noise detector with the show notifications the noise level on the LCD indicator and in the audiovisual form. Noise detection using the sensor is a condenser microphone and LM 567 as IC op-amps, which are assembled so that it can detect the noise, which sounds are captured by the sensor will turn the tide of sinusoida voice became sine wave energy electricity (altering sinusoida electric current) that is able to responded to complaints by the Arduino Uno. The tool is equipped with a detector consists of a set indicator LED and sound well as the notification from the text on LCD 16*2. Work setting indicators on the condition that, if the measured noise > 75 dB then sound will beep, the red LED will light up indicating the status of the danger. If the measured value on the LCD is higher than 56 dB, sound indicator will be beep and yellow LED will be on indicating noisy. If the noise measured value <55 dB, sound indicator will be quiet indicating peaceful from noisy. From the result of the research can be explained that the SNL is capable to detecting and displaying noise level with a measuring range 50-100 dB and capable to delivering the notification noise in audiovisual.

  17. Acoustic Measures of Voice and Physiologic Measures of Autonomic Arousal during Speech as a Function of Cognitive Load.

    Science.gov (United States)

    MacPherson, Megan K; Abur, Defne; Stepp, Cara E

    2017-07-01

    This study aimed to determine the relationship among cognitive load condition and measures of autonomic arousal and voice production in healthy adults. A prospective study design was conducted. Sixteen healthy young adults (eight men, eight women) produced a sentence containing an embedded Stroop task in each of two cognitive load conditions: congruent and incongruent. In both conditions, participants said the font color of the color words instead of the word text. In the incongruent condition, font color differed from the word text, creating an increase in cognitive load relative to the congruent condition in which font color and word text matched. Three physiologic measures of autonomic arousal (pulse volume amplitude, pulse period, and skin conductance response amplitude) and four acoustic measures of voice (sound pressure level, fundamental frequency, cepstral peak prominence, and low-to-high spectral energy ratio) were analyzed for eight sentence productions in each cognitive load condition per participant. A logistic regression model was constructed to predict the cognitive load condition (congruent or incongruent) using subject as a categorical predictor and the three autonomic measures and four acoustic measures as continuous predictors. It revealed that skin conductance response amplitude, cepstral peak prominence, and low-to-high spectral energy ratio were significantly associated with cognitive load condition. During speech produced under increased cognitive load, healthy young adults show changes in physiologic markers of heightened autonomic arousal and acoustic measures of voice quality. Future work is necessary to examine these measures in older adults and individuals with voice disorders. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  18. A note on measurement of sound pressure with intensity probes

    DEFF Research Database (Denmark)

    Juhl, Peter; Jacobsen, Finn

    2004-01-01

    be improved under a variety of realistic sound field conditions by applying a different weighting of the two pressure signals from the probe. The improved intensity probe can measure the sound pressure more accurately at high frequencies than an ordinary sound intensity probe or an ordinary sound level meter......The effect of scattering and diffraction on measurement of sound pressure with "two-microphone" sound intensity probes is examined using an axisymmetric boundary element model of the probe. Whereas it has been shown a few years ago that the sound intensity estimated with a two-microphone probe...... is reliable up to 10 kHz when using 0.5 in. microphones in the usual face-to-face arrangement separated by a 12 mm spacer, the sound pressure measured with the same instrument will typically be underestimated at high frequencies. It is shown in this paper that the estimate of the sound pressure can...

  19. Subjective Evaluation of Audiovisual Signals

    Directory of Open Access Journals (Sweden)

    F. Fikejz

    2010-01-01

    Full Text Available This paper deals with subjective evaluation of audiovisual signals, with emphasis on the interaction between acoustic and visual quality. The subjective test is realized by a simple rating method. The audiovisual signal used in this test is a combination of images compressed by JPEG compression codec and sound samples compressed by MPEG-1 Layer III. Images and sounds have various contents. It simulates a real situation when the subject listens to compressed music and watches compressed pictures without the access to original, i.e. uncompressed signals.

  20. Voice Onset Time in Azerbaijani Consonants

    Directory of Open Access Journals (Sweden)

    Ali Jahan

    2009-10-01

    Full Text Available Objective: Voice onset time is known to be cue for the distinction between voiced and voiceless stops and it can be used to describe or categorize a range of developmental, neuromotor and linguistic disorders. The aim of this study is determination of standard values of voice onset time for Azerbaijani language (Tabriz dialect. Materials & Methods: In this description-analytical study, 30 Azeris persons whom were selected conveniently by simple selection, uttered 46 monosyllabic words initiating with 6 Azerbaijani stops twice. Using Praat software, the voice onset time values were analyzed by waveform and wideband spectrogram in milliseconds. Vowel effect, sex differences and the effect of place of articulation on VOT, were evaluated and data were analyzed by one-way ANOVA test. Results: There was no significant difference in voice onset time between male and female Azeris speakers (P<0.05. Vowel and place of articulation had significant correlation with voice onset time (P<0.001. Voice onset time values for /b/, /p/, /d/, /t/, /g/, /k/, and [c], [ɟ] allophones were 10.64, 86.88, 13.35, 87.09, 26.25, 100.62, 131.19, 63.18 mili second, respectively. Conclusion: Voice onset time values are the same for Azerbaijani men and women. However, like many other languages, back and high vowels and back place of articulation lengthen VOT. Also, voiceless stops are aspirated in this language and voiced stops have positive VOT values.

  1. Does CPAP treatment affect the voice?

    Science.gov (United States)

    Saylam, Güleser; Şahin, Mustafa; Demiral, Dilek; Bayır, Ömer; Yüceege, Melike Bağnu; Çadallı Tatar, Emel; Korkmaz, Mehmet Hakan

    2016-12-20

    The aim of this study was to investigate alterations in voice parameters among patients using continuous positive airway pressure (CPAP) for the treatment of obstructive sleep apnea syndrome. Patients with an indication for CPAP treatment without any voice problems and with normal laryngeal findings were included and voice parameters were evaluated before and 1 and 6 months after CPAP. Videolaryngostroboscopic findings, a self-rated scale (Voice Handicap Index-10, VHI-10), perceptual voice quality assessment (GRBAS: grade, roughness, breathiness, asthenia, strain), and acoustic parameters were compared. Data from 70 subjects (48 men and 22 women) with a mean age of 44.2 ± 6.0 years were evaluated. When compared with the pre-CPAP treatment period, there was a significant increase in the VHI-10 score after 1 month of treatment and in VHI- 10 and total GRBAS scores, jitter percent (P = 0.01), shimmer percent, noise-to-harmonic ratio, and voice turbulence index after 6 months of treatment. Vague negative effects on voice parameters after the first month of CPAP treatment became more evident after 6 months. We demonstrated nonsevere alterations in the voice quality of patients under CPAP treatment. Given that CPAP is a long-term treatment it is important to keep these alterations in mind.

  2. Managing dysphonia in occupational voice users.

    Science.gov (United States)

    Behlau, Mara; Zambon, Fabiana; Madazio, Glaucya

    2014-06-01

    Recent advances with regard to occupational voice disorders are highlighted with emphasis on issues warranting consideration when assessing, training, and treating professional voice users. Findings include the many particularities between the various categories of professional voice users, the concept that the environment plays a major role in occupational voice disorders, and that biopsychosocial influences should be analyzed on an individual basis. Assessment via self-evaluation protocols to quantify the impact of these disorders is mandatory as a component of an evaluation and to document treatment outcomes. Discomfort or odynophonia has evolved as a critical symptom in this population. Clinical trials are limited and the complexity of the environment may be a limitation in experiment design. This review reinforced the need for large population studies of professional voice users; new data highlighted important factors specific to each group of voice users. Interventions directed at student teachers are necessities to not only improving the quality of future professionals, but also to avoid the frustration and limitations associated with chronic voice problems. The causative relationship between the work environment and voice disorders has not yet been established. Randomized controlled trials are lacking and must be a focus to enhance treatment paradigms for this population.

  3. Human and animal sounds influence recognition of body language.

    Science.gov (United States)

    Van den Stock, Jan; Grèzes, Julie; de Gelder, Beatrice

    2008-11-25

    In naturalistic settings emotional events have multiple correlates and are simultaneously perceived by several sensory systems. Recent studies have shown that recognition of facial expressions is biased towards the emotion expressed by a simultaneously presented emotional expression in the voice even if attention is directed to the face only. So far, no study examined whether this phenomenon also applies to whole body expressions, although there is no obvious reason why this crossmodal influence would be specific for faces. Here we investigated whether perception of emotions expressed in whole body movements is influenced by affective information provided by human and by animal vocalizations. Participants were instructed to attend to the action displayed by the body and to categorize the expressed emotion. The results indicate that recognition of body language is biased towards the emotion expressed by the simultaneously presented auditory information, whether it consist of human or of animal sounds. Our results show that a crossmodal influence from auditory to visual emotional information obtains for whole body video images with the facial expression blanked and includes human as well as animal sounds.

  4. Review of sound card photogates

    International Nuclear Information System (INIS)

    Gingl, Zoltan; Mingesz, Robert; Mellar, Janos; Makra, Peter

    2011-01-01

    Photogates are probably the most commonly used electronic instruments to aid experiments in the field of mechanics. Although they are offered by many manufacturers, they can be too expensive to be widely used in all classrooms, in multiple experiments or even at home experimentation. Today all computers have a sound card - an interface for analogue signals. It is possible to make very simple yet highly accurate photogates for cents, while much more sophisticated solutions are also available at a still very low cost. In our paper we show several experimentally tested ways of implementing sound card photogates in detail, and we also provide full-featured, free, open-source photogate software as a much more efficient experimentation tool than the usually used sound recording programs. Further information is provided on a dedicated web page, www.noise.physx.u-szeged.hu/edudev.

  5. Epidemiology of Voice Disorders in Latvian School Teachers.

    Science.gov (United States)

    Trinite, Baiba

    2017-07-01

    The prevalence of voice disorders in the teacher population in Latvia has not been studied so far and this is the first epidemiological study whose goal is to investigate the prevalence of voice disorders and their risk factors in this professional group. A wide cross-sectional study using stratified sampling methodology was implemented in the general education schools of Latvia. The self-administered voice risk factor questionnaire and the Voice Handicap Index were completed by 522 teachers. Two teachers groups were formed: the voice disorders group which included 235 teachers with actual voice problems or problems during the last 9 months; and the control group which included 174 teachers without voice disorders. Sixty-six percent of teachers gave a positive answer to the following question: Have you ever had problems with your voice? Voice problems are more often found in female than male teachers (68.2% vs 48.8%). Music teachers suffer from voice disorders more often than teachers of other subjects. Eighty-two percent of teachers first faced voice problems in their professional carrier. The odds of voice disorders increase if the following risk factors exist: extra vocal load, shouting, throat clearing, neglecting of personal health, background noise, chronic illnesses of the upper respiratory tract, allergy, job dissatisfaction, and regular stress in the working place. The study findings indicated a high risk of voice disorders among Latvian teachers. The study confirmed data concerning the multifactorial etiology of voice disorders. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  6. Physiological phenotyping of dementias using emotional sounds.

    Science.gov (United States)

    Fletcher, Phillip D; Nicholas, Jennifer M; Shakespeare, Timothy J; Downey, Laura E; Golden, Hannah L; Agustus, Jennifer L; Clark, Camilla N; Mummery, Catherine J; Schott, Jonathan M; Crutch, Sebastian J; Warren, Jason D

    2015-06-01

    Emotional behavioral disturbances are hallmarks of many dementias but their pathophysiology is poorly understood. Here we addressed this issue using the paradigm of emotionally salient sounds. Pupil responses and affective valence ratings for nonverbal sounds of varying emotional salience were assessed in patients with behavioral variant frontotemporal dementia (bvFTD) (n = 14), semantic dementia (SD) (n = 10), progressive nonfluent aphasia (PNFA) (n = 12), and AD (n = 10) versus healthy age-matched individuals (n = 26). Referenced to healthy individuals, overall autonomic reactivity to sound was normal in Alzheimer's disease (AD) but reduced in other syndromes. Patients with bvFTD, SD, and AD showed altered coupling between pupillary and affective behavioral responses to emotionally salient sounds. Emotional sounds are a useful model system for analyzing how dementias affect the processing of salient environmental signals, with implications for defining pathophysiological mechanisms and novel biomarker development.

  7. Measuring the speed of sound in air using smartphone applications

    Science.gov (United States)

    Yavuz, A.

    2015-05-01

    This study presents a revised version of an old experiment available in many textbooks for measuring the speed of sound in air. A signal-generator application in a smartphone is used to produce the desired sound frequency. Nodes of sound waves in a glass pipe, of which one end is immersed in water, are more easily detected, so results can be obtained more quickly than from traditional acoustic experiments using tuning forks.

  8. Quantifying sound quality in loudspeaker reproduction

    NARCIS (Netherlands)

    Beerends, John G.; van Nieuwenhuizen, Kevin; van den Broek, E.L.

    2016-01-01

    We present PREQUEL: Perceptual Reproduction Quality Evaluation for Loudspeakers. Instead of quantifying the loudspeaker system itself, PREQUEL quantifies the overall loudspeakers' perceived sound quality by assessing their acoustic output using a set of music signals. This approach introduces a

  9. Identifying hidden voice and video streams

    Science.gov (United States)

    Fan, Jieyan; Wu, Dapeng; Nucci, Antonio; Keralapura, Ram; Gao, Lixin

    2009-04-01

    Given the rising popularity of voice and video services over the Internet, accurately identifying voice and video traffic that traverse their networks has become a critical task for Internet service providers (ISPs). As the number of proprietary applications that deliver voice and video services to end users increases over time, the search for the one methodology that can accurately detect such services while being application independent still remains open. This problem becomes even more complicated when voice and video service providers like Skype, Microsoft, and Google bundle their voice and video services with other services like file transfer and chat. For example, a bundled Skype session can contain both voice stream and file transfer stream in the same layer-3/layer-4 flow. In this context, traditional techniques to identify voice and video streams do not work. In this paper, we propose a novel self-learning classifier, called VVS-I , that detects the presence of voice and video streams in flows with minimum manual intervention. Our classifier works in two phases: training phase and detection phase. In the training phase, VVS-I first extracts the relevant features, and subsequently constructs a fingerprint of a flow using the power spectral density (PSD) analysis. In the detection phase, it compares the fingerprint of a flow to the existing fingerprints learned during the training phase, and subsequently classifies the flow. Our classifier is not only capable of detecting voice and video streams that are hidden in different flows, but is also capable of detecting different applications (like Skype, MSN, etc.) that generate these voice/video streams. We show that our classifier can achieve close to 100% detection rate while keeping the false positive rate to less that 1%.

  10. Occupational risk factors and voice disorders.

    Science.gov (United States)

    Vilkman, E

    1996-01-01

    From the point of view of occupational health, the field of voice disorders is very poorly developed as compared, for instance, to the prevention and diagnostics of occupational hearing disorders. In fact, voice disorders have not even been recognized in the field of occupational medicine. Hence, it is obviously very rare in most countries that the voice disorder of a professional voice user, e.g. a teacher, a singer or an actor, is accepted as an occupational disease by insurance companies. However, occupational voice problems do not lack significance from the point of view of the patient. We also know from questionnaires and clinical studies that voice complaints are very common. Another example of job-related health problems, which has proved more successful in terms of its occupational health status, is the repetition strain injury of the elbow, i.e. the "tennis elbow". Its textbook definition could be used as such to describe an occupational voice disorder ("dysphonia professional is"). In the present paper the effects of such risk factors as vocal loading itself, background noise and room acoustics and low relative humidity of the air are discussed. Due to individual factors underlying the development of professional voice disorders, recommendations rather than regulations are called for. There are many simple and even relatively low-cost methods available for the prevention of vocal problems as well as for supporting rehabilitation.

  11. Voice Response Systems Technology.

    Science.gov (United States)

    Gerald, Jeanette

    1984-01-01

    Examines two methods of generating synthetic speech in voice response systems, which allow computers to communicate in human terms (speech), using human interface devices (ears): phoneme and reconstructed voice systems. Considerations prior to implementation, current and potential applications, glossary, directory, and introduction to Input Output…

  12. Clinical Voices - an update

    DEFF Research Database (Denmark)

    Fusaroli, Riccardo; Weed, Ethan

    Anomalous aspects of speech and voice, including pitch, fluency, and voice quality, are reported to characterise many mental disorders. However, it has proven difficult to quantify and explain this oddness of speech by employing traditional statistical methods. In this talk we will show how...

  13. Changes after voice therapy in objective and subjective voice measurements of pediatric patients with vocal nodules.

    Science.gov (United States)

    Tezcaner, Ciler Zahide; Karatayli Ozgursoy, Selmin; Ozgursoy, Selmin Karatayli; Sati, Isil; Dursun, Gursel

    2009-12-01

    The aim of this study was to analyze the efficiency of the voice therapy in children with vocal nodules by using the acoustic analysis and subjective assessment. Thirty-nine patients with vocal fold nodules, aged between 7 and 14, were included in the study. Each subject had voice therapy led by an experienced voice therapist once a week. All diagnostic and follow-up workouts were performed before the voice therapy and after the third or the sixth month. Transoral and/or transnasal videostroboscopic examination and acoustic analysis were achieved using multi-dimensional voice program (MDVP) and subjective analysis with GRBAS scale. As for the perceptual assessment, the difference was significant for four parameters out of five. A significant improvement was found in the acoustic analysis parameters of jitter, shimmer, and noise-to-harmonic ratio. The voice therapy which was planned according to patients' needs, age, compliance and response to therapy had positive effects on pediatric patients with vocal nodules. Acoustic analysis and GRBAS may be used successfully in the follow-up of pediatric vocal nodule treatment.

  14. Effects of a three-week vocal exercise program using the Finnish Kuukka exercises on the speaking voice of Norwegian broadcast journalism students.

    Science.gov (United States)

    Bele, Irene; Laukkanen, Anne-Maria; Sipilä, Laura

    2010-12-01

    Nine broadcast journalism students attended 10 hours in Kuukka vocal exercises, which aims at producing a ringing vocal quality. Nine control subjects received no training. A text was read at habitual loudness before and after the course. Five speech specialists evaluated the text samples for perceptual voice quality and analyzed mean fundamental frequency (F0), equivalent sound level (Leq), and long-term average spectrum (LTAS). For the Training Group, voice quality improved and correlated negatively with firmness and timbre (less firm and darker qualities being considered more desirable), and F0 increased slightly. Leq increased significantly in both groups. The results show positive and perceivable differences after the course. However, the aimed ring was not reached, may be due to too short time.

  15. Playful Interaction with Voice Sensing Modular Robots

    DEFF Research Database (Denmark)

    Heesche, Bjarke; MacDonald, Ewen; Fogh, Rune

    2013-01-01

    This paper describes a voice sensor, suitable for modular robotic systems, which estimates the energy and fundamental frequency, F0, of the user’s voice. Through a number of example applications and tests with children, we observe how the voice sensor facilitates playful interaction between child...... children and two different robot configurations. In future work, we will investigate if such a system can motivate children to improve voice control and explore how to extend the sensor to detect emotions in the user’s voice....

  16. Binaural loudness for artificial-head measurements in directional sound fields

    DEFF Research Database (Denmark)

    Sivonen, Ville Pekka; Ellermeier, Wolfgang

    2008-01-01

    The effect of the sound incidence angle on loudness was investigated for fifteen listeners who matched the loudness of sounds coming from five different incidence angles in the horizontal plane to that of the same sound with frontal incidence. The stimuli were presented via binaural synthesis...... by using head-related transfer functions measured for an artificial head. The results, which exhibited marked individual differences, show that loudness depends on the direction from which a sound reaches the listener. The average results suggest a relatively simple rule for combining the two signals...... at the ears of an artificial head for binaural loudness predictions....

  17. Voice Quality Estimation in Wireless Networks

    Directory of Open Access Journals (Sweden)

    Petr Zach

    2015-01-01

    Full Text Available This article deals with the impact of Wireless (Wi-Fi networks on the perceived quality of voice services. The Quality of Service (QoS metrics must be monitored in the computer network during the voice data transmission to ensure proper voice service quality the end-user has paid for, especially in the wireless networks. In addition to the QoS, research area called Quality of Experience (QoE provides metrics and methods for quality evaluation from the end-user’s perspective. This article focuses on a QoE estimation of Voice over IP (VoIP calls in the wireless networks using network simulator. Results contribute to voice quality estimation based on characteristics of the wireless network and location of a wireless client.

  18. The Influence of Sleep Disorders on Voice Quality.

    Science.gov (United States)

    Rocha, Bruna Rainho; Behlau, Mara

    2017-09-19

    To verify the influence of sleep quality on the voice. Descriptive and analytical cross-sectional study. Data were collected by an online or printed survey divided in three parts: (1) demographic data and vocal health aspects; (2) self-assessment of sleep and vocal quality, and the influence that sleep has on voice; and (3) sleep and voice self-assessment inventories-the Epworth Sleepiness Scale (ESS), the Pittsburgh Sleep Quality Index (PSQI), and the Voice Handicap Index reduced version (VHI-10). A total of 862 people were included (493 women, 369 men), with a mean age of 32 years old (maximum age of 79 and minimum age of 18 years old). The perception of the influence that sleep has on voice showed a difference (P influence a voice handicap are vocal self-assessment, ESS total score, and self-assessment of the influence that sleep has on voice. The absence of daytime sleepiness is a protective factor (odds ratio [OR] > 1) against perceived voice handicap; the presence of daytime sleepiness is a damaging factor (OR influences voice. Perceived poor sleep quality is related to perceived poor vocal quality. Individuals with a voice handicap observe a greater influence of sleep on voice than those without. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  19. Voice preprocessing system incorporating a real-time spectrum analyzer with programmable switched-capacitor filters

    Science.gov (United States)

    Knapp, G.

    1984-01-01

    As part of a speaker verification program for BISS (Base Installation Security System), a test system is being designed with a flexible preprocessing system for the evaluation of voice spectrum/verification algorithm related problems. The main part of this report covers the design, construction, and testing of a voice analyzer with 16 integrating real-time frequency channels ranging from 300 Hz to 3 KHz. The bandpass filter response of each channel is programmable by NMOS switched capacitor quad filter arrays. Presently, the accuracy of these units is limited to a moderate precision by the finite steps of programming. However, repeatability of characteristics between filter units and sections seems to be excellent for the implemented fourth-order Butterworth bandpass responses. We obtained a 0.1 dB linearity error of signal detection and measured a signal-to-noise ratio of approximately 70 dB. The proprocessing system discussed includes preemphasis filter design, gain normalizer design, and data acquisition system design as well as test results.

  20. Understanding Animal Detection of Precursor Earthquake Sounds.

    Science.gov (United States)

    Garstang, Michael; Kelley, Michael C

    2017-08-31

    We use recent research to provide an explanation of how animals might detect earthquakes before they occur. While the intrinsic value of such warnings is immense, we show that the complexity of the process may result in inconsistent responses of animals to the possible precursor signal. Using the results of our research, we describe a logical but complex sequence of geophysical events triggered by precursor earthquake crustal movements that ultimately result in a sound signal detectable by animals. The sound heard by animals occurs only when metal or other surfaces (glass) respond to vibrations produced by electric currents induced by distortions of the earth's electric fields caused by the crustal movements. A combination of existing measurement systems combined with more careful monitoring of animal response could nevertheless be of value, particularly in remote locations.

  1. Masking release by combined spatial and masker-fluctuation effects in the open sound field.

    Science.gov (United States)

    Middlebrooks, John C

    2017-12-01

    In a complex auditory scene, signals of interest can be distinguished from masking sounds by differences in source location [spatial release from masking (SRM)] and by differences between masker-alone and masker-plus-signal envelopes. This study investigated interactions between those factors in release of masking of 700-Hz tones in an open sound field. Signal and masker sources were colocated in front of the listener, or the signal source was shifted 90° to the side. In Experiment 1, the masker contained a 25-Hz-wide on-signal band plus flanking bands having envelopes that were either mutually uncorrelated or were comodulated. Comodulation masking release (CMR) was largely independent of signal location at a higher masker sound level, but at a lower level CMR was reduced for the lateral signal location. In Experiment 2, a brief signal was positioned at the envelope maximum (peak) or minimum (dip) of a 50-Hz-wide on-signal masker. Masking was released in dip more than in peak conditions only for the 90° signal. Overall, open-field SRM was greater in magnitude than binaural masking release reported in comparable closed-field studies, and envelope-related release was somewhat weaker. Mutual enhancement of masking release by spatial and envelope-related effects tended to increase with increasing masker level.

  2. Noise detection during heart sound recording using periodicity signatures

    International Nuclear Information System (INIS)

    Kumar, D; Carvalho, P; Paiva, R P; Henriques, J; Antunes, M

    2011-01-01

    Heart sound is a valuable biosignal for diagnosis of a large set of cardiac diseases. Ambient and physiological noise interference is one of the most usual and highly probable incidents during heart sound acquisition. It tends to change the morphological characteristics of heart sound that may carry important information for heart disease diagnosis. In this paper, we propose a new method applicable in real time to detect ambient and internal body noises manifested in heart sound during acquisition. The algorithm is developed on the basis of the periodic nature of heart sounds and physiologically inspired criteria. A small segment of uncontaminated heart sound exhibiting periodicity in time as well as in the time-frequency domain is first detected and applied as a reference signal in discriminating noise from the sound. The proposed technique has been tested with a database of heart sounds collected from 71 subjects with several types of heart disease inducing several noises during recording. The achieved average sensitivity and specificity are 95.88% and 97.56%, respectively

  3. Zero sound and quasiwave: separation in the magnetic field

    International Nuclear Information System (INIS)

    Bezuglyj, E.V.; Bojchuk, A.V.; Burma, N.G.; Fil', V.D.

    1995-01-01

    Theoretical and experimental results on the behavior of the longitudinal and transverse electron sound in a weak magnetic field are presented. It is shown theoretically that the effects of the magnetic field on zero sound velocity and ballistic transfer are opposite in sign and have sufficiently different dependences on the sample width, excitation frequency and relaxation time. This permits us to separate experimentally the Fermi-liquid and ballistic contributions in the electron sound signals. For the first time the ballistic transfer of the acoustic excitation by the quasiwave has been observed in zero magnetic field

  4. Sustained Magnetic Responses in Temporal Cortex Reflect Instantaneous Significance of Approaching and Receding Sounds.

    Directory of Open Access Journals (Sweden)

    Dominik R Bach

    Full Text Available Rising sound intensity often signals an approaching sound source and can serve as a powerful warning cue, eliciting phasic attention, perception biases and emotional responses. How the evaluation of approaching sounds unfolds over time remains elusive. Here, we capitalised on the temporal resolution of magnetoencephalograpy (MEG to investigate in humans a dynamic encoding of perceiving approaching and receding sounds. We compared magnetic responses to intensity envelopes of complex sounds to those of white noise sounds, in which intensity change is not perceived as approaching. Sustained magnetic fields over temporal sensors tracked intensity change in complex sounds in an approximately linear fashion, an effect not seen for intensity change in white noise sounds, or for overall intensity. Hence, these fields are likely to track approach/recession, but not the apparent (instantaneous distance of the sound source, or its intensity as such. As a likely source of this activity, the bilateral inferior temporal gyrus and right temporo-parietal junction emerged. Our results indicate that discrete temporal cortical areas parametrically encode behavioural significance in moving sound sources where the signal unfolded in a manner reminiscent of evidence accumulation. This may help an understanding of how acoustic percepts are evaluated as behaviourally relevant, where our results highlight a crucial role of cortical areas.

  5. Foley Sounds vs Real Sounds

    DEFF Research Database (Denmark)

    Trento, Stefano; Götzen, Amalia De

    2011-01-01

    This paper is an initial attempt to study the world of sound effects for motion pictures, also known as Foley sounds. Throughout several audio and audio-video tests we have compared both Foley and real sounds originated by an identical action. The main purpose was to evaluate if sound effects...

  6. Central voice production and pathophysiology of spasmodic dysphonia.

    Science.gov (United States)

    Mor, Niv; Simonyan, Kristina; Blitzer, Andrew

    2018-01-01

    Our ability to speak is complex, and the role of the central nervous system in controlling speech production is often overlooked in the field of otolaryngology. In this brief review, we present an integrated overview of speech production with a focus on the role of central nervous system. The role of central control of voice production is then further discussed in relation to the potential pathophysiology of spasmodic dysphonia (SD). Peer-review articles on central laryngeal control and SD were identified from PUBMED search. Selected articles were augmented with designated relevant publications. Publications that discussed central and peripheral nervous system control of voice production and the central pathophysiology of laryngeal dystonia were chosen. Our ability to speak is regulated by specialized complex mechanisms coordinated by high-level cortical signaling, brainstem reflexes, peripheral nerves, muscles, and mucosal actions. Recent studies suggest that SD results from a primary central disturbance associated with dysfunction at our highest levels of central voice control. The efficacy of botulinum toxin in treating SD may not be limited solely to its local effect on laryngeal muscles and also may modulate the disorder at the level of the central nervous system. Future therapeutic options that target the central nervous system may help modulate the underlying disorder in SD and allow clinicians to better understand the principal pathophysiology. NA.Laryngoscope, 128:177-183, 2018. © 2017 The American Laryngological, Rhinological and Otological Society, Inc.

  7. [Diagnostics and therapy in professional voice-users].

    Science.gov (United States)

    Richter, B; Echternach, M

    2010-04-01

    Voice is one of the most important instruments for expression and communication in humans. Dysphonia remains very frequent. Generally people in voice-intensive professions, such as teachers, call center employees, singers and actors suffer from these complaints. In recent years methods have been developed which facilitate appropriate diagnosis and therapy, based on the criteria of evidence based medicine, in voice patients appropriate to their degree of disease. The basic protocol of the European Laryngological Society offers a standardized evaluation of multidimensional voice parameters. In our own patient collective there were statistically significant improvements in voice quality, according to a pre/post mean value comparison, in both phonomicrosurgical (n=45) and voice therapy (n=30) patients in relation to RBH, DSI and VHI.

  8. Mathematical pattern, smoothing and digital filtering of a speech signal

    International Nuclear Information System (INIS)

    Razzam, Mohamed Habib

    1979-01-01

    After presentation of speech synthesis methods, characterized by a treatment of pre-recorded natural signals, or by an analog simulation of vocal tract, we present a new synthesis method especially based on a mathematical pattern of the signal, as a development of M. RODET's method. For their physiological origin, these signals are partially or totally voiced, or aleatory. For the phoneme voiced parts, we compute the formant curves, the sum of which constitute the wave, directly in time-domain by applying a specific envelope (operating as a time-window analysis) to a sinusoidal wave, The sinusoidal wave computation is made at the beginning of each signal's pseudo-period. The transition from successive periods is assured by a polynomial smoothing followed by a digital filtering. For the aleatory parts, we present an aleatory computation method of formant curves. Each signal is subjected to a melodic diagrams computed in accordance with the nature of the phoneme (vowel or consonant) and its context (isolated or not). (author) [fr

  9. 78 FR 13869 - Puget Sound Energy, Inc.; Puget Sound Energy, Inc.; Puget Sound Energy, Inc.; Puget Sound Energy...

    Science.gov (United States)

    2013-03-01

    ...-123-LNG; 12-128-NG; 12-148-NG; 12- 158-NG] Puget Sound Energy, Inc.; Puget Sound Energy, Inc.; Puget Sound Energy, Inc.; Puget Sound Energy, Inc.; Puget Sound Energy, Inc.; CE FLNG, LLC; Consolidated...-NG Puget Sound Energy, Inc Order granting long- term authority to import/export natural gas from/to...

  10. Surface return direction-of-arrival analysis for radar ice sounding surface clutter suppression

    DEFF Research Database (Denmark)

    Nielsen, Ulrik; Dall, Jørgen

    2015-01-01

    Airborne radar ice sounding is challenged by surface clutter masking the depth signal of interest. Surface clutter may even be prohibitive for potential space-based ice sounding radars. To some extent the radar antenna suppresses the surface clutter, and a multi-phase-center antenna in combination...... with coherent signal processing techniques can improve the suppression, in particular if the direction of arrival (DOA) of the clutter signal is estimated accurately. This paper deals with data-driven DOA estimation. By using P-band data from the ice shelf in Antarctica it is demonstrated that a varying...

  11. [Design of standard voice sample text for subjective auditory perceptual evaluation of voice disorders].

    Science.gov (United States)

    Li, Jin-rang; Sun, Yan-yan; Xu, Wen

    2010-09-01

    To design a speech voice sample text with all phonemes in Mandarin for subjective auditory perceptual evaluation of voice disorders. The principles for design of a speech voice sample text are: The short text should include the 21 initials and 39 finals, this may cover all the phonemes in Mandarin. Also, the short text should have some meanings. A short text was made out. It had 155 Chinese words, and included 21 initials and 38 finals (the final, ê, was not included because it was rarely used in Mandarin). Also, the text covered 17 light tones and one "Erhua". The constituent ratios of the initials and finals presented in this short text were statistically similar as those in Mandarin according to the method of similarity of the sample and population (r = 0.742, P text were statistically not similar as those in Mandarin (r = 0.731, P > 0.05). A speech voice sample text with all phonemes in Mandarin was made out. The constituent ratios of the initials and finals presented in this short text are similar as those in Mandarin. Its value for subjective auditory perceptual evaluation of voice disorders need further study.

  12. High frequency components of tracheal sound are emphasized during prolonged flow limitation

    International Nuclear Information System (INIS)

    Tenhunen, M; Huupponen, E; Saastamoinen, A; Kulkas, A; Himanen, S-L; Rauhala, E

    2009-01-01

    A nasal pressure transducer, which is used to study nocturnal airflow, also provides information about the inspiratory flow waveform. A round flow shape is presented during normal breathing. A flattened, non-round shape is found during hypopneas and it can also appear in prolonged episodes. The significance of this prolonged flow limitation is still not established. A tracheal sound spectrum has been analyzed further in order to achieve additional information about breathing during sleep. Increased sound frequencies over 500 Hz have been connected to obstruction of the upper airway. The aim of the present study was to examine the tracheal sound signal content of prolonged flow limitation and to find out whether prolonged flow limitation would consist of abundant high frequency activity. Sleep recordings of 36 consecutive patients were examined. The tracheal sound spectral analysis was performed on 10 min episodes of prolonged flow limitation, normal breathing and periodic apnea-hypopnea breathing. The highest total spectral amplitude, implicating loudest sounds, occurred during flow-limited breathing which also presented loudest sounds in all frequency bands above 100 Hz. In addition, the tracheal sound signal during flow-limited breathing constituted proportionally more high frequency activities compared to normal breathing and even periodic apnea-hypopnea breathing

  13. Voice pedagogy-what do we need?

    Science.gov (United States)

    Gill, Brian P; Herbst, Christian T

    2016-12-01

    The final keynote panel of the 10th Pan-European Voice Conference (PEVOC) was concerned with the topic 'Voice pedagogy-what do we need?' In this communication the panel discussion is summarized, and the authors provide a deepening discussion on one of the key questions, addressing the roles and tasks of people working with voice students. In particular, a distinction is made between (1) voice building (derived from the German term 'Stimmbildung'), primarily comprising the functional and physiological aspects of singing; (2) coaching, mostly concerned with performance skills; and (3) singing voice rehabilitation. Both public and private educators are encouraged to apply this distinction to their curricula, in order to arrive at more efficient singing teaching and to reduce the risk of vocal injury to the singers concerned.

  14. 33 CFR 83.36 - Signals to attract attention (Rule 36).

    Science.gov (United States)

    2010-07-01

    ... 33 Navigation and Navigable Waters 1 2010-07-01 2010-07-01 false Signals to attract attention... SECURITY INLAND NAVIGATION RULES RULES Sound and Light Signals § 83.36 Signals to attract attention (Rule 36). If necessary to attract the attention of another vessel, any vessel may make light or sound...

  15. Exploring science with sound: sonification and the use of sonograms as data analysis tool

    CERN Multimedia

    CERN. Geneva; Williams, Genevieve

    2017-01-01

    Resonances, periodicity, patterns and spectra are well-known notions that play crucial roles in particle physics, and that have always been at the junction between sound/music analysis and scientific exploration. Detecting the shape of a particular energy spectrum, studying the stability of a particle beam in a synchrotron, and separating signals from a noisy background are just a few examples where the connection with sound can be very strong, all sharing the same concepts of oscillations, cycles and frequency. This seminar will focus on analysing data and their relations by translating measurements into audible signals and using the natural capability of the ear to distinguish, characterise and analyse waveform shapes, amplitudes and relations. This process is called data sonification, and one of the main tools to investigate the structure of the sound is the sonogram (sometimes also called a spectrogram). A sonogram is a visual representation of how the spectrum of a certain sound signal changes with time...

  16. Time domain acoustic contrast control implementation of sound zones for low-frequency input signals

    DEFF Research Database (Denmark)

    Schellekens, Daan H. M.; Møller, Martin Bo; Olsen, Martin

    2016-01-01

    Sound zones are two or more regions within a listening space where listeners are provided with personal audio. Acoustic contrast control (ACC) is a sound zoning method that maximizes the average squared sound pressure in one zone constrained to constant pressure in other zones. State......-of-the-art time domain broadband acoustic contrast control (BACC) methods are designed for anechoic environments. These methods are not able to realize a flat frequency response in a limited frequency range within a reverberant environment. Sound field control in a limited frequency range is a requirement...... to accommodate the effective working range of the loudspeakers. In this paper, a new BACC method is proposed which results in an implementation realizing a flat frequency response in the target zone. This method is applied in a bandlimited low-frequency scenario where the loudspeaker layout surrounds two...

  17. Visual feedback of tongue movement for novel speech sound learning

    Directory of Open Access Journals (Sweden)

    William F Katz

    2015-11-01

    Full Text Available Pronunciation training studies have yielded important information concerning the processing of audiovisual (AV information. Second language (L2 learners show increased reliance on bottom-up, multimodal input for speech perception (compared to monolingual individuals. However, little is known about the role of viewing one’s own speech articulation processes during speech training. The current study investigated whether real-time, visual feedback for tongue movement can improve a speaker’s learning of non-native speech sounds. An interactive 3D tongue visualization system based on electromagnetic articulography (EMA was used in a speech training experiment. Native speakers of American English produced a novel speech sound (/ɖ̠/; a voiced, coronal, palatal stop before, during, and after trials in which they viewed their own speech movements using the 3D model. Talkers’ productions were evaluated using kinematic (tongue-tip spatial positioning and acoustic (burst spectra measures. The results indicated a rapid gain in accuracy associated with visual feedback training. The findings are discussed with respect to neural models for multimodal speech processing.

  18. Adaptive RD Optimized Hybrid Sound Coding

    NARCIS (Netherlands)

    Schijndel, N.H. van; Bensa, J.; Christensen, M.G.; Colomes, C.; Edler, B.; Heusdens, R.; Jensen, J.; Jensen, S.H.; Kleijn, W.B.; Kot, V.; Kövesi, B.; Lindblom, J.; Massaloux, D.; Niamut, O.A.; Nordén, F.; Plasberg, J.H.; Vafin, R.; Virette, D.; Wübbolt, O.

    2008-01-01

    Traditionally, sound codecs have been developed with a particular application in mind, their performance being optimized for specific types of input signals, such as speech or audio (music), and application constraints, such as low bit rate, high quality, or low delay. There is, however, an

  19. Voice Quality in Mobile Telecommunication System

    Directory of Open Access Journals (Sweden)

    Evaldas Stankevičius

    2013-05-01

    Full Text Available The article deals with methods measuring the quality of voice transmitted over the mobile network as well as related problem, algorithms and options. It presents the created voice quality measurement system and discusses its adequacy as well as efficiency. Besides, the author presents the results of system application under the optimal hardware configuration. Under almost ideal conditions, the system evaluates the voice quality with MOS 3.85 average estimate; while the standardized TEMS Investigation 9.0 has 4.05 average MOS estimate. Next, the article presents the discussion of voice quality predictor implementation and investigates the predictor using nonlinear and linear prediction methods of voice quality dependence on the mobile network settings. Nonlinear prediction using artificial neural network resulted in the correlation coefficient of 0.62. While the linear prediction method using the least mean squares resulted in the correlation coefficient of 0.57. The analytical expression of voice quality features from the three network parameters: BER, C / I, RSSI is given as well.Article in Lithuanian

  20. The voice conveys specific emotions: Evidence from vocal burst displays

    OpenAIRE

    Simon-Thomas, E.; Keltner, D.; Sauter, D.; Sinicropi-Yao, L.; Abramson, A.

    2009-01-01

    Studies of emotion signaling inform claims about the taxonomic structure, evolutionary origins, and physiological correlates of emotions. Emotion vocalization research has tended to focus on a limited set of emotions: anger, disgust, fear, sadness, surprise, happiness, and for the voice, also tenderness. Here, we examine how well brief vocal bursts can communicate 22 different emotions: 9 negative (Study 1) and 13 positive (Study 2), and whether prototypical vocal bursts convey emotions more ...

  1. Suppression of sound radiation to far field of near-field acoustic communication system using evanescent sound field

    Science.gov (United States)

    Fujii, Ayaka; Wakatsuki, Naoto; Mizutani, Koichi

    2016-01-01

    A method of suppressing sound radiation to the far field of a near-field acoustic communication system using an evanescent sound field is proposed. The amplitude of the evanescent sound field generated from an infinite vibrating plate attenuates exponentially with increasing a distance from the surface of the vibrating plate. However, a discontinuity of the sound field exists at the edge of the finite vibrating plate in practice, which broadens the wavenumber spectrum. A sound wave radiates over the evanescent sound field because of broadening of the wavenumber spectrum. Therefore, we calculated the optimum distribution of the particle velocity on the vibrating plate to reduce the broadening of the wavenumber spectrum. We focused on a window function that is utilized in the field of signal analysis for reducing the broadening of the frequency spectrum. The optimization calculation is necessary for the design of window function suitable for suppressing sound radiation and securing a spatial area for data communication. In addition, a wide frequency bandwidth is required to increase the data transmission speed. Therefore, we investigated a suitable method for calculating the sound pressure level at the far field to confirm the variation of the distribution of sound pressure level determined on the basis of the window shape and frequency. The distribution of the sound pressure level at a finite distance was in good agreement with that obtained at an infinite far field under the condition generating the evanescent sound field. Consequently, the window function was optimized by the method used to calculate the distribution of the sound pressure level at an infinite far field using the wavenumber spectrum on the vibrating plate. According to the result of comparing the distributions of the sound pressure level in the cases with and without the window function, it was confirmed that the area whose sound pressure level was reduced from the maximum level to -50 dB was

  2. Smartphone App for Voice Disorders

    Science.gov (United States)

    ... on. Feature: Taste, Smell, Hearing, Language, Voice, Balance Smartphone App for Voice Disorders Past Issues / Fall 2013 ... developed a mobile monitoring device that relies on smartphone technology to gather a week's worth of talking, ...

  3. Hearing Voices and Seeing Things

    Science.gov (United States)

    ... Facts for Families Guide Facts for Families - Vietnamese Hearing Voices and Seeing Things No. 102; Updated October ... delusions (a fixed, false, and often bizarre belief). Hearing voices or seeing things that are not there ...

  4. Measurement and classification of heart and lung sounds by using LabView for educational use.

    Science.gov (United States)

    Altrabsheh, B

    2010-01-01

    This study presents the design, development and implementation of a simple low-cost method of phonocardiography signal detection. Human heart and lung signals are detected by using a simple microphone through a personal computer; the signals are recorded and analysed using LabView software. Amplitude and frequency analyses are carried out for various phonocardiography pathological cases. Methods for automatic classification of normal and abnormal heart sounds, murmurs and lung sounds are presented. Various cases of heart and lung sound measurement are recorded and analysed. The measurements can be saved for further analysis. The method in this study can be used by doctors as a detection tool aid and may be useful for teaching purposes at medical and nursing schools.

  5. Blast noise classification with common sound level meter metrics.

    Science.gov (United States)

    Cvengros, Robert M; Valente, Dan; Nykaza, Edward T; Vipperman, Jeffrey S

    2012-08-01

    A common set of signal features measurable by a basic sound level meter are analyzed, and the quality of information carried in subsets of these features are examined for their ability to discriminate military blast and non-blast sounds. The analysis is based on over 120 000 human classified signals compiled from seven different datasets. The study implements linear and Gaussian radial basis function (RBF) support vector machines (SVM) to classify blast sounds. Using the orthogonal centroid dimension reduction technique, intuition is developed about the distribution of blast and non-blast feature vectors in high dimensional space. Recursive feature elimination (SVM-RFE) is then used to eliminate features containing redundant information and rank features according to their ability to separate blasts from non-blasts. Finally, the accuracy of the linear and RBF SVM classifiers is listed for each of the experiments in the dataset, and the weights are given for the linear SVM classifier.

  6. Clinical Features of Psychogenic Voice Disorder and the Efficiency of Voice Therapy and Psychological Evaluation.

    Science.gov (United States)

    Tezcaner, Zahide Çiler; Gökmen, Muhammed Fatih; Yıldırım, Sibel; Dursun, Gürsel

    2017-11-06

    The aim of this study was to define the clinical features of psychogenic voice disorder (PVD) and explore the treatment efficiency of voice therapy and psychological evaluation. Fifty-eight patients who received treatment following the PVD diagnosis and had no organic or other functional voice disorders were assessed retrospectively based on laryngoscopic examinations and subjective and objective assessments. Epidemiological characteristics, accompanying organic and psychological disorders, preferred methods of treatment, and previous treatment outcomes were examined for each patient. A comparison was made based on voice disorders and responses to treatment between patients who received psychotherapy and patients who did not. Participants in this study comprised 58 patients, 10 male and 48 female. Voice therapy was applied in all patients, 54 (93.1%) of whom had improvement in their voice. Although all patients were advised to undergo psychological assessment, only 60.3% (35/58) of them underwent psychological assessment. No statistically significant difference was found between patients who did receive psychological support concerning their treatment responses and patients who did not. Relapse occurred in 14.7% (5/34) of the patients who applied for psychological assessment and in 50% (10/20) of those who did not. There was a statistically significant difference in relapse rates, which was higher among patients who did not receive psychological support (P therapy is an efficient treatment method for PVD. However, in the long-term follow-up, relapse of the disease is observed to be higher among patients who failed to follow up on the recommendation for psychological assessment. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  7. Background noise exerts diverse effects on the cortical encoding of foreground sounds.

    Science.gov (United States)

    Malone, B J; Heiser, Marc A; Beitel, Ralph E; Schreiner, Christoph E

    2017-08-01

    In natural listening conditions, many sounds must be detected and identified in the context of competing sound sources, which function as background noise. Traditionally, noise is thought to degrade the cortical representation of sounds by suppressing responses and increasing response variability. However, recent studies of neural network models and brain slices have shown that background synaptic noise can improve the detection of signals. Because acoustic noise affects the synaptic background activity of cortical networks, it may improve the cortical responses to signals. We used spike train decoding techniques to determine the functional effects of a continuous white noise background on the responses of clusters of neurons in auditory cortex to foreground signals, specifically frequency-modulated sweeps (FMs) of different velocities, directions, and amplitudes. Whereas the addition of noise progressively suppressed the FM responses of some cortical sites in the core fields with decreasing signal-to-noise ratios (SNRs), the stimulus representation remained robust or was even significantly enhanced at specific SNRs in many others. Even though the background noise level was typically not explicitly encoded in cortical responses, significant information about noise context could be decoded from cortical responses on the basis of how the neural representation of the foreground sweeps was affected. These findings demonstrate significant diversity in signal in noise processing even within the core auditory fields that could support noise-robust hearing across a wide range of listening conditions. NEW & NOTEWORTHY The ability to detect and discriminate sounds in background noise is critical for our ability to communicate. The neural basis of robust perceptual performance in noise is not well understood. We identified neuronal populations in core auditory cortex of squirrel monkeys that differ in how they process foreground signals in background noise and that may

  8. [Hearing voices does not always constitute a psychosis].

    Science.gov (United States)

    Sommer, I E C; van der Spek, D W

    2016-01-01

    Hearing voices (i.e. auditory verbal hallucinations) is mainly known as part of schizophrenia and other psychotic disorders. However, hearing voices is a symptom that can occur in many psychiatric, neurological and general medical conditions. We present three cases of non-psychotic patients with auditory verbal hallucinations caused by different disorders. The first patient is a 74-year-old male with voices due to hearing loss, the second is a 20-year-old woman with voices due to traumatisation. The third patient is a 27-year-old woman with voices caused by temporal lobe epilepsy. Hearing voices is a phenomenon that occurs in a variety of disorders. Therefore, identification of the underlying disorder is essential to indicate treatment. Improvement of coping with the voices can reduce their impact on a patient. Antipsychotic drugs are especially effective when hearing voices is accompanied by delusions or disorganization. When this is not the case, the efficacy of antipsychotic drugs will probably not outweigh the side-effects.

  9. Permanent Quadriplegia Following Replacement of Voice Prosthesis.

    Science.gov (United States)

    Ozturk, Kayhan; Erdur, Omer; Kibar, Ertugrul

    2016-11-01

    The authors presented a patient with quadriplegia caused by cervical spine abscess following voice prosthesis replacement. The authors present the first reported permanent quadriplegia patient caused by voice prosthesis replacement. The authors wanted to emphasize that life-threatening complications may be faced during the replacement of voice prosthesis. Care should be taken during the replacement of voice prosthesis and if some problems have been faced during the procedure patients must be followed closely.

  10. Second sound scattering in superfluid helium

    International Nuclear Information System (INIS)

    Rosgen, T.

    1985-01-01

    Focusing cavities are used to study the scattering of second sound in liquid helium II. The special geometries reduce wall interference effects and allow measurements in very small test volumes. In a first experiment, a double elliptical cavity is used to focus a second sound wave onto a small wire target. A thin film bolometer measures the side scattered wave component. The agreement with a theoretical estimate is reasonable, although some problems arise from the small measurement volume and associated alignment requirements. A second cavity is based on confocal parabolas, thus enabling the use of large planar sensors. A cylindrical heater produces again a focused second sound wave. Three sensors monitor the transmitted wave component as well as the side scatter in two different directions. The side looking sensors have very high sensitivities due to their large size and resistance. Specially developed cryogenic amplifers are used to match them to the signal cables. In one case, a second auxiliary heater is used to set up a strong counterflow in the focal region. The second sound wave then scatters from the induced fluid disturbances

  11. An open access database for the evaluation of heart sound algorithms.

    Science.gov (United States)

    Liu, Chengyu; Springer, David; Li, Qiao; Moody, Benjamin; Juan, Ricardo Abad; Chorro, Francisco J; Castells, Francisco; Roig, José Millet; Silva, Ikaro; Johnson, Alistair E W; Syed, Zeeshan; Schmidt, Samuel E; Papadaniil, Chrysa D; Hadjileontiadis, Leontios; Naseri, Hosein; Moukadem, Ali; Dieterlen, Alain; Brandt, Christian; Tang, Hong; Samieinasab, Maryam; Samieinasab, Mohammad Reza; Sameni, Reza; Mark, Roger G; Clifford, Gari D

    2016-12-01

    In the past few decades, analysis of heart sound signals (i.e. the phonocardiogram or PCG), especially for automated heart sound segmentation and classification, has been widely studied and has been reported to have the potential value to detect pathology accurately in clinical applications. However, comparative analyses of algorithms in the literature have been hindered by the lack of high-quality, rigorously validated, and standardized open databases of heart sound recordings. This paper describes a public heart sound database, assembled for an international competition, the PhysioNet/Computing in Cardiology (CinC) Challenge 2016. The archive comprises nine different heart sound databases sourced from multiple research groups around the world. It includes 2435 heart sound recordings in total collected from 1297 healthy subjects and patients with a variety of conditions, including heart valve disease and coronary artery disease. The recordings were collected from a variety of clinical or nonclinical (such as in-home visits) environments and equipment. The length of recording varied from several seconds to several minutes. This article reports detailed information about the subjects/patients including demographics (number, age, gender), recordings (number, location, state and time length), associated synchronously recorded signals, sampling frequency and sensor type used. We also provide a brief summary of the commonly used heart sound segmentation and classification methods, including open source code provided concurrently for the Challenge. A description of the PhysioNet/CinC Challenge 2016, including the main aims, the training and test sets, the hand corrected annotations for different heart sound states, the scoring mechanism, and associated open source code are provided. In addition, several potential benefits from the public heart sound database are discussed.

  12. Voice amplification for primary school teachers with voice disorders: a randomized clinical trial.

    Science.gov (United States)

    Bovo, Roberto; Trevisi, Patrizia; Emanuelli, Enzo; Martini, Alessandro

    2013-06-01

    Several studies have demonstrated a high prevalence of voice disorders in teachers, together with the personal, professional and economical consequences of the problem. Good primary prevention should be based on 3 aspects: 1) amelioration of classroom acoustics, 2) voice care programs for future professional voice users, including teachers and 3) classroom or portable amplification systems. The aim of the study was to assess the benefit obtained from the use of portable amplification systems by female primary school teachers in their occupational setting. Forty female primary school teachers attended a course about professional voice care, which comprised two theoretical lectures, each 60 min long. Thereafter, they were randomized into 2 groups: the teachers of the first group were asked to use a portable vocal amplifier for 3 months, till the end of school-year. The other 20 teachers were part of the control group, matched for age and years of employment. All subjects had a grade 1 of dysphonia with no significant organic lesion of the vocal folds. Most teachers of the experimental group used the amplifier consistently for the whole duration of the experiment and found it very useful in reducing the symptoms of vocal fatigue. In fact, after 3 months, Voice Handicap Index (VHI) scores in "course + amplifier" group demonstrated a significant amelioration (p = 0.003). The perceptual grade of dysphonia also improved significantly (p = 0.0005). The same parameters changed favourably also in the "course only" group, but the results were not statistically significant (p = 0.4 for VHI and p = 0.03 for perceptual grade). In teachers, and particularly in those with a constitutional weak voice and/or those who are prone to vocal fold pathology, vocal amplifiers may be an effective and low-cost intervention to decrease potentially damaging vocal loads and may represent a necessary form of prevention.

  13. Digital servo control of random sound test excitation. [in reverberant acoustic chamber

    Science.gov (United States)

    Nakich, R. B. (Inventor)

    1974-01-01

    A digital servocontrol system for random noise excitation of a test object in a reverberant acoustic chamber employs a plurality of sensors spaced in the sound field to produce signals in separate channels which are decorrelated and averaged. The average signal is divided into a plurality of adjacent frequency bands cyclically sampled by a time division multiplex system, converted into digital form, and compared to a predetermined spectrum value stored in digital form. The results of the comparisons are used to control a time-shared up-down counter to develop gain control signals for the respective frequency bands in the spectrum of random sound energy picked up by the microphones.

  14. Heart Sound Biometric System Based on Marginal Spectrum Analysis

    Science.gov (United States)

    Zhao, Zhidong; Shen, Qinqin; Ren, Fangqin

    2013-01-01

    This work presents a heart sound biometric system based on marginal spectrum analysis, which is a new feature extraction technique for identification purposes. This heart sound identification system is comprised of signal acquisition, pre-processing, feature extraction, training, and identification. Experiments on the selection of the optimal values for the system parameters are conducted. The results indicate that the new spectrum coefficients result in a significant increase in the recognition rate of 94.40% compared with that of the traditional Fourier spectrum (84.32%) based on a database of 280 heart sounds from 40 participants. PMID:23429515

  15. Acoustic cues for the recognition of self-voice and other-voice

    Directory of Open Access Journals (Sweden)

    Mingdi eXu

    2013-10-01

    Full Text Available Self-recognition, being indispensable for successful social communication, has become a major focus in current social neuroscience. The physical aspects of the self are most typically manifested in the face and voice. Compared with the wealth of studies on self-face recognition, self-voice recognition (SVR has not gained much attention. Converging evidence has suggested that the fundamental frequency (F0 and formant structures serve as the key acoustic cues for other-voice recognition (OVR. However, little is known about which, and how, acoustic cues are utilized for SVR as opposed to OVR. To address this question, we independently manipulated the F0 and formant information of recorded voices and investigated their contributions to SVR and OVR. Japanese participants were presented with recorded vocal stimuli and were asked to identify the speaker—either themselves or one of their peers. Six groups of 5 peers of the same sex participated in the study. Under conditions where the formant information was fully preserved and where only the frequencies lower than the third formant (F3 were retained, accuracies of SVR deteriorated significantly with the modulation of the F0, and the results were comparable for OVR. By contrast, under a condition where only the frequencies higher than F3 were retained, the accuracy of SVR was significantly higher than that of OVR throughout the range of F0 modulations, and the F0 scarcely affected the accuracies of SVR and OVR. Our results indicate that while both F0 and formant information are involved in SVR, as well as in OVR, the advantage of SVR is manifested only when major formant information for speech intelligibility is absent. These findings imply the robustness of self-voice representation, possibly by virtue of auditory familiarity and other factors such as its association with motor/articulatory representation.

  16. Multidimensional assessment of strongly irregular voices such as in substitution voicing and spasmodic dysphonia: a compilation of own research.

    Science.gov (United States)

    Moerman, Mieke; Martens, Jean-Pierre; Dejonckere, Philippe

    2015-04-01

    This article is a compilation of own research performed during the European COoperation in Science and Technology (COST) action 2103: 'Advance Voice Function Assessment', an initiative of voice and speech processing teams consisting of physicists, engineers, and clinicians. This manuscript concerns analyzing largely irregular voicing types, namely substitution voicing (SV) and adductor spasmodic dysphonia (AdSD). A specific perceptual rating scale (IINFVo) was developed, and the Auditory Model Based Pitch Extractor (AMPEX), a piece of software that automatically analyses running speech and generates pitch values in background noise, was applied. The IINFVo perceptual rating scale has been shown to be useful in evaluating SV. The analysis of strongly irregular voices stimulated a modification of the European Laryngological Society's assessment protocol which was originally designed for the common types of (less severe) dysphonia. Acoustic analysis with AMPEX demonstrates that the most informative features are, for SV, the voicing-related acoustic features and, for AdSD, the perturbation measures. Poor correlations between self-assessment and acoustic and perceptual dimensions in the assessment of highly irregular voices argue for a multidimensional approach.

  17. Voice and choice by delegation.

    Science.gov (United States)

    van de Bovenkamp, Hester; Vollaard, Hans; Trappenburg, Margo; Grit, Kor

    2013-02-01

    In many Western countries, options for citizens to influence public services are increased to improve the quality of services and democratize decision making. Possibilities to influence are often cast into Albert Hirschman's taxonomy of exit (choice), voice, and loyalty. In this article we identify delegation as an important addition to this framework. Delegation gives individuals the chance to practice exit/choice or voice without all the hard work that is usually involved in these options. Empirical research shows that not many people use their individual options of exit and voice, which could lead to inequality between users and nonusers. We identify delegation as a possible solution to this problem, using Dutch health care as a case study to explore this option. Notwithstanding various advantages, we show that voice and choice by delegation also entail problems of inequality and representativeness.

  18. Voice stress analysis and evaluation

    Science.gov (United States)

    Haddad, Darren M.; Ratley, Roy J.

    2001-02-01

    Voice Stress Analysis (VSA) systems are marketed as computer-based systems capable of measuring stress in a person's voice as an indicator of deception. They are advertised as being less expensive, easier to use, less invasive in use, and less constrained in their operation then polygraph technology. The National Institute of Justice have asked the Air Force Research Laboratory for assistance in evaluating voice stress analysis technology. Law enforcement officials have also been asking questions about this technology. If VSA technology proves to be effective, its value for military and law enforcement application is tremendous.

  19. Effects of Medications on Voice

    Science.gov (United States)

    ... ENTCareers Marketplace Find an ENT Doctor Near You Effects of Medications on Voice Effects of Medications on Voice Patient Health Information News ... replacement therapy post-menopause may have a variable effect. An inadequate level of thyroid replacement medication in ...

  20. Social and emotional values of sounds influence human (Homo sapiens and non-human primate (Cercopithecus campbelli auditory laterality.

    Directory of Open Access Journals (Sweden)

    Muriel Basile

    Full Text Available The last decades evidenced auditory laterality in vertebrates, offering new important insights for the understanding of the origin of human language. Factors such as the social (e.g. specificity, familiarity and emotional value of sounds have been proved to influence hemispheric specialization. However, little is known about the crossed effect of these two factors in animals. In addition, human-animal comparative studies, using the same methodology, are rare. In our study, we adapted the head turn paradigm, a widely used non invasive method, on 8-9-year-old schoolgirls and on adult female Campbell's monkeys, by focusing on head and/or eye orientations in response to sound playbacks. We broadcast communicative signals (monkeys: calls, humans: speech emitted by familiar individuals presenting distinct degrees of social value (female monkeys: conspecific group members vs heterospecific neighbours, human girls: from the same vs different classroom and emotional value (monkeys: contact vs threat calls; humans: friendly vs aggressive intonation. We evidenced a crossed-categorical effect of social and emotional values in both species since only "negative" voices from same class/group members elicited a significant auditory laterality (Wilcoxon tests: monkeys, T = 0 p = 0.03; girls: T = 4.5 p = 0.03. Moreover, we found differences between species as a left and right hemisphere preference was found respectively in humans and monkeys. Furthermore while monkeys almost exclusively responded by turning their head, girls sometimes also just moved their eyes. This study supports theories defending differential roles played by the two hemispheres in primates' auditory laterality and evidenced that more systematic species comparisons are needed before raising evolutionary scenario. Moreover, the choice of sound stimuli and behavioural measures in such studies should be the focus of careful attention.

  1. The Voices of the Documentarist

    Science.gov (United States)

    Utterback, Ann S.

    1977-01-01

    Discusses T. S. Elliot's essay, "The Three Voices of Poetry" which conceptualizes the position taken by the poet or creator. Suggests that an examination of documentary film, within the three voices concept, expands the critical framework of the film genre. (MH)

  2. Obligatory and facultative brain regions for voice-identity recognition

    Science.gov (United States)

    Roswandowitz, Claudia; Kappes, Claudia; Obrig, Hellmuth; von Kriegstein, Katharina

    2018-01-01

    Abstract Recognizing the identity of others by their voice is an important skill for social interactions. To date, it remains controversial which parts of the brain are critical structures for this skill. Based on neuroimaging findings, standard models of person-identity recognition suggest that the right temporal lobe is the hub for voice-identity recognition. Neuropsychological case studies, however, reported selective deficits of voice-identity recognition in patients predominantly with right inferior parietal lobe lesions. Here, our aim was to work towards resolving the discrepancy between neuroimaging studies and neuropsychological case studies to find out which brain structures are critical for voice-identity recognition in humans. We performed a voxel-based lesion-behaviour mapping study in a cohort of patients (n = 58) with unilateral focal brain lesions. The study included a comprehensive behavioural test battery on voice-identity recognition of newly learned (voice-name, voice-face association learning) and familiar voices (famous voice recognition) as well as visual (face-identity recognition) and acoustic control tests (vocal-pitch and vocal-timbre discrimination). The study also comprised clinically established tests (neuropsychological assessment, audiometry) and high-resolution structural brain images. The three key findings were: (i) a strong association between voice-identity recognition performance and right posterior/mid temporal and right inferior parietal lobe lesions; (ii) a selective association between right posterior/mid temporal lobe lesions and voice-identity recognition performance when face-identity recognition performance was factored out; and (iii) an association of right inferior parietal lobe lesions with tasks requiring the association between voices and faces but not voices and names. The results imply that the right posterior/mid temporal lobe is an obligatory structure for voice-identity recognition, while the inferior parietal

  3. The effect of voice quality on hiring decisions

    Directory of Open Access Journals (Sweden)

    Lea Tylečková

    2017-09-01

    Full Text Available This paper examines the effect of voice quality on hiring decisions. Considering voice quality an important tool in an individual’s self-presentation in the job market, it may very well enhance his/her job prospects, while some voice qualities may affect employers’ judgments in a negative way. Five men and five women were recorded reading four different utterances representing answers to job interviewers’ questions in four different phonation guises: modal, breathy, creaky and pressed. 38 professional employment interviewers recorded the speakers’ hireability and personality ratings (likeability, self-confidence and trustworthiness on 7-point semantic differential scales based on the speakers’ voice. The results revealed a significant effect of the phonation guises on the speakers’ ratings with the modal voice being superior to the cluster of non-modal voices. Interestingly, the non-modal guises were evaluated in a very similar way, except for the self-confidence category with the breathy voice getting the lowest scores on the one hand and the pressed voice correlating with high self-confidence ratings on the other.

  4. Can a voice disorder be an occupational disease?

    Directory of Open Access Journals (Sweden)

    Daša Gluvajić

    2012-11-01

    Full Text Available Voice disorders are all changes in the voice quality that can be detected by hearing. Some etiological factors that contribute to the development of voice disorders are related to occupation, working environment and working conditions. In modern societies one third of the labour force works in professions with vocal loading. In such professions, voice disorders influence work ability and quality of life. For an occupational disease, the exposure to harmful factors in the workplace is essential and causes the development of a disorder in a previously healthy individual. In some European countries, voice disorders in teachers, which do not improve after proper treatment are recognized as occupational diseases. In Slovenia, no organic or functional voice disorder is listed on the current list of occupational diseases. Prevention and cure of occupational voice disorders can contribute to better safety at the workplace and improve the workers’ health. Voice professionals must also know that they are responsible for their own health and that they must actively take care of it.

  5. The voice conveys specific emotions: evidence from vocal burst displays.

    Science.gov (United States)

    Simon-Thomas, Emiliana R; Keltner, Dacher J; Sauter, Disa; Sinicropi-Yao, Lara; Abramson, Anna

    2009-12-01

    Studies of emotion signaling inform claims about the taxonomic structure, evolutionary origins, and physiological correlates of emotions. Emotion vocalization research has tended to focus on a limited set of emotions: anger, disgust, fear, sadness, surprise, happiness, and for the voice, also tenderness. Here, we examine how well brief vocal bursts can communicate 22 different emotions: 9 negative (Study 1) and 13 positive (Study 2), and whether prototypical vocal bursts convey emotions more reliably than heterogeneous vocal bursts (Study 3). Results show that vocal bursts communicate emotions like anger, fear, and sadness, as well as seldom-studied states like awe, compassion, interest, and embarrassment. Ancillary analyses reveal family-wise patterns of vocal burst expression. Errors in classification were more common within emotion families (e.g., 'self-conscious,' 'pro-social') than between emotion families. The three studies reported highlight the voice as a rich modality for emotion display that can inform fundamental constructs about emotion.

  6. Sound signatures and production mechanisms of three species of pipefishes (Family: Syngnathidae

    Directory of Open Access Journals (Sweden)

    Adam Chee Ooi Lim

    2015-12-01

    Full Text Available Background. Syngnathid fishes produce three kinds of sounds, named click, growl and purr. These sounds are generated by different mechanisms to give a consistent signal pattern or signature which is believed to play a role in intraspecific and interspecific communication. Commonly known sounds are produced when the fish feeds (click, purr or is under duress (growl. While there are more acoustic studies on seahorses, pipefishes have not received much attention. Here we document the differences in feeding click signals between three species of pipefishes and relate them to cranial morphology and kinesis, or the sound-producing mechanism.Methods. The feeding clicks of two species of freshwater pipefishes, Doryichthys martensii and Doryichthys deokhathoides and one species of estuarine pipefish, Syngnathoides biaculeatus, were recorded by a hydrophone in acoustic dampened tanks. The acoustic signals were analysed using time-scale distribution (or scalogram based on wavelet transform. A detailed time-varying analysis of the spectral contents of the localized acoustic signal was obtained by jointly interpreting the oscillogram, scalogram and power spectrum. The heads of both Doryichthys species were prepared for microtomographical scans which were analysed using a 3D imaging software. Additionally, the cranial bones of all three species were examined using a clearing and double-staining method for histological studies.Results. The sound characteristics of the feeding click of the pipefish is species-specific, appearing to be dependent on three bones: the supraoccipital, 1st postcranial plate and 2nd postcranial plate. The sounds are generated when the head of the Dorichthyes pipefishes flexes backward during the feeding strike, as the supraoccipital slides backwards, striking and pushing the 1st postcranial plate against (and striking the 2nd postcranial plate. In the Syngnathoides pipefish, in the absence of the 1st postcranial plate, the

  7. The Voice as Computer Interface: A Look at Tomorrow's Technologies.

    Science.gov (United States)

    Lange, Holley R.

    1991-01-01

    Discussion of voice as the communications device for computer-human interaction focuses on voice recognition systems for use within a library environment. Voice technologies are described, including voice response and voice recognition; examples of voice systems in use in libraries are examined; and further possibilities, including use with…

  8. Hear where we are sound, ecology, and sense of place

    CERN Document Server

    Stocker, Michael

    2013-01-01

    Throughout history, hearing and sound perception have been typically framed in the context of how sound conveys information and how that information influences the listener. Hear Where We Are inverts this premise and examines how humans and other hearing animals use sound to establish acoustical relationships with their surroundings. This simple inversion reveals a panoply of possibilities by which we can re-evaluate how hearing animals use, produce, and perceive sound. Nuance in vocalizations become signals of enticement or boundary setting; silence becomes a field ripe in auditory possibilities; predator/prey relationships are infused with acoustic deception, and sounds that have been considered territorial cues become the fabric of cooperative acoustical communities. This inversion also expands the context of sound perception into a larger perspective that centers on biological adaptation within acoustic habitats. Here, the rapid synchronized flight patterns of flocking birds and the tight maneuvering of s...

  9. Selection of individual features of a speech signal using genetic algorithms

    Directory of Open Access Journals (Sweden)

    Kamil Kamiński

    2016-03-01

    Full Text Available The paper presents an automatic speaker’s recognition system, implemented in the Matlab environment, and demonstrates how to achieve and optimize various elements of the system. The main emphasis was put on features selection of a speech signal using a genetic algorithm which takes into account synergy of features. The results of optimization of selected elements of a classifier have been also shown, including the number of Gaussian distributions used to model each of the voices. In addition, for creating voice models, a universal voice model has been used.[b]Keywords[/b]: biometrics, automatic speaker recognition, genetic algorithms, feature selection

  10. Probing echoic memory with different voices.

    Science.gov (United States)

    Madden, D J; Bastian, J

    1977-05-01

    Considerable evidence has indicated that some acoustical properties of spoken items are preserved in an "echoic" memory for approximately 2 sec. However, some of this evidence has also shown that changing the voice speaking the stimulus items has a disruptive effect on memory which persists longer than that of other acoustical variables. The present experiment examined the effect of voice changes on response bias as well as on accuracy in a recognition memory task. The task involved judging recognition probes as being present in or absent from sets of dichotically presented digits. Recognition of probes spoken in the same voice as that of the dichotic items was more accurate than recognition of different-voice probes at each of three retention intervals of up to 4 sec. Different-voice probes increased the likelihood of "absent" responses, but only up to a 1.4-sec delay. These shifts in response bias may represent a property of echoic memory which should be investigated further.

  11. Voice disorders in teachers. A review.

    Science.gov (United States)

    Martins, Regina Helena Garcia; Pereira, Eny Regina Bóia Neves; Hidalgo, Caio Bosque; Tavares, Elaine Lara Mendes

    2014-11-01

    Voice disorders are very prevalent among teachers and consequences are serious. Although the literature is extensive, there are differences in the concepts and methodology related to voice problems; most studies are restricted to analyzing the responses of teachers to questionnaires and only a few studies include vocal assessments and videolaryngoscopic examinations to obtain a definitive diagnosis. To review demographic studies related to vocal disorders in teachers to analyze the diverse methodologies, the prevalence rates pointed out by the authors, the main risk factors, the most prevalent laryngeal lesions, and the repercussions of dysphonias on professional activities. The available literature (from 1997 to 2013) was narratively reviewed based on Medline, PubMed, Lilacs, SciELO, and Cochrane library databases. Excluded were articles that specifically analyzed treatment modalities and those that did not make their abstracts available in those databases. The keywords included were teacher, dysphonia, voice disorders, professional voice. Copyright © 2014 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  12. Facilitated auditory detection for speech sounds

    Directory of Open Access Journals (Sweden)

    Carine eSignoret

    2011-07-01

    Full Text Available If it is well known that knowledge facilitates higher cognitive functions, such as visual and auditory word recognition, little is known about the influence of knowledge on detection, particularly in the auditory modality. Our study tested the influence of phonological and lexical knowledge on auditory detection. Words, pseudo words and complex non phonological sounds, energetically matched as closely as possible, were presented at a range of presentation levels from sub threshold to clearly audible. The participants performed a detection task (Experiments 1 and 2 that was followed by a two alternative forced choice recognition task in Experiment 2. The results of this second task in Experiment 2 suggest a correct recognition of words in the absence of detection with a subjective threshold approach. In the detection task of both experiments, phonological stimuli (words and pseudo words were better detected than non phonological stimuli (complex sounds, presented close to the auditory threshold. This finding suggests an advantage of speech for signal detection. An additional advantage of words over pseudo words was observed in Experiment 2, suggesting that lexical knowledge could also improve auditory detection when listeners had to recognize the stimulus in a subsequent task. Two simulations of detection performance performed on the sound signals confirmed that the advantage of speech over non speech processing could not be attributed to energetic differences in the stimuli.

  13. Optimal dose-response relationships in voice therapy.

    Science.gov (United States)

    Roy, Nelson

    2012-10-01

    Like other areas of speech-language pathology, the behavioural management of voice disorders lacks precision regarding optimal dose-response relationships. In voice therapy, dosing can presumably vary from no measurable effect (i.e., no observable benefit or adverse effect), to ideal dose (maximum benefit with no adverse effects), to doses that produce toxic or harmful effects on voice production. Practicing specific vocal exercises will inevitably increase vocal load. At ideal doses, these exercises may be non-toxic and beneficial, while at intermediate or high doses, the same exercises may actually be toxic or damaging to vocal fold tissues. In pharmacology, toxicity is a critical concept, yet it is rarely considered in voice therapy, with little known regarding "effective" concentrations of specific voice therapies vs "toxic" concentrations. The potential for vocal fold tissue damage related to overdosing on specific vocal exercises has been under-studied. In this commentary, the issue of dosing will be explored within the context of voice therapy, with particular emphasis placed on possible "overdosing".

  14. Work-related voice disorder

    Directory of Open Access Journals (Sweden)

    Paulo Eduardo Przysiezny

    2015-04-01

    Full Text Available INTRODUCTION: Dysphonia is the main symptom of the disorders of oral communication. However, voice disorders also present with other symptoms such as difficulty in maintaining the voice (asthenia, vocal fatigue, variation in habitual vocal fundamental frequency, hoarseness, lack of vocal volume and projection, loss of vocal efficiency, and weakness when speaking. There are several proposals for the etiologic classification of dysphonia: functional, organofunctional, organic, and work-related voice disorder (WRVD.OBJECTIVE: To conduct a literature review on WRVD and on the current Brazilian labor legislation.METHODS: This was a review article with bibliographical research conducted on the PubMed and Bireme databases, using the terms "work-related voice disorder", "occupational dysphonia", "dysphonia and labor legislation", and a review of labor and social security relevant laws.CONCLUSION: WRVD is a situation that frequently is listed as a reason for work absenteeism, functional rehabilitation, or for prolonged absence from work. Currently, forensic physicians have no comparative parameters to help with the analysis of vocal disorders. In certain situations WRVD may cause, work disability. This disorder may be labor-related, or be an adjuvant factor to work-related diseases.

  15. Aerodynamic findings and Voice Handicap Index in Parkinson's disease.

    Science.gov (United States)

    Motta, Sergio; Cesari, Ugo; Paternoster, Mariano; Motta, Giovanni; Orefice, Giuseppe

    2018-04-23

    To verify possible relations between vocal disability and aerodynamic measures in selected Parkinson's disease (PD) patients with low/moderate-grade dysphonia. Fifteen idiopathic dysphonic PD male patients were examined and compared with 15 euphonic subjects. Testing included the following measures: Voice Handicap Index (VHI), maximum phonation time (MPT), mean estimated subglottal pressure (MESGP), mean sound pressure level (MSPL), mean phonatory power (MPP), mean phonatory efficiency (MPE) and mean phonatory resistance (MPR). Statistical analysis showed: a significant reduction in MPR and MSPL in PD subjects compared to the healthy ones; a significant positive correlation between VHI score and MSPL, MPR, MPP, MESGP and a significant negative correlation between VHI and MTP within PD subjects. Test for multiple linear regression showed a significant correlation between VHI score, MPT, MPR and MSPL. A relationship between VHI and aerodynamic measures was shown in the present study. Compensatory mechanisms may aggravate vocal disability in PD subjects.

  16. Human voice perception.

    Science.gov (United States)

    Latinus, Marianne; Belin, Pascal

    2011-02-22

    We are all voice experts. First and foremost, we can produce and understand speech, and this makes us a unique species. But in addition to speech perception, we routinely extract from voices a wealth of socially-relevant information in what constitutes a more primitive, and probably more universal, non-linguistic mode of communication. Consider the following example: you are sitting in a plane, and you can hear a conversation in a foreign language in the row behind you. You do not see the speakers' faces, and you cannot understand the speech content because you do not know the language. Yet, an amazing amount of information is available to you. You can evaluate the physical characteristics of the different protagonists, including their gender, approximate age and size, and associate an identity to the different voices. You can form a good idea of the different speaker's mood and affective state, as well as more subtle cues as the perceived attractiveness or dominance of the protagonists. In brief, you can form a fairly detailed picture of the type of social interaction unfolding, which a brief glance backwards can on the occasion help refine - sometimes surprisingly so. What are the acoustical cues that carry these different types of vocal information? How does our brain process and analyse this information? Here we briefly review an emerging field and the main tools used in voice perception research. Copyright © 2011 Elsevier Ltd. All rights reserved.

  17. Using K-Nearest Neighbor Classification to Diagnose Abnormal Lung Sounds

    Directory of Open Access Journals (Sweden)

    Chin-Hsing Chen

    2015-06-01

    Full Text Available A reported 30% of people worldwide have abnormal lung sounds, including crackles, rhonchi, and wheezes. To date, the traditional stethoscope remains the most popular tool used by physicians to diagnose such abnormal lung sounds, however, many problems arise with the use of a stethoscope, including the effects of environmental noise, the inability to record and store lung sounds for follow-up or tracking, and the physician’s subjective diagnostic experience. This study has developed a digital stethoscope to help physicians overcome these problems when diagnosing abnormal lung sounds. In this digital system, mel-frequency cepstral coefficients (MFCCs were used to extract the features of lung sounds, and then the K-means algorithm was used for feature clustering, to reduce the amount of data for computation. Finally, the K-nearest neighbor method was used to classify the lung sounds. The proposed system can also be used for home care: if the percentage of abnormal lung sound frames is > 30% of the whole test signal, the system can automatically warn the user to visit a physician for diagnosis. We also used bend sensors together with an amplification circuit, Bluetooth, and a microcontroller to implement a respiration detector. The respiratory signal extracted by the bend sensors can be transmitted to the computer via Bluetooth to calculate the respiratory cycle, for real-time assessment. If an abnormal status is detected, the device will warn the user automatically. Experimental results indicated that the error in respiratory cycles between measured and actual values was only 6.8%, illustrating the potential of our detector for home care applications.

  18. Sound stream segregation: a neuromorphic approach to solve the "cocktail party problem" in real-time.

    Science.gov (United States)

    Thakur, Chetan Singh; Wang, Runchun M; Afshar, Saeed; Hamilton, Tara J; Tapson, Jonathan C; Shamma, Shihab A; van Schaik, André

    2015-01-01

    The human auditory system has the ability to segregate complex auditory scenes into a foreground component and a background, allowing us to listen to specific speech sounds from a mixture of sounds. Selective attention plays a crucial role in this process, colloquially known as the "cocktail party effect." It has not been possible to build a machine that can emulate this human ability in real-time. Here, we have developed a framework for the implementation of a neuromorphic sound segregation algorithm in a Field Programmable Gate Array (FPGA). This algorithm is based on the principles of temporal coherence and uses an attention signal to separate a target sound stream from background noise. Temporal coherence implies that auditory features belonging to the same sound source are coherently modulated and evoke highly correlated neural response patterns. The basis for this form of sound segregation is that responses from pairs of channels that are strongly positively correlated belong to the same stream, while channels that are uncorrelated or anti-correlated belong to different streams. In our framework, we have used a neuromorphic cochlea as a frontend sound analyser to extract spatial information of the sound input, which then passes through band pass filters that extract the sound envelope at various modulation rates. Further stages include feature extraction and mask generation, which is finally used to reconstruct the targeted sound. Using sample tonal and speech mixtures, we show that our FPGA architecture is able to segregate sound sources in real-time. The accuracy of segregation is indicated by the high signal-to-noise ratio (SNR) of the segregated stream (90, 77, and 55 dB for simple tone, complex tone, and speech, respectively) as compared to the SNR of the mixture waveform (0 dB). This system may be easily extended for the segregation of complex speech signals, and may thus find various applications in electronic devices such as for sound segregation and

  19. How to help teachers' voices.

    Science.gov (United States)

    Saatweber, Margarete

    2008-01-01

    It has been shown that teachers are at high risk of developing occupational dysphonia, and it has been widely accepted that the vocal characteristics of a speaker play an important role in determining the reactions of listeners. The functions of breathing, breathing movement, breathing tonus, voice vibrations and articulation tonus are transmitted to the listener. So we may conclude that listening to the teacher's voice at school influences children's behavior and the perception of spoken language. This paper presents the concept of Schlaffhorst-Andersen including exercises to help teachers improve their voice, breathing, movement and their posture. Copyright 2008 S. Karger AG, Basel.

  20. Towards parameter-free classification of sound effects in movies

    Science.gov (United States)

    Chu, Selina; Narayanan, Shrikanth; Kuo, C.-C. J.

    2005-08-01

    The problem of identifying intense events via multimedia data mining in films is investigated in this work. Movies are mainly characterized by dialog, music, and sound effects. We begin our investigation with detecting interesting events through sound effects. Sound effects are neither speech nor music, but are closely associated with interesting events such as car chases and gun shots. In this work, we utilize low-level audio features including MFCC and energy to identify sound effects. It was shown in previous work that the Hidden Markov model (HMM) works well for speech/audio signals. However, this technique requires a careful choice in designing the model and choosing correct parameters. In this work, we introduce a framework that will avoid such necessity and works well with semi- and non-parametric learning algorithms.

  1. Efficiency Investigation of Switch Mode Power Amplifier Drving Low Impedance Transducers

    DEFF Research Database (Denmark)

    Iversen, Niels Elkjær; Schneider, Henrik; Knott, Arnold

    2015-01-01

    the amplifier rail voltage requirement as a function of the voice coil nominal resistance is presented. The method is based on a crest factor analysis of music signals and estimation of the electrical power requirement from a specific target of the sound pressure level. Experimental measurements confirms a huge...... performance leap in terms of efficiency compared to a conventional battery driven sound system. Future optimization of low voltage, high current amplifiers for low impedance loudspeaker drivers are discussed....

  2. Musculoskeletal Pain and Occupational Variables in Teachers With Voice Disorders and in Those With Healthy Voices-A Pilot Study.

    Science.gov (United States)

    da Silva Vitor, Jhonatan; Siqueira, Larissa Thaís Donalonso; Ribeiro, Vanessa Veis; Ramos, Janine Santos; Brasolotto, Alcione Ghedini; Silverio, Kelly Cristina Alves

    2017-07-01

    This study aimed to compare musculoskeletal pain perception in teachers with voice disorders and in those with healthy voices, and to investigate the relationship between musculoskeletal pain and occupational variables (ie, work journey per week and working period). Forty-three classroom teachers were divided into two groups: dysphonic group (DG), 32 classroom teachers with voice complaints and voice disorders; and non-DG, 11 classroom teachers without voice complaints and who are vocally healthy. The musculoskeletal pain investigation survey was used to investigate the frequency and intensity of the pain. Occupational variables, such as work journey per week and working period, were investigated by the Voice Production Condition-Teacher questionnaire. The statistical tests used were the Spearman correlation (P ≤ 0.05) and the Mann-Whitney U test (P ≤ 0.05). There was no difference between the frequency and the intensity of musculoskeletal pain regarding dysphonia. Work journey per week was positively related to the frequency and the intensity of laryngeal pain in the DG. The working period had a negative relationship to the frequency and the intensity of musculoskeletal pain in the submandibular region in the DG. Classroom teachers with voice disorders and those with healthy voices do not have differences regarding the frequency and the intensity of musculoskeletal pain. Besides dysphonia the pain is an important symptom to be considered in classroom teachers. The occupational variables contributed to the presence of musculoskeletal pain in the region near the larynx, which appears to be directly proportional to work journey per week and inversely proportional to the working period. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  3. Politeness, emotion, and gender: A sociophonetic study of voice pitch modulation

    Science.gov (United States)

    Yuasa, Ikuko

    The present dissertation is a cross-gender and cross-cultural sociophonetic exploration of voice pitch characteristics utilizing speech data derived from Japanese and American speakers in natural conversations. The roles of voice pitch modulation in terms of the concepts of politeness and emotion as they pertain to culture and gender will be investigated herein. The research interprets the significance of my findings based on the acoustic measurements of speech data as they are presented in the ERB-rate scale (the most appropriate scale for human speech perception). The investigation reveals that pitch range modulation displayed by Japanese informants in two types of conversations is closely linked to types of politeness adopted by those informants. The degree of the informants' emotional involvement and expressions reflected in differing pitch range widths plays an important role in determining the relationship between pitch range modulation and politeness. The study further correlates the Japanese cultural concept of enryo ("self-restraint") with this phenomenon. When median values were examined, male and female pitch ranges across cultures did not conspicuously differ. However, sporadically occurring women's pitch characteristics which culturally differ in width and height of pitch ranges may create an 'emotional' perception of women's speech style. The salience of these pitch characteristics appears to be the source of the stereotypically linked sound of women's speech being identified as 'swoopy' or 'shrill' and thus 'emotional'. Such women's salient voice characteristics are interpreted in light of camaraderie/positive politeness. Women's use of conspicuous paralinguistic features helps to create an atmosphere of camaraderie. These voice pitch characteristics promote the establishment of a sense of camaraderie since they act to emphasize such feelings as concern, support, and comfort towards addressees, Moreover, men's wide pitch ranges are discussed in view

  4. Sound

    CERN Document Server

    Robertson, William C

    2003-01-01

    Muddled about what makes music? Stuck on the study of harmonics? Dumbfounded by how sound gets around? Now you no longer have to struggle to teach concepts you really don t grasp yourself. Sound takes an intentionally light touch to help out all those adults science teachers, parents wanting to help with homework, home-schoolers seeking necessary scientific background to teach middle school physics with confidence. The book introduces sound waves and uses that model to explain sound-related occurrences. Starting with the basics of what causes sound and how it travels, you'll learn how musical instruments work, how sound waves add and subtract, how the human ear works, and even why you can sound like a Munchkin when you inhale helium. Sound is the fourth book in the award-winning Stop Faking It! Series, published by NSTA Press. Like the other popular volumes, it is written by irreverent educator Bill Robertson, who offers this Sound recommendation: One of the coolest activities is whacking a spinning metal rod...

  5. A Neural Network Model for Prediction of Sound Quality

    DEFF Research Database (Denmark)

    Nielsen,, Lars Bramsløw

    An artificial neural network structure has been specified, implemented and optimized for the purpose of predicting the perceived sound quality for normal-hearing and hearing-impaired subjects. The network was implemented by means of commercially available software and optimized to predict results...... obtained in subjective sound quality rating experiments based on input data from an auditory model. Various types of input data and data representations from the auditory model were used as input data for the chosen network structure, which was a three-layer perceptron. This network was trained by means...... the physical signal parameters and the subjectively perceived sound quality. No simple objective-subjective relationship was evident from this analysis....

  6. Obligatory and facultative brain regions for voice-identity recognition.

    Science.gov (United States)

    Roswandowitz, Claudia; Kappes, Claudia; Obrig, Hellmuth; von Kriegstein, Katharina

    2018-01-01

    Recognizing the identity of others by their voice is an important skill for social interactions. To date, it remains controversial which parts of the brain are critical structures for this skill. Based on neuroimaging findings, standard models of person-identity recognition suggest that the right temporal lobe is the hub for voice-identity recognition. Neuropsychological case studies, however, reported selective deficits of voice-identity recognition in patients predominantly with right inferior parietal lobe lesions. Here, our aim was to work towards resolving the discrepancy between neuroimaging studies and neuropsychological case studies to find out which brain structures are critical for voice-identity recognition in humans. We performed a voxel-based lesion-behaviour mapping study in a cohort of patients (n = 58) with unilateral focal brain lesions. The study included a comprehensive behavioural test battery on voice-identity recognition of newly learned (voice-name, voice-face association learning) and familiar voices (famous voice recognition) as well as visual (face-identity recognition) and acoustic control tests (vocal-pitch and vocal-timbre discrimination). The study also comprised clinically established tests (neuropsychological assessment, audiometry) and high-resolution structural brain images. The three key findings were: (i) a strong association between voice-identity recognition performance and right posterior/mid temporal and right inferior parietal lobe lesions; (ii) a selective association between right posterior/mid temporal lobe lesions and voice-identity recognition performance when face-identity recognition performance was factored out; and (iii) an association of right inferior parietal lobe lesions with tasks requiring the association between voices and faces but not voices and names. The results imply that the right posterior/mid temporal lobe is an obligatory structure for voice-identity recognition, while the inferior parietal lobe is

  7. Covert Channels in SIP for VoIP Signalling

    Science.gov (United States)

    Mazurczyk, Wojciech; Szczypiorski, Krzysztof

    In this paper, we evaluate available steganographic techniques for SIP (Session Initiation Protocol) that can be used for creating covert channels during signaling phase of VoIP (Voice over IP) call. Apart from characterizing existing steganographic methods we provide new insights by introducing new techniques. We also estimate amount of data that can be transferred in signalling messages for typical IP telephony call.

  8. Voice amplification for primary school teachers with voice disorders: A randomized clinical trial

    Directory of Open Access Journals (Sweden)

    Roberto Bovo

    2013-06-01

    Full Text Available Objectives: Several studies have demonstrated a high prevalence of voice disorders in teachers, together with the personal, professional and economical consequences of the problem. Good primary prevention should be based on 3 aspects: 1 amelioration of classroom acoustics, 2 voice care programs for future professional voice users, including teachers and 3 classroom or portable amplification systems. The aim of the study was to assess the benefit obtained from the use of portable amplification systems by female primary school teachers in their occupational setting. Materials and Methods: Forty female primary school teachers attended a course about professional voice care, which comprised two theoretical lectures, each 60 min long. Thereafter, they were randomized into 2 groups: the teachers of the first group were asked to use a portable vocal amplifier for 3 months, till the end of school-year. The other 20 teachers were part of the control group, matched for age and years of employment. All subjects had a grade 1 of dysphonia with no significant organic lesion of the vocal folds. Results: Most teachers of the experimental group used the amplifier consistently for the whole duration of the experiment and found it very useful in reducing the symptoms of vocal fatigue. In fact, after 3 months, Voice Handicap Index (VHI scores in "course + amplifier" group demonstrated a significant amelioration (p = 0.003. The perceptual grade of dysphonia also improved significantly (p = 0.0005. The same parameters changed favourably also in the "course only" group, but the results were not statistically significant (p = 0.4 for VHI and p = 0.03 for perceptual grade. Conclusions: In teachers, and particularly in those with a constitutional weak voice and/or those who are prone to vocal fold pathology, vocal amplifiers may be an effective and low-cost intervention to decrease potentially damaging vocal loads and may represent a necessary form of prevention.

  9. Phase vocoder and beyond

    Directory of Open Access Journals (Sweden)

    Marco Liuni

    2013-08-01

    Full Text Available For a broad range of sound transformations, quality is measured according to the common expectation about the result: if a male’s voice has to be changed in a female’s one, there exists a common reference for the perceptive evaluation of the result; the same holds if an instrumental sound has to be made longer, or shorter. Following the argument in Röbel, “Between Physics and Perception: Signal Models for High Level Audio Processing”, a fundamental requirement for these transformation algorithms is their need of signal models that are strongly linked to perceptually relevant physical properties of the sound source. This paper is a short survey about the phase vocoder technique, together with its extensions and improvements relying on appropriate sound models, which have led to high level audio processing algorithms.

  10. Voiced Excitations

    National Research Council Canada - National Science Library

    Holzricher, John

    2004-01-01

    To more easily obtain a voiced excitation function for speech characterization, measurements of skin motion, tracheal tube, and vocal fold, motions were made and compared to EM sensor-glottal derived...

  11. Physically based sound synthesis and control of jumping sounds on an elastic trampoline

    DEFF Research Database (Denmark)

    Turchet, Luca; Pugliese, Roberto; Takala, Tapio

    2013-01-01

    This paper describes a system to interactively sonify the foot-floor contacts resulting from jumping on an elastic trampoline. The sonification was achieved by means of a synthesis engine based on physical models reproducing the sounds of jumping on several surface materials. The engine was contr......This paper describes a system to interactively sonify the foot-floor contacts resulting from jumping on an elastic trampoline. The sonification was achieved by means of a synthesis engine based on physical models reproducing the sounds of jumping on several surface materials. The engine...... was controlled in real-time by pro- cessing the signal captured by a contact microphone which was attached to the membrane of the trampoline in order to detect each jump. A user study was conducted to evaluate the quality of the in- teractive sonification. Results proved the success of the proposed algorithms...

  12. Voice and Speech Quality Perception Assessment and Evaluation

    CERN Document Server

    Jekosch, Ute

    2005-01-01

    Foundations of Voice and Speech Quality Perception starts out with the fundamental question of: "How do listeners perceive voice and speech quality and how can these processes be modeled?" Any quantitative answers require measurements. This is natural for physical quantities but harder to imagine for perceptual measurands. This book approaches the problem by actually identifying major perceptual dimensions of voice and speech quality perception, defining units wherever possible and offering paradigms to position these dimensions into a structural skeleton of perceptual speech and voice quality. The emphasis is placed on voice and speech quality assessment of systems in artificial scenarios. Many scientific fields are involved. This book bridges the gap between two quite diverse fields, engineering and humanities, and establishes the new research area of Voice and Speech Quality Perception.

  13. Voice Disorders in Occupations with Vocal Load in Slovenia.

    Science.gov (United States)

    Boltežar, Lučka; Šereg Bahar, Maja

    2014-12-01

    The aim of this paper is to compare the prevalence of voice disorders and the risk factors for them in different occupations with a vocal load in Slovenia. A meta-analysis of six different Slovenian studies involving teachers, physicians, salespeople, catholic priests, nurses and speech-and-language therapists (SLTs) was performed. In all six studies, similar questions about the prevalence of voice disorders and the causes for them were included. The comparison of the six studies showed that more than 82% of the 2347 included subjects had voice problems at some time during their career. The teachers were the most affected by voice problems. The prevalent cause of voice problems was the vocal load in teachers and salespeople and respiratory-tract infections in all the other occupational groups. When the occupational groups were compared, it was stated that the teachers had more voice problems and showed less care for their voices than the priests. The physicians had more voice problems and showed better consideration of vocal hygiene rules than the SLTs. The majority of all the included subjects did not receive instructions about voice care during education. In order to decrease the prevalence of voice disorders in vocal professionals, a screening program is recommended before the beginning of their studies. Regular courses on voice care and proper vocal technique should be obligatory for all professional voice users during their career. The inclusion of dysphonia in the list of occupational diseases should be considered in Slovenia as it is in some European countries.

  14. Artificial intelligence techniques used in respiratory sound analysis--a systematic review.

    Science.gov (United States)

    Palaniappan, Rajkumar; Sundaraj, Kenneth; Sundaraj, Sebastian

    2014-02-01

    Artificial intelligence (AI) has recently been established as an alternative method to many conventional methods. The implementation of AI techniques for respiratory sound analysis can assist medical professionals in the diagnosis of lung pathologies. This article highlights the importance of AI techniques in the implementation of computer-based respiratory sound analysis. Articles on computer-based respiratory sound analysis using AI techniques were identified by searches conducted on various electronic resources, such as the IEEE, Springer, Elsevier, PubMed, and ACM digital library databases. Brief descriptions of the types of respiratory sounds and their respective characteristics are provided. We then analyzed each of the previous studies to determine the specific respiratory sounds/pathology analyzed, the number of subjects, the signal processing method used, the AI techniques used, and the performance of the AI technique used in the analysis of respiratory sounds. A detailed description of each of these studies is provided. In conclusion, this article provides recommendations for further advancements in respiratory sound analysis.

  15. A description of externally recorded womb sounds in human subjects during gestation.

    Science.gov (United States)

    Parga, Joanna J; Daland, Robert; Kesavan, Kalpashri; Macey, Paul M; Zeltzer, Lonnie; Harper, Ronald M

    2018-01-01

    Reducing environmental noise benefits premature infants in neonatal intensive care units (NICU), but excessive reduction may lead to sensory deprivation, compromising development. Instead of minimal noise levels, environments that mimic intrauterine soundscapes may facilitate infant development by providing a sound environment reflecting fetal life. This soundscape may support autonomic and emotional development in preterm infants. We aimed to assess the efficacy and feasibility of external non-invasive recordings in pregnant women, endeavoring to capture intra-abdominal or womb sounds during pregnancy with electronic stethoscopes and build a womb sound library to assess sound trends with gestational development. We also compared these sounds to popular commercial womb sounds marketed to new parents. Intra-abdominal sounds from 50 mothers in their second and third trimester (13 to 40 weeks) of pregnancy were recorded for 6 minutes in a quiet clinic room with 4 electronic stethoscopes, placed in the right upper and lower quadrants, and left upper and lower quadrants of the abdomen. These recording were partitioned into 2-minute intervals in three different positions: standing, sitting and lying supine. Maternal and gestational age, Body Mass Index (BMI) and time since last meal were collected during recordings. Recordings were analyzed using long-term average spectral and waveform analysis, and compared to sounds from non-pregnant abdomens and commercially-marketed womb sounds selected for their availability, popularity, and claims they mimic the intrauterine environment. Maternal sounds shared certain common characteristics, but varied with gestational age. With fetal development, the maternal abdomen filtered high (500-5,000 Hz) and mid-frequency (100-500 Hz) energy bands, but no change appeared in contributions from low-frequency signals (10-100 Hz) with gestational age. Variation appeared between mothers, suggesting a resonant chamber role for intra

  16. Voice, Schooling, Inequality, and Scale

    Science.gov (United States)

    Collins, James

    2013-01-01

    The rich studies in this collection show that the investigation of voice requires analysis of "recognition" across layered spatial-temporal and sociolinguistic scales. I argue that the concepts of voice, recognition, and scale provide insight into contemporary educational inequality and that their study benefits, in turn, from paying attention to…

  17. Categorization of common sounds by cochlear implanted and normal hearing adults.

    Science.gov (United States)

    Collett, E; Marx, M; Gaillard, P; Roby, B; Fraysse, B; Deguine, O; Barone, P

    2016-05-01

    Auditory categorization involves grouping of acoustic events along one or more shared perceptual dimensions which can relate to both semantic and physical attributes. This process involves both high level cognitive processes (categorization) and low-level perceptual encoding of the acoustic signal, both of which are affected by the use of a cochlear implant (CI) device. The goal of this study was twofold: I) compare the categorization strategies of CI users and normal hearing listeners (NHL) II) investigate if any characteristics of the raw acoustic signal could explain the results. 16 experienced CI users and 20 NHL were tested using a Free-Sorting Task of 16 common sounds divided into 3 predefined categories of environmental, musical and vocal sounds. Multiple Correspondence Analysis (MCA) and Hierarchical Clustering based on Principal Components (HCPC) show that CI users followed a similar categorization strategy to that of NHL and were able to discriminate between the three different types of sounds. However results for CI users were more varied and showed less inter-participant agreement. Acoustic analysis also highlighted the average pitch salience and average autocorrelation peak as being important for the perception and categorization of the sounds. The results therefore show that on a broad level of categorization CI users may not have as many difficulties as previously thought in discriminating certain kinds of sound; however the perception of individual sounds remains challenging. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. Voice restoration following total laryngectomy by tracheoesophageal prosthesis: Effect on patients' quality of life and voice handicap in Jordan

    Directory of Open Access Journals (Sweden)

    Wreikat Mahmoud M

    2008-03-01

    Full Text Available Abstract Background Little has been reported about the impact of tracheoesophageal (TE speech on individuals in the Middle East where the procedure has been gaining in popularity. After total laryngectomy, individuals in Europe and North America have rated their quality of life as being lower than non-laryngectomized individuals. The purpose of this study was to evaluate changes in quality of life and degree of voice handicap reported by laryngectomized speakers from Jordan before and after establishment of TE speech. Methods Twelve male Jordanian laryngectomees completed the University of Michigan Head & Neck Quality of Life instrument and the Voice Handicap Index pre- and post-TE puncture. Results All subjects showed significant improvements in their quality of life following successful prosthetic voice restoration. In addition, voice handicap scores were significantly reduced from pre- to post-TE puncture. Conclusion Tracheoesophageal speech significantly improved the quality of life and limited the voice handicap imposed by total laryngectomy. This method of voice restoration has been used for a number of years in other countries and now appears to be a viable alternative within Jordan.

  19. Self-Reported Acute and Chronic Voice Disorders in Teachers.

    Science.gov (United States)

    Rossi-Barbosa, Luiza Augusta Rosa; Barbosa, Mirna Rossi; Morais, Renata Martins; de Sousa, Kamilla Ferreira; Silveira, Marise Fagundes; Gama, Ana Cristina Côrtes; Caldeira, Antônio Prates

    2016-11-01

    The present study aimed to identify factors associated with self-reported acute and chronic voice disorders among municipal elementary school teachers in the city of Montes Claros, in the State of Minas Gerais, Brazil. The dependent variable, self-reported dysphonia, was determined via a single question, "Have you noticed changes in your voice quality?" and if so, a follow-up question queried the duration of this change, acute or chronic. The independent variables were dichotomized and divided into five categories: sociodemographic and economic data; lifestyle; organizational and environmental data; health-disease processes; and voice. Analyses of associated factors were performed via a hierarchical multiple logistic regression model. The present study included 226 teachers, of whom 38.9% reported no voice disorders, 35.4% reported an acute disorder, and 25.7% reported a chronic disorder. Excessive voice use daily, consuming more than one alcoholic drink per time, and seeking medical treatment because of voice disorders were associated factors for acute and chronic voice disorders. Consuming up to three glasses of water per day was associated with acute voice disorders. Among teachers who reported chronic voice disorders, teaching for over 15 years and the perception of disturbing or unbearable noise outside the school were both associated factors. Identification of organizational, environmental, and predisposing risk factors for voice disorders is critical, and furthermore, a vocal health promotion program may address these issues. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  20. Signalling design and architecture for a proposed mobile satellite network

    Science.gov (United States)

    Yan, T.-Y.; Cheng, U.; Wang, C.

    1990-01-01

    In a frequency-division/demand-assigned multiple-access (FD/DAMA) architecture, each mobile subscriber must make a connection request to the Network Management Center before transmission for either open-end or closed-end services. Open-end services are for voice calls and long file transfer and are processed on a blocked-call-cleared basis. Closed-end services are for transmitting burst data and are processed on a first-come first-served basis. This paper presents the signalling design and architecture for non-voice services of an FD/DAMA mobile satellite network. The connection requests are made through the recently proposed multiple channel collision resolution scheme which provides a significantly higher throughput than the traditional slotted ALOHA scheme. For non-voice services, it is well known that retransmissions are necessary to ensure the delivery of a message in its entirety from the source to destination. Retransmission protocols for open-end and closed-end data transfer are investigated. The signal structure for the proposed network is derived from X-25 standards with appropriate modifications. The packet types and their usages are described in this paper.