WorldWideScience

Sample records for ac-3 audio system

  1. 3D Audio System

    Science.gov (United States)

    1992-01-01

    Ames Research Center research into virtual reality led to the development of the Convolvotron, a high speed digital audio processing system that delivers three-dimensional sound over headphones. It consists of a two-card set designed for use with a personal computer. The Convolvotron's primary application is presentation of 3D audio signals over headphones. Four independent sound sources are filtered with large time-varying filters that compensate for motion. The perceived location of the sound remains constant. Possible applications are in air traffic control towers or airplane cockpits, hearing and perception research and virtual reality development.

  2. Survey of error concealment schemes for real-time audio transmission systems

    OpenAIRE

    Robles Moya, Aránzazu

    2012-01-01

    This thesis presents an overview of the main strategies employed for error detection and error concealment in different real-time transmission systems for digital audio. The “Adaptive Differential Pulse-Code Modulation (ADPCM)”, the “Audio Processing Technology Apt-x100”, the “Extended Adaptive Multi-Rate Wideband (AMR-WB+)”, the “Advanced Audio Coding (AAC)”, the “MPEG-1 Audio Layer II (MP2)”, the “MPEG-1 Audio Layer III (MP3)” and finally the “Adaptive Transform Coder 3 (AC3)” are considere...

  3. A listening test system for automotive audio

    DEFF Research Database (Denmark)

    Christensen, Flemming; Geoff, Martin; Minnaar, Pauli

    2005-01-01

    This paper describes a system for simulating automotive audio through headphones for the purposes of conducting listening experiments in the laboratory. The system is based on binaural technology and consists of a component for reproducing the sound of the audio system itself and a component...

  4. Augmenting Environmental Interaction in Audio Feedback Systems

    Directory of Open Access Journals (Sweden)

    Seunghun Kim

    2016-04-01

    Full Text Available Audio feedback is defined as a positive feedback of acoustic signals where an audio input and output form a loop, and may be utilized artistically. This article presents new context-based controls over audio feedback, leading to the generation of desired sonic behaviors by enriching the influence of existing acoustic information such as room response and ambient noise. This ecological approach to audio feedback emphasizes mutual sonic interaction between signal processing and the acoustic environment. Mappings from analyses of the received signal to signal-processing parameters are designed to emphasize this specificity as an aesthetic goal. Our feedback system presents four types of mappings: approximate analyses of room reverberation to tempo-scale characteristics, ambient noise to amplitude and two different approximations of resonances to timbre. These mappings are validated computationally and evaluated experimentally in different acoustic conditions.

  5. Personalized Audio Systems - a Bayesian Approach

    DEFF Research Database (Denmark)

    Nielsen, Jens Brehm; Jensen, Bjørn Sand; Hansen, Toke Jansen

    2013-01-01

    Modern audio systems are typically equipped with several user-adjustable parameters unfamiliar to most users listening to the system. To obtain the best possible setting, the user is forced into multi-parameter optimization with respect to the users's own objective and preference. To address this......, the present paper presents a general inter-active framework for personalization of such audio systems. The framework builds on Bayesian Gaussian process regression in which a model of the users's objective function is updated sequentially. The parameter setting to be evaluated in a given trial is selected...

  6. Non Audio-Video gesture recognition system

    DEFF Research Database (Denmark)

    Craciunescu, Razvan; Mihovska, Albena Dimitrova; Kyriazakos, Sofoklis

    2016-01-01

    recognition from the face and hand gesture recognition. Gesture recognition enables humans to communicate with the machine and interact naturally without any mechanical devices. This paper investigates the possibility to use non-audio/video sensors in order to design a low-cost gesture recognition device...... that can be connected to any computer on the market. The paper proposes an equation that relates the distance and voltage for a Sharp GP2Y0A21 and GP2D120 sensors in the situation that a hand is used as the reflective object. In the end, the presented system is compared with other audio/video system...

  7. Efficient audio signal processing for embedded systems

    Science.gov (United States)

    Chiu, Leung Kin

    As mobile platforms continue to pack on more computational power, electronics manufacturers start to differentiate their products by enhancing the audio features. However, consumers also demand smaller devices that could operate for longer time, hence imposing design constraints. In this research, we investigate two design strategies that would allow us to efficiently process audio signals on embedded systems such as mobile phones and portable electronics. In the first strategy, we exploit properties of the human auditory system to process audio signals. We designed a sound enhancement algorithm to make piezoelectric loudspeakers sound ”richer" and "fuller." Piezoelectric speakers have a small form factor but exhibit poor response in the low-frequency region. In the algorithm, we combine psychoacoustic bass extension and dynamic range compression to improve the perceived bass coming out from the tiny speakers. We also developed an audio energy reduction algorithm for loudspeaker power management. The perceptually transparent algorithm extends the battery life of mobile devices and prevents thermal damage in speakers. This method is similar to audio compression algorithms, which encode audio signals in such a ways that the compression artifacts are not easily perceivable. Instead of reducing the storage space, however, we suppress the audio contents that are below the hearing threshold, therefore reducing the signal energy. In the second strategy, we use low-power analog circuits to process the signal before digitizing it. We designed an analog front-end for sound detection and implemented it on a field programmable analog array (FPAA). The system is an example of an analog-to-information converter. The sound classifier front-end can be used in a wide range of applications because programmable floating-gate transistors are employed to store classifier weights. Moreover, we incorporated a feature selection algorithm to simplify the analog front-end. A machine

  8. Hierarchical system for content-based audio classification and retrieval

    Science.gov (United States)

    Zhang, Tong; Kuo, C.-C. Jay

    1998-10-01

    A hierarchical system for audio classification and retrieval based on audio content analysis is presented in this paper. The system consists of three stages. The audio recordings are first classical and segmented into speech, music, several types of environmental sounds, and silence, based on morphological and statistical analysis of temporal curves of the energy function, the average zero-crossing rate, and the fundamental frequency of audio signals. The first stage is called the coarse-level audio classification and segmentation. Then, environmental sounds are classified into finer classes such as applause, rain, birds' sound, etc., which is called the fine-level audio classification. The second stage is based on time-frequency analysis of audio signals and the use of the hidden Markov model (HMM) for classification. In the third stage, the query-by-example audio retrieval is implemented where similar sounds can be found according to the input sample audio. The way of modeling audio features with the hidden Markov model, the procedures of audio classification and retrieval, and the experimental results are described. It is shown that, with the proposed new system, audio recordings can be automatically segmented and classified into basic types in real time with an accuracy higher than 90%. Examples of audio fine classification and audio retrieval with the proposed HMM-based method are also provided.

  9. Personalized Audio Systems - a Bayesian Approach

    DEFF Research Database (Denmark)

    Nielsen, Jens Brehm; Jensen, Bjørn Sand; Hansen, Toke Jansen

    2013-01-01

    , the present paper presents a general inter-active framework for personalization of such audio systems. The framework builds on Bayesian Gaussian process regression in which a model of the users's objective function is updated sequentially. The parameter setting to be evaluated in a given trial is selected...... are optimized using the proposed framework. Twelve test subjects obtain a personalized setting with the framework, and these settings are signicantly preferred to those obtained with random experimentation....

  10. Audio Books in the Nigerian Higher Educational System: To be ...

    African Journals Online (AJOL)

    This study discusses audio books from the point of view of an innovation. It discusses the advantages and disadvantages of audio books. It examined students' familiarization with audio books and their perception about its being introduced into the school system. It was found out that Nigerian students are already familiar ...

  11. 3D Audio Acquisition and Reproduction Systems

    OpenAIRE

    Evrard, Marc; André, Cédric; Embrechts, Jean-Jacques; Verly, Jacques

    2011-01-01

    This presentation introduces two different research projects dealing with 3D audio for 3D-stereoscopic movies. The first project “3D audio acquisition for real time applications” studies the best method for acquiring a full 3D audio soundscape on location and for processing it in real-time for further reproduction. The second project “Adding 3D sound to 3D cinema” is aimed towards the study of reproducing a 3D soundscape consistent with the visual content of a 3D-stereoscopic movie. ...

  12. Differences in Human Audio Localization Performance between a HRTF- and a non-HRTF Audio System

    DEFF Research Database (Denmark)

    Larsen, Camilla Horne; Lauritsen, David Skødt; Larsen, Jacob Junker

    2013-01-01

    Spatial audio solutions have been around for a long time in real-time applications, but yielding spatial cues that more closely simulate real life accuracy has been a computational issue, and has often been solved by hardware solutions. This has long been a restriction, but now with more powerful...... computers this is becoming a lesser and lesser concern and software solutions are now applicable. Most current virtual environment applications do not take advantage of these im- plementations of accurate spatial cues, however. This paper compares a common implementation of spatial audio and a head......-related transfer function (HRTF) system implemen- tation in a study in relation to precision, speed and navi- gational performance in localizing audio sources in a virtual environment. We found that a system using HRTFs is signif- icantly better at all three performance tasks than a system using panning....

  13. [Development of Audio Indicator System for Respiratory Dynamic CT Imaging].

    Science.gov (United States)

    Muramatsu, Shun; Moriya, Hiroshi; Tsukagoshi, Shinsuke; Yamada, Norikazu

    We created the device, which can conduct a radiological technologist's voice to a subject during CT scanning. For 149 lung cancer, dynamic respiratory CT were performed. 92 cases were performed using this device, the others were without this device. The respiratory cycle and respiratory amplitude were analyzed from the lung density. A stable respirating cycle was obtained by using the audio indicator system. The audio indicator system is useful for respiratory dynamic CT.

  14. Aurally Aided Visual Search Performance Comparing Virtual Audio Systems

    DEFF Research Database (Denmark)

    Larsen, Camilla Horne; Lauritsen, David Skødt; Larsen, Jacob Junker

    2014-01-01

    Due to increased computational power reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between an HRTF enhanced audio system (3D...... with white dots. The results indicate that 3D audio yields faster search latencies than panning audio, especially with larger amounts of distractors. The applications of this research could fit virtual environments such as video games or virtual simulations....

  15. Aurally Aided Visual Search Performance Comparing Virtual Audio Systems

    DEFF Research Database (Denmark)

    Larsen, Camilla Horne; Lauritsen, David Skødt; Larsen, Jacob Junker

    2014-01-01

    Due to increased computational power, reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between a HRTF enhanced audio system (3D...... with white dots. The results indicate that 3D audio yields faster search latencies than panning audio, especially with larger amounts of distractors. The applications of this research could fit virtual environments such as video games or virtual simulations....

  16. Subjective and Objective Assessment of Perceived Audio Quality of Current Digital Audio Broadcasting Systems and Web-Casting Applications

    NARCIS (Netherlands)

    Pocta, P.; Beerends, J.G.

    2015-01-01

    This paper investigates the impact of different audio codecs typically deployed in current digital audio broadcasting (DAB) systems and web-casting applications, which represent a main source of quality impairment in these systems and applications, on the quality perceived by the end user. Both

  17. Perceived Audio Quality Analysis in Digital Audio Broadcasting Plus System Based on PEAQ

    Directory of Open Access Journals (Sweden)

    K. Ulovec

    2018-04-01

    Full Text Available Broadcasters need to decide on bitrates of the services in the multiplex transmitted via Digital Audio Broadcasting Plus system. The bitrate should be set as low as possible for maximal number of services, but with high quality, not lower than in conventional analog systems. In this paper, the objective method Perceptual Evaluation of Audio Quality is used to analyze the perceived audio quality for appropriate codecs --- MP2 and AAC offering three profiles. The main aim is to determine dependencies on the type of signal --- music and speech, the number of channels --- stereo and mono, and the bitrate. Results indicate that only MP2 codec and AAC Low Complexity profile reach imperceptible quality loss. The MP2 codec needs higher bitrate than AAC Low Complexity profile for the same quality. For the both versions of AAC High-Efficiency profiles, the limit bitrates are determined above which less complex profiles outperform the more complex ones and higher bitrates above these limits are not worth using. It is shown that stereo music has worse quality than stereo speech generally, whereas for mono, the dependencies vary upon the codec/profile. Furthermore, numbers of services satisfying various quality criteria are presented.

  18. Music Identification System Using MPEG-7 Audio Signature Descriptors

    Science.gov (United States)

    You, Shingchern D.; Chen, Wei-Hwa; Chen, Woei-Kae

    2013-01-01

    This paper describes a multiresolution system based on MPEG-7 audio signature descriptors for music identification. Such an identification system may be used to detect illegally copied music circulated over the Internet. In the proposed system, low-resolution descriptors are used to search likely candidates, and then full-resolution descriptors are used to identify the unknown (query) audio. With this arrangement, the proposed system achieves both high speed and high accuracy. To deal with the problem that a piece of query audio may not be inside the system's database, we suggest two different methods to find the decision threshold. Simulation results show that the proposed method II can achieve an accuracy of 99.4% for query inputs both inside and outside the database. Overall, it is highly possible to use the proposed system for copyright control. PMID:23533359

  19. Music Identification System Using MPEG-7 Audio Signature Descriptors

    Directory of Open Access Journals (Sweden)

    Shingchern D. You

    2013-01-01

    Full Text Available This paper describes a multiresolution system based on MPEG-7 audio signature descriptors for music identification. Such an identification system may be used to detect illegally copied music circulated over the Internet. In the proposed system, low-resolution descriptors are used to search likely candidates, and then full-resolution descriptors are used to identify the unknown (query audio. With this arrangement, the proposed system achieves both high speed and high accuracy. To deal with the problem that a piece of query audio may not be inside the system’s database, we suggest two different methods to find the decision threshold. Simulation results show that the proposed method II can achieve an accuracy of 99.4% for query inputs both inside and outside the database. Overall, it is highly possible to use the proposed system for copyright control.

  20. Robustness evaluation of transactional audio watermarking systems

    Science.gov (United States)

    Neubauer, Christian; Steinebach, Martin; Siebenhaar, Frank; Pickel, Joerg

    2003-06-01

    Distribution via Internet is of increasing importance. Easy access, transmission and consumption of digitally represented music is very attractive to the consumer but led also directly to an increasing problem of illegal copying. To cope with this problem watermarking is a promising concept since it provides a useful mechanism to track illicit copies by persistently attaching property rights information to the material. Especially for online music distribution the use of so-called transaction watermarking, also denoted with the term bitstream watermarking, is beneficial since it offers the opportunity to embed watermarks directly into perceptually encoded material without the need of full decompression/compression. Besides the concept of bitstream watermarking, former publications presented the complexity, the audio quality and the detection performance. These results are now extended by an assessment of the robustness of such schemes. The detection performance before and after applying selected attacks is presented for MPEG-1/2 Layer 3 (MP3) and MPEG-2/4 AAC bitstream watermarking, contrasted to the performance of PCM spread spectrum watermarking.

  1. Integrated Spacesuit Audio System Enhances Speech Quality and Reduces Noise

    Science.gov (United States)

    Huang, Yiteng Arden; Chen, Jingdong; Chen, Shaoyan Sharyl

    2009-01-01

    A new approach has been proposed for increasing astronaut comfort and speech capture. Currently, the special design of a spacesuit forms an extreme acoustic environment making it difficult to capture clear speech without compromising comfort. The proposed Integrated Spacesuit Audio (ISA) system is to incorporate the microphones into the helmet and use software to extract voice signals from background noise.

  2. Audio-Visual Perception System for a Humanoid Robotic Head

    Science.gov (United States)

    Viciana-Abad, Raquel; Marfil, Rebeca; Perez-Lorenzo, Jose M.; Bandera, Juan P.; Romero-Garces, Adrian; Reche-Lopez, Pedro

    2014-01-01

    One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting. Different approaches follow bio-inspired mechanisms, merging audio and visual cues to localize a person using multiple sensors. However, most of these fusion mechanisms have been used in fixed systems, such as those used in video-conference rooms, and thus, they may incur difficulties when constrained to the sensors with which a robot can be equipped. Besides, within the scope of interactive autonomous robots, there is a lack in terms of evaluating the benefits of audio-visual attention mechanisms, compared to only audio or visual approaches, in real scenarios. Most of the tests conducted have been within controlled environments, at short distances and/or with off-line performance measurements. With the goal of demonstrating the benefit of fusing sensory information with a Bayes inference for interactive robotics, this paper presents a system for localizing a person by processing visual and audio data. Moreover, the performance of this system is evaluated and compared via considering the technical limitations of unimodal systems. The experiments show the promise of the proposed approach for the proactive detection and tracking of speakers in a human-robot interactive framework. PMID:24878593

  3. A listening test system for automotive audio - listeners

    DEFF Research Database (Denmark)

    Choisel, Sylvain; Hegarty, Patrick; Christensen, Flemming

    2007-01-01

    A series of experiments was conducted in order to validate an experimental procedure to perform listening tests on car audio systems in a simulation of the car environment in a laboratory, using binaural synthesis with head-tracking. Seven experts and 40 non-expert listeners rated a range of stim...

  4. A listening test system for automative audio

    DEFF Research Database (Denmark)

    Bech, Søren; Gulbol, Mehmet-Ali; Martin, Geoff

    2005-01-01

    This paper describes two listening tests that were performed to provide initial validation of an auralisation system (see Part 1) to mimic the acoustics of a car interior. The validation is based on a comparison of results from an in-car listening test and another test using the auralisation syst...

  5. Prototype of speech translation system for audio effective communication

    OpenAIRE

    Rojas Bello, Richard; Araya Araya, Erick; Vidal Vidal, Luis

    2006-01-01

    The present document exposes the development of a prototype of translation system as a Thesis Project. It consists basically on the capture of a flow of voice from the emitter, integrating advanced technologies of voice recognition, instantaneous translation and communication over the internet protocol RTP/RTCP (Real time Transport Protocol) to send information in real-time to the receiver. This prototype doesn't transmit image, it only boards the audio stage. Finally, the project besides emb...

  6. A Smart Audio on Demand Application on Android Systems

    Directory of Open Access Journals (Sweden)

    Ing-Jr Ding

    2015-05-01

    Full Text Available This paper describes a study of the realization of intelligent Audio on Demand (AOD processing in the embedded system environment. This study describes the development of innovative Android software that will enhance user experience of the increasingly popular number of smart mobile devices now available on the market. The application we developed can accumulate records of the songs that are played and automatically analyze the favorite song types of a user. The application can also select sound control playback functions to make operation more convenient. A large number of different types of music genre were collected to create a sound database and build an intelligent AOD processing mechanism. Formant analysis was used to extract voice features and the K-means clustering method and acoustic modeling technology of the Gaussian mixture model (GMM were used to study and develop the application mechanism. The processes we developed run smoothly in the embedded Android platform.

  7. Interactive video audio system: communication server for INDECT portal

    Science.gov (United States)

    Mikulec, Martin; Voznak, Miroslav; Safarik, Jakub; Partila, Pavol; Rozhon, Jan; Mehic, Miralem

    2014-05-01

    The paper deals with presentation of the IVAS system within the 7FP EU INDECT project. The INDECT project aims at developing the tools for enhancing the security of citizens and protecting the confidentiality of recorded and stored information. It is a part of the Seventh Framework Programme of European Union. We participate in INDECT portal and the Interactive Video Audio System (IVAS). This IVAS system provides a communication gateway between police officers working in dispatching centre and police officers in terrain. The officers in dispatching centre have capabilities to obtain information about all online police officers in terrain, they can command officers in terrain via text messages, voice or video calls and they are able to manage multimedia files from CCTV cameras or other sources, which can be interesting for officers in terrain. The police officers in terrain are equipped by smartphones or tablets. Besides common communication, they can reach pictures or videos sent by commander in office and they can respond to the command via text or multimedia messages taken by their devices. Our IVAS system is unique because we are developing it according to the special requirements from the Police of the Czech Republic. The IVAS communication system is designed to use modern Voice over Internet Protocol (VoIP) services. The whole solution is based on open source software including linux and android operating systems. The technical details of our solution are presented in the paper.

  8. Reducing audio stimulus presentation latencies across studies, laboratories, and hardware and operating system configurations.

    Science.gov (United States)

    Babjack, Destiny L; Cernicky, Brandon; Sobotka, Andrew J; Basler, Lee; Struthers, Devon; Kisic, Richard; Barone, Kimberly; Zuccolotto, Anthony P

    2015-09-01

    Using differing computer platforms and audio output devices to deliver audio stimuli often introduces (1) substantial variability across labs and (2) variable time between the intended and actual sound delivery (the sound onset latency). Fast, accurate audio onset latencies are particularly important when audio stimuli need to be delivered precisely as part of studies that depend on accurate timing (e.g., electroencephalographic, event-related potential, or multimodal studies), or in multisite studies in which standardization and strict control over the computer platforms used is not feasible. This research describes the variability introduced by using differing configurations and introduces a novel approach to minimizing audio sound latency and variability. A stimulus presentation and latency assessment approach is presented using E-Prime and Chronos (a new multifunction, USB-based data presentation and collection device). The present approach reliably delivers audio stimuli with low latencies that vary by ≤1 ms, independent of hardware and Windows operating system (OS)/driver combinations. The Chronos audio subsystem adopts a buffering, aborting, querying, and remixing approach to the delivery of audio, to achieve a consistent 1-ms sound onset latency for single-sound delivery, and precise delivery of multiple sounds that achieves standard deviations of 1/10th of a millisecond without the use of advanced scripting. Chronos's sound onset latencies are small, reliable, and consistent across systems. Testing of standard audio delivery devices and configurations highlights the need for careful attention to consistency between labs, experiments, and multiple study sites in their hardware choices, OS selections, and adoption of audio delivery systems designed to sidestep the audio latency variability issue.

  9. An Analog I/O Interface Board for Audio Arduino Open Sound Card System

    DEFF Research Database (Denmark)

    Dimitrov, Smilen; Serafin, Stefania

    2011-01-01

    AudioArduino [1] is a system consisting of an ALSA (Advanced Linux Sound Architecture) audio driver and corresponding microcontroller code; that can demonstrate full-duplex, mono, 8-bit, 44.1 kHz soundcard behavior on an FTDI based Arduino. While the basic operation as a soundcard can be demonstr...

  10. Robot Command Interface Using an Audio-Visual Speech Recognition System

    Science.gov (United States)

    Ceballos, Alexánder; Gómez, Juan; Prieto, Flavio; Redarce, Tanneguy

    In recent years audio-visual speech recognition has emerged as an active field of research thanks to advances in pattern recognition, signal processing and machine vision. Its ultimate goal is to allow human-computer communication using voice, taking into account the visual information contained in the audio-visual speech signal. This document presents a command's automatic recognition system using audio-visual information. The system is expected to control the laparoscopic robot da Vinci. The audio signal is treated using the Mel Frequency Cepstral Coefficients parametrization method. Besides, features based on the points that define the mouth's outer contour according to the MPEG-4 standard are used in order to extract the visual speech information.

  11. Design and implementation of a two-way real-time communication system for audio over CATV networks

    Science.gov (United States)

    Cho, Choong Sang; Oh, Yoo Rhee; Lee, Young Han; Kim, Hong Kook

    2007-09-01

    In this paper, we design and implement a two-way real-time communication system for audio over cable television (CATV) networks to provide an audio-based interaction between the CATV broadcasting station and CATV subscribers. The two-way real-time communication system consists of a real-time audio encoding/decoding module, a payload formatter based on a transmission control protocol/Internet protocol (TCP/IP), and a cable network. At the broadcasting station, audio signals from a microphone are encoded by an audio codec that is implemented using a digital signal processor (DSP), where the MPEG-2 Layer II audio codec is used for the audio codec and TMS320C6416 is used for a DSP. Next, a payload formatter constructs a TCP/IP packet from an audio bitstream for transmission to a cable modem. Another payload formatter at the subscriber unpacks the TCP/IP packet decoded from the cable modem into audio bitstream. This bitstream is decoded by the MPEG-2 Layer II audio decoder. Finally the decoded audio signals are played out to the speaker. We confirmed that the system worked in real-time, with a measured delay of around 150 ms including the algorithmic and processing time delays.

  12. Collusion-Resistant Audio Fingerprinting System in the Modulated Complex Lapped Transform Domain

    Science.gov (United States)

    Garcia-Hernandez, Jose Juan; Feregrino-Uribe, Claudia; Cumplido, Rene

    2013-01-01

    Collusion-resistant fingerprinting paradigm seems to be a practical solution to the piracy problem as it allows media owners to detect any unauthorized copy and trace it back to the dishonest users. Despite the billionaire losses in the music industry, most of the collusion-resistant fingerprinting systems are devoted to digital images and very few to audio signals. In this paper, state-of-the-art collusion-resistant fingerprinting ideas are extended to audio signals and the corresponding parameters and operation conditions are proposed. Moreover, in order to carry out fingerprint detection using just a fraction of the pirate audio clip, block-based embedding and its corresponding detector is proposed. Extensive simulations show the robustness of the proposed system against average collusion attack. Moreover, by using an efficient Fast Fourier Transform core and standard computer machines it is shown that the proposed system is suitable for real-world scenarios. PMID:23762455

  13. Collusion-resistant audio fingerprinting system in the modulated complex lapped transform domain.

    Directory of Open Access Journals (Sweden)

    Jose Juan Garcia-Hernandez

    Full Text Available Collusion-resistant fingerprinting paradigm seems to be a practical solution to the piracy problem as it allows media owners to detect any unauthorized copy and trace it back to the dishonest users. Despite the billionaire losses in the music industry, most of the collusion-resistant fingerprinting systems are devoted to digital images and very few to audio signals. In this paper, state-of-the-art collusion-resistant fingerprinting ideas are extended to audio signals and the corresponding parameters and operation conditions are proposed. Moreover, in order to carry out fingerprint detection using just a fraction of the pirate audio clip, block-based embedding and its corresponding detector is proposed. Extensive simulations show the robustness of the proposed system against average collusion attack. Moreover, by using an efficient Fast Fourier Transform core and standard computer machines it is shown that the proposed system is suitable for real-world scenarios.

  14. Audio-magnetotelluric survey to characterize the Sunnyside porphyry copper system in the Patagonia Mountains, Arizona

    Science.gov (United States)

    Sampson, Jay A.; Rodriguez, Brian D.

    2010-01-01

    The Sunnyside porphyry copper system is part of the concealed San Rafael Valley porphyry system located in the Patagonia Mountains of Arizona. The U.S. Geological Survey is conducting a series of multidisciplinary studies as part of the Assessment Techniques for Concealed Mineral Resources project. To help characterize the size, resistivity, and skin depth of the polarizable mineral deposit concealed beneath thick overburden, a regional east-west audio-magnetotelluric sounding profile was acquired. The purpose of this report is to release the audio-magnetotelluric sounding data collected along that east-west profile. No interpretation of the data is included.

  15. A method for Perceptual Assessment of Automotive Audio Systems and Cabin Acoustics

    DEFF Research Database (Denmark)

    Kaplanis, Neofytos; Bech, Søren; Sakari, Tervo

    2016-01-01

    This paper reports the design and implementation of a method to perceptually assess the acoustical prop- erties of a car cabin and the subsequent sound reproduction properties of automotive audio systems. Here, we combine Spatial Decomposition Method and Rapid Sensory Analysis techniques. The for......This paper reports the design and implementation of a method to perceptually assess the acoustical prop- erties of a car cabin and the subsequent sound reproduction properties of automotive audio systems. Here, we combine Spatial Decomposition Method and Rapid Sensory Analysis techniques...

  16. System Level Power Optimization of Digital Audio Back End for Hearing Aids

    DEFF Research Database (Denmark)

    Pracny, Peter; Jørgensen, Ivan Harald Holger; Bruun, Erik

    2017-01-01

    This work deals with power optimization of the audio processing back end for hearing aids - the interpolation filter (IF), the sigma-delta (SD modulator and the Class D power amplifier (PA) as a whole. Specifications are derived and insight into the tradeoffs involved is used to optimize the inte......This work deals with power optimization of the audio processing back end for hearing aids - the interpolation filter (IF), the sigma-delta (SD modulator and the Class D power amplifier (PA) as a whole. Specifications are derived and insight into the tradeoffs involved is used to optimize...... to track the hardware and power demands as the tradeoffs of the system level parameters are investigated. The result is the digital part of the back end optimized with respect to power which provides audio performance comparable to state-of-theart. A combination of system level parameters leading...

  17. Frequency dependent loss analysis and minimization of system losses in switchmode audio power amplifiers

    DEFF Research Database (Denmark)

    Yamauchi, Akira; Knott, Arnold; Jørgensen, Ivan Harald Holger

    2014-01-01

    In this paper, frequency dependent losses in switch-mode audio power amplifiers are analyzed and a loss model is improved by taking the voltage dependence of the parasitic capacitance of MOSFETs into account. The estimated power losses are compared to the measurement and great accuracy is achieved....... By choosing the optimal switching frequency based on the proposed analysis, the experimental results show that system power losses of the reference design are minimized and an efficiency improvement of 8 % in maximum is achieved without compromising audio performances....

  18. Back to basics audio

    CERN Document Server

    Nathan, Julian

    1998-01-01

    Back to Basics Audio is a thorough, yet approachable handbook on audio electronics theory and equipment. The first part of the book discusses electrical and audio principles. Those principles form a basis for understanding the operation of equipment and systems, covered in the second section. Finally, the author addresses planning and installation of a home audio system.Julian Nathan joined the audio service and manufacturing industry in 1954 and moved into motion picture engineering and production in 1960. He installed and operated recording theaters in Sydney, Austra

  19. Audio Papers

    DEFF Research Database (Denmark)

    Groth, Sanne Krogh; Samson, Kristine

    2016-01-01

    With this special issue of Seismograf we are happy to present a new format of articles: Audio Papers. Audio papers resemble the regular essay or the academic text in that they deal with a certain topic of interest, but presented in the form of an audio production. The audio paper is an extension...

  20. Audio system using binaural synthesis for multimodal telepresence applications

    DEFF Research Database (Denmark)

    Madsen, Esben; Markovic, Milos; Olesen, Søren Krarup

    2013-01-01

    of microphones, headphones and loudspeakers as well as measurements of network latency and bandwidth requirements of the system. Furthermore, measurements were made to determine whether the level of echo and cross talk cause any issues. The overall system employs multiple modalities to virtually transport...... are implemented in a distributed manner. Body-tracking of all participants is provided through the system for the purpose of using binaural synthesis for directional sound. Head-worn microphones are used to capture sound, and the visitor is provided with directional sound through headphones. The visitor...

  1. Audio Key Finding: Considerations in System Design and Case Studies on Chopin's 24 Preludes

    Directory of Open Access Journals (Sweden)

    Elaine Chew

    2007-01-01

    Full Text Available We systematically analyze audio key finding to determine factors important to system design, and the selection and evaluation of solutions. First, we present a basic system, fuzzy analysis spiral array center of effect generator algorithm, with three key determination policies: nearest-neighbor (NN, relative distance (RD, and average distance (AD. AD achieved a 79% accuracy rate in an evaluation on 410 classical pieces, more than 8% higher RD and NN. We show why audio key finding sometimes outperforms symbolic key finding. We next propose three extensions to the basic key finding system—the modified spiral array (mSA, fundamental frequency identification (F0, and post-weight balancing (PWB—to improve performance, with evaluations using Chopin's Preludes (Romantic repertoire was the most challenging. F0 provided the greatest improvement in the first 8 seconds, while mSA gave the best performance after 8 seconds. Case studies examine when all systems were correct, or all incorrect.

  2. Software open system for MPEG video and audio transmission

    Science.gov (United States)

    Cabitza, Gabriella; Setzu, Maria G.; Fregonese, Giulio

    1998-02-01

    This paper describes some experience we have done in the development of multimedia application inside a distributed architecture. The requirements of such environment are the following: (1) to support high quality multimedia data; (2) to be independent from the hardware platform and from the transport protocol; (3) to be cheap form the client side. The goals to reach were twofold: to develop a client/server system for the management and transmission of MPEG bitstreams, and to optimize the transmission of MPEG over ATM networks. The first goal was reached using standardized technologies in the implementation of the system components: a DSM-CC server has been realized based on CORBA Services, which is able to manage MPEG streams as specified in the ISO DSM-CC document. The client module has been realized using Java. The system, originally written for Sun Solaris has been successfully tested on different Unix and NT platform. The evaluation of quality and performance of the transmission of MPEG over ATM was made using three types of signaling: classical IP over ATM,, LAN emulation and the ATM Native algorithm implemented over FORE API. Both classical IP over ATM and LAN Emulation allows the transparent use of ATM for application written using the widespread TCP/IP family of protocols, but they introduce an overload of data which could not be suitable for real-time transmission. Of course, native ATM gets better performance than Classical IP over ATM and LAN Emulation but binds the application to run only on ATM networks, and at the moment it is hardware dependent.

  3. Automatic Speech Acquisition and Recognition for Spacesuit Audio Systems

    Science.gov (United States)

    Ye, Sherry

    2015-01-01

    NASA has a widely recognized but unmet need for novel human-machine interface technologies that can facilitate communication during astronaut extravehicular activities (EVAs), when loud noises and strong reverberations inside spacesuits make communication challenging. WeVoice, Inc., has developed a multichannel signal-processing method for speech acquisition in noisy and reverberant environments that enables automatic speech recognition (ASR) technology inside spacesuits. The technology reduces noise by exploiting differences between the statistical nature of signals (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, ASR accuracy can be improved to the level at which crewmembers will find the speech interface useful. System components and features include beam forming/multichannel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, and ASR decoding. Arithmetic complexity models were developed and will help designers of real-time ASR systems select proper tasks when confronted with constraints in computational resources. In Phase I of the project, WeVoice validated the technology. The company further refined the technology in Phase II and developed a prototype for testing and use by suited astronauts.

  4. Measuring 3D Audio Localization Performance and Speech Quality of Conferencing Calls for a Multiparty Communication System

    Directory of Open Access Journals (Sweden)

    Mansoor Hyder

    2013-07-01

    Full Text Available Communication systems which support 3D (Three Dimensional audio offer a couple of advantages to the users/customers. Firstly, within the virtual acoustic environments all participants could easily be recognized through their placement/sitting positions. Secondly, all participants can turn their focus on any particular talker when multiple participants start talking at the same time by taking advantage of the natural listening tendency which is called the Cocktail Party Effect. On the other hand, 3D audio is known as a decreasing factor for overall speech quality because of the commencement of reverberations and echoes within the listening environment. In this article, we study the tradeoff between speech quality and human natural ability of localizing audio events/or talkers within our three dimensional audio supported telephony and teleconferencing solution. Further, we performed subjective user studies by incorporating two different HRTFs (Head Related Transfer Functions, different placements of the teleconferencing participants and different layouts of the virtual environments. Moreover, subjective user studies results for audio event localization and subjective speech quality are presented in this article. This subjective user study would help the research community to optimize the existing 3D audio systems and to design new 3D audio supported teleconferencing solutions based on the quality of experience requirements of the users/customers for agriculture personal in particular and for all potential users in general.

  5. Measuring 3D Audio Localization Performance and Speech Quality of Conferencing Calls for a Multiparty Communication System

    International Nuclear Information System (INIS)

    Hyder, M.; Menghwar, G.D.; Qureshi, A.

    2013-01-01

    Communication systems which support 3D (Three Dimensional) audio offer a couple of advantages to the users/customers. Firstly, within the virtual acoustic environments all participants could easily be recognized through their placement/sitting positions. Secondly, all participants can turn their focus on any particular talker when multiple participants start talking at the same time by taking advantage of the natural listening tendency which is called the Cocktail Party Effect. On the other hand, 3D audio is known as a decreasing factor for overall speech quality because of the commencement of reverberations and echoes within the listening environment. In this article, we study the tradeoff between speech quality and human natural ability of localizing audio events/or talkers within our three dimensional audio supported telephony and teleconferencing solution. Further, we performed subjective user studies by incorporating two different HRTFs (Head Related Transfer Functions), different placements of the teleconferencing participants and different layouts of the virtual environments. Moreover, subjective user studies results for audio event localization and subjective speech quality are presented in this article. This subjective user study would help the research community to optimize the existing 3D audio systems and to design new 3D audio supported teleconferencing solutions based on the quality of experience requirements of the users/customers for agriculture personal in particular and for all potential users in general. (author)

  6. Advances in Audio-Based Systems to Monitor Patient Adherence and Inhaler Drug Delivery.

    Science.gov (United States)

    Taylor, Terence E; Zigel, Yaniv; De Looze, Céline; Sulaiman, Imran; Costello, Richard W; Reilly, Richard B

    2018-03-01

    Hundreds of millions of people worldwide have asthma and COPD. Current medications to control these chronic respiratory diseases can be administered using inhaler devices, such as the pressurized metered dose inhaler and the dry powder inhaler. Provided that they are used as prescribed, inhalers can improve patient clinical outcomes and quality of life. Poor patient inhaler adherence (both time of use and user technique) is, however, a major clinical concern and is associated with poor disease control, increased hospital admissions, and increased mortality rates, particularly in low- and middle-income countries. There are currently limited methods available to health-care professionals to objectively and remotely monitor patient inhaler adherence. This review describes recent sensor-based technologies that use audio-based approaches that show promising opportunities for monitoring inhaler adherence in clinical practice. This review discusses how one form of sensor-based technology, audio-based monitoring systems, can provide clinically pertinent information regarding patient inhaler use over the course of treatment. Audio-based monitoring can provide health-care professionals with quantitative measurements of the drug delivery of inhalers, signifying a clear clinical advantage over other methods of assessment. Furthermore, objective audio-based adherence measures can improve the predictability of patient outcomes to treatment compared with current standard methods of adherence assessment used in clinical practice. Objective feedback on patient inhaler adherence can be used to personalize treatment to the patient, which may enhance precision medicine in the treatment of chronic respiratory diseases. Copyright © 2017 American College of Chest Physicians. Published by Elsevier Inc. All rights reserved.

  7. Blind speech separation system for humanoid robot with FastICA for audio filtering and separation

    Science.gov (United States)

    Budiharto, Widodo; Santoso Gunawan, Alexander Agung

    2016-07-01

    Nowadays, there are many developments in building intelligent humanoid robot, mainly in order to handle voice and image. In this research, we propose blind speech separation system using FastICA for audio filtering and separation that can be used in education or entertainment. Our main problem is to separate the multi speech sources and also to filter irrelevant noises. After speech separation step, the results will be integrated with our previous speech and face recognition system which is based on Bioloid GP robot and Raspberry Pi 2 as controller. The experimental results show the accuracy of our blind speech separation system is about 88% in command and query recognition cases.

  8. An Assessment of the Audio Codec Performance in Voice over WLAN (VoWLAN) Systems

    OpenAIRE

    Narbutt, Miroslaw; Davis, Mark

    2005-01-01

    In this paper we present results of experimental investigation into the performance of three audio codecs (ITU-T G.711, G.723.1 and G.729A) under varying load conditions on a Voice over WLAN system utilizing the IEEE 802.11b wireless LAN standard. The analysis is based upon a new technique for estimating user satisfaction of speech quality calculated from packet delay and packet loss/late measurements. We also demonstrate the importance of the de-jitter buffer playout scheme for insuring spee...

  9. Distortion Analysis Toolkit—A Software Tool for Easy Analysis of Nonlinear Audio Systems

    Directory of Open Access Journals (Sweden)

    Pakarinen Jyri

    2010-01-01

    Full Text Available Several audio effects devices deliberately add nonlinear distortion to the processed signal in order to create a desired sound. When creating virtual analog models of nonlinearly distorting devices, it would be very useful to carefully analyze the type of distortion, so that the model could be made as realistic as possible. While traditional system analysis tools such as the frequency response give detailed information on the operation of linear and time-invariant systems, they are less useful for analyzing nonlinear devices. Furthermore, although there do exist separate algorithms for nonlinear distortion analysis, there is currently no unified, easy-to-use tool for rapid analysis of distorting audio systems. This paper offers a remedy by introducing a new software tool for easy analysis of distorting effects. A comparison between a well-known guitar tube amplifier and two commercial software simulations is presented as a case study. This freely available software is written in Matlab language, but the analysis tool can also run as a standalone program, so the user does not need to have Matlab installed in order to perform the analysis.

  10. Distortion Analysis Toolkit—A Software Tool for Easy Analysis of Nonlinear Audio Systems

    Science.gov (United States)

    Pakarinen, Jyri

    2010-12-01

    Several audio effects devices deliberately add nonlinear distortion to the processed signal in order to create a desired sound. When creating virtual analog models of nonlinearly distorting devices, it would be very useful to carefully analyze the type of distortion, so that the model could be made as realistic as possible. While traditional system analysis tools such as the frequency response give detailed information on the operation of linear and time-invariant systems, they are less useful for analyzing nonlinear devices. Furthermore, although there do exist separate algorithms for nonlinear distortion analysis, there is currently no unified, easy-to-use tool for rapid analysis of distorting audio systems. This paper offers a remedy by introducing a new software tool for easy analysis of distorting effects. A comparison between a well-known guitar tube amplifier and two commercial software simulations is presented as a case study. This freely available software is written in Matlab language, but the analysis tool can also run as a standalone program, so the user does not need to have Matlab installed in order to perform the analysis.

  11. Audio Twister

    DEFF Research Database (Denmark)

    Cermak, Daniel; Moreno Garcia, Rodrigo; Monastiridis, Stefanos

    2015-01-01

    Daniel Cermak-Sassenrath, Rodrigo Moreno Garcia, Stefanos Monastiridis. Audio Twister. Installation. P-Hack Copenhagen 2015, Copenhagen, DK, Apr 24, 2015.......Daniel Cermak-Sassenrath, Rodrigo Moreno Garcia, Stefanos Monastiridis. Audio Twister. Installation. P-Hack Copenhagen 2015, Copenhagen, DK, Apr 24, 2015....

  12. Safety of the HyperSound® Audio System in subjects with normal hearing

    Directory of Open Access Journals (Sweden)

    Ritvik P. Mehta

    2015-11-01

    Full Text Available The objective of the study was to assess the safety of the HyperSound® Audio System (HSS, a novel audio system using ultrasound technology, in normal hearing subjects under normal use conditions; we considered preexposure and post-exposure test design. We investigated primary and secondary outcome measures: i temporary threshold shift (TTS, defined as >10 dB shift in pure tone air conduction thresholds and/or a decrement in distortion product otoacoustic emissions (DPOAEs >10 dB at two or more frequencies; ii presence of new-onset otologic symptoms after exposure. Twenty adult subjects with normal hearing underwent a pre-exposure assessment (pure tone air conduction audiometry, tympanometry, DPOAEs and otologic symptoms questionnaire followed by exposure to a 2-h movie with sound delivered through the HSS emitter followed by a post-exposure assessment. No TTS or new-onset otological symptoms were identified. HSS demonstrates excellent safety in normal hearing subjects under normal use conditions.

  13. An Analysis/Synthesis System of Audio Signal with Utilization of an SN Model

    Directory of Open Access Journals (Sweden)

    G. Rozinaj

    2004-12-01

    Full Text Available An SN (sinusoids plus noise model is a spectral model, in which theperiodic components of the sound are represented by sinusoids withtime-varying frequencies, amplitudes and phases. The remainingnon-periodic components are represented by a filtered noise. Thesinusoidal model utilizes physical properties of musical instrumentsand the noise model utilizes the human inability to perceive the exactspectral shape or the phase of stochastic signals. SN modeling can beapplied in a compression, transformation, separation of sounds, etc.The designed system is based on methods used in the SN modeling. Wehave proposed a model that achieves good results in audio perception.Although many systems do not save phases of the sinusoids, they areimportant for better modelling of transients, for the computation ofresidual and last but not least for stereo signals, too. One of thefundamental properties of the proposed system is the ability of thesignal reconstruction not only from the amplitude but from the phasepoint of view, as well.

  14. Audio Fingerprint Untuk Identifikasi File Audio

    OpenAIRE

    Yuanto, Stefanus Irwan; Tampubolon, Junius Karel; Restyandito, Restyandito

    2007-01-01

    Identifikasi file audio secara biner kurang efektif karena adanya format penyimpanan dan cara penyimpanan file audio yang berbeda-beda. Dengan menerapkan konsep audio fingerprint maka sinyal audio akan diidentifikasi dengan membandingkan sebuah kode unik berukuran kecil yang mewakili sinyal audio tersebut sehingga perbedaan format dan cara penyimpanan tidak berpengaruh besar terhadap sebuah proses identifikasi audio.

  15. The Language System of Audio Description: An Investigation as a Discursive Process

    Science.gov (United States)

    Piety, Philip J.

    2004-01-01

    This study investigated the language used in a selection of films containing audio description and developed a set of definitions that allow productions containing it to be more fully defined, measured, and compared. It also highlights some challenging questions related to audio description as a discursive practice and provides a basis for future…

  16. A Low-Cost Audio Prescription Labeling System Using RFID for Thai Visually-Impaired People.

    Science.gov (United States)

    Lertwiriyaprapa, Titipong; Fakkheow, Pirapong

    2015-01-01

    This research aims to develop a low-cost audio prescription labeling (APL) system for visually-impaired people by using the RFID system. The developed APL system includes the APL machine and APL software. The APL machine is for visually-impaired people while APL software allows caregivers to record all important information into the APL machine. The main objective of the development of the APL machine is to reduce costs and size by designing all of the electronic devices to fit into one print circuit board. Also, it is designed so that it is easy to use and can become an electronic aid for daily living. The developed APL software is based on Java and MySQL, both of which can operate on various operating platforms and are easy to develop as commercial software. The developed APL system was first evaluated by 5 experts. The APL system was also evaluated by 50 actual visually-impaired people (30 elders and 20 blind individuals) and 20 caregivers, pharmacists and nurses. After using the APL system, evaluations were carried out, and it can be concluded from the evaluation results that this proposed APL system can be effectively used for helping visually-impaired people in terms of self-medication.

  17. Calibration of Clinical Audio Recording and Analysis Systems for Sound Intensity Measurement.

    Science.gov (United States)

    Maryn, Youri; Zarowski, Andrzej

    2015-11-01

    Sound intensity is an important acoustic feature of voice/speech signals. Yet recordings are performed with different microphone, amplifier, and computer configurations, and it is therefore crucial to calibrate sound intensity measures of clinical audio recording and analysis systems on the basis of output of a sound-level meter. This study was designed to evaluate feasibility, validity, and accuracy of calibration methods, including audiometric speech noise signals and human voice signals under typical speech conditions. Calibration consisted of 3 comparisons between data from 29 measurement microphone-and-computer systems and data from the sound-level meter: signal-specific comparison with audiometric speech noise at 5 levels, signal-specific comparison with natural voice at 3 levels, and cross-signal comparison with natural voice at 3 levels. Intensity measures from recording systems were then linearly converted into calibrated data on the basis of these comparisons, and validity and accuracy of calibrated sound intensity were investigated. Very strong correlations and quasisimilarity were found between calibrated data and sound-level meter data across calibration methods and recording systems. Calibration of clinical sound intensity measures according to this method is feasible, valid, accurate, and representative for a heterogeneous set of microphones and data acquisition systems in real-life circumstances with distinct noise contexts.

  18. A technology prototype system for rating therapist empathy from audio recordings in addiction counseling.

    Science.gov (United States)

    Xiao, Bo; Huang, Chewei; Imel, Zac E; Atkins, David C; Georgiou, Panayiotis; Narayanan, Shrikanth S

    2016-04-01

    Scaling up psychotherapy services such as for addiction counseling is a critical societal need. One challenge is ensuring quality of therapy, due to the heavy cost of manual observational assessment. This work proposes a speech technology-based system to automate the assessment of therapist empathy-a key therapy quality index-from audio recordings of the psychotherapy interactions. We designed a speech processing system that includes voice activity detection and diarization modules, and an automatic speech recognizer plus a speaker role matching module to extract the therapist's language cues. We employed Maximum Entropy models, Maximum Likelihood language models, and a Lattice Rescoring method to characterize high vs. low empathic language. We estimated therapy-session level empathy codes using utterance level evidence obtained from these models. Our experiments showed that the fully automated system achieved a correlation of 0.643 between expert annotated empathy codes and machine-derived estimations, and an accuracy of 81% in classifying high vs. low empathy, in comparison to a 0.721 correlation and 86% accuracy in the oracle setting using manual transcripts. The results show that the system provides useful information that can contribute to automatic quality insurance and therapist training.

  19. A technology prototype system for rating therapist empathy from audio recordings in addiction counseling

    Directory of Open Access Journals (Sweden)

    Bo Xiao

    2016-04-01

    Full Text Available Scaling up psychotherapy services such as for addiction counseling is a critical societal need. One challenge is ensuring quality of therapy, due to the heavy cost of manual observational assessment. This work proposes a speech technology-based system to automate the assessment of therapist empathy—a key therapy quality index—from audio recordings of the psychotherapy interactions. We designed a speech processing system that includes voice activity detection and diarization modules, and an automatic speech recognizer plus a speaker role matching module to extract the therapist’s language cues. We employed Maximum Entropy models, Maximum Likelihood language models, and a Lattice Rescoring method to characterize high vs. low empathic language. We estimated therapy-session level empathy codes using utterance level evidence obtained from these models. Our experiments showed that the fully automated system achieved a correlation of 0.643 between expert annotated empathy codes and machine-derived estimations, and an accuracy of 81% in classifying high vs. low empathy, in comparison to a 0.721 correlation and 86% accuracy in the oracle setting using manual transcripts. The results show that the system provides useful information that can contribute to automatic quality insurance and therapist training.

  20. Audio-visual imposture

    Science.gov (United States)

    Karam, Walid; Mokbel, Chafic; Greige, Hanna; Chollet, Gerard

    2006-05-01

    A GMM based audio visual speaker verification system is described and an Active Appearance Model with a linear speaker transformation system is used to evaluate the robustness of the verification. An Active Appearance Model (AAM) is used to automatically locate and track a speaker's face in a video recording. A Gaussian Mixture Model (GMM) based classifier (BECARS) is used for face verification. GMM training and testing is accomplished on DCT based extracted features of the detected faces. On the audio side, speech features are extracted and used for speaker verification with the GMM based classifier. Fusion of both audio and video modalities for audio visual speaker verification is compared with face verification and speaker verification systems. To improve the robustness of the multimodal biometric identity verification system, an audio visual imposture system is envisioned. It consists of an automatic voice transformation technique that an impostor may use to assume the identity of an authorized client. Features of the transformed voice are then combined with the corresponding appearance features and fed into the GMM based system BECARS for training. An attempt is made to increase the acceptance rate of the impostor and to analyzing the robustness of the verification system. Experiments are being conducted on the BANCA database, with a prospect of experimenting on the newly developed PDAtabase developed within the scope of the SecurePhone project.

  1. Effect of audio instruction on tracking errors using a four‐dimensional image‐guided radiotherapy system

    Science.gov (United States)

    Sawada, Akira; Mukumoto, Nobutaka; Takahashi, Kunio; Mizowaki, Takashi; Kokubo, Masaki; Hiraoka, Masahiro

    2013-01-01

    The Vero4DRT (MHI‐TM2000) is capable of performing X‐ray image‐based tracking (X‐ray Tracking) that directly tracks the target or fiducial markers under continuous kV X‐ray imaging. Previously, we have shown that irregular respiratory patterns increased X‐ray Tracking errors. Thus, we assumed that audio instruction, which generally improves the periodicity of respiration, should reduce tracking errors. The purpose of this study was to assess the effect of audio instruction on X‐ray Tracking errors. Anterior‐posterior abdominal skin‐surface displacements obtained from ten lung cancer patients under free breathing and simple audio instruction were used as an alternative to tumor motion in the superior‐inferior direction. First, a sequential predictive model based on the Levinson‐Durbin algorithm was created to estimate the future three‐dimensional (3D) target position under continuous kV X‐ray imaging while moving a steel ball target of 9.5 mm in diameter. After creating the predictive model, the future 3D target position was sequentially calculated from the current and past 3D target positions based on the predictive model every 70 ms under continuous kV X‐ray imaging. Simultaneously, the system controller of the Vero4DRT calculated the corresponding pan and tilt rotational angles of the gimbaled X‐ray head, which then adjusted its orientation to the target. The calculated and current rotational angles of the gimbaled X‐ray head were recorded every 5 ms. The target position measured by the laser displacement gauge was synchronously recorded every 10 msec. Total tracking system errors (ET) were compared between free breathing and audio instruction. Audio instruction significantly improved breathing regularity (p audio instruction (E95T,AI). E95T,AI was larger than E95T,FB for five patients; no significant difference was found between E95T,FB and ET,AI95(p = 0.21). Correlation analysis revealed that the rapid respiratory velocity

  2. Effect of audio instruction on tracking errors using a four-dimensional image-guided radiotherapy system.

    Science.gov (United States)

    Nakamura, Mitsuhiro; Sawada, Akira; Mukumoto, Nobutaka; Takahashi, Kunio; Mizowaki, Takashi; Kokubo, Masaki; Hiraoka, Masahiro

    2013-09-06

    The Vero4DRT (MHI-TM2000) is capable of performing X-ray image-based tracking (X-ray Tracking) that directly tracks the target or fiducial markers under continuous kV X-ray imaging. Previously, we have shown that irregular respiratory patterns increased X-ray Tracking errors. Thus, we assumed that audio instruction, which generally improves the periodicity of respiration, should reduce tracking errors. The purpose of this study was to assess the effect of audio instruction on X-ray Tracking errors. Anterior-posterior abdominal skin-surface displacements obtained from ten lung cancer patients under free breathing and simple audio instruction were used as an alternative to tumor motion in the superior-inferior direction. First, a sequential predictive model based on the Levinson-Durbin algorithm was created to estimate the future three-dimensional (3D) target position under continuous kV X-ray imaging while moving a steel ball target of 9.5 mm in diameter. After creating the predictive model, the future 3D target position was sequentially calculated from the current and past 3D target positions based on the predictive model every 70 ms under continuous kV X-ray imaging. Simultaneously, the system controller of the Vero4DRT calculated the corresponding pan and tilt rotational angles of the gimbaled X-ray head, which then adjusted its orientation to the target. The calculated and current rotational angles of the gimbaled X-ray head were recorded every 5 ms. The target position measured by the laser displacement gauge was synchronously recorded every 10 msec. Total tracking system errors (ET) were compared between free breathing and audio instruction. Audio instruction significantly improved breathing regularity (p < 0.01). The mean ± standard deviation of the 95th percentile of ET (E95T ) was 1.7 ± 0.5 mm (range: 1.1-2.6mm) under free breathing (E95T,FB) and 1.9 ± 0.5 mm (range: 1.2-2.7 mm) under audio instruction (E95T,AI). E95T,AI was larger than E95T,FB for five

  3. Mixxing Audio Menggunakan FL Studio

    OpenAIRE

    Prawira, Yanheri

    2011-01-01

    Kajian ini bertujuan untuk memudahkan proses mixing audio dan menghemat biaya dalam proses Mixxing audio hanya menggunakan sebuah laptop ataupun komputer sebagai media utama yang menggunakan OS Windows 7, dan menggunakan aplikasi yang mencakup : FL Studio 9, ASIO 4 ALL tanpa tambahan alat apapun. Tujuan dari pembuatan system ini berguna untuk mempermudah proses mixxing audio DJ dengan menggunakan media laptop ataupun komputer, tanpa mengeluarkan banyak biaya. 082406014

  4. Automatic Detection and Recognition of Pig Wasting Diseases Using Sound Data in Audio Surveillance Systems

    Directory of Open Access Journals (Sweden)

    Yongwha Chung

    2013-09-01

    Full Text Available Automatic detection of pig wasting diseases is an important issue in the management of group-housed pigs. Further, respiratory diseases are one of the main causes of mortality among pigs and loss of productivity in intensive pig farming. In this study, we propose an efficient data mining solution for the detection and recognition of pig wasting diseases using sound data in audio surveillance systems. In this method, we extract the Mel Frequency Cepstrum Coefficients (MFCC from sound data with an automatic pig sound acquisition process, and use a hierarchical two-level structure: the Support Vector Data Description (SVDD and the Sparse Representation Classifier (SRC as an early anomaly detector and a respiratory disease classifier, respectively. Our experimental results show that this new method can be used to detect pig wasting diseases both economically (even a cheap microphone can be used and accurately (94% detection and 91% classification accuracy, either as a standalone solution or to complement known methods to obtain a more accurate solution.

  5. MRI-compatible audio/visual system: impact on pediatric sedation

    International Nuclear Information System (INIS)

    Harned, R.K. II; Strain, J.D.

    2001-01-01

    Background. While sedation is necessary for much pediatric imaging, there are new alternatives that may help patients hold still without medication. Objective. We examined the effect of an audio/visual system consisting of video goggles and earphones on the need for sedation during magnetic resonance imaging (MRI). Materials and methods. All MRI examinations from May 1999 to October 1999 performed after installation of the MRVision 2000 (Resonance Technology, Inc.) were compared to the same 6-month period in 1998. Imaging and sedation protocols remained constant. Data collected included: patient age, type of examination, use of intravenous contrast enhancement, and need for sedation. The average supply charge and nursing cost per sedated patient were calculated. Results. The 955 patients from 1998 and 1,112 patients from 1999 were similar in demographics and examination distribution. There was an overall reduction in the percent of patients requiring sedation in the group using the video goggle system from 49 to 40 % (P < 0.001). There was no significant change for 0-2 years (P = 0.805), but there was a reduction from 53 to 40 % for age 3-10 years (P < 0.001) and 16 to 8 % for those older than 10 years (P < 0.001). There was a 17 % decrease in MRI room time for those patients whose examinations could be performed without sedation. Sedation costs per patient were $80 for nursing and $29 for supplies. Conclusion. The use of this video system reduced the number of children requiring sedation for MRI examination by 18 %. In addition to reducing patient risk, this can potentially reduce cost. (orig.)

  6. 372 Audio Books in the Nigerian Higher Educational System: To be ...

    African Journals Online (AJOL)

    Nekky Umera

    In a developing country where most homes are still not connected to internet, students now have to queue up at the cybercafés. Lastly, having a study at home is becoming old-fashioned. In the developed countries like America and Britain however, people of all ages are reported to have been switching over to audio books ...

  7. WAVE : a virtual audio environment: an immersive musical instrument using a low cost technological system

    OpenAIRE

    Valbom, Leonel; Forni, Christophe; Marcos, Adérito

    2004-01-01

    The WAVE project proposes a multidisciplinary investigation in order to create a model-prototype of a virtual immersive instrument using audio, visual technologies, and virtual reality. This model will open up new horizons for the processes involved in music making by dealing not only with relevant technological issues, but especially with meaningful research in theareas of human-machine interaction and sound.

  8. Balancing Audio

    DEFF Research Database (Denmark)

    Walther-Hansen, Mads

    2016-01-01

    This paper explores the concept of balance in music production and examines the role of conceptual metaphors in reasoning about audio editing. Balance may be the most central concept in record production, however, the way we cognitively understand and respond meaningfully to a mix requiring balance...... is not thoroughly understood. In this paper I treat balance as a metaphor that we use to reason about several different actions in music production, such as adjusting levels, editing the frequency spectrum or the spatiality of the recording. This study is based on an exploration of a linguistic corpus of sound...

  9. Semantic Audio Track Mixer

    OpenAIRE

    Uhle, C.; Herre, J.; Ridderbusch, F.; Popp, H.

    2011-01-01

    An audio mixer for mixing a plurality of audio tracks to a mixture signal comprises a semantic command interpreter (30; 35) for receiving a semantic mixing command and for deriving a plurality of mixing parameters for the plurality of audio tracks from the semantic mixing command; an audio track processor (70; 75) for processing the plurality of audio tracks in accordance with the plurality of mixing parameters; and an audio track combiner (76) for combining the plurality of audio tracks proc...

  10. Audio-Visual Technician | IDRC - International Development ...

    International Development Research Centre (IDRC) Digital Library (Canada)

    Occasionally records on audio and/or video media, conferences, seminars, lectures and other events. Edits and duplicates audio and video tapes ... Participates in the planning and design of new or updated audio-visual systems by providing technical input on system needs. Based on current and emerging requirements as ...

  11. 372 Audio Books in the Nigerian Higher Educational System: To be ...

    African Journals Online (AJOL)

    Nekky Umera

    audio books as a book for the visually impaired or the blind, but whether we like it or not, it is part of the new media. Printed books have existed for a very long time. One of the challenges facing the use of textbooks however is the exorbitant prices at which they are selling. This has caused students to fall back on the internet ...

  12. Audio Restoration

    Science.gov (United States)

    Esquef, Paulo A. A.

    The first reproducible recording of human voice was made in 1877 on a tinfoil cylinder phonograph devised by Thomas A. Edison. Since then, much effort has been expended to find better ways to record and reproduce sounds. By the mid-1920s, the first electrical recordings appeared and gradually took over purely acoustic recordings. The development of electronic computers, in conjunction with the ability to record data onto magnetic or optical media, culminated in the standardization of compact disc format in 1980. Nowadays, digital technology is applied to several audio applications, not only to improve the quality of modern and old recording/reproduction techniques, but also to trade off sound quality for less storage space and less taxing transmission capacity requirements.

  13. Imaging hydrothermal systems at Furnas caldera (Azores, Portugal): Insights from Audio-Magnetotelluric data

    Science.gov (United States)

    Hogg, Colin; Kiyan, Duygu; Rath, Volker; Byrdina, Svetlana; Vandemeulebrouck, Jean; Silva, Catarina; Viveiros, Maria FB; Ferreira, Teresa

    2016-04-01

    The Furnas volcano is the eastern-most of the three active central volcanoes of Sao Miguel Island. The main caldera formed about 30 ka BP, followed by a younger eruption at 10-12 ka BP, which forms the steep topography of more than 200 m in the measuring area. It contains several very young eruptive centers, and a shallow caldera lake. Tectonic features of varying directions have been identified in the Caldera and its vicinity. In the northern part of the caldera, containing the fumarole field of Caldeiras das Furnas, a detailed map of surface CO2 emissions was recently made available. In 2015, a pilot survey of 13 AudioMagnetoTelluric soundings (AMT) and Electrical Resistivity Tomography (ERT) data were collected along two profiles in the eastern part of Furnas caldera in order to image the electrical conductivity of the subsurface. The data quality achieved by both techniques is extraordinary and first results indicate a general correlation between regions of elevated conductivity and the mapped surface CO2 emissions, suggesting that they may both be caused by the presence hydrothermal fluids. Tensor decomposition analysis using the Groom-Bailey approach produce a generalised geo-electric strike direction, 72deg East of North, for the AMT data compared to the surface geological strike derived from the major mapped fault crossing the profiles of 105deg. An analysis of the real induction arrows at certain frequencies (at depths greater than 350 m) infer that an extended conductor at depth does not exactly correspond to the degassing structures at the surface and extends outside the area of investigation. The geometry of the most conductive regions with electrical conductivities less then1 Ώm found at various depths differ from what was expected from earlier geologic and tectonic studies and possibly may not be directly related to the mapped fault systems at the surface. On the eastern profile, which seemed to be more appropriate for 2-D modelling with 72deg strike

  14. The perceptual influence of the cabin acoustics on the reproduced sound of a car audio system

    DEFF Research Database (Denmark)

    Kaplanis, Neofytos; Bech, Søren; Sakari, Tervo

    2015-01-01

    . In this study, a sensory evaluation methodology [Lokki et al., J. Acoust. Soc. Am. 132, 3148–2161 (2012)] was employed to identify the most relevant attributes that characterize the influence of the physical properties of a car cabin on the reproduced sound field. A series of in-situ measurements of a high...... a previous review [Kaplanis et al., in 55th Int. Conf. Aud. Eng. Soc. (2014)] and possible links to the acoustical properties of the car cabin are discussed. [This study is a part of Marie Curie Network on Dereverberation and Reverberation of Audio, Music, and Speech. EU-FP7 under agreement ITN-GA-2012-316969.]...

  15. ∑∆ Modulator System-Level Considerations for Hearing-Aid Audio Class-D Output Stage Application

    DEFF Research Database (Denmark)

    Pracný, Peter; Bruun, Erik

    2012-01-01

    This paper deals with a system-level design of a digital sigma-delta (∑∆) modulator for hearing-aid audio Class D output stage application. The aim of this paper is to provide a thorough discussion on various possibilities and tradeoffs of ∑∆ modulator system-level design parameter combinations...... - order, oversampling ratio (OSR) and number of bits in the quantizer - including their impact on interpolation filter design as well. The system is kept in digital domain up to the input of the Class D power stage including the digital pulse width modulation (DPWM) block. Notes on the impact of the DPWM...... block on the modulated spectrum are provided....

  16. Realtime Audio with Garbage Collection

    OpenAIRE

    Matheussen, Kjetil Svalastog

    2010-01-01

    Two non-moving concurrent garbage collectors tailored for realtime audio processing are described. Both collectors work on copies of the heap to avoid cache misses and audio-disruptive synchronizations. Both collectors are targeted at multiprocessor personal computers. The first garbage collector works in uncooperative environments, and can replace Hans Boehm's conservative garbage collector for C and C++. The collector does not access the virtual memory system. Neither doe...

  17. A centralized audio presentation manager

    Energy Technology Data Exchange (ETDEWEB)

    Papp, A.L. III; Blattner, M.M.

    1994-05-16

    The centralized audio presentation manager addresses the problems which occur when multiple programs running simultaneously attempt to use the audio output of a computer system. Time dependence of sound means that certain auditory messages must be scheduled simultaneously, which can lead to perceptual problems due to psychoacoustic phenomena. Furthermore, the combination of speech and nonspeech audio is examined; each presents its own problems of perceptibility in an acoustic environment composed of multiple auditory streams. The centralized audio presentation manager receives abstract parameterized message requests from the currently running programs, and attempts to create and present a sonic representation in the most perceptible manner through the use of a theoretically and empirically designed rule set.

  18. Intelligent audio analysis

    CERN Document Server

    Schuller, Björn W

    2013-01-01

    This book provides the reader with the knowledge necessary for comprehension of the field of Intelligent Audio Analysis. It firstly introduces standard methods and discusses the typical Intelligent Audio Analysis chain going from audio data to audio features to audio recognition.  Further, an introduction to audio source separation, and enhancement and robustness are given. After the introductory parts, the book shows several applications for the three types of audio: speech, music, and general sound. Each task is shortly introduced, followed by a description of the specific data and methods applied, experiments and results, and a conclusion for this specific task. The books provides benchmark results and standardized test-beds for a broader range of audio analysis tasks. The main focus thereby lies on the parallel advancement of realism in audio analysis, as too often today’s results are overly optimistic owing to idealized testing conditions, and it serves to stimulate synergies arising from transfer of ...

  19. Small signal audio design

    CERN Document Server

    Self, Douglas

    2014-01-01

    Learn to use inexpensive and readily available parts to obtain state-of-the-art performance in all the vital parameters of noise, distortion, crosstalk and so on. With ample coverage of preamplifiers and mixers and a new chapter on headphone amplifiers, this practical handbook provides an extensive repertoire of circuits that can be put together to make almost any type of audio system.A resource packed full of valuable information, with virtually every page revealing nuggets of specialized knowledge not found elsewhere. Essential points of theory that bear on practical performance are lucidly

  20. Making the Switch to Digital Audio

    Directory of Open Access Journals (Sweden)

    Shannon Gwin Mitchell

    2004-12-01

    Full Text Available In this article, the authors describe the process of converting from analog to digital audio data. They address the step-by-step decisions that they made in selecting hardware and software for recording and converting digital audio, issues of system integration, and cost considerations. The authors present a brief description of how digital audio is being used in their current research project and how it has enhanced the “quality” of their qualitative research.

  1. Evaluation of a Smartphone-based audio-biofeedback system for improving balance in older adults--a pilot study.

    Science.gov (United States)

    Fleury, A; Mourcou, Q; Franco, C; Diot, B; Demongeot, J; Vuillerme, N

    2013-01-01

    This study was designed to assess the effectiveness of a Smartphone-based audio-biofeedback (ABF) system for improving balance in older adults. This so-called "iBalance-ABF" system that we recetly developed is "all-inclusive" in the sense that its three main components of a balance prosthesis, (i) the sensory input unit, (ii) the processing unit, and (iii) the sensory output unit, are entirely embedded into the Smartphone. The underlying principle of this system is to supply the user with supplementary information about the medial-lateral (ML) trunk tilt relative to a predetermined adjustable "dead zone" through sound generation in earphones. Six healthy older adults voluntarily participated in this pilot study. Eyes closed, they were asked to stand upright and to sway as little as possible in two (parallel and tandem) stance conditions executed without and with the use of the iBalance-ABF system. Results showed that, without any visual information, the use of the Smartphone-based ABF allowed the older healthy adults to significantly decrease their ML trunk sway in the tandem stance posture and to mitigate the destabilizing effect induced by this particular stance. Although an extended study including a larger number of participants is needed to confirm these data, the present results are encouraging. They do suggest that Smartphone-based ABF system could be used for balance training and rehabilitation therapy in older adults.

  2. Reviews on Technology and Standard of Spatial Audio Coding

    Directory of Open Access Journals (Sweden)

    Ikhwana Elfitri

    2017-03-01

    Full Text Available Market demands on a more impressive entertainment media have motivated for delivery of three dimensional (3D audio content to home consumers through Ultra High Definition TV (UHDTV, the next generation of TV broadcasting, where spatial audio coding plays fundamental role. This paper reviews fundamental concept on spatial audio coding which includes technology, standard, and application. Basic principle of object-based audio reproduction system will also be elaborated, compared to the traditional channel-based system, to provide good understanding on this popular interactive audio reproduction system which gives end users flexibility to render their own preferred audio composition.

  3. Digital signal processor for silicon audio playback devices; Silicon audio saisei kikiyo digital signal processor

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-03-01

    The digital audio signal processor (DSP) TC9446F series has been developed silicon audio playback devices with a memory medium of, e.g., flash memory, DVD players, and AV devices, e.g., TV sets. It corresponds to AAC (advanced audio coding) (2ch) and MP3 (MPEG1 Layer3), as the audio compressing techniques being used for transmitting music through an internet. It also corresponds to compressed types, e.g., Dolby Digital, DTS (digital theater system) and MPEG2 audio, being adopted for, e.g., DVDs. It can carry a built-in audio signal processing program, e.g., Dolby ProLogic, equalizer, sound field controlling, and 3D sound. TC9446XB has been lined up anew. It adopts an FBGA (fine pitch ball grid array) package for portable audio devices. (translated by NEDO)

  4. Three-Dimensional Audio Client Library

    Science.gov (United States)

    Rizzi, Stephen A.

    2005-01-01

    The Three-Dimensional Audio Client Library (3DAudio library) is a group of software routines written to facilitate development of both stand-alone (audio only) and immersive virtual-reality application programs that utilize three-dimensional audio displays. The library is intended to enable the development of three-dimensional audio client application programs by use of a code base common to multiple audio server computers. The 3DAudio library calls vendor-specific audio client libraries and currently supports the AuSIM Gold-Server and Lake Huron audio servers. 3DAudio library routines contain common functions for (1) initiation and termination of a client/audio server session, (2) configuration-file input, (3) positioning functions, (4) coordinate transformations, (5) audio transport functions, (6) rendering functions, (7) debugging functions, and (8) event-list-sequencing functions. The 3DAudio software is written in the C++ programming language and currently operates under the Linux, IRIX, and Windows operating systems.

  5. Service provider perceptions of transitioning from audio to video capability in a telehealth system: a qualitative evaluation.

    Science.gov (United States)

    Clay-Williams, Robyn; Baysari, Melissa; Taylor, Natalie; Zalitis, Dianne; Georgiou, Andrew; Robinson, Maureen; Braithwaite, Jeffrey; Westbrook, Johanna

    2017-08-14

    Telephone consultation and triage services are increasingly being used to deliver health advice. Availability of high speed internet services in remote areas allows healthcare providers to move from telephone to video telehealth services. Current approaches for assessing video services have limitations. This study aimed to identify the challenges for service providers associated with transitioning from audio to video technology. Using a mixed-method, qualitative approach, we observed training of service providers who were required to switch from telephone to video, and conducted pre- and post-training interviews with 15 service providers and their trainers on the challenges associated with transitioning to video. Two full days of simulation training were observed. Data were transcribed and analysed using an inductive approach; a modified constant comparative method was employed to identify common themes. We found three broad categories of issues likely to affect implementation of the video service: social, professional, and technical. Within these categories, eight sub-themes were identified; they were: enhanced delivery of the health service, improved health advice for people living in remote areas, safety concerns, professional risks, poor uptake of video service, system design issues, use of simulation for system testing, and use of simulation for system training. This study identified a number of unexpected potential barriers to successful transition from telephone to the video system. Most prominent were technical and training issues, and personal safety concerns about transitioning from telephone to video media. Addressing identified issues prior to implementation of a new video telehealth system is likely to improve effectiveness and uptake.

  6. Digital Augmented Reality Audio Headset

    Directory of Open Access Journals (Sweden)

    Jussi Rämö

    2012-01-01

    Full Text Available Augmented reality audio (ARA combines virtual sound sources with the real sonic environment of the user. An ARA system can be realized with a headset containing binaural microphones. Ideally, the ARA headset should be acoustically transparent, that is, it should not cause audible modification to the surrounding sound. A practical implementation of an ARA mixer requires a low-latency headphone reproduction system with additional equalization to compensate for the attenuation and the modified ear canal resonances caused by the headphones. This paper proposes digital IIR filters to realize the required equalization and evaluates a real-time prototype ARA system. Measurements show that the throughput latency of the digital prototype ARA system can be less than 1.4 ms, which is sufficiently small in practice. When the direct and processed sounds are combined in the ear, a comb filtering effect is brought about and appears as notches in the frequency response. The comb filter effect in speech and music signals was studied in a listening test and it was found to be inaudible when the attenuation is 20 dB. Insert ARA headphones have a sufficient attenuation at frequencies above about 1 kHz. The proposed digital ARA system enables several immersive audio applications, such as a virtual audio tourist guide and audio teleconferencing.

  7. Recommending audio mixing workflows

    OpenAIRE

    Sauer, Christian; Roth-Berghofer, Thomas; Auricchio, Nino; Proctor, Sam

    2013-01-01

    This paper describes our work on Audio Advisor, a workflow recommender for audio mixing. We examine the process of eliciting, formalising and modelling the domain knowledge and expert’s experience. We are also describing the effects and problems associated with the knowledge formalisation processes. We decided to employ structured case-based reasoning using the myCBR 3 to capture the vagueness encountered in the audio domain. We detail on how we used extensive similarity measure modelling to ...

  8. A Real-Time Semiautonomous Audio Panning System for Music Mixing

    Directory of Open Access Journals (Sweden)

    Perez_Gonzalez Enrique

    2010-01-01

    Full Text Available A real-time semiautonomous stereo panning system for music mixing has been implemented. The system uses spectral decomposition, constraint rules, and cross-adaptive algorithms to perform real-time placement of sources in a stereo mix. A subjective evaluation test was devised to evaluate its quality against human panning. It was shown that the automatic panning technique performed better than a nonexpert and showed no significant statistical difference to the performance of a professional mixing engineer.

  9. A Real-Time Semiautonomous Audio Panning System for Music Mixing

    Science.gov (United States)

    Perez Gonzalez, Enrique; Reiss, JoshuaD

    2010-12-01

    A real-time semiautonomous stereo panning system for music mixing has been implemented. The system uses spectral decomposition, constraint rules, and cross-adaptive algorithms to perform real-time placement of sources in a stereo mix. A subjective evaluation test was devised to evaluate its quality against human panning. It was shown that the automatic panning technique performed better than a nonexpert and showed no significant statistical difference to the performance of a professional mixing engineer.

  10. Evaluating Author and User Experience for an Audio-Haptic System for Annotation of Physical Models.

    Science.gov (United States)

    Coughlan, James M; Miele, Joshua

    2017-01-01

    We describe three usability studies involving a prototype system for creation and haptic exploration of labeled locations on 3D objects. The system uses a computer, webcam, and fiducial markers to associate a physical 3D object in the camera's view with a predefined digital map of labeled locations ("hotspots"), and to do real-time finger tracking, allowing a blind or visually impaired user to explore the object and hear individual labels spoken as each hotspot is touched. This paper describes: (a) a formative study with blind users exploring pre-annotated objects to assess system usability and accuracy; (b) a focus group of blind participants who used the system and, through structured and unstructured discussion, provided feedback on its practicality, possible applications, and real-world potential; and (c) a formative study in which a sighted adult used the system to add labels to on-screen images of objects, demonstrating the practicality of remote annotation of 3D models. These studies and related literature suggest potential for future iterations of the system to benefit blind and visually impaired users in educational, professional, and recreational contexts.

  11. Categorizing Video Game Audio

    DEFF Research Database (Denmark)

    Westerberg, Andreas Rytter; Schoenau-Fog, Henrik

    2015-01-01

    This paper dives into the subject of video game audio and how it can be categorized in order to deliver a message to a player in the most precise way. A new categorization, with a new take on the diegetic spaces, can be used a tool of inspiration for sound- and game-designers to rethink how...... they can use audio in video games. The conclusion of this study is that the current models' view of the diegetic spaces, used to categorize video game audio, is not t to categorize all sounds. This can however possibly be changed though a rethinking of how the player interprets audio....

  12. Audio Cues to Assist Visual Search in Robotic System Operator Control Unit Displays

    Science.gov (United States)

    2005-12-01

    Christopher C. Stachowiak , and Michael A. Lattin ARL-TR-3632 December 2005 Approved for...Assist Visual Search in Robotic System Operator Control Unit Displays Ellen C. Haas, Ramakrishna S. Pillalamarri, Christopher C. Stachowiak , and...5e. TASK NUMBER 6. AUTHOR(S) Ellen C. Haas, Ramakrishna S. Pillalamarri, Christopher C. Stachowiak , and Michael A. Lattin (all of ARL

  13. Design and fuel fabrication processes for the AC-3 mixed-carbide irradiation test

    International Nuclear Information System (INIS)

    Latimer, T.W.; Chidester, K.M.; Stratton, R.W.; Ledergerber, G.; Ingold, F.

    1992-01-01

    The AC-3 test was a cooperative U.S./Swiss irradiation test of 91 wire-wrapped helium-bonded U-20% Pu carbide fuel pins irradiated to 8.3 at % peak burnup in the Fast Flux Test Facility. The test consisted of 25 pins that contained spherepac fuel fabricated by the Paul Scherrer Institute (PSI) and 66 pins that contained pelletized fuel fabricated by the Los Alamos National Laboratory. Design of AC-3 by LANL and PSI was begun in 1981, the fuel pins were fabricated from 1983 to 1985, and the test was irradiated from 1986 to 1988. The principal objective of the AC-3 test was to compare the irradiation performance of mixed-carbide fuel pins that contained either pelletized or sphere-pac fuel at prototypic fluence and burnup levels for a fast breeder reactor

  14. Speech Acquisition and Automatic Speech Recognition for Integrated Spacesuit Audio Systems

    Science.gov (United States)

    Huang, Yiteng; Chen, Jingdong; Chen, Shaoyan

    2010-01-01

    A voice-command human-machine interface system has been developed for spacesuit extravehicular activity (EVA) missions. A multichannel acoustic signal processing method has been created for distant speech acquisition in noisy and reverberant environments. This technology reduces noise by exploiting differences in the statistical nature of signal (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, the automatic speech recognition (ASR) accuracy can be improved to the level at which crewmembers would find the speech interface useful. The developed speech human/machine interface will enable both crewmember usability and operational efficiency. It can enjoy a fast rate of data/text entry, small overall size, and can be lightweight. In addition, this design will free the hands and eyes of a suited crewmember. The system components and steps include beam forming/multi-channel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, model adaption, ASR HMM (Hidden Markov Model) training, and ASR decoding. A state-of-the-art phoneme recognizer can obtain an accuracy rate of 65 percent when the training and testing data are free of noise. When it is used in spacesuits, the rate drops to about 33 percent. With the developed microphone array speech-processing technologies, the performance is improved and the phoneme recognition accuracy rate rises to 44 percent. The recognizer can be further improved by combining the microphone array and HMM model adaptation techniques and using speech samples collected from inside spacesuits. In addition, arithmetic complexity models for the major HMMbased ASR components were developed. They can help real-time ASR system designers select proper tasks when in the face of constraints in computational resources.

  15. Roundtable Audio Discussion

    Directory of Open Access Journals (Sweden)

    Chris Bigum

    2007-01-01

    Full Text Available RoundTable on Technology, Teaching and Tools. This is a roundtable audio interview conducted by James Farmer, founder of Edublogs, with Anne Bartlett-Bragg (University of Technology Sydney and Chris Bigum (Deakin University. Skype was used to make and record the audio conference and the resulting sound file was edited by Andrew McLauchlan.

  16. Irradiation and examination results of the AC-3 mixed-carbide test

    International Nuclear Information System (INIS)

    Mason, R.E.; Hoth, C.W.; Stratton, R.W.; Botta, F.

    1992-01-01

    The AC-3 test was a cooperative Swiss/US irradiation test of mixed-carbide, (U,Pr)C, fuel pins in the Fast Flux Test Facility. The test included 25 Swiss-fabricated sphere-pac-type fuel pins and 66 U.S. fabricated pellet-type fuel pins. The test was designed to operate at prototypical fast reactor conditions to provide a direct comparison of the irradiation performance of the two fuel types. The test design and fuel fabrication processes used for the AC-3 test are presented

  17. Audio Mining with emphasis on Music Genre Classification

    DEFF Research Database (Denmark)

    Meng, Anders

    2004-01-01

    etc. is receiving quite a lot of attention. The first breakthough in audio mining was created by MuscleFish in 1996, a simple audio retrieval system. With the increasing amount of audio material being accessible through the web, e.g. Apple's iTunes (700,000+ songs), Sony, Amazon, new methods...

  18. Audio Technology and Mobile Human Computer Interaction

    DEFF Research Database (Denmark)

    Chamberlain, Alan; Bødker, Mads; Hazzard, Adrian

    2017-01-01

    Audio-based mobile technology is opening up a range of new interactive possibilities. This paper brings some of those possibilities to light by offering a range of perspectives based in this area. It is not only the technical systems that are developing, but novel approaches to the design...... and understanding of audio-based mobile systems are evolving to offer new perspectives on interaction and design and support such systems to be applied in areas, such as the humanities....

  19. Implementing Audio-CASI on Windows’ Platforms

    Science.gov (United States)

    Cooley, Philip C.; Turner, Charles F.

    2011-01-01

    Audio computer-assisted self interviewing (Audio-CASI) technologies have recently been shown to provide important and sometimes dramatic improvements in the quality of survey measurements. This is particularly true for measurements requiring respondents to divulge highly sensitive information such as their sexual, drug use, or other sensitive behaviors. However, DOS-based Audio-CASI systems that were designed and adopted in the early 1990s have important limitations. Most salient is the poor control they provide for manipulating the video presentation of survey questions. This article reports our experiences adapting Audio-CASI to Microsoft Windows 3.1 and Windows 95 platforms. Overall, our Windows-based system provided the desired control over video presentation and afforded other advantages including compatibility with a much wider array of audio devices than our DOS-based Audio-CASI technologies. These advantages came at the cost of increased system requirements --including the need for both more RAM and larger hard disks. While these costs will be an issue for organizations converting large inventories of PCS to Windows Audio-CASI today, this will not be a serious constraint for organizations and individuals with small inventories of machines to upgrade or those purchasing new machines today. PMID:22081743

  20. Portable Audio Design

    DEFF Research Database (Denmark)

    Groth, Sanne Krogh

    2014-01-01

    The chapter presents a methodological approach to the early process of producing portable audio design. The chapter high lights audio walks and audio guides, but can also be of inspiration when working with graphical and video production for portable devices. The final products can be presented...... within online and physical institutional contexts. The approach focuses especially on the relationship to specific sites, and how an awareness of the relationship between the site and the production can be part of the design process. Such awareness entails several approaches: the necessity of paying...

  1. Concept for audio encoding and decoding for audio channels and audio objects

    OpenAIRE

    Adami, Alexander; Borss, Christian; Dick, Sascha; Ertel, Christian; Füg, Simone; Herre, Jürgen; Hilpert, Johannes; Hölzer, Andreas; Kratschmer, Michael; Küch, Fabian; Kuntz, Achim; Murtaza, Adrian; Plogsties, Jan; Silzle, Andreas; Stenzel, Hanne

    2015-01-01

    Audio encoder for encoding audio input data (101) to obtain audio output data (501) comprises an input interface (100) for receiving a plurality of audio channels, a plurality of audio objects and metadata related to one or more of the plurality of audio objects; a mixer (200) for mixing the plurality of objects and the plurality of channels to obtain a plurality of pre-mixed channels, each pre-mixed channel comprising audio data of a channel and audio data of at least one object; a core enco...

  2. COMINT Audio Interface

    National Research Council Canada - National Science Library

    Morgans, D

    1999-01-01

    .... Demonstrations conducted under this effort concluded that 3D audio localization techniques on their own have not been developed to the point where they achieve the fidelity necessary for the military work environment...

  3. Progressive Syntax-Rich Coding of Multichannel Audio Sources

    Directory of Open Access Journals (Sweden)

    Dai Yang

    2003-09-01

    Full Text Available Being able to transmit the audio bitstream progressively is a highly desirable property for network transmission. MPEG-4 version 2 audio supports fine grain bit rate scalability in the generic audio coder (GAC. It has a bit-sliced arithmetic coding (BSAC tool, which provides scalability in the step of 1 Kbps per audio channel. There are also several other scalable audio coding methods, which have been proposed in recent years. However, these scalable audio tools are only available for mono and stereo audio material. Little work has been done on progressive coding of multichannel audio sources. MPEG advanced audio coding (AAC is one of the most distinguished multichannel digital audio compression systems. Based on AAC, we develop in this work a progressive syntax-rich multichannel audio codec (PSMAC. It not only supports fine grain bit rate scalability for the multichannel audio bitstream but also provides several other desirable functionalities. A formal subjective listening test shows that the proposed algorithm achieves an excellent performance at several different bit rates when compared with MPEG AAC.

  4. Progressive Syntax-Rich Coding of Multichannel Audio Sources

    Science.gov (United States)

    Yang, Dai; Ai, Hongmei; Kyriakakis, Chris; Kuo, C.-C. Jay

    2003-12-01

    Being able to transmit the audio bitstream progressively is a highly desirable property for network transmission. MPEG- [InlineEquation not available: see fulltext.] version [InlineEquation not available: see fulltext.] audio supports fine grain bit rate scalability in the generic audio coder (GAC). It has a bit-sliced arithmetic coding (BSAC) tool, which provides scalability in the step of 1 Kbps per audio channel. There are also several other scalable audio coding methods, which have been proposed in recent years. However, these scalable audio tools are only available for mono and stereo audio material. Little work has been done on progressive coding of multichannel audio sources. MPEG advanced audio coding (AAC) is one of the most distinguished multichannel digital audio compression systems. Based on AAC, we develop in this work a progressive syntax-rich multichannel audio codec (PSMAC). It not only supports fine grain bit rate scalability for the multichannel audio bitstream but also provides several other desirable functionalities. A formal subjective listening test shows that the proposed algorithm achieves an excellent performance at several different bit rates when compared with MPEG AAC.

  5. Structure Learning in Audio

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch

    By having information about the setting a user is in, a computer is able to make decisions proactively to facilitate tasks for the user. Two approaches are taken in this thesis to achieve more information about an audio environment. One approach is that of classifying audio, and a new approach us......-Gaussian source distributions allowing a much wider use of the method. All methods uses a variety of classification models and model selection algorithms which is a common theme of the thesis....

  6. Museum audio description

    OpenAIRE

    Martins, Cláudia Susana Nunes

    2011-01-01

    Audio description for the blind and visually impaired has been around since people have described what is seen. Throughout time, it has evolved and developed within different media, starting with reality and daily life, moving into the cinema and television, then across other performing arts, museums and art galleries, and public places. Thus, academics and entertainment providers have developed a growing interest for audio description, especially in what concerns the best methods and strateg...

  7. CERN automatic audio-conference service

    CERN Document Server

    Sierra Moral, R

    2010-01-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first Euro...

  8. CERN automatic audio-conference service

    CERN Multimedia

    Sierra Moral, R

    2009-01-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first Euro...

  9. High-Fidelity Piezoelectric Audio Device

    Science.gov (United States)

    Woodward, Stanley E.; Fox, Robert L.; Bryant, Robert G.

    2003-01-01

    ModalMax is a very innovative means of harnessing the vibration of a piezoelectric actuator to produce an energy efficient low-profile device with high-bandwidth high-fidelity audio response. The piezoelectric audio device outperforms many commercially available speakers made using speaker cones. The piezoelectric device weighs substantially less (4 g) than the speaker cones which use magnets (10 g). ModalMax devices have extreme fabrication simplicity. The entire audio device is fabricated by lamination. The simplicity of the design lends itself to lower cost. The piezoelectric audio device can be used without its acoustic chambers and thereby resulting in a very low thickness of 0.023 in. (0.58 mm). The piezoelectric audio device can be completely encapsulated, which makes it very attractive for use in wet environments. Encapsulation does not significantly alter the audio response. Its small size (see Figure 1) is applicable to many consumer electronic products, such as pagers, portable radios, headphones, laptop computers, computer monitors, toys, and electronic games. The audio device can also be used in automobile or aircraft sound systems.

  10. Virtual Microphones for Multichannel Audio Resynthesis

    Directory of Open Access Journals (Sweden)

    Athanasios Mouchtaris

    2003-09-01

    Full Text Available Multichannel audio offers significant advantages for music reproduction, including the ability to provide better localization and envelopment, as well as reduced imaging distortion. On the other hand, multichannel audio is a demanding media type in terms of transmission requirements. Often, bandwidth limitations prohibit transmission of multiple audio channels. In such cases, an alternative is to transmit only one or two reference channels and recreate the rest of the channels at the receiving end. Here, we propose a system capable of synthesizing the required signals from a smaller set of signals recorded in a particular venue. These synthesized “virtual” microphone signals can be used to produce multichannel recordings that accurately capture the acoustics of that venue. Applications of the proposed system include transmission of multichannel audio over the current Internet infrastructure and, as an extension of the methods proposed here, remastering existing monophonic and stereophonic recordings for multichannel rendering.

  11. Spatial audio reproduction with primary ambient extraction

    CERN Document Server

    He, JianJun

    2017-01-01

    This book first introduces the background of spatial audio reproduction, with different types of audio content and for different types of playback systems. A literature study on the classical and emerging Primary Ambient Extraction (PAE) techniques is presented. The emerging techniques aim to improve the extraction performance and also enhance the robustness of PAE approaches in dealing with more complex signals encountered in practice. The in-depth theoretical study helps readers to understand the rationales behind these approaches. Extensive objective and subjective experiments validate the feasibility of applying PAE in spatial audio reproduction systems. These experimental results, together with some representative audio examples and MATLAB codes of the key algorithms, illustrate clearly the differences among various approaches and also help readers gain insights on selecting different approaches for different applications.

  12. Tomato leaf curl Kerala virus (ToLCKeV AC3 protein forms a higher order oligomer and enhances ATPase activity of replication initiator protein (Rep/AC1

    Directory of Open Access Journals (Sweden)

    Mukherjee Sunil K

    2010-06-01

    Full Text Available Abstract Background Geminiviruses are emerging plant viruses that infect a wide variety of vegetable crops, ornamental plants and cereal crops. They undergo recombination during co-infections by different species of geminiviruses and give rise to more virulent species. Antiviral strategies targeting a broad range of viruses necessitate a detailed understanding of the basic biology of the viruses. ToLCKeV, a virus prevalent in the tomato crop of Kerala state of India and a member of genus Begomovirus has been used as a model system in this study. Results AC3 is a geminiviral protein conserved across all the begomoviral species and is postulated to enhance viral DNA replication. In this work we have successfully expressed and purified the AC3 fusion proteins from E. coli. We demonstrated the higher order oligomerization of AC3 using sucrose gradient ultra-centrifugation and gel-filtration experiments. In addition we also established that ToLCKeV AC3 protein interacted with cognate AC1 protein and enhanced the AC1-mediated ATPase activity in vitro. Conclusions Highly hydrophobic viral protein AC3 can be purified as a fusion protein with either MBP or GST. The purification method of AC3 protein improves scope for the biochemical characterization of the viral protein. The enhancement of AC1-mediated ATPase activity might lead to increased viral DNA replication.

  13. TNO at TRECVID 2008, Combining Audio and Video Fingerprinting for Robust Copy Detection

    NARCIS (Netherlands)

    Doets, P.J.; Eendebak, P.T.; Ranguelova, E.; Kraaij, W.

    2009-01-01

    TNO has evaluated a baseline audio and a video fingerprinting system based on robust hashing for the TRECVID 2008 copy detection task. We participated in the audio, the video and the combined audio-video copy detection task. The audio fingerprinting implementation clearly outperformed the video

  14. Web Audio/Video Streaming Tool

    Science.gov (United States)

    Guruvadoo, Eranna K.

    2003-01-01

    In order to promote NASA-wide educational outreach program to educate and inform the public of space exploration, NASA, at Kennedy Space Center, is seeking efficient ways to add more contents to the web by streaming audio/video files. This project proposes a high level overview of a framework for the creation, management, and scheduling of audio/video assets over the web. To support short-term goals, the prototype of a web-based tool is designed and demonstrated to automate the process of streaming audio/video files. The tool provides web-enabled users interfaces to manage video assets, create publishable schedules of video assets for streaming, and schedule the streaming events. These operations are performed on user-defined and system-derived metadata of audio/video assets stored in a relational database while the assets reside on separate repository. The prototype tool is designed using ColdFusion 5.0.

  15. Design of progressive syntax-rich multichannel audio codec

    Science.gov (United States)

    Yang, Dai; Ai, Hongmei; Kyriakakis, Christos; Kuo, C.-C. Jay

    2001-12-01

    Being able to transmit the audio bitstream progressively is a highly desirable property for network transmission. MPEG-4 version-2 audio supports fine grain bit rate scalability in the Generic Audio Coder (GAC). It has a Bit-Sliced Arithmetic Coding (BSAC) tool, which provides scalability in the step of 1kbit/sec per audio channel. However, this fine grain scalability tool is only available for mono and stereo audio material. Not much work has been done on progressively transmitting multichannel audio sources. MPEG Advanced Audio Coding (AAC) is one of the most distinguished multichannel digital audio compression systems. Based on AAC, we develop a progressive syntax-rich multichannel audio codec in this work. It not only supports fine grain bit rate scalability for the multichannel audio bitstream, but also provides several other desirable functionalities. A formal subjective listening test shows that the proposed algorithm achieves a better performance at several different bit rates when compared with MPEG-4 BSAC for the mono audio sources.

  16. Detection Of Alterations In Audio Files Using Spectrograph Analysis

    Directory of Open Access Journals (Sweden)

    Anandha Krishnan G

    2015-08-01

    Full Text Available The corresponding study was carried out to detect changes in audio file using spectrograph. An audio file format is a file format for storing digital audio data on a computer system. A sound spectrograph is a laboratory instrument that displays a graphical representation of the strengths of the various component frequencies of a sound as time passes. The objectives of the study were to find the changes in spectrograph of audio after altering them to compare altering changes with spectrograph of original files and to check for similarity and difference in mp3 and wav. Five different alterations were carried out on each audio file to analyze the differences between the original and the altered file. For altering the audio file MP3 or WAV by cutcopy the file was opened in Audacity. A different audio was then pasted to the audio file. This new file was analyzed to view the differences. By adjusting the necessary parameters the noise was reduced. The differences between the new file and the original file were analyzed. By adjusting the parameters from the dialog box the necessary changes were made. The edited audio file was opened in the software named spek where after analyzing a graph is obtained of that particular file which is saved for further analysis. The original audio graph received was combined with the edited audio file graph to see the alterations.

  17. Content-based classification and retrieval of audio

    Science.gov (United States)

    Zhang, Tong; Kuo, C.-C. Jay

    1998-10-01

    An on-line audio classification and segmentation system is presented in this research, where audio recordings are classified and segmented into speech, music, several types of environmental sounds and silence based on audio content analysis. This is the first step of our continuing work towards a general content-based audio classification and retrieval system. The extracted audio features include temporal curves of the energy function,the average zero- crossing rate, the fundamental frequency of audio signals, as well as statistical and morphological features of these curves. The classification result is achieved through a threshold-based heuristic procedure. The audio database that we have built, details of feature extraction, classification and segmentation procedures, and experimental results are described. It is shown that, with the proposed new system, audio recordings can be automatically segmented and classified into basic types in real time with an accuracy of over 90 percent. Outlines of further classification of audio into finer types and a query-by-example audio retrieval system on top of the coarse classification are also introduced.

  18. Perceptual Audio Hashing Functions

    Directory of Open Access Journals (Sweden)

    Emin Anarım

    2005-07-01

    Full Text Available Perceptual hash functions provide a tool for fast and reliable identification of content. We present new audio hash functions based on summarization of the time-frequency spectral characteristics of an audio document. The proposed hash functions are based on the periodicity series of the fundamental frequency and on singular-value description of the cepstral frequencies. They are found, on one hand, to perform very satisfactorily in identification and verification tests, and on the other hand, to be very resilient to a large variety of attacks. Moreover, we address the issue of security of hashes and propose a keying technique, and thereby a key-dependent hash function.

  19. DAFX Digital Audio Effects

    CERN Document Server

    2011-01-01

    The rapid development in various fields of Digital Audio Effects, or DAFX, has led to new algorithms and this second edition of the popular book, DAFX: Digital Audio Effects has been updated throughout to reflect progress in the field. It maintains a unique approach to DAFX with a lecture-style introduction into the basics of effect processing. Each effect description begins with the presentation of the physical and acoustical phenomena, an explanation of the signal processing techniques to achieve the effect, followed by a discussion of musical applications and the control of effect parameter

  20. Audio asymmetric watermarking technique

    OpenAIRE

    Furon, Teddy; Moreau, Nicolas; Duhamel, Pierre

    2000-01-01

    This paper presents the application of the promising public key watermarking method1 to the audio domain. Its de- tection process does not need the original content nor the secret key used in the embedding process. It is the trans- lation, in the watermarking domain, of a public key pair cryptosystem [1]. We start to build the detector with some basic assumptions. This leads to a hypothesis test based on probability likelihood. But real audio signals do not satisfy the assumption of a Gaussia...

  1. Evaluation of Performance With an Adaptive Digital Remote Microphone System and a Digital Remote Microphone Audio-Streaming Accessory System.

    Science.gov (United States)

    Wolfe, Jace; Duke, Mila Morais; Schafer, Erin; Jones, Christine; Mülder, Hans E; John, Andrew; Hudson, Mary

    2015-09-01

    One purpose of this study was to evaluate the improvement in speech recognition obtained with use of 2 different remote microphone technologies. Another purpose of this study was to determine whether a battery of audiometric measures could predict benefit from use of these technologies. Sentence recognition was evaluated while 17 adults used each of 2 different hearing aids. Performance was evaluated with and without 2 different remote microphone systems. A variety of audiologic measures were administered to determine whether prefitting assessment may predict benefit from remote microphone technology. Use of both remote microphone systems resulted in improvement in speech recognition in quiet and in noise. There were no differences in performance obtained with the 2 different remote microphone technologies in quiet and at low competing noise levels, but use of the digital adaptive remote microphone system provided better speech recognition in the presence of moderate- to high-level noise. The Listening in Spatialized Noise–Sentence Test Prescribed Gain Amplifier (Cameron & Dillon, 2010) measure served as a good predictor of benefit from remote microphone technology. Each remote microphone system improved sentence recognition in noise, but greater improvement was obtained with the digital adaptive system. The Listening in Spatialized Noise–Sentence Test Prescribed Gain Amplifier may serve as a good indicator of benefit from remote microphone technology.

  2. Audio Feedback -- Better Feedback?

    Science.gov (United States)

    Voelkel, Susanne; Mello, Luciane V.

    2014-01-01

    National Student Survey (NSS) results show that many students are dissatisfied with the amount and quality of feedback they get for their work. This study reports on two case studies in which we tried to address these issues by introducing audio feedback to one undergraduate (UG) and one postgraduate (PG) class, respectively. In case study one…

  3. Circuit Bodging : Audio Multiplexer

    NARCIS (Netherlands)

    Roeling, E.; Allen, B.

    2010-01-01

    Audio amplifiers usually come with a single, glaring design flaw: Not enough auxiliary inputs. Not only that, but you’re usually required to press a button to switch between the amplifier’s limited number of inputs. This is unacceptable - we have better things to do than change input channels! In

  4. Embedded Audio Without Beeps

    DEFF Research Database (Denmark)

    Overholt, Daniel; Møbius, Nikolaj Friis

    2014-01-01

    software environments for audio processing) via innovative interfaces that send real-time inputs to such software running on a laptop, mobile device, or small Linux board (e.g., Raspberry Pi or Beagleboard). Basic hardware will be provided, but participants are also encouraged to bring related equipment...

  5. The audio expert everything you need to know about audio

    CERN Document Server

    Winer, Ethan

    2012-01-01

    The Audio Expert is a comprehensive reference that covers all aspects of audio, with many practical, as well as theoretical, explanations. Providing in-depth descriptions of how audio really works, using common sense plain-English explanations and mechanical analogies with minimal math, the book is written for people who want to understand audio at the deepest, most technical level, without needing an engineering degree. It's presented in an easy-to-read, conversational tone, and includes more than 400 figures and photos augmenting the text.The Audio Expert takes th

  6. Efectos digitales de audio con Web Audio API

    OpenAIRE

    GARCÍA CHAPARRO, SAMUEL

    2015-01-01

    El presente trabajo consiste en un estudio de la capacidad de Web Audio API para el procesado de efectos de audio en tiempo real. De todos los efectos de audio posibles se han elegido el wah-wah, el flanger y el choris, efectos ampliamente empleados con guitarra eléctrica. Se crean funciones de lenguaje JavaScript que modelan el comportamiento de los efectos de audio elegidos, haciéndolas funcionar sobre una plataforma web HTML5. García Chaparro, S. (2015). Efectos digitales de audio con W...

  7. ENERGY STAR Certified Audio Video

    Science.gov (United States)

    Certified models meet all ENERGY STAR requirements as listed in the Version 3.0 ENERGY STAR Program Requirements for Audio Video Equipment that are effective as of May 1, 2013. A detailed listing of key efficiency criteria are available at http://www.energystar.gov/index.cfm?c=audio_dvd.pr_crit_audio_dvd

  8. Newnes audio and Hi-Fi engineer's pocket book

    CERN Document Server

    Capel, Vivian

    2013-01-01

    Newnes Audio and Hi-Fi Engineer's Pocket Book, Second Edition provides concise discussion of several audio topics. The book is comprised of 10 chapters that cover different audio equipment. The coverage of the text includes microphones, gramophones, compact discs, and tape recorders. The book also covers high-quality radio, amplifiers, and loudspeakers. The book then reviews the concepts of sound and acoustics, and presents some facts and formulas relevant to audio. The text will be useful to sound engineers and other professionals whose work involves sound systems.

  9. Audio Signal Decoder, Method for Decoding an Audio Signal and Computer Program Using Cascaded Audio Object Processing Stages

    OpenAIRE

    Hellmuth, O.; Falch, C.; Herre, J.; Hilpert, J.; Ridderbusch, F.; Terentiev, L.

    2010-01-01

    An audio signal decoder for providing an upmix signal representation in dependence on a downmix signal representation and an object-related parametric information comprises an object separator configured to decompose the downmix signal representation, to provide a first audio information describing a first set of one or more audio objects of a first audio object type and a second audio information describing a second set of one or more audio objects of a second audio object type, in dependenc...

  10. Evaluation of Perceived Spatial Audio Quality

    Directory of Open Access Journals (Sweden)

    Jan Berg

    2006-04-01

    Full Text Available The increased use of audio applications capable of conveying enhanced spatial quality puts focus on how such a quality should be evaluated. Different approaches to evaluation of perceived quality are briefly discussed and a new technique is introduced. In a series of experiment, attributes were elicited from subjects, tested and subsequently used for derivation of evaluation scales that were feasible for subjective evaluation of the spatial quality of certain multichannel stimuli. The findings of these experiments led to the development of a novel method for evaluation of spatial audio in surround sound systems. Parts of the method were subsequently implemented in the OPAQUE software prototype designed to facilitate the elicitation process. The prototype was successfully tested in a pilot experiment. The experiments show that attribute scales derived from subjects' personal constructs are functional for evaluation of perceived spatial audio quality. Finally, conclusions on the importance of spatial quality evaluation of new applications are made.

  11. Improving audio chord transcription by exploiting harmonic and metric knowledge

    NARCIS (Netherlands)

    de Haas, W.B.; Rodrigues Magalhães, J.P.; Wiering, F.

    2012-01-01

    We present a new system for chord transcription from polyphonic musical audio that uses domain-specific knowledge about tonal harmony and metrical position to improve chord transcription performance. Low-level pulse and spectral features are extracted from an audio source using the Vamp plugin

  12. Advanced Audio Interface for Phonetic Speech Recognition in a High Noise Environment

    National Research Council Canada - National Science Library

    2000-01-01

    Standard Object Systems, Inc. (SOS) has used its existing technology in phonetic speech recognition, audio signal processing, and multilingual language translation to design and demonstrate an advanced audio interface for speech...

  13. Tag Based Audio Search Engine

    OpenAIRE

    Parameswaran Vellachu; Sunitha Abburu

    2012-01-01

    The volume of the music database is increasing day by day. Getting the required song as per the choice of the listener is a big challenge. Hence, it is really hard to manage this huge quantity, in terms of searching, filtering, through the music database. It is surprising to see that the audio and music industry still rely on very simplistic metadata to describe music files. However, while searching audio resource, an efficient "Tag Based Audio Search Engine" is necessary. The current researc...

  14. Parametric Coding of Stereo Audio

    Directory of Open Access Journals (Sweden)

    Erik Schuijers

    2005-06-01

    Full Text Available Parametric-stereo coding is a technique to efficiently code a stereo audio signal as a monaural signal plus a small amount of parametric overhead to describe the stereo image. The stereo properties are analyzed, encoded, and reinstated in a decoder according to spatial psychoacoustical principles. The monaural signal can be encoded using any (conventional audio coder. Experiments show that the parameterized description of spatial properties enables a highly efficient, high-quality stereo audio representation.

  15. Audio/Visual Ratios in Commercial Filmstrips.

    Science.gov (United States)

    Gulliford, Nancy L.

    Developed by the Westinghouse Electric Corporation, Video Audio Compressed (VIDAC) is a compressed time, variable rate, still picture television system. This technology made it possible for a centralized library of audiovisual materials to be transmitted over a television channel in very short periods of time. In order to establish specifications…

  16. CERN automatic audio-conference service

    Science.gov (United States)

    Sierra Moral, Rodrigo

    2010-04-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first European pilot and several improvements (such as billing, security, redundancy...) were implemented based on CERN's recommendations. The new automatic conference system has been operational since the second half of 2006. It is very popular for the users and has doubled the number of conferences in the past two years.

  17. CERN automatic audio-conference service

    International Nuclear Information System (INIS)

    Sierra Moral, Rodrigo

    2010-01-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first European pilot and several improvements (such as billing, security, redundancy...) were implemented based on CERN's recommendations. The new automatic conference system has been operational since the second half of 2006. It is very popular for the users and has doubled the number of conferences in the past two years.

  18. Class D audio amplifiers for high voltage capacitive transducers

    DEFF Research Database (Denmark)

    Nielsen, Dennis

    Audio reproduction systems contains two key components, the amplifier and the loudspeaker. In the last 20 – 30 years the technology of audio amplifiers have performed a fundamental shift of paradigm. Class D audio amplifiers have replaced the linear amplifiers, suffering from the well-known issues...... of high volume, weight, and cost. High efficient class D amplifiers are now widely available offering power densities, that their linear counterparts can not match. Unlike the technology of audio amplifiers, the loudspeaker is still based on the traditional electrodynamic transducer invented by C.W. Rice...... and E.W. Kellog in 1925 [1]. The poor efficiency of the electrodynamic transducer remains a key issue, and a significant limit of the efficiency of the complete audio reproduction systems. Also the geometric limits of the electrodynamic transducer imposes significant limits on the design of loudspeakers...

  19. The Lowdown on Audio Downloads

    Science.gov (United States)

    Farrell, Beth

    2010-01-01

    First offered to public libraries in 2004, downloadable audiobooks have grown by leaps and bounds. According to the Audio Publishers Association, their sales today account for 21% of the spoken-word audio market. It hasn't been easy, however. WMA. DRM. MP3. AAC. File extensions small on letters but very big on consequences for librarians,…

  20. Efficient audio power amplification - challenges

    Energy Technology Data Exchange (ETDEWEB)

    Andersen, Michael A.E.

    2005-07-01

    For more than a decade efficient audio power amplification has evolved and today switch-mode audio power amplification in various forms are the state-of-the-art. The technical steps that lead to this evolution are described and in addition many of the challenges still to be faced and where extensive research and development are needed is covered. (au)

  1. Efficient Audio Power Amplification - Challenges

    DEFF Research Database (Denmark)

    Andersen, Michael Andreas E.

    2005-01-01

    For more than a decade efficient audio power amplification has evolved and today switch-mode audio power amplification in various forms are the state-of-the-art. The technical steps that lead to this evolution are described and in addition many of the challenges still to be faced and where...... extensive research and development are needed is covered....

  2. Dynamic Bayesian Networks for Audio-Visual Speech Recognition

    Directory of Open Access Journals (Sweden)

    Liang Luhong

    2002-01-01

    Full Text Available The use of visual features in audio-visual speech recognition (AVSR is justified by both the speech generation mechanism, which is essentially bimodal in audio and visual representation, and by the need for features that are invariant to acoustic noise perturbation. As a result, current AVSR systems demonstrate significant accuracy improvements in environments affected by acoustic noise. In this paper, we describe the use of two statistical models for audio-visual integration, the coupled HMM (CHMM and the factorial HMM (FHMM, and compare the performance of these models with the existing models used in speaker dependent audio-visual isolated word recognition. The statistical properties of both the CHMM and FHMM allow to model the state asynchrony of the audio and visual observation sequences while preserving their natural correlation over time. In our experiments, the CHMM performs best overall, outperforming all the existing models and the FHMM.

  3. Documentary management of the sport audio-visual information in the generalist televisions

    OpenAIRE

    Jorge Caldera Serrano; Felipe Alonso

    2007-01-01

    The management of the sport audio-visual documentation of the Information Systems of the state, zonal and local chains is analyzed within the framework. For it it is made makes a route by the documentary chain that makes the sport audio-visual information with the purpose of being analyzing each one of the parameters, showing therefore a series of recommendations and norms for the preparation of the sport audio-visual registry. Evidently the audio-visual sport documentation difference i...

  4. Low-delay predictive audio coding for the HIVITS HDTV codec

    Science.gov (United States)

    McParland, A. K.; Gilchrist, N. H. C.

    1995-01-01

    The status of work relating to predictive audio coding, as part of the European project on High Quality Video Telephone and HD(TV) Systems (HIVITS), is reported. The predictive coding algorithm is developed, along with six-channel audio coding and decoding hardware. Demonstrations of the audio codec operating in conjunction with the video codec, are given.

  5. The Evaluation of Science Learning Program, Technology and Society Application of Audio Bio Harmonic System with Solar Energy to Improve Crop Productivity

    Directory of Open Access Journals (Sweden)

    D. Rosana

    2017-04-01

    Full Text Available One of the greatest challenges in science learning is how to integrate a wide range of basic scientific concepts of physics, chemistry, and biology into an integrated learning material. Research-based teaching material in this area is still very poor and does not much involve students of science education in its implementation as part of the learning program science technology and society (STS. The purpose of this study is to get the result of evaluation of the teaching and learning of STS in the form of public service in Kulon Progo, Yogyakarta. The program to improve crop productivity through the application of Audio Bio Harmonic System (ABHS with solar energy have been selected for utilizing the natural animal sounds to open stomata of the leaves conducted during foliar fertilization, making it suitable for integrated science lessons. Component of evaluation model used is Stufflebeam model evaluation (CIPP. CIPP evaluation in these activities resulted in two aspects: The first aspect was improving the skills of students and farmers in using ABHS, and these two aspects, namely food crop productivity; (1 cayenne increased 76.4%, (2 increased red onions (56.3% and (3 of maize increased by 67.8%. Besides, it was also the effect of the application of ABHS on the rate of plant growth. The outcome of this study is the STS teaching materials and appropriate technology of ABHS with solar energy.

  6. All About Audio Equalization: Solutions and Frontiers

    Directory of Open Access Journals (Sweden)

    Vesa Välimäki

    2016-05-01

    Full Text Available Audio equalization is a vast and active research area. The extent of research means that one often cannot identify the preferred technique for a particular problem. This review paper bridges those gaps, systemically providing a deep understanding of the problems and approaches in audio equalization, their relative merits and applications. Digital signal processing techniques for modifying the spectral balance in audio signals and applications of these techniques are reviewed, ranging from classic equalizers to emerging designs based on new advances in signal processing and machine learning. Emphasis is placed on putting the range of approaches within a common mathematical and conceptual framework. The application areas discussed herein are diverse, and include well-defined, solvable problems of filter design subject to constraints, as well as newly emerging challenges that touch on problems in semantics, perception and human computer interaction. Case studies are given in order to illustrate key concepts and how they are applied in practice. We also recommend preferred signal processing approaches for important audio equalization problems. Finally, we discuss current challenges and the uncharted frontiers in this field. The source code for methods discussed in this paper is made available at https://code.soundsoftware.ac.uk/projects/allaboutaudioeq.

  7. Perancangan Sistem Audio Mobil Berbasiskan Sistem Pakar dan Web

    Directory of Open Access Journals (Sweden)

    Djunaidi Santoso

    2011-12-01

    Full Text Available Designing car audio that fits user’s needs is a fun activity. However, the design often consumes more time and costly since it should be consulted to the experts several times. For easy access to information in designing a car audio system as well as error prevention, an car audio system based on expert system and web is designed for those who do not have sufficient time and expense to consult directly to experts. This system consists of tutorial modules designed using the HyperText Preprocessor (PHP and MySQL as database. This car audio system design is evaluated uses black box testing method which focuses on the functional needs of the application. Tests are performed by providing inputs and produce outputs corresponding to the function of each module. The test results prove the correspondence between input and output, which means that the program meet the initial goals of the design. 

  8. Electrophysiological evidence for Audio-visuo-lingual speech integration.

    Science.gov (United States)

    Treille, Avril; Vilain, Coriandre; Schwartz, Jean-Luc; Hueber, Thomas; Sato, Marc

    2018-01-31

    Recent neurophysiological studies demonstrate that audio-visual speech integration partly operates through temporal expectations and speech-specific predictions. From these results, one common view is that the binding of auditory and visual, lipread, speech cues relies on their joint probability and prior associative audio-visual experience. The present EEG study examined whether visual tongue movements integrate with relevant speech sounds, despite little associative audio-visual experience between the two modalities. A second objective was to determine possible similarities and differences of audio-visual speech integration between unusual audio-visuo-lingual and classical audio-visuo-labial modalities. To this aim, participants were presented with auditory, visual, and audio-visual isolated syllables, with the visual presentation related to either a sagittal view of the tongue movements or a facial view of the lip movements of a speaker, with lingual and facial movements previously recorded by an ultrasound imaging system and a video camera. In line with previous EEG studies, our results revealed an amplitude decrease and a latency facilitation of P2 auditory evoked potentials in both audio-visual-lingual and audio-visuo-labial conditions compared to the sum of unimodal conditions. These results argue against the view that auditory and visual speech cues solely integrate based on prior associative audio-visual perceptual experience. Rather, they suggest that dynamic and phonetic informational cues are sharable across sensory modalities, possibly through a cross-modal transfer of implicit articulatory motor knowledge. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. Spatial Analysis and Synthesis of Car Audio System and Car Cabin Acoustics with a Compact Microphone Array

    DEFF Research Database (Denmark)

    Sakari, Tervo; Pätynen, Jukka; Kaplanis, Neofytos

    2015-01-01

    This research proposes a spatial sound analysis and synthesis approach for automobile sound systems, where the acquisition of the measurement data is much faster than with the Binaural Car Scanning method. This approach avoids the problems that are typically found with binaural reproduction...

  10. Musical examination to bridge audio data and sheet music

    Science.gov (United States)

    Pan, Xunyu; Cross, Timothy J.; Xiao, Liangliang; Hei, Xiali

    2015-03-01

    The digitalization of audio is commonly implemented for the purpose of convenient storage and transmission of music and songs in today's digital age. Analyzing digital audio for an insightful look at a specific musical characteristic, however, can be quite challenging for various types of applications. Many existing musical analysis techniques can examine a particular piece of audio data. For example, the frequency of digital sound can be easily read and identified at a specific section in an audio file. Based on this information, we could determine the musical note being played at that instant, but what if you want to see a list of all the notes played in a song? While most existing methods help to provide information about a single piece of the audio data at a time, few of them can analyze the available audio file on a larger scale. The research conducted in this work considers how to further utilize the examination of audio data by storing more information from the original audio file. In practice, we develop a novel musical analysis system Musicians Aid to process musical representation and examination of audio data. Musicians Aid solves the previous problem by storing and analyzing the audio information as it reads it rather than tossing it aside. The system can provide professional musicians with an insightful look at the music they created and advance their understanding of their work. Amateur musicians could also benefit from using it solely for the purpose of obtaining feedback about a song they were attempting to play. By comparing our system's interpretation of traditional sheet music with their own playing, a musician could ensure what they played was correct. More specifically, the system could show them exactly where they went wrong and how to adjust their mistakes. In addition, the application could be extended over the Internet to allow users to play music with one another and then review the audio data they produced. This would be particularly

  11. Real-Time Perceptual Model for Distraction in Interfering Audio-on-Audio Scenarios

    DEFF Research Database (Denmark)

    Rämö, Jussi; Bech, Søren; Jensen, Søren Holdt

    2017-01-01

    model. Thus, while providing similar accuracy as the previous model, the proposed model can be run in real time. The proposed distraction model can be used as a tool for evaluating and optimizing sound-zone systems. Furthermore, the real-time capability of the model introduces new possibilities......This letter proposes a real-time perceptual model predicting the experienced distraction occurring in interfering audio-on-audio situations. The proposed model improves the computational efficiency of a previous distraction model, which cannot provide predictions in real time. The chosen approach...

  12. Portable audio electronics for impedance-based measurements in microfluidics

    International Nuclear Information System (INIS)

    Wood, Paul; Sinton, David

    2010-01-01

    We demonstrate the use of audio electronics-based signals to perform on-chip electrochemical measurements. Cell phones and portable music players are examples of consumer electronics that are easily operated and are ubiquitous worldwide. Audio output (play) and input (record) signals are voltage based and contain frequency and amplitude information. A cell phone, laptop soundcard and two compact audio players are compared with respect to frequency response; the laptop soundcard provides the most uniform frequency response, while the cell phone performance is found to be insufficient. The audio signals in the common portable music players and laptop soundcard operate in the range of 20 Hz to 20 kHz and are found to be applicable, as voltage input and output signals, to impedance-based electrochemical measurements in microfluidic systems. Validated impedance-based measurements of concentration (0.1–50 mM), flow rate (2–120 µL min −1 ) and particle detection (32 µm diameter) are demonstrated. The prevailing, lossless, wave audio file format is found to be suitable for data transmission to and from external sources, such as a centralized lab, and the cost of all hardware (in addition to audio devices) is ∼10 USD. The utility demonstrated here, in combination with the ubiquitous nature of portable audio electronics, presents new opportunities for impedance-based measurements in portable microfluidic systems. (technical note)

  13. Instrumental Landing Using Audio Indication

    Science.gov (United States)

    Burlak, E. A.; Nabatchikov, A. M.; Korsun, O. N.

    2018-02-01

    The paper proposes an audio indication method for presenting to a pilot the information regarding the relative positions of an aircraft in the tasks of precision piloting. The implementation of the method is presented, the use of such parameters of audio signal as loudness, frequency and modulation are discussed. To confirm the operability of the audio indication channel the experiments using modern aircraft simulation facility were carried out. The simulated performed the instrument landing using the proposed audio method to indicate the aircraft deviations in relation to the slide path. The results proved compatible with the simulated instrumental landings using the traditional glidescope pointers. It inspires to develop the method in order to solve other precision piloting tasks.

  14. WLAN Technologies for Audio Delivery

    Directory of Open Access Journals (Sweden)

    Nicolas-Alexander Tatlas

    2007-01-01

    Full Text Available Audio delivery and reproduction for home or professional applications may greatly benefit from the adoption of digital wireless local area network (WLAN technologies. The most challenging aspect of such integration relates the synchronized and robust real-time streaming of multiple audio channels to multipoint receivers, for example, wireless active speakers. Here, it is shown that current WLAN solutions are susceptible to transmission errors. A detailed study of the IEEE802.11e protocol (currently under ratification is also presented and all relevant distortions are assessed via an analytical and experimental methodology. A novel synchronization scheme is also introduced, allowing optimized playback for multiple receivers. The perceptual audio performance is assessed for both stereo and 5-channel applications based on either PCM or compressed audio signals.

  15. Definici?n de audio

    OpenAIRE

    Monta?ez, Luis A.; Cabrera, Juan G.

    2015-01-01

    Descripci?n del significado de Audio como objeto de estudio por distintos autores, y su diferenciaci?n con el significado de Sonido. Se define Audio como una se?al el?ctrica con caracter?sticas similares en su forma de onda en comparaci?n a la de una se?al sonora. La se?al sonora corresponde a presi?n en un medio f?sico, mientras que la se?al de Audio es una tensi?n o voltaje definida como se?al an?loga. As? el Audio se concibe como una se?al el?ctrica, an?loga o anal?gica, frente una se?al s...

  16. Definici?n de audio

    OpenAIRE

    Monta?ez Carrillo, Luis A.; Cabrera, Juan G.

    2015-01-01

    Descripci?n del significado de Audio como objeto de estudio por distintos autores, y su diferenciaci?n con el significado de Sonido. De esta forma se define Audio como una se?al el?ctrica con caracter?sticas similares en su forma de onda en comparaci?n a la de una se?al sonora, teniendo en cuenta la se?al sonora corresponde a presi?n en u medio f?sico, mientras que la se?al de Audio es una tensi?n o voltaje definida como se?al an?loga. En este orden de ideas, el Audio se concibe como una se?a...

  17. ENERGY STAR Certified Audio Video

    Data.gov (United States)

    U.S. Environmental Protection Agency — Certified models meet all ENERGY STAR requirements as listed in the Version 3.0 ENERGY STAR Program Requirements for Audio Video Equipment that are effective as of...

  18. Tourism research and audio methods

    DEFF Research Database (Denmark)

    Jensen, Martin Trandberg

    2016-01-01

    Audio methods enriches sensuous tourism ethnographies. • The note suggests five research avenues for future auditory scholarship. • Sensuous tourism research has neglected the role of sounds in embodied tourism experiences.......• Audio methods enriches sensuous tourism ethnographies. • The note suggests five research avenues for future auditory scholarship. • Sensuous tourism research has neglected the role of sounds in embodied tourism experiences....

  19. Audio Steganography with Embedded Text

    Science.gov (United States)

    Teck Jian, Chua; Chai Wen, Chuah; Rahman, Nurul Hidayah Binti Ab.; Hamid, Isredza Rahmi Binti A.

    2017-08-01

    Audio steganography is about hiding the secret message into the audio. It is a technique uses to secure the transmission of secret information or hide their existence. It also may provide confidentiality to secret message if the message is encrypted. To date most of the steganography software such as Mp3Stego and DeepSound use block cipher such as Advanced Encryption Standard or Data Encryption Standard to encrypt the secret message. It is a good practice for security. However, the encrypted message may become too long to embed in audio and cause distortion of cover audio if the secret message is too long. Hence, there is a need to encrypt the message with stream cipher before embedding the message into the audio. This is because stream cipher provides bit by bit encryption meanwhile block cipher provide a fixed length of bits encryption which result a longer output compare to stream cipher. Hence, an audio steganography with embedding text with Rivest Cipher 4 encryption cipher is design, develop and test in this project.

  20. An inconclusive digital audio authenticity examination: a unique case.

    Science.gov (United States)

    Koenig, Bruce E; Lacey, Douglas S

    2012-01-01

    This case report sets forth an authenticity examination of 35 encrypted, proprietary-format digital audio files containing recorded telephone conversations between two codefendants in a criminal matter. The codefendant who recorded the conversations did so on a recording system he developed; additionally, he was both a forensic audio authenticity examiner, who had published and presented in the field, and was the head of a professional audio society's writing group for authenticity standards. The authors conducted the examination of the recordings following nine laboratory steps of the peer-reviewed and published 11-step digital audio authenticity protocol. Based considerably on the codefendant's direct involvement with the development of the encrypted audio format, his experience in the field of forensic audio authenticity analysis, and the ease with which the audio files could be accessed, converted, edited in the gap areas, and reconstructed in such a way that the processes were undetected, the authors concluded that the recordings could not be scientifically authenticated through accepted forensic practices. © 2011 American Academy of Forensic Sciences.

  1. Quality Enhancement of Compressed Audio Based on Statistical Conversion

    Directory of Open Access Journals (Sweden)

    Mouchtaris Athanasios

    2008-01-01

    Full Text Available Most audio compression formats are based on the idea of low bit rate transparent encoding. As these types of audio signals are starting to migrate from portable players with inexpensive headphones to higher quality home audio systems, it is becoming evident that higher bit rates may be required to maintain transparency. We propose a novel method that enhances low bit rate encoded audio segments by applying multiband audio resynthesis methods in a postprocessing stage. Our algorithm employs the highly flexible Generalized Gaussian mixture model which offers a more accurate representation of audio features than the Gaussian mixture model. A novel residual conversion technique is applied which proves to significantly improve the enhancement performance without excessive overhead. In addition, both cepstral and residual errors are dramatically decreased by a feature-alignment scheme that employs a sorting transformation. Some improvements regarding the quantization step are also described that enable us to further reduce the algorithm overhead. Signal enhancement examples are presented and the results show that the overhead size incurred by the algorithm is a fraction of the uncompressed signal size. Our results show that the resulting audio quality is comparable to that of a standard perceptual codec operating at approximately the same bit rate.

  2. Quality Enhancement of Compressed Audio Based on Statistical Conversion

    Directory of Open Access Journals (Sweden)

    Chris Kyriakakis

    2008-07-01

    Full Text Available Most audio compression formats are based on the idea of low bit rate transparent encoding. As these types of audio signals are starting to migrate from portable players with inexpensive headphones to higher quality home audio systems, it is becoming evident that higher bit rates may be required to maintain transparency. We propose a novel method that enhances low bit rate encoded audio segments by applying multiband audio resynthesis methods in a postprocessing stage. Our algorithm employs the highly flexible Generalized Gaussian mixture model which offers a more accurate representation of audio features than the Gaussian mixture model. A novel residual conversion technique is applied which proves to significantly improve the enhancement performance without excessive overhead. In addition, both cepstral and residual errors are dramatically decreased by a feature-alignment scheme that employs a sorting transformation. Some improvements regarding the quantization step are also described that enable us to further reduce the algorithm overhead. Signal enhancement examples are presented and the results show that the overhead size incurred by the algorithm is a fraction of the uncompressed signal size. Our results show that the resulting audio quality is comparable to that of a standard perceptual codec operating at approximately the same bit rate.

  3. Modeling Audio Fingerprints : Structure, Distortion, Capacity

    NARCIS (Netherlands)

    Doets, P.J.O.

    2010-01-01

    An audio fingerprint is a compact low-level representation of a multimedia signal. An audio fingerprint can be used to identify audio files or fragments in a reliable way. The use of audio fingerprints for identification consists of two phases. In the enrollment phase known content is fingerprinted,

  4. An introduction to audio content analysis applications in signal processing and music informatics

    CERN Document Server

    Lerch, Alexander

    2012-01-01

    "With the proliferation of digital audio distribution over digital media, audio content analysis is fast becoming a requirement for designers of intelligent signal-adaptive audio processing systems. Written by a well-known expert in the field, this book provides quick access to different analysis algorithms and allows comparison between different approaches to the same task, making it useful for newcomers to audio signal processing and industry experts alike. A review of relevant fundamentals in audio signal processing, psychoacoustics, and music theory, as well as downloadable MATLAB files are also included"--

  5. Introduction to audio analysis a MATLAB approach

    CERN Document Server

    Giannakopoulos, Theodoros

    2014-01-01

    Introduction to Audio Analysis serves as a standalone introduction to audio analysis, providing theoretical background to many state-of-the-art techniques. It covers the essential theory necessary to develop audio engineering applications, but also uses programming techniques, notably MATLAB®, to take a more applied approach to the topic. Basic theory and reproducible experiments are combined to demonstrate theoretical concepts from a practical point of view and provide a solid foundation in the field of audio analysis. Audio feature extraction, audio classification, audio segmentation, au

  6. Sharing Annotated Audio Recordings of Clinic Visits With Patients-Development of the Open Recording Automated Logging System (ORALS): Study Protocol.

    Science.gov (United States)

    Barr, Paul J; Dannenberg, Michelle D; Ganoe, Craig H; Haslett, William; Faill, Rebecca; Hassanpour, Saeed; Das, Amar; Arend, Roger; Masel, Meredith C; Piper, Sheryl; Reicher, Haley; Ryan, James; Elwyn, Glyn

    2017-07-06

    Providing patients with recordings of their clinic visits enhances patient and family engagement, yet few organizations routinely offer recordings. Challenges exist for organizations and patients, including data safety and navigating lengthy recordings. A secure system that allows patients to easily navigate recordings may be a solution. The aim of this project is to develop and test an interoperable system to facilitate routine recording, the Open Recording Automated Logging System (ORALS), with the aim of increasing patient and family engagement. ORALS will consist of (1) technically proficient software using automated machine learning technology to enable accurate and automatic tagging of in-clinic audio recordings (tagging involves identifying elements of the clinic visit most important to patients [eg, treatment plan] on the recording) and (2) a secure, easy-to-use Web interface enabling the upload and accurate linkage of recordings to patients, which can be accessed at home. We will use a mixed methods approach to develop and formatively test ORALS in 4 iterative stages: case study of pioneer clinics where recordings are currently offered to patients, ORALS design and user experience testing, ORALS software and user interface development, and rapid cycle testing of ORALS in a primary care clinic, assessing impact on patient and family engagement. Dartmouth's Informatics Collaboratory for Design, Development and Dissemination team, patients, patient partners, caregivers, and clinicians will assist in developing ORALS. We will implement a publication plan that includes a final project report and articles for peer-reviewed journals. In addition to this work, we will regularly report on our progress using popular relevant Tweet chats and online using our website, www.openrecordings.org. We will disseminate our work at relevant conferences (eg, Academy Health, Health Datapalooza, and the Institute for Healthcare Improvement Quality Forums). Finally, Iora Health, a

  7. The circadian neuropeptide PDF signals preferentially through a specific adenylate cyclase isoform AC3 in M pacemakers of Drosophila.

    Directory of Open Access Journals (Sweden)

    Laura B Duvall

    Full Text Available The neuropeptide Pigment Dispersing Factor (PDF is essential for normal circadian function in Drosophila. It synchronizes the phases of M pacemakers, while in E pacemakers it decelerates their cycling and supports their amplitude. The PDF receptor (PDF-R is present in both M and subsets of E cells. Activation of PDF-R stimulates cAMP increases in vitro and in M cells in vivo. The present study asks: What is the identity of downstream signaling components that are associated with PDF receptor in specific circadian pacemaker neurons? Using live imaging of intact fly brains and transgenic RNAi, we show that adenylate cyclase AC3 underlies PDF signaling in M cells. Genetic disruptions of AC3 specifically disrupt PDF responses: they do not affect other Gs-coupled GPCR signaling in M cells, they can be rescued, and they do not represent developmental alterations. Knockdown of the Drosophila AKAP-like scaffolding protein Nervy also reduces PDF responses. Flies with AC3 alterations show behavioral syndromes consistent with known roles of M pacemakers as mediated by PDF. Surprisingly, disruption of AC3 does not alter PDF responses in E cells--the PDF-R(+ LNd. Within M pacemakers, PDF-R couples preferentially to a single AC, but PDF-R association with a different AC(s is needed to explain PDF signaling in the E pacemakers. Thus critical pathways of circadian synchronization are mediated by highly specific second messenger components. These findings support a hypothesis that PDF signaling components within target cells are sequestered into "circadian signalosomes," whose compositions differ between E and M pacemaker cell types.

  8. Location audio simplified capturing your audio and your audience

    CERN Document Server

    Miles, Dean

    2014-01-01

    From the basics of using camera, handheld, lavalier, and shotgun microphones to camera calibration and mixer set-ups, Location Audio Simplified unlocks the secrets to clean and clear broadcast quality audio no matter what challenges you face. Author Dean Miles applies his twenty-plus years of experience as a professional location operator to teach the skills, techniques, tips, and secrets needed to produce high-quality production sound on location. Humorous and thoroughly practical, the book covers a wide array of topics, such as:* location selection* field mixing* boo

  9. A Joint Audio-Visual Approach to Audio Localization

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Christensen, Mads Græsbøll

    2015-01-01

    Localization of audio sources is an important research problem, e.g., to facilitate noise reduction. In the recent years, the problem has been tackled using distributed microphone arrays (DMA). A common approach is to apply direction-of-arrival (DOA) estimation on each array (denoted as nodes...... time-of-flight cameras. Moreover, we propose an optimal method for weighting such DOA and range information for audio localization. Our experiments on both synthetic and real data show that there is a clear, potential advantage of using the joint audiovisual localization framework....

  10. Investigating the impact of audio instruction and audio-visual biofeedback for lung cancer radiation therapy

    Science.gov (United States)

    George, Rohini

    function could be approximated to a normal distribution function. A statistical analysis was also performed to investigate if a patient's physical, tumor or general characteristics played a role in identifying whether he/she responded positively to the coaching type---signified by a reduction in the variability of respiratory motion. The analysis demonstrated that, although there were some characteristics like disease type and dose per fraction that were significant with respect to time-independent analysis, there were no significant time trends observed for the inter-session or intra-session analysis. Based on patient feedback with the existing audio-visual biofeedback system used for the study and research performed on other feedback systems, an improved audio-visual biofeedback system was designed. It is hoped the widespread clinical implementation of audio-visual biofeedback for radiotherapy will improve the accuracy of lung cancer radiotherapy.

  11. Can audio recording improve patients' recall of outpatient consultations?

    DEFF Research Database (Denmark)

    Wolderslund, Maiken; Kofoed, Poul-Erik; Axboe, Mette

    Introduction In order to give patients possibility to listen to their consultation again, we have designed a system which gives the patients access to digital audio recordings of their consultations. An Interactive Voice Response platform enables the audio recording and gives the patients access...... to replay their consultation. The intervention is evaluated in a randomised controlled trial with 5.460 patients in order to determine whether providing patients with digital audio recording of the consultation affects the patients overall perception of their consultation. In addition to this primary...... objective we want to investigate if replay of the consultations improves the patients’ recall of the information given. Methods Interviews are carried out with 40 patients whose consultations have been audio recorded. Patients are divided into two groups, those who have listened to their consultation...

  12. Hydrothermal system beneath the crater of Tarumai volcano, Japan : 3-D resistivity structure revealed using audio-magnetotellurics and induction vector

    OpenAIRE

    Yamaya, Yusuke; Mogi, Toru; Hashimoto, Takeshi; Ichihara, Hiroshi

    2009-01-01

    Audio-magnetotelluric (AMT) measurements were recorded in the crater area of Tarumai volcano, northeastern Japan. This survey brought the specific structures beneath the lava dome of Tarumai volcano, enabling us to interpret the relationship between the subsurface structure and fumarolic activity in the vicinity of a lava dome. Three-dimensional resistivity modeling was performed to achieve this purpose. The measured induction vectors pointed toward the center of the dome, implying the topogr...

  13. Realization of guitar audio effects using methods of digital signal processing

    Science.gov (United States)

    Buś, Szymon; Jedrzejewski, Konrad

    2015-09-01

    The paper is devoted to studies on possibilities of realization of guitar audio effects by means of methods of digital signal processing. As a result of research, some selected audio effects corresponding to the specifics of guitar sound were realized as the real-time system called Digital Guitar Multi-effect. Before implementation in the system, the selected effects were investigated using the dedicated application with a graphical user interface created in Matlab environment. In the second stage, the real-time system based on a microcontroller and an audio codec was designed and realized. The system is designed to perform audio effects on the output signal of an electric guitar.

  14. Audio power amplifier design handbook

    CERN Document Server

    Self, Douglas

    2013-01-01

    This book is essential for audio power amplifier designers and engineers for one simple reason...it enables you as a professional to develop reliable, high-performance circuits. The Author Douglas Self covers the major issues of distortion and linearity, power supplies, overload, DC-protection and reactive loading. He also tackles unusual forms of compensation and distortion produced by capacitors and fuses. This completely updated fifth edition includes four NEW chapters including one on The XD Principle, invented by the author, and used by Cambridge Audio. Cro

  15. Engaging Students with Audio Feedback

    Science.gov (United States)

    Cann, Alan

    2014-01-01

    Students express widespread dissatisfaction with academic feedback. Teaching staff perceive a frequent lack of student engagement with written feedback, much of which goes uncollected or unread. Published evidence shows that audio feedback is highly acceptable to students but is underused. This paper explores methods to produce and deliver audio…

  16. Haptic and Audio Interaction Design

    DEFF Research Database (Denmark)

    This book constitutes the refereed proceedings of the 5th International Workshop on Haptic and Audio Interaction Design, HAID 2010 held in Copenhagen, Denmark, in September 2010. The 21 revised full papers presented were carefully reviewed and selected for inclusion in the book. The papers are or...

  17. Audio watermark a comprehensive foundation using Matlab

    CERN Document Server

    Lin, Yiqing

    2015-01-01

    This book illustrates the commonly used and novel approaches of audio watermarking for copyrights protection. The author examines the theoretical and practical step by step guide to the topic of data hiding in audio signal such as music, speech, broadcast. The book covers new techniques developed by the authors are fully explained and MATLAB programs, for audio watermarking and audio quality assessments and also discusses methods for objectively predicting the perceptual quality of the watermarked audio signals. Explains the theoretical basics of the commonly used audio watermarking techniques Discusses the methods used to objectively and subjectively assess the quality of the audio signals Provides a comprehensive well tested MATLAB programs that can be used efficiently to watermark any audio media

  18. Bit rates in audio source coding

    NARCIS (Netherlands)

    Veldhuis, Raymond N.J.

    1992-01-01

    The goal is to introduce and solve the audio coding optimization problem. Psychoacoustic results such as masking and excitation pattern models are combined with results from rate distortion theory to formulate the audio coding optimization problem. The solution of the audio optimization problem is a

  19. Audio Frequency Analysis in Mobile Phones

    Science.gov (United States)

    Aguilar, Horacio Munguía

    2016-01-01

    A new experiment using mobile phones is proposed in which its audio frequency response is analyzed using the audio port for inputting external signal and getting a measurable output. This experiment shows how the limited audio bandwidth used in mobile telephony is the main cause of the poor speech quality in this service. A brief discussion is…

  20. 50 CFR 27.72 - Audio equipment.

    Science.gov (United States)

    2010-10-01

    ... 50 Wildlife and Fisheries 6 2010-10-01 2010-10-01 false Audio equipment. 27.72 Section 27.72 Wildlife and Fisheries UNITED STATES FISH AND WILDLIFE SERVICE, DEPARTMENT OF THE INTERIOR (CONTINUED) THE... Audio equipment. The operation or use of audio devices including radios, recording and playback devices...

  1. Audio Satellites – Overhearing Everyday Life

    DEFF Research Database (Denmark)

    Breinbjerg, Morten; Højlund, Marie Koldkjær; Riis, Morten S.

    2016-01-01

    The project “Audio Satellites – overhearing everyday life” consists of a number of mobile listening devices (audio satellites) from which sound is distributed in real time to a server and made available for listening and mixing through a web interface. The audio satellites can either be carried...

  2. 36 CFR 2.12 - Audio disturbances.

    Science.gov (United States)

    2010-07-01

    ... 36 Parks, Forests, and Public Property 1 2010-07-01 2010-07-01 false Audio disturbances. 2.12... RESOURCE PROTECTION, PUBLIC USE AND RECREATION § 2.12 Audio disturbances. (a) The following are prohibited..., motorized toy, or an audio device, such as a radio, television set, tape deck or musical instrument, in a...

  3. Evaluating Visual Information Provided by Audio Description.

    Science.gov (United States)

    Peli, E.; And Others

    1996-01-01

    The video and standard audio portions of 2 television programs were presented to 25 adults with low vision and 24 adults with normal vision; 29 additional subjects only heard the standard audio portions. Subjects then answered questions based on audio descriptions (AD) provided by Descriptive Video Service. Results indicated that some AD…

  4. Audio Format Change From Analog to Digital Audio Using the Sony Sound Forge 9.0

    OpenAIRE

    Faisal Safrudin; Yulina Yulina, SKom, MMSI

    2007-01-01

    Changes in an audio analog to digital audio is not only useful in among the journalists or the journalists are also useful for general audiences though. In previous technology we encounter a lot of almost everyone uses the form of analog audio cassettes. Along with the development of technology, analog audio format is rarely used in the presence of digital audio, but it can be overcome by changing the format of analog audio to digital audio using Sony Sound Forge 9.0. The author will discuss ...

  5. Effects for augmented reality audio headsets

    OpenAIRE

    Martí i Rabadán, Miquel

    2014-01-01

    [ANGLÈS] Augmented reality is a real-time combination of real and virtual worlds. In augmented reality audio (ARA) real surrounding sounds are mixed with virtual sound sources. In this bachelor’s degree thesis a digital, real-time hear-through system (HTS) is implemented for the acoustical transparency of an ARA headset. It is achieved by adding back the sounds that have been attenuated by the isolation characteristics of the headphone itself. The surrounding sounds are recorded on both ears...

  6. Analog Audio Format Changes From Being Digital Audio Using Sony Sound Forge 9.0

    OpenAIRE

    Faisal Safrudin; Yulina Yulina

    2010-01-01

    Perubahan sebuah audio analog ke audio digital tidak hanya berguna padakalangan jurnalis atau wartawan juga bermanfaat untuk khalayak umumsekalipun. Pada teknologi sebelumnya banyak kita jumpai hampir setiap orangmenggunakan audio analog yaitu berupa kaset. Sejalannya perkembanganteknologi, format audio analog sudah jarang digunakan dengan hadirnya audiodigital, namun hal tersebut dapat diatasi dengan merubah format audio analog keaudio digital dengan menggunakan Sony Sound Forge 9.0. Penulis...

  7. Virtual environment display for a 3D audio room simulation

    Science.gov (United States)

    Chapin, William L.; Foster, Scott H.

    1992-01-01

    The development of a virtual environment simulation system integrating a 3D acoustic audio model with an immersive 3D visual scene is discussed. The system complements the acoustic model and is specified to: allow the listener to freely move about the space, a room of manipulable size, shape, and audio character, while interactively relocating the sound sources; reinforce the listener's feeling of telepresence in the acoustical environment with visual and proprioceptive sensations; enhance the audio with the graphic and interactive components, rather than overwhelm or reduce it; and serve as a research testbed and technology transfer demonstration. The hardware/software design of two demonstration systems, one installed and one portable, are discussed through the development of four iterative configurations.

  8. Audio-visual interactions in environment assessment.

    Science.gov (United States)

    Preis, Anna; Kociński, Jędrzej; Hafke-Dys, Honorata; Wrzosek, Małgorzata

    2015-08-01

    The aim of the study was to examine how visual and audio information influences audio-visual environment assessment. Original audio-visual recordings were made at seven different places in the city of Poznań. Participants of the psychophysical experiments were asked to rate, on a numerical standardized scale, the degree of comfort they would feel if they were in such an environment. The assessments of audio-visual comfort were carried out in a laboratory in four different conditions: (a) audio samples only, (b) original audio-visual samples, (c) video samples only, and (d) mixed audio-visual samples. The general results of this experiment showed a significant difference between the investigated conditions, but not for all the investigated samples. There was a significant improvement in comfort assessment when visual information was added (in only three out of 7 cases), when conditions (a) and (b) were compared. On the other hand, the results show that the comfort assessment of audio-visual samples could be changed by manipulating the audio rather than the video part of the audio-visual sample. Finally, it seems, that people could differentiate audio-visual representations of a given place in the environment based rather of on the sound sources' compositions than on the sound level. Object identification is responsible for both landscape and soundscape grouping. Copyright © 2015. Published by Elsevier B.V.

  9. Conflicting audio-haptic feedback in physically based simulation of walking sounds

    DEFF Research Database (Denmark)

    Turchet, Luca; Serafin, Stefania; Dimitrov, Smilen

    2010-01-01

    We describe an audio-haptic experiment conducted using a system which simulates in real-time the auditory and haptic sensation of walking on different surfaces. The system is based on physical models, that drive both the haptic and audio synthesizers, and a pair of shoes enhanced with sensors...

  10. Highlight summarization in golf videos using audio signals

    Science.gov (United States)

    Kim, Hyoung-Gook; Kim, Jin Young

    2008-01-01

    In this paper, we present an automatic summarization of highlights in golf videos based on audio information alone without video information. The proposed highlight summarization system is carried out based on semantic audio segmentation and detection on action units from audio signals. Studio speech, field speech, music, and applause are segmented by means of sound classification. Swing is detected by the methods of impulse onset detection. Sounds like swing and applause form a complete action unit, while studio speech and music parts are used to anchor the program structure. With the advantage of highly precise detection of applause, highlights are extracted effectively. Our experimental results obtain high classification precision on 18 golf games. It proves that the proposed system is very effective and computationally efficient to apply the technology to embedded consumer electronic devices.

  11. AudioRegent: Exploiting SimpleADL and SoX for Digital Audio Delivery

    Directory of Open Access Journals (Sweden)

    Nitin Arora

    2010-06-01

    Full Text Available AudioRegent is a command-line Python script currently being used by the University of Alabama Libraries’ Digital Services to create web-deliverable MP3s from regions within archival audio files. In conjunction with a small-footprint XML file called SimpleADL and SoX, an open-source command-line audio editor, AudioRegent batch processes archival audio files, allowing for one or many user-defined regions, particular to each audio file, to be extracted with additional audio processing in a transparent manner that leaves the archival audio file unaltered. Doing so has alleviated many of the tensions of cumbersome workflows, complicated documentation, preservation concerns, and reliance on expensive closed-source GUI audio applications.

  12. Audio-guided audiovisual data segmentation, indexing, and retrieval

    Science.gov (United States)

    Zhang, Tong; Kuo, C.-C. Jay

    1998-12-01

    While current approaches for video segmentation and indexing are mostly focused on visual information, audio signals may actually play a primary role in video content parsing. In this paper, we present an approach for automatic segmentation, indexing, and retrieval of audiovisual data, based on audio content analysis. The accompanying audio signal of audiovisual data is first segmented and classified into basic types, i.e., speech, music, environmental sound, and silence. This coarse-level segmentation and indexing step is based upon morphological and statistical analysis of several short-term features of the audio signals. Then, environmental sounds are classified into finer classes, such as applause, explosions, bird sounds, etc. This fine-level classification and indexing step is based upon time- frequency analysis of audio signals and the use of the hidden Markov model as the classifier. On top of this archiving scheme, an audiovisual data retrieval system is proposed. Experimental results show that the proposed approach has an accuracy rate higher than 90 percent for the coarse-level classification, and higher than 85 percent for the fine-level classification. Examples of audiovisual data segmentation and retrieval are also provided.

  13. A Method to Detect AAC Audio Forgery

    Directory of Open Access Journals (Sweden)

    Qingzhong Liu

    2015-08-01

    Full Text Available Advanced Audio Coding (AAC, a standardized lossy compression scheme for digital audio, which was designed to be the successor of the MP3 format, generally achieves better sound quality than MP3 at similar bit rates. While AAC is also the default or standard audio format for many devices and AAC audio files may be presented as important digital evidences, the authentication of the audio files is highly needed but relatively missing. In this paper, we propose a scheme to expose tampered AAC audio streams that are encoded at the same encoding bit-rate. Specifically, we design a shift-recompression based method to retrieve the differential features between the re-encoded audio stream at each shifting and original audio stream, learning classifier is employed to recognize different patterns of differential features of the doctored forgery files and original (untouched audio files. Experimental results show that our approach is very promising and effective to detect the forgery of the same encoding bit-rate on AAC audio streams. Our study also shows that shift recompression-based differential analysis is very effective for detection of the MP3 forgery at the same bit rate.

  14. An Interactive Concert Program Based on Infrared Watermark and Audio Synthesis

    Science.gov (United States)

    Wang, Hsi-Chun; Lee, Wen-Pin Hope; Liang, Feng-Ju

    The objective of this research is to propose a video/audio system which allows the user to listen the typical music notes in the concert program under infrared detection. The system synthesizes audio with different pitches and tempi in accordance with the encoded data in a 2-D barcode embedded in the infrared watermark. The digital halftoning technique has been used to fabricate the infrared watermark composed of halftone dots by both amplitude modulation (AM) and frequency modulation (FM). The results show that this interactive system successfully recognizes the barcode and synthesizes audio under infrared detection of a concert program which is also valid for human observation of the contents. This interactive video/audio system has greatly expanded the capability of the printout paper to audio display and also has many potential value-added applications.

  15. Design of a WAV audio player based on K20

    Directory of Open Access Journals (Sweden)

    Xu Yu

    2016-01-01

    Full Text Available The designed player uses the Freescale Company’s MK20DX128VLH7 as the core control ship, and its hardware platform is equipped with VS1003 audio decoder, OLED display interface, USB interface and SD card slot. The player uses the open source embedded real-time operating system μC/OS-II, Freescale USB Stack V4.1.1 and FATFS, and a graphical user interface is developed to improve the user experience based on CGUI. In general, the designed WAV audio player has a strong applicability and a good practical value.

  16. Audio-visual gender recognition

    Science.gov (United States)

    Liu, Ming; Xu, Xun; Huang, Thomas S.

    2007-11-01

    Combining different modalities for pattern recognition task is a very promising field. Basically, human always fuse information from different modalities to recognize object and perform inference, etc. Audio-Visual gender recognition is one of the most common task in human social communication. Human can identify the gender by facial appearance, by speech and also by body gait. Indeed, human gender recognition is a multi-modal data acquisition and processing procedure. However, computational multimodal gender recognition has not been extensively investigated in the literature. In this paper, speech and facial image are fused to perform a mutli-modal gender recognition for exploring the improvement of combining different modalities.

  17. Digital audio watermarking fundamentals, techniques and challenges

    CERN Document Server

    Xiang, Yong; Yan, Bin

    2017-01-01

    This book offers comprehensive coverage on the most important aspects of audio watermarking, from classic techniques to the latest advances, from commonly investigated topics to emerging research subdomains, and from the research and development achievements to date, to current limitations, challenges, and future directions. It also addresses key topics such as reversible audio watermarking, audio watermarking with encryption, and imperceptibility control methods. The book sets itself apart from the existing literature in three main ways. Firstly, it not only reviews classical categories of audio watermarking techniques, but also provides detailed descriptions, analysis and experimental results of the latest work in each category. Secondly, it highlights the emerging research topic of reversible audio watermarking, including recent research trends, unique features, and the potentials of this subdomain. Lastly, the joint consideration of audio watermarking and encryption is also reviewed. With the help of this...

  18. Modified BTC Algorithm for Audio Signal Coding

    Directory of Open Access Journals (Sweden)

    TOMIC, S.

    2016-11-01

    Full Text Available This paper describes modification of a well-known image coding algorithm, named Block Truncation Coding (BTC and its application in audio signal coding. BTC algorithm was originally designed for black and white image coding. Since black and white images and audio signals have different statistical characteristics, the application of this image coding algorithm to audio signal presents a novelty and a challenge. Several implementation modifications are described in this paper, while the original idea of the algorithm is preserved. The main modifications are performed in the area of signal quantization, by designing more adequate quantizers for audio signal processing. The result is a novel audio coding algorithm, whose performance is presented and analyzed in this research. The performance analysis indicates that this novel algorithm can be successfully applied in audio signal coding.

  19. Presence and the utility of audio spatialization

    DEFF Research Database (Denmark)

    Bormann, Karsten

    2005-01-01

    The primary concern of this paper is whether the utility of audio spatialization, as opposed to the fidelity of audio spatialization, impacts presence. An experiment is reported that investigates the presence-performance relationship by decoupling spatial audio fidelity (realism) from task...... performance by varying the spatial fidelity of the audio independently of its relevance to performance on the search task that subjects were to perform. This was achieved by having conditions in which subjects searched for a music-playing radio (an active sound source) and having conditions in which...... supplied only nonattenuated audio was detrimental to performance. Even so, this group of subjects consistently had the largest increase in presence scores over the baseline experiment. Further, the Witmer and Singer (1998) presence questionnaire was more sensitive to whether the audio source was active...

  20. Design And Construction Of 300W Audio Power Amplifier For Classroom

    Directory of Open Access Journals (Sweden)

    Shune Lei Aung

    2015-07-01

    Full Text Available Abstract This paper describes the design and construction of 300W audio power amplifier for classroom. In the construction of this amplifier microphone preamplifier tone preamplifier equalizer line amplifier output power amplifier and sound level indicator are included. The output power amplifier is designed as O.C.L system and constructed by using Class B among many types of amplifier classes. There are two types in O.C.L system quasi system and complementary system. Between them the complementary system is used in the construction of 300W audio power amplifier. The Multisim software is utilized for the construction of audio power amplifier.

  1. Subband coding of digital audio signals without loss of quality

    NARCIS (Netherlands)

    Veldhuis, Raymond N.J.; Breeuwer, Marcel; van de Waal, Robbert

    1989-01-01

    A subband coding system for high quality digital audio signals is described. To achieve low bit rates at a high quality level, it exploits the simultaneous masking effect of the human ear. It is shown how this effect can be used in an adaptive bit-allocation scheme. The proposed approach has been

  2. Dynamically-Loaded Hardware Libraries (HLL) Technology for Audio Applications

    DEFF Research Database (Denmark)

    Esposito, A.; Lomuscio, A.; Nunzio, L. Di

    2016-01-01

    In this work, we apply hardware acceleration to embedded systems running audio applications. We present a new framework, Dynamically-Loaded Hardware Libraries or HLL, to dynamically load hardware libraries on reconfigurable platforms (FPGAs). Provided a library of application-specific processors,...

  3. A Model of Distraction in an Audio-on-Audio Interference Situation with Music Program Material

    DEFF Research Database (Denmark)

    Francombe, J.; Mason, R.; Dewhirst, M.

    2015-01-01

    by a qualitative analysis of subject responses. Distraction ratings were collected for one hundred randomly created audio-on-audio interference situations with music target and interferer programs. The selected features were related to the overall loudness, loudness ratio, perceptual evaluation of audio source...

  4. Real-Time Audio-Visual Analysis for Multiperson Videoconferencing

    Directory of Open Access Journals (Sweden)

    Petr Motlicek

    2013-01-01

    Full Text Available We describe the design of a system consisting of several state-of-the-art real-time audio and video processing components enabling multimodal stream manipulation (e.g., automatic online editing for multiparty videoconferencing applications in open, unconstrained environments. The underlying algorithms are designed to allow multiple people to enter, interact, and leave the observable scene with no constraints. They comprise continuous localisation of audio objects and its application for spatial audio object coding, detection, and tracking of faces, estimation of head poses and visual focus of attention, detection and localisation of verbal and paralinguistic events, and the association and fusion of these different events. Combined all together, they represent multimodal streams with audio objects and semantic video objects and provide semantic information for stream manipulation systems (like a virtual director. Various experiments have been performed to evaluate the performance of the system. The obtained results demonstrate the effectiveness of the proposed design, the various algorithms, and the benefit of fusing different modalities in this scenario.

  5. Modular Sensor Environment : Audio Visual Industry Monitoring Applications

    OpenAIRE

    Guillot, Calvin

    2017-01-01

    This work was made for Electro Waves Oy. The company specializes in Audio-visual services and interactive systems. The purpose of this work is to design and implement a modular sensor environment for the company, which will be used for developing automated systems. This thesis begins with an introduction to sensor systems and their different topologies. It is followed by an introduction to the technologies used in this project. The system is divided in three parts. The client, tha...

  6. Distortion Estimation in Compressed Music Using Only Audio Fingerprints

    NARCIS (Netherlands)

    Doets, P.J.O.; Lagendijk, R.L.

    2008-01-01

    An audio fingerprint is a compact yet very robust representation of the perceptually relevant parts of an audio signal. It can be used for content-based audio identification, even when the audio is severely distorted. Audio compression changes the fingerprint slightly. We show that these small

  7. AKTIVITAS SEKUNDER AUDIO UNTUK MENJAGA KEWASPADAAN PENGEMUDI MOBIL INDONESIA

    Directory of Open Access Journals (Sweden)

    Iftikar Zahedi Sutalaksana

    2013-03-01

    the awake, alert, and able to process all the stimulus well. The results of this study generate some form of audio response test that is integrated with the system drive in the car. Sound source is played with constant intensity between 80-85 dB. The sound will stop if the driver to respond to the sound stimulus. Response test is designed to be capable of monitoring the driver's level of alertness while driving. Its application is expected to help reduce the rate of traffic accidents in Indonesia. Keywords: driving, secondary activities, audio, alertness, response test

  8. Relationship between volcanic activity and shallow hydrothermal system at Meakandake volcano, Japan, inferred from geomagnetic and audio-frequency magnetotelluric measurements

    Science.gov (United States)

    Takahashi, Kosuke; Takakura, Shinichi; Matsushima, Nobuo; Fujii, Ikuko

    2018-01-01

    Hydrothermal activity at Meakandake volcano, Japan, from 2004 to 2014 was investigated by using long-term geomagnetic field observations and audio-frequency magnetotelluric (AMT) surveys. The total intensity of the geomagnetic field has been measured around the summit crater Ponmachineshiri since 1992 by Kakioka Magnetic Observatory. We reanalyzed an 11-year dataset of the geomagnetic total intensity distribution and used it to estimate the thermomagnetic source models responsible for the surface geomagnetic changes during four time periods (2004-2006, 2006-2008, 2008-2009 and 2013-2014). The modeled sources suggest that the first two periods correspond to a cooling phase after a phreatic eruption in 1998, the third one to a heating phase associated with a phreatic eruption in 2008, and the last one to a heating phase accompanying minor internal activity in 2013. All of the thermomagnetic sources were beneath a location on the south side of Ponmachineshiri crater. In addition, we conducted AMT surveys in 2013 and 2014 at Meakandake and constructed a two-dimensional model of the electrical resistivity structure across the volcano. Combined, the resistivity information and thermomagnetic models revealed that the demagnetization source associated with the 2008 eruptive activity, causing a change in magnetic moment about 30 to 50 times greater than the other sources, was located about 1000 m beneath Ponmachineshiri crater, within or below a zone of high conductivity (a few ohm meters), whereas the other three sources were near each other and above this zone. We interpret the conductive zone as either a hydrothermal reservoir or an impermeable clay-rich layer acting as a seal above the hydrothermal reservoir. Along with other geophysical observations, our models suggest that the 2008 phreatic eruption was triggered by a rapid influx of heat into the hydrothermal reservoir through fluid-rich fractures developed during recent seismic swarms. The hydrothermal reservoir

  9. The relationship between basic audio quality and overall listening experience.

    Science.gov (United States)

    Schoeffler, Michael; Herre, Jürgen

    2016-09-01

    Basic audio quality (BAQ) is a well-known perceptual attribute, which is rated in various listening test methods to measure the performance of audio systems. Unfortunately, when it comes to purchasing audio systems, BAQ might not have a significant influence on the customers' buying decisions since other factors, like brand loyalty, might be more important. In contrast to BAQ, overall listening experience (OLE) is an affective attribute which incorporates all aspects that are important to an individual assessor, including his or her preference for music genre and audio quality. In this work, the relationship between BAQ and OLE is investigated in more detail. To this end, an experiment was carried out, in which participants rated the BAQ and the OLE of music excerpts with different timbral and spatial degradations. In a between-group-design procedure, participants were assigned into two groups, in each of which a different set of stimuli was rated. The results indicate that rating of both attributes, BAQ and OLE, leads to similar rankings, even if a different set of stimuli is rated. In contrast to the BAQ ratings, which were more influenced by timbral than spatial degradations, the OLE ratings were almost equally influenced by timbral and spatial degradations.

  10. Turkish Music Genre Classification using Audio and Lyrics Features

    Directory of Open Access Journals (Sweden)

    Önder ÇOBAN

    2017-05-01

    Full Text Available Music Information Retrieval (MIR has become a popular research area in recent years. In this context, researchers have developed music information systems to find solutions for such major problems as automatic playlist creation, hit song detection, and music genre or mood classification. Meta-data information, lyrics, or melodic content of music are used as feature resource in previous works. However, lyrics do not often used in MIR systems and the number of works in this field is not enough especially for Turkish. In this paper, firstly, we have extended our previously created Turkish MIR (TMIR dataset, which comprises of Turkish lyrics, by including the audio file of each song. Secondly, we have investigated the effect of using audio and textual features together or separately on automatic Music Genre Classification (MGC. We have extracted textual features from lyrics using different feature extraction models such as word2vec and traditional Bag of Words. We have conducted our experiments on Support Vector Machine (SVM algorithm and analysed the impact of feature selection and different feature groups on MGC. We have considered lyrics based MGC as a text classification task and also investigated the effect of term weighting method. Experimental results show that textual features can also be effective as well as audio features for Turkish MGC, especially when a supervised term weighting method is employed. We have achieved the highest success rate as 99,12\\% by using both audio and textual features together.

  11. Estimation of macro sleep stages from whole night audio analysis.

    Science.gov (United States)

    Dafna, E; Halevi, M; Ben Or, D; Tarasiuk, A; Zigel, Y

    2016-08-01

    During routine sleep diagnostic procedure, sleep is broadly divided into three states: rapid eye movement (REM), non-REM (NREM) states, and wake, frequently named macro-sleep stages (MSS). In this study, we present a pioneering attempt for MSS detection using full night audio analysis. Our working hypothesis is that there might be differences in sound properties within each MSS due to breathing efforts (or snores) and body movements in bed. In this study, audio signals of 35 patients referred to a sleep laboratory were recorded and analyzed. An additional 178 subjects were used to train a probabilistic time-series model for MSS staging across the night. The audio-based system was validated on 20 out of the 35 subjects. System accuracy for estimating (detecting) epoch-by-epoch wake/REM/NREM states for a given subject is 74% (69% for wake, 54% for REM, and 79% NREM). Mean error (absolute difference) was 36±34 min for detecting total sleep time, 17±21 min for sleep latency, 5±5% for sleep efficiency, and 7±5% for REM percentage. These encouraging results indicate that audio-based analysis can provide a simple and comfortable alternative method for ambulatory evaluation of sleep and its disorders.

  12. Presence and the utility of audio spatialization

    DEFF Research Database (Denmark)

    Bormann, Karsten

    2005-01-01

    or not, while the presence questionnaire used by Slater and coworkers (see Tromp et al., 1998) was more sensitive to whether audio was fully spatialized or not. Finally, having the sound source active positively impacts the assessment of the audio while negatively impacting subjects' assessment...

  13. Audio Classification from Time-Frequency Texture

    OpenAIRE

    Yu, Guoshen; Slotine, Jean-Jacques

    2008-01-01

    Time-frequency representations of audio signals often resemble texture images. This paper derives a simple audio classification algorithm based on treating sound spectrograms as texture images. The algorithm is inspired by an earlier visual classification scheme particularly efficient at classifying textures. While solely based on time-frequency texture features, the algorithm achieves surprisingly good performance in musical instrument classification experiments.

  14. Synchronization and comparison of Lifelog audio recordings

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Hansen, Lars Kai

    2008-01-01

    We investigate concurrent ‘Lifelog’ audio recordings to locate segments from the same environment. We compare two techniques earlier proposed for pattern recognition in extended audio recordings, namely cross-correlation and a fingerprinting technique. If successful, such alignment can be used...

  15. Prediction of perceptual audio reproduction characteristics

    DEFF Research Database (Denmark)

    Volk, Christer Peter

    affects perception. In this project a number of audio metrics are presented, which describes perceptual characteristics in terms of properties of the physical acoustical output of headphones and loudspeakers. The audio metrics relies on perceptual models for estimations of the how these acoustical outputs...

  16. Semantic Analysis of Multimedial Information Usign Both Audio and Visual Clues

    Directory of Open Access Journals (Sweden)

    Andrej Lukac

    2008-01-01

    Full Text Available Nowadays, there is a lot of information in databases (text, audio/video form, etc.. It is important to be able to describe this data for better orientation in them. It is necessary to apply audio/video properties, which are used for metadata management, segmenting the document into semantically meaningful units, classifying each unit into a predefined scene type, indexing, summarizing the document for efficient retrieval and browsing. Data can be used for system that automatically searches for a specific person in a sequence also for special video sequences. Audio/video properties are presented by descriptors and description schemes. There are many features that can be used to characterize multimedial signals. We can analyze audio and video sequences jointly or considered them completely separately. Our aim is oriented to possibilities of combining multimedial features. Focus is direct into discussion programs, because there are more decisions how to combine audio features with video sequences.

  17. Effect of Audio Coaching on Correlation of Abdominal Displacement With Lung Tumor Motion

    International Nuclear Information System (INIS)

    Nakamura, Mitsuhiro; Narita, Yuichiro; Matsuo, Yukinori; Narabayashi, Masaru; Nakata, Manabu; Sawada, Akira; Mizowaki, Takashi; Nagata, Yasushi; Hiraoka, Masahiro

    2009-01-01

    Purpose: To assess the effect of audio coaching on the time-dependent behavior of the correlation between abdominal motion and lung tumor motion and the corresponding lung tumor position mismatches. Methods and Materials: Six patients who had a lung tumor with a motion range >8 mm were enrolled in the present study. Breathing-synchronized fluoroscopy was performed initially without audio coaching, followed by fluoroscopy with recorded audio coaching for multiple days. Two different measurements, anteroposterior abdominal displacement using the real-time positioning management system and superoinferior (SI) lung tumor motion by X-ray fluoroscopy, were performed simultaneously. Their sequential images were recorded using one display system. The lung tumor position was automatically detected with a template matching technique. The relationship between the abdominal and lung tumor motion was analyzed with and without audio coaching. Results: The mean SI tumor displacement was 10.4 mm without audio coaching and increased to 23.0 mm with audio coaching (p < .01). The correlation coefficients ranged from 0.89 to 0.97 with free breathing. Applying audio coaching, the correlation coefficients improved significantly (range, 0.93-0.99; p < .01), and the SI lung tumor position mismatches became larger in 75% of all sessions. Conclusion: Audio coaching served to increase the degree of correlation and make it more reproducible. In addition, the phase shifts between tumor motion and abdominal displacement were improved; however, all patients breathed more deeply, and the SI lung tumor position mismatches became slightly larger with audio coaching than without audio coaching.

  18. Impact of audio narrated animation on students' understanding and learning environment based on gender

    Science.gov (United States)

    Nasrudin, Ajeng Ratih; Setiawan, Wawan; Sanjaya, Yayan

    2017-05-01

    This study is titled the impact of audio narrated animation on students' understanding in learning humanrespiratory system based on gender. This study was conducted in eight grade of junior high school. This study aims to investigate the difference of students' understanding and learning environment at boys and girls classes in learning human respiratory system using audio narrated animation. Research method that is used is quasy experiment with matching pre-test post-test comparison group design. The procedures of study are: (1) preliminary study and learning habituation using audio narrated animation; (2) implementation of learning using audio narrated animation and taking data; (3) analysis and discussion. The result of analysis shows that there is significant difference on students' understanding and learning environment at boys and girls classes in learning human respiratory system using audio narrated animation, both in general and specifically in achieving learning indicators. The discussion related to the impact of audio narrated animation, gender characteristics, and constructivist learning environment. It can be concluded that there is significant difference of students' understanding at boys and girls classes in learning human respiratory system using audio narrated animation. Additionally, based on interpretation of students' respond, there is the difference increment of agreement level in learning environment.

  19. Draft genome sequence of Mesotoga strain PhosAC3, a mesophilic member of the bacterial order Thermotogales, isolated from a digestor treating phosphogypsum in Tunisia.

    Science.gov (United States)

    Ben Hania, Wajdi; Fadhlaoui, Khaled; Brochier-Armanet, Céline; Persillon, Cécile; Postec, Anne; Hamdi, Moktar; Dolla, Alain; Ollivier, Bernard; Fardeau, Marie-Laure; Le Mer, Jean; Erauso, Gaël

    2015-01-01

    Mesotoga strain PhosAc3 was the first mesophilic cultivated member of the order Thermotogales. This genus currently contain two described species, M. prima and M. infera. Strain PhosAc3, isolated from a Tunisian digestor treating phosphogypsum, is phylogenetically closely related to M. prima strain MesG1.Ag.4.2(T). Strain PhosAc3 has a genome of 3.1 Mb with a G+C content of 45.2%. It contains 3,051 protein-coding genes of which 74.6% have their best reciprocal BLAST hit in the genome of the type species, strain MesG1.Ag.4.2(T). For this reason we propose to assign strain PhosAc3 as a novel ecotype of the Mesotoga prima species. However, in contrast with the M. prima type strain, (i) it does not ferment sugars but uses them only in the presence of elemental sulfur as terminal electron acceptor, (ii) it produces only acetate and CO2 from sugars, whereas strain MesG1.Ag.4.2(T) produces acetate, butyrate, isobutyrate, isovalerate, 2-methyl-butyrate and (iii) sulfides are also end products of the elemental sulfur reduction in theses growth conditions.

  20. Aerosol hygroscopicity and cloud condensation nuclei activity during the AC3Exp campaign: implications for cloud condensation nuclei parameterization

    Science.gov (United States)

    Zhang, F.; Li, Y.; Li, Z.; Sun, L.; Li, R.; Zhao, C.; Wang, P.; Sun, Y.; Liu, X.; Li, J.; Li, P.; Ren, G.; Fan, T.

    2014-12-01

    Aerosol hygroscopicity and cloud condensation nuclei (CCN) activity under background conditions and during pollution events are investigated during the Aerosol-CCN-Cloud Closure Experiment (AC3Exp) campaign conducted at Xianghe, China in summer 2013. A gradual increase in size-resolved activation ratio (AR) with particle diameter (Dp) suggests that aerosol particles have different hygroscopicities. During pollution events, the activation diameter (Da) measured at low supersaturation (SS) was significantly increased compared to background conditions. An increase was not observed when SS was > 0.4%. The hygroscopicity parameter (κ) was ~ 0.31-0.38 for particles in accumulation mode under background conditions. This range in magnitude of κ was ~ 20%, higher than κ derived under polluted conditions. For particles in nucleation or Aitken mode, κ ranged from 0.20-0.34 for background and polluted cases. Larger particles were on average more hygroscopic than smaller particles. The situation was more complex for heavy pollution particles because of the diversity in particle composition and mixing state. A non-parallel observation CCN closure test showed that uncertainties in CCN number concentration estimates ranged from 30-40%, which are associated with changes in particle composition as well as measurement uncertainties associated with bulk and size-resolved CCN methods. A case study showed that bulk CCN activation ratios increased as total condensation nuclei (CN) number concentrations (NCN) increased on background days. The background case also showed that bulk AR correlated well with the hygroscopicity parameter calculated from chemical volume fractions. On the contrary, bulk AR decreased with increasing total NCN during pollution events, but was closely related to the fraction of the total organic mass signal at m/z 44 (f44), which is usually associated with the particle's organic oxidation level. Our study highlights the importance of chemical composition in

  1. Near-field Localization of Audio

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Christensen, Mads Græsbøll

    2014-01-01

    Localization of audio sources using microphone arrays has been an important research problem for more than two decades. Many traditional methods for solving the problem are based on a two-stage procedure: first, information about the audio source, such as time differences-of-arrival (TDOAs......) and gain ratios-of-arrival (GROAs) between microphones is estimated, and, second, this knowledge is used to localize the audio source. These methods often have a low computational complexity, but this comes at the cost of a limited estimation accuracy. Therefore, we propose a new localization approach...

  2. Musical Audio Synthesis Using Autoencoding Neural Nets

    OpenAIRE

    Sarroff, Andy; Casey, Michael A.

    2014-01-01

    With an optimal network topology and tuning of hyperpa-\\ud rameters, artificial neural networks (ANNs) may be trained\\ud to learn a mapping from low level audio features to one\\ud or more higher-level representations. Such artificial neu-\\ud ral networks are commonly used in classification and re-\\ud gression settings to perform arbitrary tasks. In this work\\ud we suggest repurposing autoencoding neural networks as\\ud musical audio synthesizers. We offer an interactive musi-\\ud cal audio synt...

  3. Audio-Visual Classification of Sports Types

    DEFF Research Database (Denmark)

    Gade, Rikke; Abou-Zleikha, Mohamed; Christensen, Mads Græsbøll

    2015-01-01

    In this work we propose a method for classification of sports types from combined audio and visual features ex- tracted from thermal video. From audio Mel Frequency Cepstral Coefficients (MFCC) are extracted, and PCA are applied to reduce the feature space to 10 dimensions. From the visual modality...... short trajectories are constructed to rep- resent the motion of players. From these, four motion fea- tures are extracted and combined directly with audio fea- tures for classification. A k-nearest neighbour classifier is applied for classification of 180 1-minute video sequences from three sports types...

  4. High quality scalable audio codec

    Science.gov (United States)

    Kim, Miyoung; Oh, Eunmi; Kim, JungHoe

    2007-09-01

    The MPEG-4 BSAC (Bit Sliced Arithmetic Coding) is a fine-grain scalable codec with layered structure which consists of a single base-layer and several enhancement layers. The scalable functionality allows us to decode the subsets of a full bitstream and to deliver audio contents adaptively under conditions of heterogeneous network and devices, and user interaction. This bitrate scalability can be provided at the cost of high frequency components. It means that the decoded output of BSAC sounds muffled as the transmitted layers become less and less due to deprived conditions of network and devices. The goal of the proposed technology is to compensate the missing high frequency components, while maintaining the fine grain scalability of BSAC. This paper describes the integration of SBR (Spectral Bandwidth Replication) tool to existing MPEG-4 BSAC. Listening test results show that the sound quality of BSAC is improved when the full bitstream is truncated for lower bitrates, and this quality is comparable to that of BSAC using SBR tool without truncation at the same bitrate.

  5. Real-Time Audio Processing on the T-CREST Multicore Platform

    DEFF Research Database (Denmark)

    Ausin, Daniel Sanz; Pezzarossa, Luca; Schoeberl, Martin

    2017-01-01

    of the audio signal. This paper presents a real-time multicore audio processing system based on the T-CREST platform. T-CREST is a time-predictable multicore processor for real-time embedded systems. Multiple audio effect tasks have been implemented, which can be connected together in different configurations...... forming sequential and parallel effect chains, and using a network-onchip for intercommunication between processors. The evaluation of the system shows that real-time processing of multiple effect configurations is possible, and that the estimation and control of latency ensures real-time behavior.......Multicore platforms are nowadays widely used for audio processing applications, due to the improvement of computational power that they provide. However, some of these systems are not optimized for temporally constrained environments, which often leads to an undesired increase in the latency...

  6. Analysis of musical expression in audio signals

    Science.gov (United States)

    Dixon, Simon

    2003-01-01

    In western art music, composers communicate their work to performers via a standard notation which specificies the musical pitches and relative timings of notes. This notation may also include some higher level information such as variations in the dynamics, tempo and timing. Famous performers are characterised by their expressive interpretation, the ability to convey structural and emotive information within the given framework. The majority of work on audio content analysis focusses on retrieving score-level information; this paper reports on the extraction of parameters describing the performance, a task which requires a much higher degree of accuracy. Two systems are presented: BeatRoot, an off-line beat tracking system which finds the times of musical beats and tracks changes in tempo throughout a performance, and the Performance Worm, a system which provides a real-time visualisation of the two most important expressive dimensions, tempo and dynamics. Both of these systems are being used to process data for a large-scale study of musical expression in classical and romantic piano performance, which uses artificial intelligence (machine learning) techniques to discover fundamental patterns or principles governing expressive performance.

  7. TECHNICAL NOTE: Portable audio electronics for impedance-based measurements in microfluidics

    Science.gov (United States)

    Wood, Paul; Sinton, David

    2010-08-01

    We demonstrate the use of audio electronics-based signals to perform on-chip electrochemical measurements. Cell phones and portable music players are examples of consumer electronics that are easily operated and are ubiquitous worldwide. Audio output (play) and input (record) signals are voltage based and contain frequency and amplitude information. A cell phone, laptop soundcard and two compact audio players are compared with respect to frequency response; the laptop soundcard provides the most uniform frequency response, while the cell phone performance is found to be insufficient. The audio signals in the common portable music players and laptop soundcard operate in the range of 20 Hz to 20 kHz and are found to be applicable, as voltage input and output signals, to impedance-based electrochemical measurements in microfluidic systems. Validated impedance-based measurements of concentration (0.1-50 mM), flow rate (2-120 µL min-1) and particle detection (32 µm diameter) are demonstrated. The prevailing, lossless, wave audio file format is found to be suitable for data transmission to and from external sources, such as a centralized lab, and the cost of all hardware (in addition to audio devices) is ~10 USD. The utility demonstrated here, in combination with the ubiquitous nature of portable audio electronics, presents new opportunities for impedance-based measurements in portable microfluidic systems.

  8. Sotto Voce: Exploring the Interplay of Conversation and Mobile Audio Spaces

    OpenAIRE

    Aoki, Paul M.; Grinter, Rebecca E.; Hurst, Amy; Szymanski, Margaret H.; Thornton, James D.; Woodruff, Allison

    2002-01-01

    In addition to providing information to individual visitors, electronic guidebooks have the potential to facilitate social interaction between visitors and their companions. However, many systems impede visitor interaction. By contrast, our electronic guidebook, Sotto Voce, has social interaction as a primary design goal. The system enables visitors to share audio information - specifically, they can hear each other's guidebook activity using a technologically mediated audio eavesdropping mec...

  9. EVALUASI KEPUASAN PENGGUNA TERHADAP APLIKASI AUDIO BOOKS

    Directory of Open Access Journals (Sweden)

    Raditya Maulana Anuraga

    2017-02-01

    Full Text Available Listeno is the first application audio books in Indonesia so that the users can get the book in audio form like listen to music, Listeno have problems in a feature request Listeno offline mode that have not been released, a security problem mp3 files that must be considered, and the target Listeno not yet reached 100,000 active users. This research has the objective to evaluate user satisfaction to Audio Books with research method approach, Nielsen. The analysis in this study using Importance Performance Analysis (IPA is combined with the index of User Satisfaction (IKP based on the indicators used are: Benefit (Usefulness, Utility (Utility, Usability (Usability, easy to understand (Learnability, Efficient (efficiency , Easy to remember (Memorability, Error (Error, and satisfaction (satisfaction. The results showed Applications User Satisfaction Audio books are quite satisfied with the results of the calculation IKP 69.58%..

  10. Parametric time-frequency domain spatial audio

    CERN Document Server

    Delikaris-Manias, Symeon; Politis, Archontis

    2018-01-01

    This book provides readers with the principles and best practices in spatial audio signal processing. It describes how sound fields and their perceptual attributes are captured and analyzed within the time-frequency domain, how essential representation parameters are coded, and how such signals are efficiently reproduced for practical applications. The book is split into four parts starting with an overview of the fundamentals. It then goes on to explain the reproduction of spatial sound before offering an examination of signal-dependent spatial filtering. The book finishes with coverage of both current and future applications and the direction that spatial audio research is heading in. Parametric Time-frequency Domain Spatial Audio focuses on applications in entertainment audio, including music, home cinema, and gaming--covering the capturing and reproduction of spatial sound as well as its generation, transduction, representation, transmission, and perception. This book will teach readers the tools needed...

  11. Audio production principles practical studio applications

    CERN Document Server

    Elmosnino, Stephane

    2018-01-01

    A new and fully practical guide to all of the key topics in audio production, this book covers the entire workflow from pre-production, to recording all kinds of instruments, to mixing theories and tools, and finally to mastering.

  12. Design of an audio advertisement dataset

    Science.gov (United States)

    Fu, Yutao; Liu, Jihong; Zhang, Qi; Geng, Yuting

    2015-12-01

    Since more and more advertisements swarm into radios, it is necessary to establish an audio advertising dataset which could be used to analyze and classify the advertisement. A method of how to establish a complete audio advertising dataset is presented in this paper. The dataset is divided into four different kinds of advertisements. Each advertisement's sample is given in *.wav file format, and annotated with a txt file which contains its file name, sampling frequency, channel number, broadcasting time and its class. The classifying rationality of the advertisements in this dataset is proved by clustering the different advertisements based on Principal Component Analysis (PCA). The experimental results show that this audio advertisement dataset offers a reliable set of samples for correlative audio advertisement experimental studies.

  13. Method for reading sensors and controlling actuators using audio interfaces of mobile devices.

    Science.gov (United States)

    Aroca, Rafael V; Burlamaqui, Aquiles F; Gonçalves, Luiz M G

    2012-01-01

    This article presents a novel closed loop control architecture based on audio channels of several types of computing devices, such as mobile phones and tablet computers, but not restricted to them. The communication is based on an audio interface that relies on the exchange of audio tones, allowing sensors to be read and actuators to be controlled. As an application example, the presented technique is used to build a low cost mobile robot, but the system can also be used in a variety of mechatronics applications and sensor networks, where smartphones are the basic building blocks.

  14. Method for Reading Sensors and Controlling Actuators Using Audio Interfaces of Mobile Devices

    Science.gov (United States)

    Aroca, Rafael V.; Burlamaqui, Aquiles F.; Gonçalves, Luiz M. G.

    2012-01-01

    This article presents a novel closed loop control architecture based on audio channels of several types of computing devices, such as mobile phones and tablet computers, but not restricted to them. The communication is based on an audio interface that relies on the exchange of audio tones, allowing sensors to be read and actuators to be controlled. As an application example, the presented technique is used to build a low cost mobile robot, but the system can also be used in a variety of mechatronics applications and sensor networks, where smartphones are the basic building blocks. PMID:22438726

  15. Pengaruh layanan informasi bimbingan konseling berbantuan media audio visual terhadap empati siswa

    Directory of Open Access Journals (Sweden)

    Rita Kumalasari

    2017-05-01

    The results of research effective of audio-visual media counseling techniques effective and practical to increase the empathy of students are rational design, key concepts, understanding, purpose, content models, the role and qualifications tutor (counselor is expected, procedures or steps in the implementation of the audio-visual, evaluation, follow-up, support system. This research is proven effective in improving student behavior. Empathy behavior of students increases 28.9% from the previous 45.08% increase to 73.98%. This increase occurred in all aspects of empathy Keywords: Effective, Audio visual, Empathy

  16. Watermarking-Based Digital Audio Data Authentication

    Directory of Open Access Journals (Sweden)

    Jana Dittmann

    2003-09-01

    Full Text Available Digital watermarking has become an accepted technology for enabling multimedia protection schemes. While most efforts concentrate on user authentication, recently interest in data authentication to ensure data integrity has been increasing. Existing concepts address mainly image data. Depending on the necessary security level and the sensitivity to detect changes in the media, we differentiate between fragile, semifragile, and content-fragile watermarking approaches for media authentication. Furthermore, invertible watermarking schemes exist while each bit change can be recognized by the watermark which can be extracted and the original data can be reproduced for high-security applications. Later approaches can be extended with cryptographic approaches like digital signatures. As we see from the literature, only few audio approaches exist and the audio domain requires additional strategies for time flow protection and resynchronization. To allow different security levels, we have to identify relevant audio features that can be used to determine content manipulations. Furthermore, in the field of invertible schemes, there are a bunch of publications for image and video data but no approaches for digital audio to ensure data authentication for high-security applications. In this paper, we introduce and evaluate two watermarking algorithms for digital audio data, addressing content integrity protection. In our first approach, we discuss possible features for a content-fragile watermarking scheme to allow several postproduction modifications. The second approach is designed for high-security applications to detect each bit change and reconstruct the original audio by introducing an invertible audio watermarking concept. Based on the invertible audio scheme, we combine digital signature schemes and digital watermarking to provide a public verifiable data authentication and a reproduction of the original, protected with a secret key.

  17. Audio Description as a Pedagogical Tool

    OpenAIRE

    Georgina Kleege; Scott Wallin

    2015-01-01

    Audio description is the process of translating visual information into words for people who are blind or have low vision. Typically such description has focused on films, museum exhibitions, images and video on the internet, and live theater. Because it allows people with visual impairments to experience a variety of cultural and educational texts that would otherwise be inaccessible, audio description is a mandated aspect of disability inclusion, although it remains markedly underdeveloped ...

  18. Audio description as an accessibility enhancer

    OpenAIRE

    Martins, Cláudia Susana Nunes

    2012-01-01

    Audio description for the blind and visually-impaired has been around since people have described what is seen. Throughout time, it has evolved and developed in different contexts, starting with daily life, moving into the cinema and television, then across other performing arts, museums and galleries, historical sites and public places. Audio description is above all an issue of accessibility and of providing visually-impaired people with the same rights to have access to culture, e...

  19. The Effect Of 3D Audio And Other Audio Techniques On Virtual Reality Experience.

    Science.gov (United States)

    Brinkman, Willem-Paul; Hoekstra, Allart R D; van Egmond, René

    2015-01-01

    Three studies were conducted to examine the effect of audio on people's experience in a virtual world. The first study showed that people could distinguish between mono, stereo, Dolby surround and 3D audio of a wasp. The second study found significant effects for audio techniques on people's self-reported anxiety, presence, and spatial perception. The third study found that adding sound to a visual virtual world had a significant effect on people's experience (including heart rate), while it found no difference in experience between stereo and 3D audio.

  20. PENGGUNAAN MEDIA AUDIO DALAM PEMBELAJARAN STENOGRAFI

    Directory of Open Access Journals (Sweden)

    S Martono

    2011-06-01

    Full Text Available The objective this study is to know the effectivenes of using audio media in stenografi typing learning. The population  of this research was 30 students that divided into two groups; experimental and controlled group consisted of 15 students. Based on the first score in stenografi subject that the two groups have the same abillity but they were given different treatment. For experimental group, they got a treatment of audio media whereas the controlled group didn’t use audio media. The technique of collecting data were documentation technique and experimental tecnique. The instrument was stenografi speed typing. The final result showed that the using of audio media was more effective and can improve the study result better than controlled group. This result was expected to  give significance for the stenografi teachers to apply audio media in learning and input for the students that stenografi was not a memorizing subject but it was a skill subject that must be trained by joining the lesson. Thus, people can use stenografi typing to record each talk. Keywords: Learning, Audio Media, Stenografi

  1. PENGGUNAAN MEDIA AUDIO DALAM PEMBELAJARAN STENOGRAFI

    Directory of Open Access Journals (Sweden)

    S Martono

    2007-06-01

    Full Text Available The objective this study is to know the effectivenes of using audio media in stenografi typing learning. The population  of this research was 30 students that divided into two groups; experimental and controlled group consisted of 15 students. Based on the first score in stenografi subject that the two groups have the same abillity but they were given different treatment. For experimental group, they got a treatment of audio media whereas the controlled group didn’t use audio media. The technique of collecting data were documentation technique and experimental tecnique. The instrument was stenografi speed typing. The final result showed that the using of audio media was more effective and can improve the study result better than controlled group. This result was expected to  give significance for the stenografi teachers to apply audio media in learning and input for the students that stenografi was not a memorizing subject but it was a skill subject that must be trained by joining the lesson. Thus, people can use stenografi typing to record each talk. Keywords: Learning, Audio Media, Stenografi

  2. Design and implementation of an audio indicator

    Science.gov (United States)

    Zheng, Shiyong; Li, Zhao; Li, Biqing

    2017-04-01

    This page proposed an audio indicator which designed by using C9014, LED by operational amplifier level indicator, the decimal count/distributor of CD4017. The experimental can control audibly neon and holiday lights through the signal. Input audio signal after C9014 composed of operational amplifier for power amplifier, the adjust potentiometer extraction amplification signal input voltage CD4017 distributors make its drive to count, then connect the LED display running situation of the circuit. This simple audio indicator just use only U1 and can produce two colors LED with the audio signal tandem come pursuit of the running effect, from LED display the running of the situation takes can understand the general audio signal. The variation in the audio and the frequency of the signal and the corresponding level size. In this light can achieve jump to change, slowly, atlas, lighting four forms, used in home, hotel, discos, theater, advertising and other fields, and a wide range of USES, rU1h life in a modern society.

  3. Design of batch audio/video conversion platform based on JavaEE

    Science.gov (United States)

    Cui, Yansong; Jiang, Lianpin

    2018-03-01

    With the rapid development of digital publishing industry, the direction of audio / video publishing shows the diversity of coding standards for audio and video files, massive data and other significant features. Faced with massive and diverse data, how to quickly and efficiently convert to a unified code format has brought great difficulties to the digital publishing organization. In view of this demand and present situation in this paper, basing on the development architecture of Sptring+SpringMVC+Mybatis, and combined with the open source FFMPEG format conversion tool, a distributed online audio and video format conversion platform with a B/S structure is proposed. Based on the Java language, the key technologies and strategies designed in the design of platform architecture are analyzed emphatically in this paper, designing and developing a efficient audio and video format conversion system, which is composed of “Front display system”, "core scheduling server " and " conversion server ". The test results show that, compared with the ordinary audio and video conversion scheme, the use of batch audio and video format conversion platform can effectively improve the conversion efficiency of audio and video files, and reduce the complexity of the work. Practice has proved that the key technology discussed in this paper can be applied in the field of large batch file processing, and has certain practical application value.

  4. The Audio Description as a Physics Teaching Tool

    Science.gov (United States)

    Cozendey, Sabrina; Costa, Maria da Piedade

    2016-01-01

    This study analyses the use of audio description in teaching physics concepts, aiming to determine the variables that influence the understanding of the concept. One education resource was audio described. For make the audio description the screen was freezing. The video with and without audio description should be presented to students, so that…

  5. Voice activity detection using audio-visual information

    DEFF Research Database (Denmark)

    Petsatodis, Theodore; Pnevmatikakis, Aristodemos; Boukis, Christos

    2009-01-01

    -decision scheme. The Mel-Frequency Cepstral Coefficients and the vertical mouth opening are the chosen audio and visual features respectively, both augmented with their first-order derivatives. The proposed system is assessed using far-field recordings from four different speakers and under various levels...... of additive white Gaussian noise, to obtain a performance superior than that which each unimodal component alone can achieve....

  6. Virtual environment display for a 3D audio room simulation

    Science.gov (United States)

    Chapin, William L.; Foster, Scott

    1992-06-01

    Recent developments in virtual 3D audio and synthetic aural environments have produced a complex acoustical room simulation. The acoustical simulation models a room with walls, ceiling, and floor of selected sound reflecting/absorbing characteristics and unlimited independent localizable sound sources. This non-visual acoustic simulation, implemented with 4 audio ConvolvotronsTM by Crystal River Engineering and coupled to the listener with a Poihemus IsotrakTM, tracking the listener's head position and orientation, and stereo headphones returning binaural sound, is quite compelling to most listeners with eyes closed. This immersive effect should be reinforced when properly integrated into a full, multi-sensory virtual environment presentation. This paper discusses the design of an interactive, visual virtual environment, complementing the acoustic model and specified to: 1) allow the listener to freely move about the space, a room of manipulable size, shape, and audio character, while interactively relocating the sound sources; 2) reinforce the listener's feeling of telepresence into the acoustical environment with visual and proprioceptive sensations; 3) enhance the audio with the graphic and interactive components, rather than overwhelm or reduce it; and 4) serve as a research testbed and technology transfer demonstration. The hardware/software design of two demonstration systems, one installed and one portable, are discussed through the development of four iterative configurations. The installed system implements a head-coupled, wide-angle, stereo-optic tracker/viewer and multi-computer simulation control. The portable demonstration system implements a head-mounted wide-angle, stereo-optic display, separate head and pointer electro-magnetic position trackers, a heterogeneous parallel graphics processing system, and object oriented C++ program code.

  7. Video equipment of tele dosimetry and audio

    International Nuclear Information System (INIS)

    Ojeda R, M.A.; Padilla C, I.

    2007-01-01

    To develop a work in an area with high radiation, it requires of a detailed knowledge of the surroundings work, a communication and effective vision, a near dosimetric control. In a work where the spaces variables and reduced accesses exist, noise that hinders the communication, defendant operative condition, radiation field and taking of decision, it is necessary to have tools that allow a total control of the environment to make opportune and effective decisions, there where the task is developed. Under this elementary concept, it was developed in the Laguna Verde Central a project that it allowed a mechanism, interactive of control in spaces complex; to see, to hear, to speak, to measure. This concept takes to the creation of an equipped system with closed circuit of television, wireless communication systems, tele dosimetry wireless systems, VHS and DVD recording equipment, uninterrupted energy units. The system requires of an electric power socket, and the installation of two cables by CCTV camera. The system is mobilized by a person. He puts on in operation in 5 minutes using a verification list. The concept was developed in the project denominated VETA-1, (Video Equipment of Tele dosimetry and Audio). It is objective of this work to present before the society the development of the VETA-1 tool that conclude in their first prototype in May of the present year. The VETA-1 project arises by a necessity of optimizing dose, it is an ALARA tool, with a countless applications, like it was proven in the 12 recharge stop of the Unit 1. The VETA-1 project integrate a recording system, with the primary end of analyzing in the place where the task is developed the details for an effective and opportune decision, but the resulting information is of utility for the personnel's training and the planning of future works. The VETA-1 system is an ALARA tool of quick response control. (Author)

  8. Extraction of ions and electrons from audio frequency plasma source

    Directory of Open Access Journals (Sweden)

    N. A. Haleem

    2016-09-01

    Full Text Available Herein, the extraction of high ion / electron current from an audio frequency (AF nitrogen gas discharge (10 – 100 kHz is studied and investigated. This system is featured by its small size (L= 20 cm and inner diameter = 3.4 cm and its capacitive discharge electrodes inside the tube and its high discharge pressure ∼ 0.3 Torr, without the need of high vacuum system or magnetic fields. The extraction system of ion/electron current from the plasma is a very simple electrode that allows self-beam focusing by adjusting its position from the source exit. The working discharge conditions were applied at a frequency from 10 to 100 kHz, power from 50 – 500 W and the gap distance between the plasma meniscus surface and the extractor electrode extending from 3 to 13 mm. The extracted ion/ electron current is found mainly dependent on the discharge power, the extraction gap width and the frequency of the audio supply. SIMION 3D program version 7.0 package is used to generate a simulation of ion trajectories as a reference to compare and to optimize the experimental extraction beam from the present audio frequency plasma source using identical operational conditions. The focal point as well the beam diameter at the collector area is deduced. The simulations showed a respectable agreement with the experimental results all together provide the optimizing basis of the extraction electrode construction and its parameters for beam production.

  9. Analytical Features: A Knowledge-Based Approach to Audio Feature Generation

    Directory of Open Access Journals (Sweden)

    Pachet François

    2009-01-01

    Full Text Available We present a feature generation system designed to create audio features for supervised classification tasks. The main contribution to feature generation studies is the notion of analytical features (AFs, a construct designed to support the representation of knowledge about audio signal processing. We describe the most important aspects of AFs, in particular their dimensional type system, on which are based pattern-based random generators, heuristics, and rewriting rules. We show how AFs generalize or improve previous approaches used in feature generation. We report on several projects using AFs for difficult audio classification tasks, demonstrating their advantage over standard audio features. More generally, we propose analytical features as a paradigm to bring raw signals into the world of symbolic computation.

  10. Investigation des correlations existant entre la perception de qualite audio et les reactions physiologiques d'un auditeur

    Science.gov (United States)

    Baudot, Matthias

    Les tests d'ecoute subjectifs permettent d'evaluer la fiabilite de reproduction des systemes de codage audio (codecs). Le projet presente ici vise a evaluer la possibilite d'utiliser les reactions physiologiques (activite electrodermale, cardiaque, musculaire et cerebrale) a la place d'une note donnee par l'auditeur, afin de caracteriser la performance d'un codec. Ceci permettrait d'avoir une methode d'evaluation plus proche de la perception reelle de qualite audio du sujet. Des tests d'ecoute mettant en oeuvre des degradations audio bien connues en concours avec la mesure des reactions physiologiques ont ete realises pour 4 auditeurs. L'analyse des resultats montre que certaines caracteristiques physiologiques permettent d'avoir une information fiable sur la qualite audio percue, et ce de maniere repetable pour pres de 70% des signaux audio testes chez un sujet, et pres de 60% des sequences audio testees chez tous les sujets. Ceci permet de postuler sur la faisabilite d'une telle methode d'evaluation subjective des codecs audio. Mots-cles : test d'ecoute subjectif, evaluation des codecs audio, mesures physiologiques, qualite audio percue, conductance electrodermale, photoplethysmographie, electromyogramme, electroencephalogramme

  11. Implementation of Audio signal by using wavelet transform

    OpenAIRE

    Chakresh kumar,; Chandra Shekhar; Ashu Soni; Bindu Thakral

    2010-01-01

    Audio coding is the technology to represent audio in digital form with as few bits as possible while maintaining the intelligibility and quality required for particular application. Interest in audio coding is motivated by the evolution to digital communications and the requirement to minimize bit rate, and hence conserve bandwidth. There is always a tradeoff between compression ratio and maintaining the delivered audio quality and intelligibility. Audio coding is widely used in application s...

  12. Big Data Analytics: Challenges And Applications For Text, Audio, Video, And Social Media Data

    OpenAIRE

    Jai Prakash Verma; Smita Agrawal; Bankim Patel; Atul Patel

    2016-01-01

    All types of machine automated systems are generating large amount of data in different forms like statistical, text, audio, video, sensor, and bio-metric data that emerges the term Big Data. In this paper we are discussing issues, challenges, and application of these types of Big Data with the consideration of big data dimensions. Here we are discussing social media data analytics, content based analytics, text data analytics, audio, and video data analytics their issues and expected applica...

  13. Single conversion audio amplifier and DC-AC converters with high performance and low complexity control scheme

    DEFF Research Database (Denmark)

    Poulsen, Søren; Andersen, Michael Andreas E.

    2004-01-01

    This paper proposes a novel control topology for a mains isolated single conversion audio amplifier and DC-AC converters. The topology is made for use in audio applications, and differs from prior art in terms of significantly reduced distortion as well as lower system complexity. The topology can...

  14. Audio Arduino - an ALSA (Advanced Linux Sound Architecture) audio driver for FTDI-based Arduinos

    DEFF Research Database (Denmark)

    Dimitrov, Smilen; Serafin, Stefania

    2011-01-01

    A contemporary PC user, typically expects a sound card to be a piece of hardware, that: can be manipulated by 'audio' software (most typically exemplified by 'media players'); and allows interfacing of the PC to audio reproduction and/or recording equipment. As such, a 'sound card' can be conside...

  15. Introduction of audio gating to further reduce organ motion in breathing synchronized radiotherapy

    International Nuclear Information System (INIS)

    Kubo, H. Dale; Wang Lili

    2002-01-01

    With breathing synchronized radiotherapy (BSRT), a voltage signal derived from an organ displacement detector is usually displayed on the vertical axis whereas the elapsed time is shown on the horizontal axis. The voltage gate window is set on the breathing voltage signal. Whenever the breathing signal falls between the two gate levels, a gate pulse is produced to enable the treatment machine. In this paper a new gating mechanism, audio (or time-sequence) gating, is introduced and is integrated into the existing voltage gating system. The audio gating takes advantage of the repetitive nature of the breathing signal when repetitive audio instruction is given to the patient. The audio gating is aimed at removing the regions of sharp rises and falls in the breathing signal that cannot be removed by the voltage gating. When the breathing signal falls between voltage gate levels as well as between audio-gate levels, the voltage- and audio-gated radiotherapy (ART) system will generate an AND gate pulse. When this gate pulse is received by a linear accelerator, the linear accelerator becomes 'enabled' for beam delivery and will deliver the beam when all other interlocks are removed. This paper describes a new gating mechanism and a method of recording beam-on signal, both of which are, configured into a laptop computer. The paper also presents evidence of some clinical advantages achieved with the ART system

  16. Audio-vocal interaction in single neurons of the monkey ventrolateral prefrontal cortex.

    Science.gov (United States)

    Hage, Steffen R; Nieder, Andreas

    2015-05-06

    Complex audio-vocal integration systems depend on a strong interconnection between the auditory and the vocal motor system. To gain cognitive control over audio-vocal interaction during vocal motor control, the PFC needs to be involved. Neurons in the ventrolateral PFC (VLPFC) have been shown to separately encode the sensory perceptions and motor production of vocalizations. It is unknown, however, whether single neurons in the PFC reflect audio-vocal interactions. We therefore recorded single-unit activity in the VLPFC of rhesus monkeys (Macaca mulatta) while they produced vocalizations on command or passively listened to monkey calls. We found that 12% of randomly selected neurons in VLPFC modulated their discharge rate in response to acoustic stimulation with species-specific calls. Almost three-fourths of these auditory neurons showed an additional modulation of their discharge rates either before and/or during the monkeys' motor production of vocalization. Based on these audio-vocal interactions, the VLPFC might be well positioned to combine higher order auditory processing with cognitive control of the vocal motor output. Such audio-vocal integration processes in the VLPFC might constitute a precursor for the evolution of complex learned audio-vocal integration systems, ultimately giving rise to human speech. Copyright © 2015 the authors 0270-6474/15/357030-11$15.00/0.

  17. Audio stream classification for multimedia database search

    Science.gov (United States)

    Artese, M.; Bianco, S.; Gagliardi, I.; Gasparini, F.

    2013-03-01

    Search and retrieval of huge archives of Multimedia data is a challenging task. A classification step is often used to reduce the number of entries on which to perform the subsequent search. In particular, when new entries of the database are continuously added, a fast classification based on simple threshold evaluation is desirable. In this work we present a CART-based (Classification And Regression Tree [1]) classification framework for audio streams belonging to multimedia databases. The database considered is the Archive of Ethnography and Social History (AESS) [2], which is mainly composed of popular songs and other audio records describing the popular traditions handed down generation by generation, such as traditional fairs, and customs. The peculiarities of this database are that it is continuously updated; the audio recordings are acquired in unconstrained environment; and for the non-expert human user is difficult to create the ground truth labels. In our experiments, half of all the available audio files have been randomly extracted and used as training set. The remaining ones have been used as test set. The classifier has been trained to distinguish among three different classes: speech, music, and song. All the audio files in the dataset have been previously manually labeled into the three classes above defined by domain experts.

  18. Automatic summarization of soccer highlights using audio-visual descriptors.

    Science.gov (United States)

    Raventós, A; Quijada, R; Torres, Luis; Tarrés, Francesc

    2015-01-01

    Automatic summarization generation of sports video content has been object of great interest for many years. Although semantic descriptions techniques have been proposed, many of the approaches still rely on low-level video descriptors that render quite limited results due to the complexity of the problem and to the low capability of the descriptors to represent semantic content. In this paper, a new approach for automatic highlights summarization generation of soccer videos using audio-visual descriptors is presented. The approach is based on the segmentation of the video sequence into shots that will be further analyzed to determine its relevance and interest. Of special interest in the approach is the use of the audio information that provides additional robustness to the overall performance of the summarization system. For every video shot a set of low and mid level audio-visual descriptors are computed and lately adequately combined in order to obtain different relevance measures based on empirical knowledge rules. The final summary is generated by selecting those shots with highest interest according to the specifications of the user and the results of relevance measures. A variety of results are presented with real soccer video sequences that prove the validity of the approach.

  19. Audio Description as a Pedagogical Tool

    Directory of Open Access Journals (Sweden)

    Georgina Kleege

    2015-05-01

    Full Text Available Audio description is the process of translating visual information into words for people who are blind or have low vision. Typically such description has focused on films, museum exhibitions, images and video on the internet, and live theater. Because it allows people with visual impairments to experience a variety of cultural and educational texts that would otherwise be inaccessible, audio description is a mandated aspect of disability inclusion, although it remains markedly underdeveloped and underutilized in our classrooms and in society in general. Along with increasing awareness of disability, audio description pushes students to practice close reading of visual material, deepen their analysis, and engage in critical discussions around the methodology, standards and values, language, and role of interpretation in a variety of academic disciplines. We outline a few pedagogical interventions that can be customized to different contexts to develop students' writing and critical thinking skills through guided description of visual material.

  20. Frequency Hopping Method for Audio Watermarking

    Directory of Open Access Journals (Sweden)

    A. Anastasijević

    2012-11-01

    Full Text Available This paper evaluates the degradation of audio content for a perceptible removable watermark. Two different approaches to embedding the watermark in the spectral domain were investigated. The frequencies for watermark embedding are chosen according to a pseudorandom sequence making the methods robust. Consequentially, the lower quality audio can be used for promotional purposes. For a fee, the watermark can be removed with a secret watermarking key. Objective and subjective testing was conducted in order to measure degradation level for the watermarked music samples and to examine residual distortion for different parameters of the watermarking algorithm and different music genres.

  1. Combining multiple observations of audio signals

    Science.gov (United States)

    Bayram, Ilker

    2013-09-01

    We consider the problem of reconstructing an audio signal from multiple observations, each of which is contaminated with time-varying noise. Assuming that the time-variation is different for each observation, we propose an estimation formulation that can adapt to these changes. Specifically, we postulate a parametric reconstruction and choose the parameters so that the reconstruction minimizes a cost function. The cost function is selected so that audio signals are penalized less compared to arbitrary signals with the same energy. As cost functions, we experiment with a recently proposed prior as well as mixed norms placed on the short time Fourier coefficients.

  2. Enhancing Navigation Skills through Audio Gaming.

    Science.gov (United States)

    Sánchez, Jaime; Sáenz, Mauricio; Pascual-Leone, Alvaro; Merabet, Lotfi

    2010-01-01

    We present the design, development and initial cognitive evaluation of an Audio-based Environment Simulator (AbES). This software allows a blind user to navigate through a virtual representation of a real space for the purposes of training orientation and mobility skills. Our findings indicate that users feel satisfied and self-confident when interacting with the audio-based interface, and the embedded sounds allow them to correctly orient themselves and navigate within the virtual world. Furthermore, users are able to transfer spatial information acquired through virtual interactions into real world navigation and problem solving tasks.

  3. Enhancing Navigation Skills through Audio Gaming

    Science.gov (United States)

    Sánchez, Jaime; Sáenz, Mauricio; Pascual-Leone, Alvaro; Merabet, Lotfi

    2014-01-01

    We present the design, development and initial cognitive evaluation of an Audio-based Environment Simulator (AbES). This software allows a blind user to navigate through a virtual representation of a real space for the purposes of training orientation and mobility skills. Our findings indicate that users feel satisfied and self-confident when interacting with the audio-based interface, and the embedded sounds allow them to correctly orient themselves and navigate within the virtual world. Furthermore, users are able to transfer spatial information acquired through virtual interactions into real world navigation and problem solving tasks. PMID:25505796

  4. Overview of the audio description in spanish DTT channels

    Directory of Open Access Journals (Sweden)

    Francisco José González

    2014-09-01

    Full Text Available This paper presents an analysis of current practices in audio description in Spanish TV channels. The results of this research show that in some channels the audio description is broadcasted for ‘receiver mix audio description’ while in other channels the alternative used is ‘broadcaster mix audio description’. The problems detected for the activation of audio description in users’ TVs can be solved applying some enhancement to signaling information used by broadcasters in their DVB TV channels. Finally, some recommendations for the users are included to present the key aspects to audio description activation in their TVs.

  5. Audio wiring guide how to wire the most popular audio and video connectors

    CERN Document Server

    Hechtman, John

    2012-01-01

    Whether you're a pro or an amateur, a musician or into multimedia, you can't afford to guess about audio wiring. The Audio Wiring Guide is a comprehensive, easy-to-use guide that explains exactly what you need to know. No matter the size of your wiring project or installation, this handy tool provides you with the essential information you need and the techniques to use it. Using The Audio Wiring Guide is like having an expert at your side. By following the clear, step-by-step directions, you can do professional-level work at a fraction of the cost.

  6. Mobile video-to-audio transducer and motion detection for sensory substitution

    Directory of Open Access Journals (Sweden)

    Maxime eAmbard

    2015-10-01

    Full Text Available Visuo-auditory sensory substitution systems are augmented reality devices that translate a video stream into an audio stream in order to help the blind in daily tasks requiring visuo-spatial information. In this work, we present both a new mobile device and a transcoding method specifically designed to sonify moving objects. Frame differencing is used to extract spatial features from the video stream and two-dimensional spatial information is converted into audio cues using pitch, interaural time difference and interaural level difference. Using numerical methods, we attempt to reconstruct visuo-spatial information based on audio signals generated from various video stimuli. We show that despite a contrasted visual background and a highly lossy encoding method, the information in the audio signal is sufficient to allow object localization, object trajectory evaluation, object approach detection, and spatial separation of multiple objects. We also show that this type of audio signal can be interpreted by human users by asking ten subjects to discriminate trajectories based on generated audio signals.

  7. The complete guide to high-end audio

    CERN Document Server

    Harley, Robert

    2015-01-01

    An updated edition of what many consider the "bible of high-end audio"   In this newly revised and updated fifth edition, Robert Harley, editor in chief of the Absolute Sound magazine, tells you everything you need to know about buying and enjoying high-quality hi-fi. With this book, discover how to get the best sound for your money, how to identify the weak links in your system and upgrade where it will do the most good, how to set up and tweak your system for maximum performance, and how to become a more perceptive and appreciative listener. Just a few of the secrets you will learn cover hi

  8. Quantitative characterisation of audio data by ordinal symbolic dynamics

    Science.gov (United States)

    Aschenbrenner, T.; Monetti, R.; Amigó, J. M.; Bunk, W.

    2013-06-01

    Ordinal symbolic dynamics has developed into a valuable method to describe complex systems. Recently, using the concept of transcripts, the coupling behaviour of systems was assessed, combining the properties of the symmetric group with information theoretic ideas. In this contribution, methods from the field of ordinal symbolic dynamics are applied to the characterisation of audio data. Coupling complexity between frequency bands of solo violin music, as a fingerprint of the instrument, is used for classification purposes within a support vector machine scheme. Our results suggest that coupling complexity is able to capture essential characteristics, sufficient to distinguish among different violins.

  9. A listening test system for automotive audio

    DEFF Research Database (Denmark)

    Christensen, Flemming; Martin, Geoff; Minnaar, Pauli

    2005-01-01

    A selection procedure was devised in order to select listeners for experiments in which their main task will be to judge multi-channel reproduced sound. 91 participants filled in a web-based questionnaire. 78 of them took part in an assessment of their hearing thresholds, their spatial hearing, a...

  10. Predistortion of a Bidirectional Cuk Audio Amplifier

    DEFF Research Database (Denmark)

    Birch, Thomas Hagen; Nielsen, Dennis; Knott, Arnold

    2014-01-01

    using predistortion. This paper suggests linearizing a nonlinear bidirectional Cuk audio amplifier using an analog predistortion approach. A prototype power stage was built and results show that a voltage gain of up to 9 dB and reduction in THD from 6% down to 3% was obtainable using this approach....

  11. Spatial audio quality perception (part 2)

    DEFF Research Database (Denmark)

    Conetta, R.; Brookes, T.; Rumsey, F.

    2015-01-01

    location, envelopment, coverage angle, ensemble width, and spaciousness. They can also impact timbre, and changes to timbre can then influence spatial perception. Previously obtained data was used to build a regression model of perceived spatial audio quality in terms of spatial and timbral metrics...

  12. Audio Signal Quantization Companding Laws Comparative Analysis

    Directory of Open Access Journals (Sweden)

    Aleksei A. Matskaniuk

    2012-05-01

    Full Text Available We describe the results of research on the effectiveness of the optimal in the sense of minimum error variance quantization scale audio playback (Lloyd-Max algorithm, and scales based on the A and Mu-law companding.

  13. Utilization of Nonlinear Converters for Audio Amplification

    DEFF Research Database (Denmark)

    Iversen, Niels; Birch, Thomas; Knott, Arnold

    2012-01-01

    . The introduction of non-linear converters for audio amplication defeats this limitation. A Cuk converter, designed to deliver an AC peak output voltage twice the supply voltage, is presented in this paper. A 3V prototype has been developed to prove the concept. The prototype shows that it is possible to achieve...

  14. Providing Students with Formative Audio Feedback

    Science.gov (United States)

    Brearley, Francis Q.; Cullen, W. Rod

    2012-01-01

    The provision of timely and constructive feedback is increasingly challenging for busy academics. Ensuring effective student engagement with feedback is equally difficult. Increasingly, studies have explored provision of audio recorded feedback to enhance effectiveness and engagement with feedback. Few, if any, of these focus on purely formative…

  15. An ESL Audio-Script Writing Workshop

    Science.gov (United States)

    Miller, Carla

    2012-01-01

    The roles of dialogue, collaborative writing, and authentic communication have been explored as effective strategies in second language writing classrooms. In this article, the stages of an innovative, multi-skill writing method, which embeds students' personal voices into the writing process, are explored. A 10-step ESL Audio Script Writing Model…

  16. Agency Video, Audio and Imagery Library

    Science.gov (United States)

    Grubbs, Rodney

    2015-01-01

    The purpose of this presentation was to inform the ISS International Partners of the new NASA Agency Video, Audio and Imagery Library (AVAIL) website. AVAIL is a new resource for the public to search for and download NASA-related imagery, and is not intended to replace the current process by which the International Partners receive their Space Station imagery products.

  17. Frequency Compensation of an Audio Power Amplifier

    NARCIS (Netherlands)

    van der Zee, Ronan A.R.; van Heeswijk, R.

    2006-01-01

    A car audio power amplifier is presented that uses a frequency compensation scheme which avoids large compensation capacitors around the MOS power transistors, while retaining the bandwidth and stable load range of nested miller compensation. THD is 0.005%@(1kHz, 10W), SNR is 108dB, and the

  18. Audio Journal in an ELT Context

    Directory of Open Access Journals (Sweden)

    Neşe Aysin Siyli

    2012-09-01

    Full Text Available It is widely acknowledged that one of the most serious problems students of English as a foreign language face is their deprivation of practicing the language outside the classroom. Generally, the classroom is the sole environment where they can practice English, which by its nature does not provide rich setting to help students develop their competence by putting the language into practice. Motivated by this need, this descriptive study investigated the impact of audio dialog journals on students’ speaking skills. It also aimed to gain insights into students’ and teacher’s opinions on keeping audio dialog journals outside the class. The data of the study developed from student and teacher audio dialog journals, student written feedbacks, interviews held with the students, and teacher observations. The descriptive analysis of the data revealed that audio dialog journals served a number of functions ranging from cognitive to linguistic, from pedagogical to psychological, and social. The findings and pedagogical implications of the study are discussed in detail.

  19. Consuming audio: an introduction to Tweak Theory

    NARCIS (Netherlands)

    Perlman, Marc

    2014-01-01

    abstractAudio technology is a medium for music, and when we pay attention to it we tend to speculate about its effects on the music it transmits. By now there are well-established traditions of commentary (many of them critical) about the impact of musical reproduction on musical production.

  20. Audible Aliasing Distortion in Digital Audio Synthesis

    Directory of Open Access Journals (Sweden)

    J. Schimmel

    2012-04-01

    Full Text Available This paper deals with aliasing distortion in digital audio signal synthesis of classic periodic waveforms with infinite Fourier series, for electronic musical instruments. When these waveforms are generated in the digital domain then the aliasing appears due to its unlimited bandwidth. There are several techniques for the synthesis of these signals that have been designed to avoid or reduce the aliasing distortion. However, these techniques have high computing demands. One can say that today's computers have enough computing power to use these methods. However, we have to realize that today’s computer-aided music production requires tens of multi-timbre voices generated simultaneously by software synthesizers and the most of the computing power must be reserved for hard-disc recording subsystem and real-time audio processing of many audio channels with a lot of audio effects. Trivially generated classic analog synthesizer waveforms are therefore still effective for sound synthesis. We cannot avoid the aliasing distortion but spectral components produced by the aliasing can be masked with harmonic components and thus made inaudible if sufficient oversampling ratio is used. This paper deals with the assessment of audible aliasing distortion with the help of a psychoacoustic model of simultaneous masking and compares the computing demands of trivial generation using oversampling with those of other methods.

  1. Restoration of Local Degradations in Audio Signals

    Directory of Open Access Journals (Sweden)

    M. Brejl

    1996-09-01

    Full Text Available The paper presents an algorithm for restoration of local degradations in audio signals. The theoretical foundations and basic suggestions of this algorithm were published in [1]. A complete description of restoration process and some improvements are presented here.

  2. Extracting meaning from audio signals - a machine learning approach

    DEFF Research Database (Denmark)

    Larsen, Jan

    2007-01-01

    * Machine learning framework for sound search * Genre classification * Music and audio separation * Wind noise suppression......* Machine learning framework for sound search * Genre classification * Music and audio separation * Wind noise suppression...

  3. Software tools for object-based audio production using the Audio Definition Model

    OpenAIRE

    Matthias , Geier; Carpentier , Thibaut; Noisternig , Markus; Warusfel , Olivier

    2017-01-01

    International audience; We present a publicly available set of tools for the integration of the Audio Definition Model (ADM) in production workflows. ADM is an open metadata model for the description of channel-, scene-, and object-based media within a Broadcast Wave Format (BWF) container. The software tools were developed within the European research project ORPHEUS (https://orpheus-audio.eu/) that aims at developing new end-to-end object-based media chains for broadcast. These tools allow ...

  4. 47 CFR 10.520 - Common audio attention signal.

    Science.gov (United States)

    2010-10-01

    ... 47 Telecommunication 1 2010-10-01 2010-10-01 false Common audio attention signal. 10.520 Section... Equipment Requirements § 10.520 Common audio attention signal. A Participating CMS Provider and equipment manufacturers may only market devices for public use under part 10 that include an audio attention signal that...

  5. Debugging of Class-D Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Crone, Lasse; Pedersen, Jeppe Arnsdorf; Mønster, Jakob Døllner

    2012-01-01

    Determining and optimizing the performance of a Class-D audio power amplier can be very dicult without knowledge of the use of audio performance measuring equipment and of how the various noise and distortion sources in uence the audio performance. This paper gives an introduction on how to measure...

  6. Tools for signal compression applications to speech and audio coding

    CERN Document Server

    Moreau, Nicolas

    2013-01-01

    This book presents tools and algorithms required to compress/uncompress signals such as speech and music. These algorithms are largely used in mobile phones, DVD players, HDTV sets, etc. In a first rather theoretical part, this book presents the standard tools used in compression systems: scalar and vector quantization, predictive quantization, transform quantization, entropy coding. In particular we show the consistency between these different tools. The second part explains how these tools are used in the latest speech and audio coders. The third part gives Matlab programs simulating t

  7. Using Audio-Derived Affective Offset to Enhance TV Recommendation

    DEFF Research Database (Denmark)

    Shepstone, Sven Ewan; Tan, Zheng-Hua; Jensen, Søren Holdt

    2014-01-01

    . First a user's mood profile is determined using 12-class audio-based emotion classifications . An initial TV content item is then displayed to the user based on the extracted mood profile. The user has the option to either accept the recommendation, or to critique the item once or several times......, by navigating the emotion space to request an alternative match. The final match is then compared to the initial match, in terms of the difference in the items' affective parameterization . This offset is then utilized in future recommendation sessions. The system was evaluated by eliciting three different...

  8. Audio coding in wireless acoustic sensor networks

    DEFF Research Database (Denmark)

    Zahedi, Adel; Østergaard, Jan; Jensen, Søren Holdt

    2015-01-01

    ) for the resulting remote DSC problem under covariance matrix distortion constraints. We further show that for this problem, the Gaussian source is the worst to code. Thus, the Gaussian RDF provides an upper bound to other sources such as audio signals. We then turn our attention to audio signals. We consider......In this paper, we consider the problem of source coding for a wireless acoustic sensor network where each node in the network makes its own noisy measurement of the sound field, and communicates with other nodes in the network by sending and receiving encoded versions of the measurements. To make...... use of the correlation between the sources available at the nodes, we consider the possibility of combining the measurement and the received messages into one single message at each node instead of forwarding the received messages and separate encoding of the measurement. Moreover, to exploit...

  9. Enlace optoelectrónico de audio

    OpenAIRE

    García Lozano, Jesús

    2012-01-01

    En este proyecto se diseña e implementa un sistema capaz de transmitir audio mediante luz infrarroja. Se pueden diferenciar dos grandes partes del proyecto, una el módulo emisor y la otra el módulo receptor. La señal es introducida en el módulo emisor a partir de cualquier reproductor de audio. Esta señal es sometida a un proceso de modulación FM para mejorar la comunicación entre emisor y receptor, puesto que la transmisión de la señal en banda base es más vulnerable a ruidos. Una vez modula...

  10. Basic Concepts in Augmented Reality Audio

    OpenAIRE

    Lemordant, Jacques

    2010-01-01

    International audience; The basic difference between real and virtual sound environments is that virtual sounds are originating from another environment or are artificially created, whereas the real sounds are the natural existing sounds in the user's own environment. Augmented Reality Audio combines these aspects in a way where real and virtual sound scenes are mixed so that virtual sounds are perceived as an extension or a complement to the natural ones.

  11. New musical organology : the audio-games

    OpenAIRE

    Zénouda , Hervé

    2012-01-01

    International audience; This article aims to shed light on a new and emerging creative field: " Audio Games, " a crossroad between video games and computer music. Today, a plethora of tiny applications, which propose entertaining audiovisual experiences with a preponderant sound dimension, are available for game consoles, computers, and mobile phones. These experiences represent a new universe where the gameplay of video games is applied to musical composition, hence creating new links betwee...

  12. Emerging topics in translation: Audio description

    OpenAIRE

    Perego, Elisa

    2012-01-01

    The volume deals with several aspects of audio description for the blind and sight impaired which came to the surface during the AD session of the conference Emerging topics in translation and interpreting held at the Department of Language, Translation and Interpreting Studies of the University of Trieste, 16-18 June 2010. The topics dealt with in the volume range from the more established (linguistic analysis of ADs in various languages, strategies to overcome possible obs...

  13. Digitisation of the CERN Audio Archives

    CERN Multimedia

    Maximilien Brice

    2006-01-01

    Since the creation of CERN in 1954 until mid 1980s, the audiovisual service has recorded hundreds of hours of moments of life at CERN on audio tapes. These moments range from inaugurations of new facilities to VIP speeches and general interest cultural seminars The preservation process started in June 2005 On these pictures, we see Waltraud Hug working on an open-reel tape.

  14. Comparing audio and video data for rating communication.

    Science.gov (United States)

    Williams, Kristine; Herman, Ruth; Bontempo, Daniel

    2013-09-01

    Video recording has become increasingly popular in nursing research, adding rich nonverbal, contextual, and behavioral information. However, benefits of video over audio data have not been well established. We compared communication ratings of audio versus video data using the Emotional Tone Rating Scale. Twenty raters watched video clips of nursing care and rated staff communication on 12 descriptors that reflect dimensions of person-centered and controlling communication. Another group rated audio-only versions of the same clips. Interrater consistency was high within each group with Interclass Correlation Coefficient (ICC) (2,1) for audio .91, and video = .94. Interrater consistency for both groups combined was also high with ICC (2,1) for audio and video = .95. Communication ratings using audio and video data were highly correlated. The value of video being superior to audio-recorded data should be evaluated in designing studies evaluating nursing care.

  15. AudioMUD: a multiuser virtual environment for blind people.

    Science.gov (United States)

    Sánchez, Jaime; Hassler, Tiago

    2007-03-01

    A number of virtual environments have been developed during the last years. Among them there are some applications for blind people based on different type of audio, from simple sounds to 3-D audio. In this study, we pursued a different approach. We designed AudioMUD by using spoken text to describe the environment, navigation, and interaction. We have also introduced some collaborative features into the interaction between blind users. The core of a multiuser MUD game is a networked textual virtual environment. We developed AudioMUD by adding some collaborative features to the basic idea of a MUD and placed a simulated virtual environment inside the human body. This paper presents the design and usability evaluation of AudioMUD. Blind learners were motivated when interacted with AudioMUD and helped to improve the interaction through audio and interface design elements.

  16. Perceptually controlled doping for audio source separation

    Science.gov (United States)

    Mahé, Gaël; Nadalin, Everton Z.; Suyama, Ricardo; Romano, João MT

    2014-12-01

    The separation of an underdetermined audio mixture can be performed through sparse component analysis (SCA) that relies however on the strong hypothesis that source signals are sparse in some domain. To overcome this difficulty in the case where the original sources are available before the mixing process, the informed source separation (ISS) embeds in the mixture a watermark, which information can help a further separation. Though powerful, this technique is generally specific to a particular mixing setup and may be compromised by an additional bitrate compression stage. Thus, instead of watermarking, we propose a `doping' method that makes the time-frequency representation of each source more sparse, while preserving its audio quality. This method is based on an iterative decrease of the distance between the distribution of the signal and a target sparse distribution, under a perceptual constraint. We aim to show that the proposed approach is robust to audio coding and that the use of the sparsified signals improves the source separation, in comparison with the original sources. In this work, the analysis is made only in instantaneous mixtures and focused on voice sources.

  17. Securing Digital Audio using Complex Quadratic Map

    Science.gov (United States)

    Suryadi, MT; Satria Gunawan, Tjandra; Satria, Yudi

    2018-03-01

    In This digital era, exchanging data are common and easy to do, therefore it is vulnerable to be attacked and manipulated from unauthorized parties. One data type that is vulnerable to attack is digital audio. So, we need data securing method that is not vulnerable and fast. One of the methods that match all of those criteria is securing the data using chaos function. Chaos function that is used in this research is complex quadratic map (CQM). There are some parameter value that causing the key stream that is generated by CQM function to pass all 15 NIST test, this means that the key stream that is generated using this CQM is proven to be random. In addition, samples of encrypted digital sound when tested using goodness of fit test are proven to be uniform, so securing digital audio using this method is not vulnerable to frequency analysis attack. The key space is very huge about 8.1×l031 possible keys and the key sensitivity is very small about 10-10, therefore this method is also not vulnerable against brute-force attack. And finally, the processing speed for both encryption and decryption process on average about 450 times faster that its digital audio duration.

  18. Audio Spatial Representation Around the Body.

    Science.gov (United States)

    Aggius-Vella, Elena; Campus, Claudio; Finocchietti, Sara; Gori, Monica

    2017-01-01

    Studies have found that portions of space around our body are differently coded by our brain. Numerous works have investigated visual and auditory spatial representation, focusing mostly on the spatial representation of stimuli presented at head level, especially in the frontal space. Only few studies have investigated spatial representation around the entire body and its relationship with motor activity. Moreover, it is still not clear whether the space surrounding us is represented as a unitary dimension or whether it is split up into different portions, differently shaped by our senses and motor activity. To clarify these points, we investigated audio localization of dynamic and static sounds at different body levels. In order to understand the role of a motor action in auditory space representation, we asked subjects to localize sounds by pointing with the hand or the foot, or by giving a verbal answer. We found that the audio sound localization was different depending on the body part considered. Moreover, a different pattern of response was observed when subjects were asked to make actions with respect to the verbal responses. These results suggest that the audio space around our body is split in various spatial portions, which are perceived differently: front, back, around chest, and around foot, suggesting that these four areas could be differently modulated by our senses and our actions.

  19. Reduction in time-to-sleep through EEG based brain state detection and audio stimulation.

    Science.gov (United States)

    Zhuo Zhang; Cuntai Guan; Ti Eu Chan; Juanhong Yu; Aung Aung Phyo Wai; Chuanchu Wang; Haihong Zhang

    2015-08-01

    We developed an EEG- and audio-based sleep sensing and enhancing system, called iSleep (interactive Sleep enhancement apparatus). The system adopts a closed-loop approach which optimizes the audio recording selection based on user's sleep status detected through our online EEG computing algorithm. The iSleep prototype comprises two major parts: 1) a sleeping mask integrated with a single channel EEG electrode and amplifier, a pair of stereo earphones and a microcontroller with wireless circuit for control and data streaming; 2) a mobile app to receive EEG signals for online sleep monitoring and audio playback control. In this study we attempt to validate our hypothesis that appropriate audio stimulation in relation to brain state can induce faster onset of sleep and improve the quality of a nap. We conduct experiments on 28 healthy subjects, each undergoing two nap sessions - one with a quiet background and one with our audio-stimulation. We compare the time-to-sleep in both sessions between two groups of subjects, e.g., fast and slow sleep onset groups. The p-value obtained from Wilcoxon Signed Rank Test is 1.22e-04 for slow onset group, which demonstrates that iSleep can significantly reduce the time-to-sleep for people with difficulty in falling sleep.

  20. Audio-visual speech experience with age influences perceived audio-visual asynchrony in speech.

    Science.gov (United States)

    Alm, Magnus; Behne, Dawn

    2013-10-01

    Previous research indicates that perception of audio-visual (AV) synchrony changes in adulthood. Possible explanations for these age differences include a decline in hearing acuity, a decline in cognitive processing speed, and increased experience with AV binding. The current study aims to isolate the effect of AV experience by comparing synchrony judgments from 20 young adults (20 to 30 yrs) and 20 normal-hearing middle-aged adults (50 to 60 yrs), an age range for which a decline of cognitive processing speed is expected to be minimal. When presented with AV stop consonant syllables with asynchronies ranging from 440 ms audio-lead to 440 ms visual-lead, middle-aged adults showed significantly less tolerance for audio-lead than young adults. Middle-aged adults also showed a greater shift in their point of subjective simultaneity than young adults. Natural audio-lead asynchronies are arguably more predictable than natural visual-lead asynchronies, and this predictability may render audio-lead thresholds more prone to experience-related fine-tuning.

  1. Performance comparison of audio codecs for high-quality color ring-back-tone services over CDMA

    Science.gov (United States)

    Lee, Young Han; Kim, Hong Kook; Yu, Jaehwang; Park, SeongSoo; Lee, Dong-Hahk; Woo, Daesic

    2006-10-01

    In this paper, we investigate the use of existing audio codecs for the purpose of a high quality color ring-back- tone service. First of all, we exploit the limitations of the enhanced variable rate codec (EVRC) in a view of music quality because EVRC is a standard speech coder employed in a code division multiple access (CDMA) system. In order to figure it out which current existing audio codec is suitable to deliver music over CDMA or wideband CDMA (W-CDMA), several audio codecs such as two different versions of MPEG AAC and the Enhanced AAC+ codec are reviewed. Next, the music quality of the audio codecs is compared with that of EVRC, where the bit-rates of the audio codecs are set to be around 10 kbit/s because the color ring-back-tone service using one of the audio codecs should be realized by replacing EVRC with it. The quality comparison is performed by an informal listening test as well as an objective quality test. It is shown from the experiments that the audio codecs provide better music quality than EVRC and among them, the Enhance AAC+ codec operated at a bit-rate of 10 kbit/s with a sampling rate of 32 kHz can be considered as a new candidate for the high quality color ring-back-tone service.

  2. Audio-vestibular function in human immunodeficiency virus infected patients in India.

    Science.gov (United States)

    Mathews, Suma Susan; Albert, Rita Ruby; Job, Anand

    2012-07-01

    As the acquired immunodeficiency syndrome (AIDS) epidemic shows no signs of abating, the impact of AIDS is felt more in the developing countries due to socioeconomic reasons. The possibility of drug-induced ototoxicity also adds to the risk of audio vestibular dysfunction. We sought to determine if there was a difference between the audio-vestibular function in the asymptomatic human immunodeficiency virus (HIV) infected patients and patients with AIDS. A prospective, cross-sectional study A tertiary care center in South India The audio-vestibular system of 30 asymptomatic HIV positive subjects (group 1) and 30 subjects with AIDS (group 2), and age-matched 30 healthy controls (group 3) were assessed using pure tone audiometry and cold caloric test. Sixteen patients each, in group 1 and group 2 and four subjects in the control group were detected to have a hearing loss indicating significantly more HIV infected individuals (group 1 and 2) were having hearing loss (P=0.001). Kobrak's (modified) test showed 27% of patients in group 1 and 33% of patients in group 2 and none in the group 3 had a hypofunctioning labyrinth (P=0.001). It seems that the human immunodeficiency virus does affect the audio-vestibular pathway. There was a significant incidence of audio-vestibular dysfunction among the HIV infected patients, as compared to the control population (P=0.001) and no significant difference between the asymptomatic HIV seropositive patients and AIDS patients. Majority of the patients had no otological symptoms.

  3. A high efficiency PWM CMOS class-D audio power amplifier

    Science.gov (United States)

    Zhangming, Zhu; Lianxi, Liu; Yintang, Yang; Han, Lei

    2009-02-01

    Based on the difference close-loop feedback technique and the difference pre-amp, a high efficiency PWM CMOS class-D audio power amplifier is proposed. A rail-to-rail PWM comparator with window function has been embedded in the class-D audio power amplifier. Design results based on the CSMC 0.5 μm CMOS process show that the max efficiency is 90%, the PSRR is -75 dB, the power supply voltage range is 2.5-5.5 V, the THD+N in 1 kHz input frequency is less than 0.20%, the quiescent current in no load is 2.8 mA, and the shutdown current is 0.5 μA. The active area of the class-D audio power amplifier is about 1.47 × 1.52 mm2. With the good performance, the class-D audio power amplifier can be applied to several audio power systems.

  4. Le registrazioni audio dell’archivio Luigi Nono di Venezia

    Directory of Open Access Journals (Sweden)

    Luca Cossettini

    2009-11-01

    Full Text Available The audio recordings of the Luigi Nono Archive in Venice: guidelines for preservation and critical edition of audio documentsStudying audio recordings brings us back to ancient source verification problems that too often one thinks are overcome by the technical reproduction of sound. Au-dio signal is “fixed” on a specific carrier (tape, disc etc with a specific audio format (speed, number of tracks etc; the choice of support and format during the first “memorizing” process and the following copying processes is a subjective and, in case of copying, an interpretative operation conducted within a continuously evolv-ing audio technology. What we listen to today is the result of a transmission process that unavoidably transforms the original acoustic event and the documents that memorize it. Audio recording is no way a timeless and immutable fixing process. It is therefore necessary to study the transmission processes and to reconstruct the au-dio document tradition. The re-recording of the tapes of the Archivio Luigi Nono, conducted by the Audio Labs of the DAMS Musica of the University of Udine, of-fers clear examples of the technical and musicological interpretative problems one can find when he works with audio recordings.

  5. Exploring the Implementation of Steganography Protocols on Quantum Audio Signals

    Science.gov (United States)

    Chen, Kehan; Yan, Fei; Iliyasu, Abdullah M.; Zhao, Jianping

    2018-02-01

    Two quantum audio steganography (QAS) protocols are proposed, each of which manipulates or modifies the least significant qubit (LSQb) of the host quantum audio signal that is encoded as an FRQA (flexible representation of quantum audio) audio content. The first protocol (i.e. the conventional LSQb QAS protocol or simply the cLSQ stego protocol) is built on the exchanges between qubits encoding the quantum audio message and the LSQb of the amplitude information in the host quantum audio samples. In the second protocol, the embedding procedure to realize it implants information from a quantum audio message deep into the constraint-imposed most significant qubit (MSQb) of the host quantum audio samples, we refer to it as the pseudo MSQb QAS protocol or simply the pMSQ stego protocol. The cLSQ stego protocol is designed to guarantee high imperceptibility between the host quantum audio and its stego version, whereas the pMSQ stego protocol ensures that the resulting stego quantum audio signal is better immune to illicit tampering and copyright violations (a.k.a. robustness). Built on the circuit model of quantum computation, the circuit networks to execute the embedding and extraction algorithms of both QAS protocols are determined and simulation-based experiments are conducted to demonstrate their implementation. Outcomes attest that both protocols offer promising trade-offs in terms of imperceptibility and robustness.

  6. Elicitation of attributes for the evaluation of audio-on audio-interference

    DEFF Research Database (Denmark)

    Francombe, Jon; Mason, R.; Dewhirst, M.

    2014-01-01

    An experiment to determine the perceptual attributes of the experience of listening to a target audio program in the presence of an audio interferer was performed. The first stage was a free elicitation task in which a total of 572 phrases were produced. In the second stage, a consensus vocabulary...... procedure was used to reduce these phrases into a comprehensive set of attributes. Groups of experienced and inexperienced listeners determined nine and eight attributes, respectively. These attribute sets were combined by the listeners to produce a final set of 12 attributes: masking, calming, distraction...

  7. Anthropomorphic Coding of Speech and Audio: A Model Inversion Approach

    Directory of Open Access Journals (Sweden)

    W. Bastiaan Kleijn

    2005-06-01

    Full Text Available Auditory modeling is a well-established methodology that provides insight into human perception and that facilitates the extraction of signal features that are most relevant to the listener. The aim of this paper is to provide a tutorial on perceptual speech and audio coding using an invertible auditory model. In this approach, the audio signal is converted into an auditory representation using an invertible auditory model. The auditory representation is quantized and coded. Upon decoding, it is then transformed back into the acoustic domain. This transformation converts a complex distortion criterion into a simple one, thus facilitating quantization with low complexity. We briefly review past work on auditory models and describe in more detail the components of our invertible model and its inversion procedure, that is, the method to reconstruct the signal from the output of the auditory model. We summarize attempts to use the auditory representation for low-bit-rate coding. Our approach also allows the exploitation of the inherent redundancy of the human auditory system for the purpose of multiple description (joint source-channel coding.

  8. Audio-Tactile Integration and the Influence of Musical Training

    Science.gov (United States)

    Kuchenbuch, Anja; Paraskevopoulos, Evangelos; Herholz, Sibylle C.; Pantev, Christo

    2014-01-01

    Perception of our environment is a multisensory experience; information from different sensory systems like the auditory, visual and tactile is constantly integrated. Complex tasks that require high temporal and spatial precision of multisensory integration put strong demands on the underlying networks but it is largely unknown how task experience shapes multisensory processing. Long-term musical training is an excellent model for brain plasticity because it shapes the human brain at functional and structural levels, affecting a network of brain areas. In the present study we used magnetoencephalography (MEG) to investigate how audio-tactile perception is integrated in the human brain and if musicians show enhancement of the corresponding activation compared to non-musicians. Using a paradigm that allowed the investigation of combined and separate auditory and tactile processing, we found a multisensory incongruency response, generated in frontal, cingulate and cerebellar regions, an auditory mismatch response generated mainly in the auditory cortex and a tactile mismatch response generated in frontal and cerebellar regions. The influence of musical training was seen in the audio-tactile as well as in the auditory condition, indicating enhanced higher-order processing in musicians, while the sources of the tactile MMN were not influenced by long-term musical training. Consistent with the predictive coding model, more basic, bottom-up sensory processing was relatively stable and less affected by expertise, whereas areas for top-down models of multisensory expectancies were modulated by training. PMID:24465675

  9. Automatic processing of CERN video, audio and photo archives

    International Nuclear Information System (INIS)

    Kwiatek, M

    2008-01-01

    The digitalization of CERN audio-visual archives, a major task currently in progress, will generate over 40 TB of video, audio and photo files. Storing these files is one issue, but a far more important challenge is to provide long-time coherence of the archive and to make these files available on-line with minimum manpower investment. An infrastructure, based on standard CERN services, has been implemented, whereby master files, stored in the CERN Distributed File System (DFS), are discovered and scheduled for encoding into lightweight web formats based on predefined profiles. Changes in master files, conversion profiles or in the metadata database (read from CDS, the CERN Document Server) are automatically detected and the media re-encoded whenever necessary. The encoding processes are run on virtual servers provided on-demand by the CERN Server Self Service Centre, so that new servers can be easily configured to adapt to higher load. Finally, the generated files are made available from the CERN standard web servers with streaming implemented using Windows Media Services

  10. Audio-Vestibular Findings in Increased Intracranial Hypertension Syndrome.

    Science.gov (United States)

    Çoban, Kübra; Aydın, Erdinç; Özlüoğlu, Levent Naci

    2017-04-01

    Idiopathic intracranial hypertension (IIH) can be manifested by audiological and vestibular complaints. The aim of the present study is to determine the audio-vestibular pathologies and their pathophysiologies in this syndrome by performing current audio-vestibular tests. The study was performed prospectively on 40 individuals (20 IIH patients, 20 healthy volunteers). Pure tone audiometry, tympanometry, vestibular evoked myogenic potentials, and electronystagmography tests were performed in both groups and the results were compared. The mean age of both groups was found to be 30.2±18.7. There were 11 females and 9 males in each group. The study group patients had significantly worse hearing levels. Pure tone averages were significantly higher in both ears of the study group (pvestibular systems are frequently affected in this condition. Our test results suggest inner ear pathologies in these patients. Higher incidence of inferior vestibular nerve and/or saccule dysfunction is detected as a novelty. Increased intracranial pressure may affect the inner ear with similar mechanisms as in hydrops.

  11. Audio-tactile integration and the influence of musical training.

    Directory of Open Access Journals (Sweden)

    Anja Kuchenbuch

    Full Text Available Perception of our environment is a multisensory experience; information from different sensory systems like the auditory, visual and tactile is constantly integrated. Complex tasks that require high temporal and spatial precision of multisensory integration put strong demands on the underlying networks but it is largely unknown how task experience shapes multisensory processing. Long-term musical training is an excellent model for brain plasticity because it shapes the human brain at functional and structural levels, affecting a network of brain areas. In the present study we used magnetoencephalography (MEG to investigate how audio-tactile perception is integrated in the human brain and if musicians show enhancement of the corresponding activation compared to non-musicians. Using a paradigm that allowed the investigation of combined and separate auditory and tactile processing, we found a multisensory incongruency response, generated in frontal, cingulate and cerebellar regions, an auditory mismatch response generated mainly in the auditory cortex and a tactile mismatch response generated in frontal and cerebellar regions. The influence of musical training was seen in the audio-tactile as well as in the auditory condition, indicating enhanced higher-order processing in musicians, while the sources of the tactile MMN were not influenced by long-term musical training. Consistent with the predictive coding model, more basic, bottom-up sensory processing was relatively stable and less affected by expertise, whereas areas for top-down models of multisensory expectancies were modulated by training.

  12. Calibration of an audio frequency noise generator

    DEFF Research Database (Denmark)

    Diamond, Joseph M.

    1966-01-01

    a noise bandwidth Bn = π/2 × (3dB bandwidth). To apply this method to low audio frequencies, the noise bandwidth of the low Q parallel resonant circuit has been found, including the effects of both series and parallel damping. The method has been used to calibrate a General Radio 1390-B noise generator...... it is used for measurement purposes. The spectral density of a noise source may be found by measuring its rms output over a known noise bandwidth. Such a bandwidth may be provided by a passive filter using accurately known elements. For example, the parallel resonant circuit with purely parallel damping has...

  13. Mixing audio concepts, practices and tools

    CERN Document Server

    Izhaki, Roey

    2013-01-01

    Your mix can make or break a record, and mixing is an essential catalyst for a record deal. Professional engineers with exceptional mixing skills can earn vast amounts of money and find that they are in demand by the biggest acts. To develop such skills, you need to master both the art and science of mixing. The new edition of this bestselling book offers all you need to know and put into practice in order to improve your mixes. Covering the entire process --from fundamental concepts to advanced techniques -- and offering a multitude of audio samples, tips and tricks, this boo

  14. Predistortion of a Bidirectional Cuk Audio Amplifier

    DEFF Research Database (Denmark)

    Birch, Thomas Hagen; Nielsen, Dennis; Knott, Arnold

    2014-01-01

    Some non-linear amplifier topologies are capable of providing a larger voltage gain than one from a DC source, which could make them suitable for various applications. However, the non-linearities introduce a significant amount of harmonic distortion (THD). Some of this distortion could be reduced...... using predistortion. This paper suggests linearizing a nonlinear bidirectional Cuk audio amplifier using an analog predistortion approach. A prototype power stage was built and results show that a voltage gain of up to 9 dB and reduction in THD from 6% down to 3% was obtainable using this approach....

  15. Spatial audio quality perception (part 1)

    DEFF Research Database (Denmark)

    Conetta, R.; Brookes, T.; Rumsey, F.

    2015-01-01

    resulting from 48 such SAPs. Perceived degradation also depends on the particular listeners, the program content, and the listening location. For example, combining off-center listener with another SAP can reduce spatial quality significantly when compared to listening to that SAP from a central location....... The choice of the SAP can have a large influence on the degree of degradation. Taken together these findings and the quality-annotated database can guide the development of a regression model of perceived overall spatial audio quality, incorporating previously developed spatially-relevant feature...

  16. Elicitation of attributes for the evaluation of audio-on-audio interference.

    Science.gov (United States)

    Francombe, Jon; Mason, Russell; Dewhirst, Martin; Bech, Søren

    2014-11-01

    An experiment to determine the perceptual attributes of the experience of listening to a target audio program in the presence of an audio interferer was performed. The first stage was a free elicitation task in which a total of 572 phrases were produced. In the second stage, a consensus vocabulary procedure was used to reduce these phrases into a comprehensive set of attributes. Groups of experienced and inexperienced listeners determined nine and eight attributes, respectively. These attribute sets were combined by the listeners to produce a final set of 12 attributes: masking, calming, distraction, separation, confusion, annoyance, environment, chaotic, balance and blend, imagery, response to stimuli over time, and short-term response to stimuli. In the third stage, a simplified ranking procedure was used to select only the most useful and relevant attributes. Four attributes were selected: distraction, annoyance, balance and blend, and confusion. Ratings using these attributes were collected in the fourth stage, and a principal component analysis performed. This suggested two dimensions underlying the perception of an audio-on-audio interference situation: The first dimension was labeled "distraction" and accounted for 89% of the variance; the second dimension, accounting for 10% of the variance, was labeled "balance and blend."

  17. Audio-Visual, Visuo-Tactile and Audio-Tactile Correspondences in Preschoolers.

    Science.gov (United States)

    Nava, Elena; Grassi, Massimo; Turati, Chiara

    2016-01-01

    Interest in crossmodal correspondences has recently seen a renaissance thanks to numerous studies in human adults. Yet, still very little is known about crossmodal correspondences in children, particularly in sensory pairings other than audition and vision. In the current study, we investigated whether 4-5-year-old children match auditory pitch to the spatial motion of visual objects (audio-visual condition). In addition, we investigated whether this correspondence extends to touch, i.e., whether children also match auditory pitch to the spatial motion of touch (audio-tactile condition) and the spatial motion of visual objects to touch (visuo-tactile condition). In two experiments, two different groups of children were asked to indicate which of two stimuli fitted best with a centrally located third stimulus (Experiment 1), or to report whether two presented stimuli fitted together well (Experiment 2). We found sensitivity to the congruency of all of the sensory pairings only in Experiment 2, suggesting that only under specific circumstances can these correspondences be observed. Our results suggest that pitch-height correspondences for audio-visual and audio-tactile combinations may still be weak in preschool children, and speculate that this could be due to immature linguistic and auditory cues that are still developing at age five.

  18. Audio scene segmentation for video with generic content

    Science.gov (United States)

    Niu, Feng; Goela, Naveen; Divakaran, Ajay; Abdel-Mottaleb, Mohamed

    2008-01-01

    In this paper, we present a content-adaptive audio texture based method to segment video into audio scenes. The audio scene is modeled as a semantically consistent chunk of audio data. Our algorithm is based on "semantic audio texture analysis." At first, we train GMM models for basic audio classes such as speech, music, etc. Then we define the semantic audio texture based on those classes. We study and present two types of scene changes, those corresponding to an overall audio texture change and those corresponding to a special "transition marker" used by the content creator, such as a short stretch of music in a sitcom or silence in dramatic content. Unlike prior work using genre specific heuristics, such as some methods presented for detecting commercials, we adaptively find out if such special transition markers are being used and if so, which of the base classes are being used as markers without any prior knowledge about the content. Our experimental results show that our proposed audio scene segmentation works well across a wide variety of broadcast content genres.

  19. Deep learning, audio adversaries, and music content analysis

    DEFF Research Database (Denmark)

    Kereliuk, Corey Mose; Sturm, Bob L.; Larsen, Jan

    2015-01-01

    We present the concept of adversarial audio in the context of deep neural networks (DNNs) for music content analysis. An adversary is an algorithm that makes minor perturbations to an input that cause major repercussions to the system response. In particular, we design an adversary for a DNN...... that takes as input short-time spectral magnitudes of recorded music and outputs a high-level music descriptor. We demonstrate how this adversary can make the DNN behave in any way with only extremely minor changes to the music recording signal. We show that the adversary cannot be neutralised by a simple...... filtering of the input. Finally, we discuss adversaries in the broader context of the evaluation of music content analysis systems....

  20. Feature Representations for Neuromorphic Audio Spike Streams.

    Science.gov (United States)

    Anumula, Jithendar; Neil, Daniel; Delbruck, Tobi; Liu, Shih-Chii

    2018-01-01

    Event-driven neuromorphic spiking sensors such as the silicon retina and the silicon cochlea encode the external sensory stimuli as asynchronous streams of spikes across different channels or pixels. Combining state-of-art deep neural networks with the asynchronous outputs of these sensors has produced encouraging results on some datasets but remains challenging. While the lack of effective spiking networks to process the spike streams is one reason, the other reason is that the pre-processing methods required to convert the spike streams to frame-based features needed for the deep networks still require further investigation. This work investigates the effectiveness of synchronous and asynchronous frame-based features generated using spike count and constant event binning in combination with the use of a recurrent neural network for solving a classification task using N-TIDIGITS18 dataset. This spike-based dataset consists of recordings from the Dynamic Audio Sensor, a spiking silicon cochlea sensor, in response to the TIDIGITS audio dataset. We also propose a new pre-processing method which applies an exponential kernel on the output cochlea spikes so that the interspike timing information is better preserved. The results from the N-TIDIGITS18 dataset show that the exponential features perform better than the spike count features, with over 91% accuracy on the digit classification task. This accuracy corresponds to an improvement of at least 2.5% over the use of spike count features, establishing a new state of the art for this dataset.

  1. Semantic Labeling of Nonspeech Audio Clips

    Directory of Open Access Journals (Sweden)

    Xiaojuan Ma

    2010-01-01

    Full Text Available Human communication about entities and events is primarily linguistic in nature. While visual representations of information are shown to be highly effective as well, relatively little is known about the communicative power of auditory nonlinguistic representations. We created a collection of short nonlinguistic auditory clips encoding familiar human activities, objects, animals, natural phenomena, machinery, and social scenes. We presented these sounds to a broad spectrum of anonymous human workers using Amazon Mechanical Turk and collected verbal sound labels. We analyzed the human labels in terms of their lexical and semantic properties to ascertain that the audio clips do evoke the information suggested by their pre-defined captions. We then measured the agreement with the semantically compatible labels for each sound clip. Finally, we examined which kinds of entities and events, when captured by nonlinguistic acoustic clips, appear to be well-suited to elicit information for communication, and which ones are less discriminable. Our work is set against the broader goal of creating resources that facilitate communication for people with some types of language loss. Furthermore, our data should prove useful for future research in machine analysis/synthesis of audio, such as computational auditory scene analysis, and annotating/querying large collections of sound effects.

  2. Simple Solutions for Space Station Audio Problems

    Science.gov (United States)

    Wood, Eric

    2016-01-01

    Throughout this summer, a number of different projects were supported relating to various NASA programs, including the International Space Station (ISS) and Orion. The primary project that was worked on was designing and testing an acoustic diverter which could be used on the ISS to increase sound pressure levels in Node 1, a module that does not have any Audio Terminal Units (ATUs) inside it. This acoustic diverter is not intended to be a permanent solution to providing audio to Node 1; it is simply intended to improve conditions while more permanent solutions are under development. One of the most exciting aspects of this project is that the acoustic diverter is designed to be 3D printed on the ISS, using the 3D printer that was set up earlier this year. Because of this, no new hardware needs to be sent up to the station, and no extensive hardware testing needs to be performed on the ground before sending it to the station. Instead, the 3D part file can simply be uploaded to the station's 3D printer, where the diverter will be made.

  3. Comparison between audio-only and audiovisual biofeedback for regulating patients' respiration during four-dimensional radiotherapy.

    Science.gov (United States)

    Yu, Jesang; Choi, Ji Hoon; Ma, Sun Young; Jeung, Tae Sig; Lim, Sangwook

    2015-09-01

    To compare audio-only biofeedback to conventional audiovisual biofeedback for regulating patients' respiration during four-dimensional radiotherapy, limiting damage to healthy surrounding tissues caused by organ movement. Six healthy volunteers were assisted by audiovisual or audio-only biofeedback systems to regulate their respirations. Volunteers breathed through a mask developed for this study by following computer-generated guiding curves displayed on a screen, combined with instructional sounds. They then performed breathing following instructional sounds only. The guiding signals and the volunteers' respiratory signals were logged at 20 samples per second. The standard deviations between the guiding and respiratory curves for the audiovisual and audio-only biofeedback systems were 21.55% and 23.19%, respectively; the average correlation coefficients were 0.9778 and 0.9756, respectively. The regularities between audiovisual and audio-only biofeedback for six volunteers' respirations were same statistically from the paired t-test. The difference between the audiovisual and audio-only biofeedback methods was not significant. Audio-only biofeedback has many advantages, as patients do not require a mask and can quickly adapt to this method in the clinic.

  4. Design And Construction Of 300W Audio Power Amplifier For Classroom

    OpenAIRE

    Shune Lei Aung; Kyaw Soe Lwin and Hla Myo Tun

    2015-01-01

    Abstract This paper describes the design and construction of 300W audio power amplifier for classroom. In the construction of this amplifier microphone preamplifier tone preamplifier equalizer line amplifier output power amplifier and sound level indicator are included. The output power amplifier is designed as O.C.L system and constructed by using Class B among many types of amplifier classes. There are two types in O.C.L system quasi system and complementary system. Between them the comple...

  5. acceleration observed in an audio air gas discharge

    International Nuclear Information System (INIS)

    Ragheb, M.S.

    2010-01-01

    an audio air gas discharge enclosed in a pyrex glass of 34 mm diameter and 25 cm long , lead to trace the occurrence of an unusual phenomenon. injected relative huge light spots of intense brightness, distributed regularly on the contour and in the center of one of the discharge electrodes, are observed. very high heat is pronounced on both electrodes, while, one of them is higher than the other it attains 660 degree C in 3-4 minutes. series of photographs and registered video films define and clarify the sequence of events that describe the observed phenomenon. the plasma is created by applying an audio power through the electrodes of an air gas discharge of 10 khz and up to 500 watts power supply. the discharge voltage is up to 900 volts: the discharge current flowing through the plasma attains 360 mA. it is found that the discharge system must attain its optimal working conditions in order to produce the amazing phenomena. the obtained plasma is classified as the maximum conditions borders of a γ-discharge type. at these conditions, the corresponding maximum electron temperature and density are 16 eV and 10 15 cm -3 respectively . the observation system succeeded to reveal and to clarify the sequence of the phenomenon events. in addition, by means of the scanning electron microscope and the energy dispersive x- ray systems, the effects on the electrodes surface are investigated and analyzed. the optical observations, in conjunction with the micrograph and surface microanalysis,demonstrate the collision occurrence, of powered agglomerations groups, to the electrode surface. detailed interpretation of that phenomenon suggests a molecular acceleration gaining their energy from the formed plasma due to optimal discharge working conditions. as a consequence, due to the ions agglomerates size this procedure could be considered as a mesoscopic acceleration technique.

  6. Feature Selection for Audio Surveillance in Urban Environment

    Directory of Open Access Journals (Sweden)

    KIKTOVA Eva

    2014-05-01

    Full Text Available This paper presents the work leading to the acoustic event detection system, which is designed to recognize two types of acoustic events (shot and breaking glass in urban environment. For this purpose, a huge front-end processing was performed for the effective parametric representation of an input sound. MFCC features and features computed during their extraction (MELSPEC and FBANK, then MPEG-7 audio descriptors and other temporal and spectral characteristics were extracted. High dimensional feature sets were created and in the next phase reduced by the mutual information based selection algorithms. Hidden Markov Model based classifier was applied and evaluated by the Viterbi decoding algorithm. Thus very effective feature sets were identified and also the less important features were found.

  7. Evaluation of MPEG-7-Based Audio Descriptors for Animal Voice Recognition over Wireless Acoustic Sensor Networks

    Directory of Open Access Journals (Sweden)

    Joaquín Luque

    2016-05-01

    Full Text Available Environmental audio monitoring is a huge area of interest for biologists all over the world. This is why some audio monitoring system have been proposed in the literature, which can be classified into two different approaches: acquirement and compression of all audio patterns in order to send them as raw data to a main server; or specific recognition systems based on audio patterns. The first approach presents the drawback of a high amount of information to be stored in a main server. Moreover, this information requires a considerable amount of effort to be analyzed. The second approach has the drawback of its lack of scalability when new patterns need to be detected. To overcome these limitations, this paper proposes an environmental Wireless Acoustic Sensor Network architecture focused on use of generic descriptors based on an MPEG-7 standard. These descriptors demonstrate it to be suitable to be used in the recognition of different patterns, allowing a high scalability. The proposed parameters have been tested to recognize different behaviors of two anuran species that live in Spanish natural parks; the Epidalea calamita and the Alytes obstetricans toads, demonstrating to have a high classification performance.

  8. Content Discovery from Composite Audio : An unsupervised approach

    NARCIS (Netherlands)

    Lu, L.

    2009-01-01

    In this thesis, we developed and assessed a novel robust and unsupervised framework for semantic inference from composite audio signals. We focused on the problem of detecting audio scenes and grouping them into meaningful clusters. Our approach addressed all major steps in a general process of

  9. Multilevel inverter based class D audio amplifier for capacitive transducers

    DEFF Research Database (Denmark)

    Nielsen, Dennis; Knott, Arnold; Andersen, Michael A. E.

    2014-01-01

    The reduced semiconductor voltage stress makes the multilevel inverters especially interesting, when driving capacitive transducers for audio applications. A ± 300 V flying capacitor class D audio amplifier driving a 100 nF load in the midrange region of 0.1-3.5 kHz with Total Harmonic Distortion...

  10. Tune in the Net with RealAudio.

    Science.gov (United States)

    Buchanan, Larry

    1997-01-01

    Describes how to connect to the RealAudio Web site to download a player that provides sound from Web pages to the computer through streaming technology. Explains hardware and software requirements and provides addresses for other RealAudio Web sites are provided, including weather information and current news. (LRW)

  11. Teaching Audio Playwriting: The Pedagogy of Drama Podcasting

    Science.gov (United States)

    Eshelman, David J.

    2016-01-01

    This article suggests how teaching artists can develop practical coursework in audio playwriting. To prepare students to work in the reemergent audio drama medium, the author created a seminar course called Radio Theatre Writing, taught at Arkansas Tech University in the fall of 2014. The course had three sections. First, it focused on…

  12. Use of Video and Audio Texts in EFL Listening Test

    Science.gov (United States)

    Basal, Ahmet; Gülözer, Kaine; Demir, Ibrahim

    2015-01-01

    The study aims to discover whether audio or video modality in a listening test is more beneficial to test takers. In this study, the posttest-only control group design was utilized and quantitative data were collected in order to measure participant performances concerning two types of modality (audio or video) in a listening test. The…

  13. Effect of Audio vs. Video on Aural Discrimination of Vowels

    Science.gov (United States)

    McCrocklin, Shannon

    2012-01-01

    Despite the growing use of media in the classroom, the effects of using of audio versus video in pronunciation teaching has been largely ignored. To analyze the impact of the use of audio or video training on aural discrimination of vowels, 61 participants (all students at a large American university) took a pre-test followed by two training…

  14. A Case Study on Audio Feedback with Geography Undergraduates

    Science.gov (United States)

    Rodway-Dyer, Sue; Knight, Jasper; Dunne, Elizabeth

    2011-01-01

    Several small-scale studies have suggested that audio feedback can help students to reflect on their learning and to develop deep learning approaches that are associated with higher attainment in assessments. For this case study, Geography undergraduates were given audio feedback on a written essay assignment, alongside traditional written…

  15. Automated Speech and Audio Analysis for Semantic Access to Multimedia

    NARCIS (Netherlands)

    Jong, F.M.G. de; Ordelman, R.; Huijbregts, M.

    2006-01-01

    The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to

  16. Decision-level fusion for audio-visual laughter detection

    NARCIS (Netherlands)

    Reuderink, B.; Poel, M.; Truong, K.; Poppe, R.; Pantic, M.

    2008-01-01

    Laughter is a highly variable signal, which can be caused by a spectrum of emotions. This makes the automatic detection of laughter a challenging, but interesting task. We perform automatic laughter detection using audio-visual data from the AMI Meeting Corpus. Audio-visual laughter detection is

  17. Automated speech and audio analysis for semantic access to multimedia

    NARCIS (Netherlands)

    de Jong, Franciska M.G.; Ordelman, Roeland J.F.; Huijbregts, M.A.H.; Avrithis, Y.; Kompatsiaris, Y.; Staab, S.; O' Connor, N.E.

    2006-01-01

    The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to

  18. On the Use of Memory Models in Audio Features

    DEFF Research Database (Denmark)

    Jensen, Karl Kristoffer

    2011-01-01

    Audio feature estimation is potentially improved by including higher- level models. One such model is the Short Term Memory (STM) model. A new paradigm of audio feature estimation is obtained by adding the influence of notes in the STM. These notes are identified when the perceptual spectral flux......, and an initial experiment with sensory dissonance has been undertaken with good results....

  19. Parametric Audio Based Decoder and Music Synthesizer for Mobile Applications

    NARCIS (Netherlands)

    Oomen, A.W.J.; Szczerba, M.Z.; Therssen, D.

    2011-01-01

    This paper reviews parametric audio coders and discusses novel technologies introduced in a low-complexity, low-power consumption audiodecoder and music synthesizer platform developed by the authors. Thedecoder uses parametric coding scheme based on the MPEG-4 Parametric Audio standard. In order to

  20. Automatic processing of CERN video, audio and photo archives

    CERN Document Server

    Kwiatek, M

    2008-01-01

    The digitalization of CERN audio-visual archives, a major task currently in progress, will generate over 40 TB of video, audio and photo files. Storing these files is one issue, but a far more important challenge is to provide long-time coherence of the archive and to make these files available on-line with minimum manpower investment.

  1. PROTOTIPE KOMPRESI LOSSLESS AUDIO CODEC MENGGUNAKAN ENTROPY ENCODING

    OpenAIRE

    Andreas Soegandi

    2010-01-01

    The purpose of this study was to perform lossless compression on the uncompress audio file audio to minimize file size without reducing the quality. The application is developed using the entropy encoding compression method with rice coding technique. For the result, the compression ratio is good enough and easy to be developed because the algorithm is quite simple. 

  2. Prototipe Kompresi Lossless Audio Codec Menggunakan Entropy Encoding

    Directory of Open Access Journals (Sweden)

    Andreas Soegandi

    2010-12-01

    Full Text Available The purpose of this study was to perform lossless compression on the uncompress audio file audio to minimize file size without reducing the quality. The application is developed using the entropy encoding compression method with rice coding technique. For the result, the compression ratio is good enough and easy to be developed because the algorithm is quite simple. 

  3. Four-quadrant flyback converter for direct audio power amplification

    DEFF Research Database (Denmark)

    Ljusev, Petar; Andersen, Michael Andreas E.

    2005-01-01

    This paper presents a bidirectional, four-quadrant flyback converter for use in direct audio power amplification. When compared to the standard Class-D switching audio power amplifier with a separate power supply, the proposed four-quadrant flyback converter provides simple solution with better...

  4. Minimizing Crosstalk in Self Oscillating Switch Mode Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Knott, Arnold; Ploug, Rasmus Overgaard

    2012-01-01

    The varying switching frequencies of self oscillating switch mode audio amplifiers have been known to cause interchannel intermodulation disturbances in multi channel configurations. This crosstalk phenomenon has a negative impact on the audio performance. The goal of this paper is to present...... by the implementation presented. Future work could include further refinement of the implementation of the concepts, electromagnetic interference investigations or PCB design....

  5. Evaluation of Audio Books: A Guide for Teachers.

    Science.gov (United States)

    Brown, Jean E.

    2003-01-01

    Considers how as educators recognize the importance of improving listening skills among students, the role of audio books gains curricular significance. Notes that teachers can use them for whole class work, or for students to work in small groups, or individually. Presents a guide for evaluating audio books. (SG)

  6. Some Characteristics of Audio Description and the Corresponding Moving Image.

    Science.gov (United States)

    Turner, James M.

    1998-01-01

    This research is concerned with reusing texts produced by audio describers as a source for automatically deriving shot-level indexing for film and video products. Results reinforce the notion that audio description is not sufficient on its own as a source for generating an index to the image, but it is valuable because it describes what is going…

  7. Multilevel inverter based class D audio amplifier for capacitive transducers

    DEFF Research Database (Denmark)

    Nielsen, Dennis; Knott, Arnold; Andersen, Michael A. E.

    2014-01-01

    The reduced semiconductor voltage stress makes the multilevel inverters especially interesting, when driving capacitive transducers for audio applications. A ± 300 V flying capacitor class D audio amplifier driving a 100 nF load in the midrange region of 0.1-3.5 kHz with Total Harmonic Distortion...... plus Noise (THD+N) belo w1%is presented....

  8. Tonal description of music audio signals

    OpenAIRE

    Gómez Gutiérrez, Emilia

    2006-01-01

    Aquesta tesi doctoral proposa i avalua un enfocament computacional per a la descripció automàtica dels aspectes tonals de la música a partir de l'anàlisi de senyals d'-audio polifòniques. Aquests mètodes es centren en el càlcul de descriptors de distribucions de notes, en l'estimació de tonalitat d'una peça, en la visualització de l'evolució del centre tonal o en la mesura de la similitud tonal entre dues peces diferents. Aquesta tesi contribueix substancialment al camp de la descripció tonal...

  9. Audio visual information materials for risk communication

    International Nuclear Information System (INIS)

    Gunji, Ikuko; Tabata, Rimiko; Ohuchi, Naomi

    2005-07-01

    Japan Nuclear Cycle Development Institute (JNC), Tokai Works set up the Risk Communication Study Team in January, 2001 to promote mutual understanding between the local residents and JNC. The Team has studied risk communication from various viewpoints and developed new methods of public relations which are useful for the local residents' risk perception toward nuclear issues. We aim to develop more effective risk communication which promotes a better mutual understanding of the local residents, by providing the risk information of the nuclear fuel facilities such a Reprocessing Plant and other research and development facilities. We explain the development process of audio visual information materials which describe our actual activities and devices for the risk management in nuclear fuel facilities, and our discussion through the effectiveness measurement. (author)

  10. AUDIO CRYPTANALYSIS- AN APPLICATION OF SYMMETRIC KEY CRYPTOGRAPHY AND AUDIO STEGANOGRAPHY

    Directory of Open Access Journals (Sweden)

    Smita Paira

    2016-09-01

    Full Text Available In the recent trend of network and technology, “Cryptography” and “Steganography” have emerged out as the essential elements of providing network security. Although Cryptography plays a major role in the fabrication and modification of the secret message into an encrypted version yet it has certain drawbacks. Steganography is the art that meets one of the basic limitations of Cryptography. In this paper, a new algorithm has been proposed based on both Symmetric Key Cryptography and Audio Steganography. The combination of a randomly generated Symmetric Key along with LSB technique of Audio Steganography sends a secret message unrecognizable through an insecure medium. The Stego File generated is almost lossless giving a 100 percent recovery of the original message. This paper also presents a detailed experimental analysis of the algorithm with a brief comparison with other existing algorithms and a future scope. The experimental verification and security issues are promising.

  11. High-Order Sparse Linear Predictors for Audio Processing

    DEFF Research Database (Denmark)

    Giacobello, Daniele; van Waterschoot, Toon; Christensen, Mads Græsbøll

    2010-01-01

    Linear prediction has generally failed to make a breakthrough in audio processing, as it has done in speech processing. This is mostly due to its poor modeling performance, since an audio signal is usually an ensemble of different sources. Nevertheless, linear prediction comes with a whole set...... of interesting features that make the idea of using it in audio processing not far fetched, e.g., the strong ability of modeling the spectral peaks that play a dominant role in perception. In this paper, we provide some preliminary conjectures and experiments on the use of high-order sparse linear predictors...... in audio processing. These predictors, successfully implemented in modeling the short-term and long-term redundancies present in speech signals, will be used to model tonal audio signals, both monophonic and polyphonic. We will show how the sparse predictors are able to model efficiently the different...

  12. Object-based audio reproduction and the audio scene description format

    OpenAIRE

    Geier, Matthias; Ahrens, Jens; Spors, Sascha

    2010-01-01

    Dieser Beitrag ist mit Zustimmung des Rechteinhabers aufgrund einer (DFG geförderten) Allianz- bzw. Nationallizenz frei zugänglich. This publication is with permission of the rights owner freely accessible due to an Alliance licence and a national licence (funded by the DFG, German Research Foundation) respectively. The introduction of new techniques for audio reproduction such as HRTF-based technology, wave field synthesis and higher-order Ambisonics is accompanied by a paradigm shift ...

  13. Horatio Audio-Describes Shakespeare's "Hamlet": Blind and Low-Vision Theatre-Goers Evaluate an Unconventional Audio Description Strategy

    Science.gov (United States)

    Udo, J. P.; Acevedo, B.; Fels, D. I.

    2010-01-01

    Audio description (AD) has been introduced as one solution for providing people who are blind or have low vision with access to live theatre, film and television content. However, there is little research to inform the process, user preferences and presentation style. We present a study of a single live audio-described performance of Hart House…

  14. ANALYSIS OF MULTIMODAL FUSION TECHNIQUES FOR AUDIO-VISUAL SPEECH RECOGNITION

    Directory of Open Access Journals (Sweden)

    D.V. Ivanko

    2016-05-01

    Full Text Available The paper deals with analytical review, covering the latest achievements in the field of audio-visual (AV fusion (integration of multimodal information. We discuss the main challenges and report on approaches to address them. One of the most important tasks of the AV integration is to understand how the modalities interact and influence each other. The paper addresses this problem in the context of AV speech processing and speech recognition. In the first part of the review we set out the basic principles of AV speech recognition and give the classification of audio and visual features of speech. Special attention is paid to the systematization of the existing techniques and the AV data fusion methods. In the second part we provide a consolidated list of tasks and applications that use the AV fusion based on carried out analysis of research area. We also indicate used methods, techniques, audio and video features. We propose classification of the AV integration, and discuss the advantages and disadvantages of different approaches. We draw conclusions and offer our assessment of the future in the field of AV fusion. In the further research we plan to implement a system of audio-visual Russian continuous speech recognition using advanced methods of multimodal fusion.

  15. Perancangan dan Analisis Kinerja Pengkodean Audio Multichannel dengan Metode Closed Loop

    Directory of Open Access Journals (Sweden)

    Muhammad Sobirin

    2014-09-01

    Full Text Available The latest international standardized codec is MPEG Surround. But the open-loop system of the MPEG Surround not able to minimize errors. In this research, the design and performance analysis of the audio coding in MPEG Surround with closed-loop method which can minimize the occurrence of errors on multichannel audio compression was applied. The implementation of closed-loop method increase the value of Signal to Noise Ratio (SNR. The average value of SNR for all bit rate for open-loop and closed-loop method are 17.67385 dB and 23.82338 dB, respectively. The highest SNR increase reaches 82.72% in comparison with the open-loop method. The average value increase of Objective Difference Grade (ODG for the entire audio sample is equal to 0.143917. In perceptual objective test and subjective tests indicate that the MPEG Surround with closed-loop method works better than the open-loop method. Overall, the impairment of audio compression is imperceptible at bit rate 90 kbps or higher.

  16. MPEG-7 audio-visual indexing test-bed for video retrieval

    Science.gov (United States)

    Gagnon, Langis; Foucher, Samuel; Gouaillier, Valerie; Brun, Christelle; Brousseau, Julie; Boulianne, Gilles; Osterrath, Frederic; Chapdelaine, Claude; Dutrisac, Julie; St-Onge, Francis; Champagne, Benoit; Lu, Xiaojian

    2003-12-01

    This paper reports on the development status of a Multimedia Asset Management (MAM) test-bed for content-based indexing and retrieval of audio-visual documents within the MPEG-7 standard. The project, called "MPEG-7 Audio-Visual Document Indexing System" (MADIS), specifically targets the indexing and retrieval of video shots and key frames from documentary film archives, based on audio-visual content like face recognition, motion activity, speech recognition and semantic clustering. The MPEG-7/XML encoding of the film database is done off-line. The description decomposition is based on a temporal decomposition into visual segments (shots), key frames and audio/speech sub-segments. The visible outcome will be a web site that allows video retrieval using a proprietary XQuery-based search engine and accessible to members at the Canadian National Film Board (NFB) Cineroute site. For example, end-user will be able to ask to point on movie shots in the database that have been produced in a specific year, that contain the face of a specific actor who tells a specific word and in which there is no motion activity. Video streaming is performed over the high bandwidth CA*net network deployed by CANARIE, a public Canadian Internet development organization.

  17. Music Genre Classification Using MIDI and Audio Features

    Science.gov (United States)

    Cataltepe, Zehra; Yaslan, Yusuf; Sonmez, Abdullah

    2007-12-01

    We report our findings on using MIDI files and audio features from MIDI, separately and combined together, for MIDI music genre classification. We use McKay and Fujinaga's 3-root and 9-leaf genre data set. In order to compute distances between MIDI pieces, we use normalized compression distance (NCD). NCD uses the compressed length of a string as an approximation to its Kolmogorov complexity and has previously been used for music genre and composer clustering. We convert the MIDI pieces to audio and then use the audio features to train different classifiers. MIDI and audio from MIDI classifiers alone achieve much smaller accuracies than those reported by McKay and Fujinaga who used not NCD but a number of domain-based MIDI features for their classification. Combining MIDI and audio from MIDI classifiers improves accuracy and gets closer to, but still worse, accuracies than McKay and Fujinaga's. The best root genre accuracies achieved using MIDI, audio, and combination of them are 0.75, 0.86, and 0.93, respectively, compared to 0.98 of McKay and Fujinaga. Successful classifier combination requires diversity of the base classifiers. We achieve diversity through using certain number of seconds of the MIDI file, different sample rates and sizes for the audio file, and different classification algorithms.

  18. Music Genre Classification Using MIDI and Audio Features

    Directory of Open Access Journals (Sweden)

    Abdullah Sonmez

    2007-01-01

    Full Text Available We report our findings on using MIDI files and audio features from MIDI, separately and combined together, for MIDI music genre classification. We use McKay and Fujinaga's 3-root and 9-leaf genre data set. In order to compute distances between MIDI pieces, we use normalized compression distance (NCD. NCD uses the compressed length of a string as an approximation to its Kolmogorov complexity and has previously been used for music genre and composer clustering. We convert the MIDI pieces to audio and then use the audio features to train different classifiers. MIDI and audio from MIDI classifiers alone achieve much smaller accuracies than those reported by McKay and Fujinaga who used not NCD but a number of domain-based MIDI features for their classification. Combining MIDI and audio from MIDI classifiers improves accuracy and gets closer to, but still worse, accuracies than McKay and Fujinaga's. The best root genre accuracies achieved using MIDI, audio, and combination of them are 0.75, 0.86, and 0.93, respectively, compared to 0.98 of McKay and Fujinaga. Successful classifier combination requires diversity of the base classifiers. We achieve diversity through using certain number of seconds of the MIDI file, different sample rates and sizes for the audio file, and different classification algorithms.

  19. Optimized Audio Classification and Segmentation Algorithm by Using Ensemble Methods

    Directory of Open Access Journals (Sweden)

    Saadia Zahid

    2015-01-01

    Full Text Available Audio segmentation is a basis for multimedia content analysis which is the most important and widely used application nowadays. An optimized audio classification and segmentation algorithm is presented in this paper that segments a superimposed audio stream on the basis of its content into four main audio types: pure-speech, music, environment sound, and silence. An algorithm is proposed that preserves important audio content and reduces the misclassification rate without using large amount of training data, which handles noise and is suitable for use for real-time applications. Noise in an audio stream is segmented out as environment sound. A hybrid classification approach is used, bagged support vector machines (SVMs with artificial neural networks (ANNs. Audio stream is classified, firstly, into speech and nonspeech segment by using bagged support vector machines; nonspeech segment is further classified into music and environment sound by using artificial neural networks and lastly, speech segment is classified into silence and pure-speech segments on the basis of rule-based classifier. Minimum data is used for training classifier; ensemble methods are used for minimizing misclassification rate and approximately 98% accurate segments are obtained. A fast and efficient algorithm is designed that can be used with real-time multimedia applications.

  20. Intraoperative multichannel audio-visual information recording and automatic surgical phase and incident detection.

    Science.gov (United States)

    Suzuki, Takashi; Sakurai, Yasuo; Yoshimitsu, Kitaro; Nambu, Kyojiro; Muragaki, Yoshihiro; Iseki, Hiroshi

    2010-01-01

    Identification, analysis, and treatment of potential risk in surgical workflow are the key to decrease medical errors in operating room. For the automatic analysis of recorded surgical information, this study reports multichannel audio visual recording system, and its review and analysis system. Motion in operating room is quantified using video file size without motion tracking. Conversation among surgical staff is quantified using fast Fourier transformation and frequency filter without speech recognition. The results suggested the progression phase of surgical procedure.

  1. 106-17 Telemetry Standards Digitized Audio Telemetry Standard Chapter 5

    Science.gov (United States)

    2017-07-01

    system which accepts a band-limited analog signal and encodes it into binary form for transmission through a digital channel . At the receiver, the...bit code words. This nonlinear, sampled-data feedback system then transmits the encoded bit stream through a digital channel . At the receiving end...Telemetry Standards, RCC Standard 106-17 Chapter 5, July 2017 CHAPTER 5 Digitized Audio Telemetry Standard Table of Contents Chapter 5

  2. MPEG-4 low-delay general audio coding

    Science.gov (United States)

    Sporer, Thomas; Grill, Bernhard; Herre, Juergen

    2001-07-01

    Traditionally, speech coding for communication purposes and perceptual audio coding have been separate worlds. On one hand, speech coders provide acceptable speech quality at very low data rates and low delays which are suitable for two-way communication applications, such as Voice over IP (VoIP) or teleconferencing. Due to the underlying coding paradigm, however, such coders do not perform well for non-speech signals (e.g.~music and environmental noise). Furthermore, the sound quality and naturalness is severely limited by the fact that most coders are working in narrow-band mode, i.e. with a bandwidth below 4 kHz. On the other hand, perceptual audio codecs provide excellent subjective audio quality for a broad range of signals including speech at bit rates down to 16 kbit/s. The delay of such a coder/decoder chain, however, usually exceeds 200 ms at very low data rates and in this way is not acceptable for interactive two-way communication. This paper describes a coding scheme which is designed to combine the advantages of perceptual audio coding with the low delay necessary for two-way communication. The codec was standardized within MPEG-4 Version 2 Audio under the work item ``Low Delay Audio Coding'' and is derived from the ISO/MPEG-2/4 Advanced Audio Coding (AAC) algorithm. The algorithm provides modes operating at algorithmic delay as low as 20 ms and is equipped to handle all full-bandwidth high-quality audio signals, both in monophonic, stereophonic and even multi-channel format. Despite of the low algorithmic delay, the codec delivers better audio quality than MPEG-1 Layer-3 (MP3) at the same bit rate. The paper also addresses issues pertaining to the integration of the coder into H.32x and SDP applications.

  3. 47 CFR 25.144 - Licensing provisions for the 2.3 GHz satellite digital audio radio service.

    Science.gov (United States)

    2010-10-01

    ... system, and the technical, legal, and financial qualifications of the applicant. In particular... receiver that will permit end users to access all licensed satellite DARS systems that are operational or...) Reporting requirements. All licensees of satellite digital audio radio service systems shall, on June 30 of...

  4. Survey of compressed domain audio features and their expressiveness

    Science.gov (United States)

    Pfeiffer, Silvia; Vincent, Thomas

    2003-01-01

    We give an overview of existing audio analysis approaches in the compressed domain and incorporate them into a coherent formal structure. After examining the kinds of information accessible in an MPEG-1 compressed audio stream, we describe a coherent approach to determine features from them and report on a number of applications they enable. Most of them aim at creating an index to the audio stream by segmenting the stream into temporally coherent regions, which may be classified into pre-specified types of sounds such as music, speech, speakers, animal sounds, sound effects, or silence. Other applications centre around sound recognition such as gender, beat or speech recognition.

  5. Debugging of Class-D Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Crone, Lasse; Pedersen, Jeppe Arnsdorf; Mønster, Jakob Døllner

    2012-01-01

    Determining and optimizing the performance of a Class-D audio power amplier can be very dicult without knowledge of the use of audio performance measuring equipment and of how the various noise and distortion sources in uence the audio performance. This paper gives an introduction on how to measure...... the performance of the amplier and how to nd the noise and distortion sources and suggests ways to remove them. Throughout the paper measurements of a test amplier are presented along with the relevant theory....

  6. A review of lossless audio compression standards and algorithms

    Science.gov (United States)

    Muin, Fathiah Abdul; Gunawan, Teddy Surya; Kartiwi, Mira; Elsheikh, Elsheikh M. A.

    2017-09-01

    Over the years, lossless audio compression has gained popularity as researchers and businesses has become more aware of the need for better quality and higher storage demand. This paper will analyse various lossless audio coding algorithm and standards that are used and available in the market focusing on Linear Predictive Coding (LPC) specifically due to its popularity and robustness in audio compression, nevertheless other prediction methods are compared to verify this. Advanced representation of LPC such as LSP decomposition techniques are also discussed within this paper.

  7. Current-Driven Switch-Mode Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Knott, Arnold; Buhl, Niels Christian; Andersen, Michael A. E.

    2012-01-01

    The conversion of electrical energy into sound waves by electromechanical transducers is proportional to the current through the coil of the transducer. However virtually all audio power amplifiers provide a controlled voltage through the interface to the transducer. This paper is presenting...... a switch-mode audio power amplifier not only providing controlled current but also being supplied by current. This results in an output filter size reduction by a factor of 6. The implemented prototype shows decent audio performance with THD + N below 0.1 %....

  8. Multi Carrier Modulation Audio Power Amplifier with Programmable Logic

    DEFF Research Database (Denmark)

    Christiansen, Theis; Andersen, Toke Meyer; Knott, Arnold

    2009-01-01

    While switch-mode audio power amplifiers allow compact implementations and high output power levels due to their high power efficiency, they are very well known for creating electromagnetic interference (EMI) with other electronic equipment. To lower the EMI of switch-mode (class D) audio power...... for performance and out of band spectral amplitudes. The basic principle in MCM is to use programmable logic to combine two or more Pulse Width Modulated (PWM) audio signals at different switching frequencies. In this way the out of band spectrum will be lowered compared with conventional class D amplifiers...

  9. Switching-mode Audio Power Amplifiers with Direct Energy Conversion

    DEFF Research Database (Denmark)

    Ljusev, Petar; Andersen, Michael Andreas E.

    2005-01-01

    has been replaced with a high frequency AC link. When compared to the conventional Class D amplifiers with a separate DC power supply, the proposed single conversion stage amplifier provides simple and compact solution with better efficiency and higher level of integration, leading to reduced......This paper presents a new class of switching-mode audio power amplifiers, which are capable of direct energy conversion from the AC mains to the audio output. They represent an ultimate integration of a switching-mode power supply and a Class D audio power amplifier, where the intermediate DC bus...

  10. Auditory and audio-vocal responses of single neurons in the monkey ventral premotor cortex.

    Science.gov (United States)

    Hage, Steffen R

    2018-03-20

    Monkey vocalization is a complex behavioral pattern, which is flexibly used in audio-vocal communication. A recently proposed dual neural network model suggests that cognitive control might be involved in this behavior, originating from a frontal cortical network in the prefrontal cortex and mediated via projections from the rostral portion of the ventral premotor cortex (PMvr) and motor cortex to the primary vocal motor network in the brainstem. For the rapid adjustment of vocal output to external acoustic events, strong interconnections between vocal motor and auditory sites are needed, which are present at cortical and subcortical levels. However, the role of the PMvr in audio-vocal integration processes remains unclear. In the present study, single neurons in the PMvr were recorded in rhesus monkeys (Macaca mulatta) while volitionally producing vocalizations in a visual detection task or passively listening to monkey vocalizations. Ten percent of randomly selected neurons in the PMvr modulated their discharge rate in response to acoustic stimulation with species-specific calls. More than four-fifths of these auditory neurons showed an additional modulation of their discharge rates either before and/or during the monkeys' motor production of the vocalization. Based on these audio-vocal interactions, the PMvr might be well positioned to mediate higher order auditory processing with cognitive control of the vocal motor output to the primary vocal motor network. Such audio-vocal integration processes in the premotor cortex might constitute a precursor for the evolution of complex learned audio-vocal integration systems, ultimately giving rise to human speech. Copyright © 2018 Elsevier B.V. All rights reserved.

  11. Integration of top-down and bottom-up information for audio organization and retrieval

    DEFF Research Database (Denmark)

    Jensen, Bjørn Sand

    The increasing availability of digital audio and music calls for methods and systems to analyse and organize these digital objects. This thesis investigates three elements related to such systems focusing on the ability to represent and elicit the user's view on the multimedia object and the system...... output. The aim is to provide organization and processing, which aligns with the understanding and needs of the users. Audio and music is often characterized by the large amount of heterogenous information. The rst aspect investigated is the integration of such multi-variate and multi-modal information....... The setup is based on classical regression and choice models placed in the framework of Gaussian processes, which provides flexible non-parametric Bayesian models. The setup consist of a number of likelihood functions suitable for modelling both absolute ratings (direct scaling) and comparative judgements...

  12. Interactive Audio Visual Learning: An Overview

    Science.gov (United States)

    Reich, Steven D.

    1984-01-01

    Interactive AudioVisual Learning (IAVL) is a dynamic branch of computer-assisted instruction that adds the dimensions of sight and sound to programmed learning. The power of audiovisual media to present complex concepts is coupled with the capabilities of a computer to analyze a learner's response to questions and then to direct the flow of information. The development of lessons in this format usually requires the input of content specialists, instructional designers, audiovisual media experts, and programmers. The IAVL format appears to be well accepted by learners and has been shown to be an efficient means of teaching. No standards for hardware, software, or presentation of material have been set, so efforts in the area of IAVL remain scattered. Several groups are actively working in the field of medically related subjects, but the major emphasis for most production teams is on corporate training. The commercial sector will probably be responsible for standardizing software and hardware, but lesson content for medical professionals will require medical educators. Since IAVL lessons are so different from standard lecture formats, more medical educators will have to be introduced to IAVL in order to create enough interest to get IAVL moved into the medical curriculum. The developmental efforts of those involved in IAVL productions for the education of medical professionals are important to the ultimate acceptance of the IAVL format.

  13. Personal audio with a planar bright zone.

    Science.gov (United States)

    Coleman, Philip; Jackson, Philip J B; Olik, Marek; Pedersen, Jan Abildgaard

    2014-10-01

    Reproduction of multiple sound zones, in which personal audio programs may be consumed without the need for headphones, is an active topic in acoustical signal processing. Many approaches to sound zone reproduction do not consider control of the bright zone phase, which may lead to self-cancellation problems if the loudspeakers surround the zones. Conversely, control of the phase in a least-squares sense comes at a cost of decreased level difference between the zones and frequency range of cancellation. Single-zone approaches have considered plane wave reproduction by focusing the sound energy in to a point in the wavenumber domain. In this article, a planar bright zone is reproduced via planarity control, which constrains the bright zone energy to impinge from a narrow range of angles via projection in to a spatial domain. Simulation results using a circular array surrounding two zones show the method to produce superior contrast to the least-squares approach, and superior planarity to the contrast maximization approach. Practical performance measurements obtained in an acoustically treated room verify the conclusions drawn under free-field conditions.

  14. Advances in audio watermarking based on singular value decomposition

    CERN Document Server

    Dhar, Pranab Kumar

    2015-01-01

    This book introduces audio watermarking methods for copyright protection, which has drawn extensive attention for securing digital data from unauthorized copying. The book is divided into two parts. First, an audio watermarking method in discrete wavelet transform (DWT) and discrete cosine transform (DCT) domains using singular value decomposition (SVD) and quantization is introduced. This method is robust against various attacks and provides good imperceptible watermarked sounds. Then, an audio watermarking method in fast Fourier transform (FFT) domain using SVD and Cartesian-polar transformation (CPT) is presented. This method has high imperceptibility and high data payload and it provides good robustness against various attacks. These techniques allow media owners to protect copyright and to show authenticity and ownership of their material in a variety of applications.   ·         Features new methods of audio watermarking for copyright protection and ownership protection ·         Outl...

  15. Audio-visual temporal perception in children with restored hearing.

    Science.gov (United States)

    Gori, Monica; Chilosi, Anna; Forli, Francesca; Burr, David

    2017-05-01

    It is not clear how audio-visual temporal perception develops in children with restored hearing. In this study we measured temporal discrimination thresholds with an audio-visual temporal bisection task in 9 deaf children with restored audition, and 22 typically hearing children. In typically hearing children, audition was more precise than vision, with no gain in multisensory conditions (as previously reported in Gori et al. (2012b)). However, deaf children with restored audition showed similar thresholds for audio and visual thresholds and some evidence of gain in audio-visual temporal multisensory conditions. Interestingly, we found a strong correlation between auditory weighting of multisensory signals and quality of language: patients who gave more weight to audition had better language skills. Similarly, auditory thresholds for the temporal bisection task were also a good predictor of language skills. This result supports the idea that the temporal auditory processing is associated with language development. Copyright © 2017. Published by Elsevier Ltd.

  16. Behavioral Science Design for Audio-Visual Software Development

    Science.gov (United States)

    Foster, Dennis L.

    1974-01-01

    A discussion of the basic structure of the behavioral audio-visual production which consists of objectives analysis, approach determination, technical production, fulfillment evaluation, program refinement, implementation, and follow-up. (Author)

  17. Audio CAPTCHA for SIP-Based VoIP

    Science.gov (United States)

    Soupionis, Yannis; Tountas, George; Gritzalis, Dimitris

    Voice over IP (VoIP) introduces new ways of communication, while utilizing existing data networks to provide inexpensive voice communications worldwide as a promising alternative to the traditional PSTN telephony. SPam over Internet Telephony (SPIT) is one potential source of future annoyance in VoIP. A common way to launch a SPIT attack is the use of an automated procedure (bot), which generates calls and produces audio advertisements. In this paper, our goal is to design appropriate CAPTCHA to fight such bots. We focus on and develop audio CAPTCHA, as the audio format is more suitable for VoIP environments and we implement it in a SIP-based VoIP environment. Furthermore, we suggest and evaluate the specific attributes that audio CAPTCHA should incorporate in order to be effective, and test it against an open source bot implementation.

  18. Effectiveness of 3-D audio for warnings in the cockpit

    NARCIS (Netherlands)

    Oving, A.B.; Veltman, J.A.; Bronkhorst, A.W.

    2004-01-01

    Een tweetal vliegsimulator experimenten lieten zien dat piloten sneller reagereerden op de auditieve waarschuwingen van het TCAS systeem in de civiele cockpit, waneer deze waarschuwingen werden gepresenteerd met 3D-audio in vergelijking tot mono geluid.

  19. PENGEMBANGAN MEDIA AUDIO VISUAL PEMBELAJARAN MENULIS BERITA SINGKAT

    OpenAIRE

    Sastri, Sastri; Wiryotinoyo, Mujiyono; Sudaryono, Sudaryono

    2015-01-01

    This article is based on a developmental research which is aimed at constructing audio visual media writing news. This media is developed with a contextual approach. Materials and training tasks are presented, designed using contextual approach or match an environment of student. Through this approach, students are expected to construct experiences into the learning situation. The design used in the development of audio-visual media using the model of learning to write news Alessi and Trollip...

  20. El Digital Audio Tape Recorder. Contra autores y creadores

    Directory of Open Access Journals (Sweden)

    Jun Ono

    2015-01-01

    Full Text Available La llamada "DAT" (abreviatura por "digital audio tape recorder" / grabadora digital de audio ha recibido cobertura durante mucho tiempo en los medios masivos de Japón y otros países, como un producto acústico electrónico nuevo y controversial de la industria japonesa de artefactos electrónicos. ¿Qué ha pasado con el objeto de esta controversia?

  1. The Effect of Audio and Visual Aids on Task Performance in Distributed Collaborative Virtual Environments

    Science.gov (United States)

    Ullah, Sehat; Richard, Paul; Otman, Samir; Mallem, Malik

    2009-03-01

    Collaborative virtual environments (CVE) has recently gained the attention of many researchers due to its numerous potential application domains. Cooperative virtual environments, where users simultaneously manipulate objects, is one of the subfields of CVEs. In this paper we present a framework that enables two users to cooperatively manipulate objects in virtual environment, while setting on two separate machines connected through local network. In addition the article presents the use of sensory feedback (audio and visual) and investigates their effects on the cooperation and user's performance. Six volunteers subject had to cooperatively perform a peg-in-hole task. Results revealed that visual and auditory aid increase users' performance. However majority of the users preferred visual feedback to audio. We hope this framework will greatly help in the development of CAD systems that allow the designers to collaboratively design while being distant. Similarly other application domains may be cooperative assembly, surgical training and rehabilitation systems.

  2. Audio-based Age and Gender Identification to Enhance the Recommendation of TV Content

    DEFF Research Database (Denmark)

    Shepstone, Sven Ewan; Tan, Zheng-Hua; Jensen, Søren Holdt

    2013-01-01

    Recommending TV content to groups of viewers is best carried out when relevant information such as the demographics of the group is available. However, it can be difficult and time consuming to extract information for every user in the group. This paper shows how an audio analysis of the age...... and gender of a group of users watching the TV can be used for recommending a sequence of N short TV content items for the group. First, a state of the art audio-based classifier determines the age and gender of each user in an M-user group and creates a group profile. A genetic recommender algorithm...... of state-of-the-art age-and-gender detection systems, the proposed system has a significant ability to predict an item with a matching age and gender category. User studies were conducted where subjects were asked to rate a sequence of advertisements, where half of the advertisements were randomly selected...

  3. Automated processing of massive audio/video content using FFmpeg

    Directory of Open Access Journals (Sweden)

    Kia Siang Hock

    2014-01-01

    Full Text Available Audio and video content forms an integral, important and expanding part of the digital collections in libraries and archives world-wide. While these memory institutions are familiar and well-versed in the management of more conventional materials such as books, periodicals, ephemera and images, the handling of audio (e.g., oral history recordings and video content (e.g., audio-visual recordings, broadcast content requires additional toolkits. In particular, a robust and comprehensive tool that provides a programmable interface is indispensable when dealing with tens of thousands of hours of audio and video content. FFmpeg is comprehensive and well-established open source software that is capable of the full-range of audio/video processing tasks (such as encode, decode, transcode, mux, demux, stream and filter. It is also capable of handling a wide-range of audio and video formats, a unique challenge in memory institutions. It comes with a command line interface, as well as a set of developer libraries that can be incorporated into applications.

  4. Depth perception of audio sources in stereo 3D environments

    Science.gov (United States)

    Corrigan, David; Gorzel, Marcin; Squires, John; Boland, Frank

    2013-03-01

    In this paper we undertook perceptual experiments to determine the allowed differences in depth between audio and visual stimuli in stereoscopic-3D environments while being perceived as congruent. We also investigated whether the nature of the environment and stimuli affects the perception of congruence. This was achieved by creating an audio-visual environment consisting of a photorealistic visual environment captured by a camera under orthostereoscopic conditions and a virtual audio environment generated by measuring the acoustic properties of the real environment. The visual environment consisted of a room with a loudspeaker or person forming the visual stimulus and was presented to the viewer using a passive stereoscopic display. Pink noise samples and female speech were used as audio stimuli which were presented over headphones using binaural renderings. The stimuli were generated at different depths from the viewer and the viewer was asked to determine whether the audio stimulus was nearer, further away or at the same depth as the visual stimulus. From our experiments it is shown that there is a significant range of depth differences for which audio and visual stimuli are perceived as congruent. Furthermore, this range increases as the depth of the visual stimulus increases.

  5. Communicative Competence in Audio Classrooms: A Position Paper for the CADE 1991 Conference.

    Science.gov (United States)

    Burge, Liz

    Classroom practitioners need to move their attention away from the technological and logistical competencies required for audio conferencing (AC) to the required communicative competencies in order to advance their skills in handling the psychodynamics of audio virtual classrooms which include audio alone and audio with graphics. While the…

  6. 37 CFR 201.27 - Initial notice of distribution of digital audio recording devices or media.

    Science.gov (United States)

    2010-07-01

    ... distribution of digital audio recording devices or media. 201.27 Section 201.27 Patents, Trademarks, and... Initial notice of distribution of digital audio recording devices or media. (a) General. This section..., any digital audio recording device or digital audio recording medium in the United States. (b...

  7. Joint evaluation of communication quality and user experience in an audio-visual virtual reality meeting

    DEFF Research Database (Denmark)

    Møller, Anders Kalsgaard; Hoffmann, Pablo F.; Carrozzino, Marcello

    2013-01-01

    The state-of-the-art speech intelligibility tests are created with the purpose of evaluating acoustic communication devices and not for evaluating audio-visual virtual reality systems. This paper present a novel method to evaluate a communication situation based on both the speech intelligibility...... and the indexical characteristics of the speaker. The results will be available in the final paper. Index Terms: speech intelligibility , virtual reality, body language, telecommunication....

  8. Modelling and analysis of a high-performance Class D audio amplifier using unipolar pulse-width-modulation

    Science.gov (United States)

    Zhou, Zekun; Shi, Yue; Ming, Xin; Zhang, Bo; Li, Zhaoji; Chen, Zao

    2012-02-01

    A high-performance class D audio amplifier using unipolar pulse-width-modulation (PWM) with double-sided natural sampling is presented in this article. In order to comprehend and design the system properly, the class D audio amplifier is modelled and analysed. A wide range triangle-wave signal with good linearity and magnitude proportional to supply voltage is embedded in the proposed class D audio amplifier for maximum output power, high power supply rejection ratio (PSRR) and low total harmonic distortion (THD). Design results based on CSMC 0.5-µm 5-V complementary metal-oxide-semiconductor process demonstrate that the proposed class D audio amplifier can operate with supply voltage in the range 2.4-5.5 V and supports 2.8 W output power from a 5.5 V supply; the maximum efficiency is above 95%, the PSRR is -82 dB, the signal-to-noise ratio (SNR) is 97 dB and the total harmonic distortion plus noise (THD+N) is less than 0.1% between 20 and 20 kHz with output power 0.4 W; the quiescent current without load is 1.8 mA, and the shutdown current is 0.01 µA. The active area of the class-D audio power amplifier is 1.5 mm × 1.5 mm.

  9. A Novel Method for Real-Time Audio Recording With Intraoperative Video.

    Science.gov (United States)

    Sugamoto, Yuji; Hamamoto, Yasuyoshi; Kimura, Masayuki; Fukunaga, Toru; Tasaki, Kentaro; Asai, Yo; Takeshita, Nobuyoshi; Maruyama, Tetsuro; Hosokawa, Takashi; Tamachi, Tomohide; Aoyama, Hiromichi; Matsubara, Hisahiro

    2015-01-01

    Although laparoscopic surgery has become widespread, effective and efficient education in laparoscopic surgery is difficult. Instructive laparoscopy videos with appropriate annotations are ideal for initial training in laparoscopic surgery; however, the method we use at our institution for creating laparoscopy videos with audio is not generalized, and there have been no detailed explanations of any such method. Our objectives were to demonstrate the feasibility of low-cost simple methods for recording surgical videos with audio and to perform a preliminary safety evaluation when obtaining these recordings during operations. We devised a method for the synchronous recording of surgical video with real-time audio in which we connected an amplifier and a wireless microphone to an existing endoscopy system and its equipped video-recording device. We tested this system in 209 cases of laparoscopic surgery in operating rooms between August 2010 and July 2011 and prospectively investigated the results of the audiovisual recording method and examined intraoperative problems. Numazu City Hospital in Numazu city, Japan. Surgeons, instrument nurses, and medical engineers. In all cases, the synchronous input of audio and video was possible. The recording system did not cause any inconvenience to the surgeon, assistants, instrument nurse, sterilized equipment, or electrical medical equipment. Statistically significant differences were not observed between the audiovisual group and control group regarding the operating time, which had been divided into 2 slots-performed by the instructors or by trainees (p > 0.05). This recording method is feasible and considerably safe while posing minimal difficulty in terms of technology, time, and expense. We recommend this method for both surgical trainees who wish to acquire surgical skills effectively and medical instructors who wish to teach surgical skills effectively. Copyright © 2015 Association of Program Directors in Surgery

  10. Neuromorphic Audio-Visual Sensor Fusion on a Sound-Localising Robot

    Directory of Open Access Journals (Sweden)

    Vincent Yue-Sek Chan

    2012-02-01

    Full Text Available This paper presents the first robotic system featuring audio-visual sensor fusion with neuromorphic sensors. We combine a pair of silicon cochleae and a silicon retina on a robotic platform to allow the robot to learn sound localisation through self-motion and visual feedback, using an adaptive ITD-based sound localisation algorithm. After training, the robot can localise sound sources (white or pink noise in a reverberant environment with an RMS error of 4 to 5 degrees in azimuth. In the second part of the paper, we investigate the source binding problem. An experiment is conducted to test the effectiveness of matching an audio event with a corresponding visual event based on their onset time. The results show that this technique can be quite effective, despite its simplicity.

  11. EXPERIMENTAL STUDIES FOR DEVELOPMENT HIGH-POWER AUDIO SPEAKER DEVICES PERFORMANCE USING PERMANENT NdFeB MAGNETS SPECIAL TECHNOLOGY

    Directory of Open Access Journals (Sweden)

    Constantin D. STĂNESCU

    2013-05-01

    Full Text Available In this paper the authors shows the research made for improving high-power audio speaker devices performance using permanent NdFeB magnets special technology. Magnetic losses inside these audio devices are due to mechanical system frictions and to thermal effect of Joules eddy currents. In this regard, by special technology, were made conical surfaces at top plate and center pin. Analysing results obtained by modelling the magnetic circuit finite element method using electronic software package,was measured increase efficiency by over 10 %, from 1,136T to13T.

  12. Audio-visual assistance in co-creating transition knowledge

    Science.gov (United States)

    Hezel, Bernd; Broschkowski, Ephraim; Kropp, Jürgen P.

    2013-04-01

    Earth system and climate impact research results point to the tremendous ecologic, economic and societal implications of climate change. Specifically people will have to adopt lifestyles that are very different from those they currently strive for in order to mitigate severe changes of our known environment. It will most likely not suffice to transfer the scientific findings into international agreements and appropriate legislation. A transition is rather reliant on pioneers that define new role models, on change agents that mainstream the concept of sufficiency and on narratives that make different futures appealing. In order for the research community to be able to provide sustainable transition pathways that are viable, an integration of the physical constraints and the societal dynamics is needed. Hence the necessary transition knowledge is to be co-created by social and natural science and society. To this end, the Climate Media Factory - in itself a massively transdisciplinary venture - strives to provide an audio-visual connection between the different scientific cultures and a bi-directional link to stake holders and society. Since methodology, particular language and knowledge level of the involved is not the same, we develop new entertaining formats on the basis of a "complexity on demand" approach. They present scientific information in an integrated and entertaining way with different levels of detail that provide entry points to users with different requirements. Two examples shall illustrate the advantages and restrictions of the approach.

  13. 3D sound and 3D image interactions: a review of audio-visual depth perception

    Science.gov (United States)

    Berry, Jonathan S.; Roberts, David A. T.; Holliman, Nicolas S.

    2014-02-01

    There has been much research concerning visual depth perception in 3D stereoscopic displays and, to a lesser extent, auditory depth perception in 3D spatial sound systems. With 3D sound systems now available in a number of different forms, there is increasing interest in the integration of 3D sound systems with 3D displays. It therefore seems timely to review key concepts and results concerning depth perception in such display systems. We first present overviews of both visual and auditory depth perception, before focussing on cross-modal effects in audio-visual depth perception, which may be of direct interest to display and content designers.

  14. Audio Visual Integration with Competing Sources in the Framework of Audio Visual Speech Scene Analysis.

    Science.gov (United States)

    Ganesh, Attigodu Chandrashekara; Berthommier, Frédéric; Schwartz, Jean-Luc

    2016-01-01

    We introduce "Audio-Visual Speech Scene Analysis" (AVSSA) as an extension of the two-stage Auditory Scene Analysis model towards audiovisual scenes made of mixtures of speakers. AVSSA assumes that a coherence index between the auditory and the visual input is computed prior to audiovisual fusion, enabling to determine whether the sensory inputs should be bound together. Previous experiments on the modulation of the McGurk effect by audiovisual coherent vs. incoherent contexts presented before the McGurk target have provided experimental evidence supporting AVSSA. Indeed, incoherent contexts appear to decrease the McGurk effect, suggesting that they produce lower audiovisual coherence hence less audiovisual fusion. The present experiments extend the AVSSA paradigm by creating contexts made of competing audiovisual sources and measuring their effect on McGurk targets. The competing audiovisual sources have respectively a high and a low audiovisual coherence (that is, large vs. small audiovisual comodulations in time). The first experiment involves contexts made of two auditory sources and one video source associated to either the first or the second audio source. It appears that the McGurk effect is smaller after the context made of the visual source associated to the auditory source with less audiovisual coherence. In the second experiment with the same stimuli, the participants are asked to attend to either one or the other source. The data show that the modulation of fusion depends on the attentional focus. Altogether, these two experiments shed light on audiovisual binding, the AVSSA process and the role of attention.

  15. Audio-visual integration through the parallel visual pathways.

    Science.gov (United States)

    Kaposvári, Péter; Csete, Gergő; Bognár, Anna; Csibri, Péter; Tóth, Eszter; Szabó, Nikoletta; Vécsei, László; Sáry, Gyula; Tamás Kincses, Zsigmond

    2015-10-22

    Audio-visual integration has been shown to be present in a wide range of different conditions, some of which are processed through the dorsal, and others through the ventral visual pathway. Whereas neuroimaging studies have revealed integration-related activity in the brain, there has been no imaging study of the possible role of segregated visual streams in audio-visual integration. We set out to determine how the different visual pathways participate in this communication. We investigated how audio-visual integration can be supported through the dorsal and ventral visual pathways during the double flash illusion. Low-contrast and chromatic isoluminant stimuli were used to drive preferably the dorsal and ventral pathways, respectively. In order to identify the anatomical substrates of the audio-visual interaction in the two conditions, the psychophysical results were correlated with the white matter integrity as measured by diffusion tensor imaging.The psychophysiological data revealed a robust double flash illusion in both conditions. A correlation between the psychophysical results and local fractional anisotropy was found in the occipito-parietal white matter in the low-contrast condition, while a similar correlation was found in the infero-temporal white matter in the chromatic isoluminant condition. Our results indicate that both of the parallel visual pathways may play a role in the audio-visual interaction. Copyright © 2015. Published by Elsevier B.V.

  16. The Fungible Audio-Visual Mapping and its Experience

    Directory of Open Access Journals (Sweden)

    Adriana Sa

    2014-12-01

    Full Text Available This article draws a perceptual approach to audio-visual mapping. Clearly perceivable cause and effect relationships can be problematic if one desires the audience to experience the music. Indeed perception would bias those sonic qualities that fit previous concepts of causation, subordinating other sonic qualities, which may form the relations between the sounds themselves. The question is, how can an audio-visual mapping produce a sense of causation, and simultaneously confound the actual cause-effect relationships. We call this a fungible audio-visual mapping. Our aim here is to glean its constitution and aspect. We will report a study, which draws upon methods from experimental psychology to inform audio-visual instrument design and composition. The participants are shown several audio-visual mapping prototypes, after which we pose quantitative and qualitative questions regarding their sense of causation, and their sense of understanding the cause-effect relationships. The study shows that a fungible mapping requires both synchronized and seemingly non-related components – sufficient complexity to be confusing. As the specific cause-effect concepts remain inconclusive, the sense of causation embraces the whole. 

  17. Distortion-Free 1-Bit PWM Coding for Digital Audio Signals

    Directory of Open Access Journals (Sweden)

    John Mourjopoulos

    2007-01-01

    Full Text Available Although uniformly sampled pulse width modulation (UPWM represents a very efficient digital audio coding scheme for digital-to-analog conversion and full-digital amplification, it suffers from strong harmonic distortions, as opposed to benign non-harmonic artifacts present in analog PWM (naturally sampled PWM, NPWM. Complete elimination of these distortions usually requires excessive oversampling of the source PCM audio signal, which results to impractical realizations of digital PWM systems. In this paper, a description of digital PWM distortion generation mechanism is given and a novel principle for their minimization is proposed, based on a process having some similarity to the dithering principle employed in multibit signal quantization. This conditioning signal is termed “jither” and it can be applied either in the PCM amplitude or the PWM time domain. It is shown that the proposed method achieves significant decrement of the harmonic distortions, rendering digital PWM performance equivalent to that of source PCM audio, for mild oversampling (e.g., ×4 resulting to typical PWM clock rates of 90 MHz.

  18. Distortion-Free 1-Bit PWM Coding for Digital Audio Signals

    Directory of Open Access Journals (Sweden)

    Mourjopoulos John

    2007-01-01

    Full Text Available Although uniformly sampled pulse width modulation (UPWM represents a very efficient digital audio coding scheme for digital-to-analog conversion and full-digital amplification, it suffers from strong harmonic distortions, as opposed to benign non-harmonic artifacts present in analog PWM (naturally sampled PWM, NPWM. Complete elimination of these distortions usually requires excessive oversampling of the source PCM audio signal, which results to impractical realizations of digital PWM systems. In this paper, a description of digital PWM distortion generation mechanism is given and a novel principle for their minimization is proposed, based on a process having some similarity to the dithering principle employed in multibit signal quantization. This conditioning signal is termed "jither" and it can be applied either in the PCM amplitude or the PWM time domain. It is shown that the proposed method achieves significant decrement of the harmonic distortions, rendering digital PWM performance equivalent to that of source PCM audio, for mild oversampling (e.g., resulting to typical PWM clock rates of 90 MHz.

  19. Audio Query by Example Using Similarity Measures between Probability Density Functions of Features

    Directory of Open Access Journals (Sweden)

    Marko Helén

    2010-01-01

    Full Text Available This paper proposes a query by example system for generic audio. We estimate the similarity of the example signal and the samples in the queried database by calculating the distance between the probability density functions (pdfs of their frame-wise acoustic features. Since the features are continuous valued, we propose to model them using Gaussian mixture models (GMMs or hidden Markov models (HMMs. The models parametrize each sample efficiently and retain sufficient information for similarity measurement. To measure the distance between the models, we apply a novel Euclidean distance, approximations of Kullback-Leibler divergence, and a cross-likelihood ratio test. The performance of the measures was tested in simulations where audio samples are automatically retrieved from a general audio database, based on the estimated similarity to a user-provided example. The simulations show that the distance between probability density functions is an accurate measure for similarity. Measures based on GMMs or HMMs are shown to produce better results than that of the existing methods based on simpler statistics or histograms of the features. A good performance with low computational cost is obtained with the proposed Euclidean distance.

  20. Technical Evaluation Report 31: Internet Audio Products (3/ 3

    Directory of Open Access Journals (Sweden)

    Jim Rudolph

    2004-08-01

    Full Text Available Two contrasting additions to the online audio market are reviewed: iVocalize, a browser-based audio-conferencing software, and Skype, a PC-to-PC Internet telephone tool. These products are selected for review on the basis of their success in gaining rapid popular attention and usage during 2003-04. The iVocalize review emphasizes the product’s role in the development of a series of successful online audio communities – notably several serving visually impaired users. The Skype review stresses the ease with which the product may be used for simultaneous PC-to-PC communication among up to five users. Editor’s Note: This paper serves as an introduction to reports about online community building, and reviews of online products for disabled persons, in the next ten reports in this series. JPB, Series Ed.

  1. Four-quadrant flyback converter for direct audio power amplification

    Energy Technology Data Exchange (ETDEWEB)

    Ljusev, P.; Andersen, Michael A.E.

    2005-07-01

    This paper presents a bidirectional, four-quadrant yback converter for use in direct audio power amplication. When compared to the standard Class-D switching-mode audio power amplier with separate power supply, the proposed four-quadrant flyback converter provides simple and compact solution with high efciency, higher level of integration, lower component count, less board space and eventually lower cost. Both peak and average current-mode control for use with 4Q flyback power converters are described and compared. Integrated magnetics is presented which simplies the construction of the auxiliary power supplies for control biasing and isolated gate drives. The feasibility of the approach is proven on audio power amplier prototype for subwoofer applications. (au)

  2. Efficiency Optimization in Class-D Audio Amplifiers

    DEFF Research Database (Denmark)

    Yamauchi, Akira; Knott, Arnold; Jørgensen, Ivan Harald Holger

    2015-01-01

    This paper presents a new power efficiency optimization routine for designing Class-D audio amplifiers. The proposed optimization procedure finds design parameters for the power stage and the output filter, and the optimum switching frequency such that the weighted power losses are minimized under...... the given constraints. The optimization routine is applied to minimize the power losses in a 130 W class-D audio amplifier based on consumer behavior investigations, where the amplifier operates at idle and low power levels most of the time. Experimental results demonstrate that the optimization method can...... lead to around 30 % of efficiency improvement at 1.3 W output power without significant effects on both audio performance and the efficiency at high power levels....

  3. Audio frequency modulated RF discharge at atmospheric pressure

    Science.gov (United States)

    Braithwaite, Nicholas; Sutton, Yvonne; Sharp, David; Moore, Jon

    2008-10-01

    An atmospheric pressure RF arc discharge, generated using a low voltage chopper and a Tesla coil resonant at about 300 kHz, forms a stable, silent, flame-like luminous region some 3 mm in diameter and 40 mm long, rooted to the electrodes by visible hot spots. It is known and we have confirmed that audio frequency modulation of the drive voltage makes the discharge act as an audio loudspeaker (tweeter) with its monopole radiation pattern constrained only by the electrodes. Time resolved `total' optical emission reveals an intensity variation that is synchronous with the audio frequency. Electrical characterisation of the high frequency discharge has been carried out. In the steady state, the high frequency arc burns without generating significant quantities of ozone, as determined by a commercial ozone detector. This is consistent with the high gas temperature within the arc, as measured by optical emission spectroscopy of molecular nitrogen. Phase-locked emission measurements illustrate the acoustic coupling.

  4. Audio signal recognition for speech, music, and environmental sounds

    Science.gov (United States)

    Ellis, Daniel P. W.

    2003-10-01

    Human listeners are very good at all kinds of sound detection and identification tasks, from understanding heavily accented speech to noticing a ringing phone underneath music playing at full blast. Efforts to duplicate these abilities on computer have been particularly intense in the area of speech recognition, and it is instructive to review which approaches have proved most powerful, and which major problems still remain. The features and models developed for speech have found applications in other audio recognition tasks, including musical signal analysis, and the problems of analyzing the general ``ambient'' audio that might be encountered by an auditorily endowed robot. This talk will briefly review statistical pattern recognition for audio signals, giving examples in several of these domains. Particular emphasis will be given to common aspects and lessons learned.

  5. Image and audio wavelet integration for home security video compression

    Science.gov (United States)

    Cheng, Yu-Shen; Huang, Gen-Dow

    2002-03-01

    We present a novel wavelet compression algorithm for both audio and image with acceptable test by human perception. It is well known that Discrete Wavelet Transform (DWT) provides global multiple resolution decomposition that is the significant feature for the audio and image compressions. Experimental simulations show that the proposed audio and image model can satisfy the current industrial communication requirements in terms of the processing time and the compression fidelity. Development of wavelet-based compression algorithm considers the trade-off for hardware implementations. As a result, this high-performance video codec can develop compact, low power, high-speed, portable, cost-effective, and low-weight video compression for multimedia and home security applications.

  6. Objective quality measurement for audio time-scale modification

    Science.gov (United States)

    Liu, Fang; Lee, Jae-Joon; Kuo, C. C. J.

    2003-11-01

    The recent ITU-T Recommendation P.862, known as the Perceptual Evaluation of Speech Quality (PESQ) is an objective end-to-end speech quality assessment method for telephone networks and speech codecs through the measurement of received audio quality. To ensure that certain network distortions will not affect the estimated subjective measurement determined by PESQ, the algorithm takes into account packet loss, short-term and long-term time warping resulted from delay variation. However, PESQ does not work well for time-scale audio modification or temporal clipping. We investigated the factors that impact the perceived quality when time-scale modification is involved. An objective measurement of time-scale modification is proposed in this research, where the cross-correlation values obtained from time-scale modification synchronization are used to evaluate the quality of a time-scaled audio sequence. This proposed objective measure has been verified by a subjective test.

  7. One Message, Many Voices: Mobile Audio Counselling in Health Education.

    Science.gov (United States)

    Pimmer, Christoph; Mbvundula, Francis

    2018-01-01

    Health workers' use of counselling information on their mobile phones for health education is a central but little understood phenomenon in numerous mobile health (mHealth) projects in Sub-Saharan Africa. Drawing on empirical data from an interpretive case study in the setting of the Millennium Villages Project in rural Malawi, this research investigates the ways in which community health workers (CHWs) perceive that audio-counselling messages support their health education practice. Three main themes emerged from the analysis: phone-aided audio counselling (1) legitimises the CHWs' use of mobile phones during household visits; (2) helps CHWs to deliver a comprehensive counselling message; (3) supports CHWs in persuading communities to change their health practices. The findings show the complexity and interplay of the multi-faceted, sociocultural, political, and socioemotional meanings associated with audio-counselling use. Practical implications and the demand for further research are discussed.

  8. Sistema de adquisición y procesamiento de audio

    OpenAIRE

    Pérez Segurado, Rubén

    2015-01-01

    El objetivo de este proyecto es el diseño y la implementación de una plataforma para un sistema de procesamiento de audio. El sistema recibirá una señal de audio analógica desde una fuente de audio, permitirá realizar un tratamiento digital de dicha señal y generará una señal procesada que se enviará a unos altavoces externos. Para la realización del sistema de procesamiento se empleará: - Un dispositivo FPGA de Lattice, modelo MachX02-7000-HE, en la cual estarán todas la...

  9. Audio engineering 101 a beginner's guide to music production

    CERN Document Server

    Dittmar, Tim

    2013-01-01

    Audio Engineering 101 is a real world guide for starting out in the recording industry. If you have the dream, the ideas, the music and the creativity but don't know where to start, then this book is for you!Filled with practical advice on how to navigate the recording world, from an author with first-hand, real-life experience, Audio Engineering 101 will help you succeed in the exciting, but tough and confusing, music industry. Covering all you need to know about the recording process, from the characteristics of sound to a guide to microphones to analog versus digital

  10. Real-time Loudspeaker Distance Estimation with Stereo Audio

    DEFF Research Database (Denmark)

    Nielsen, Jesper Kjær; Gaubitch, Nikolay; Heusdens, Richard

    2015-01-01

    Knowledge on how a number of loudspeakers are positioned relative to a listening position can be used to enhance the listening experience. Usually, these loudspeaker positions are estimated using calibration signals, either audible or psycho-acoustically hidden inside the desired audio signal....... In this paper, we propose to use the desired audio signal instead. Specifically, we treat the case of estimating the distance between two loudspeakers playing back a stereo music or speech signal. In this connection, we develop a real-time maximum likelihood estimator and demonstrate that it has a variance...

  11. Cambridge English First 2 audio CDs : authentic examination papers

    CERN Document Server

    2016-01-01

    Four authentic Cambridge English Language Assessment examination papers for the Cambridge English: First (FCE) exam. These examination papers for the Cambridge English: First (FCE) exam provide the most authentic exam preparation available, allowing candidates to familiarise themselves with the content and format of the exam and to practise useful exam techniques. The Audio CDs contain the recorded material to allow thorough preparation for the Listening paper and are designed to be used with the Student's Book. A Student's Book with or without answers and a Student's Book with answers and downloadable Audio are available separately. These tests are also available as Cambridge English: First Tests 5-8 on Testbank.org.uk

  12. DOA Estimation of Audio Sources in Reverberant Environments

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Nielsen, Jesper Kjær; Heusdens, Richard

    2016-01-01

    Reverberation is well-known to have a detrimental impact on many localization methods for audio sources. We address this problem by imposing a model for the early reflections as well as a model for the audio source itself. Using these models, we propose two iterative localization methods that est...... bias. Our simulation results show that we can estimate the DOA of the desired signal more accurately with this procedure compared to state-of-theart estimator in both synthetic and real data experiments with reverberation....

  13. Multi Carrier Modulator for Switch-Mode Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Knott, Arnold; Pfaffinger, Gerhard; Andersen, Michael Andreas E.

    2008-01-01

    While switch-mode audio power amplifiers allow compact implementations and high output power levels due to their high power efficiency, they are very well known for creating electromagnetic interference (EMI) with other electronic equipment, in particular radio receivers. Lowering the EMI of switch......-mode audio power amplifiers while keeping the performance measures to excellent levels is therefore of high general interest. A modulator utilizing multiple carrier signals to generate a two level pulse train will be shown in this paper. The performance of the modulator will be compared in simulation...

  14. Audio Watermarking Based on HAS and Neural Networks in DCT Domain

    Directory of Open Access Journals (Sweden)

    Hung-Hsu Tsai

    2003-03-01

    Full Text Available We propose a new intelligent audio watermarking method based on the characteristics of the HAS and the techniques of neural networks in the DCT domain. The method makes the watermark imperceptible by using the audio masking characteristics of the HAS. Moreover, the method exploits a neural network for memorizing the relationships between the original audio signals and the watermarked audio signals. Therefore, the method is capable of extracting watermarks without original audio signals. Finally, the experimental results are also included to illustrate that the method significantly possesses robustness to be immune against common attacks for the copyright protection of digital audio.

  15. Migrating Home Computer Audio Waveforms to Digital Objects: A Case Study on Digital Archaeology

    Directory of Open Access Journals (Sweden)

    Mark Guttenbrunner

    2011-03-01

    Full Text Available Rescuing data from inaccessible or damaged storage media for the purpose of preserving the digital data for the long term is one of the dimensions of digital archaeology. With the current pace of technological development, any system can become obsolete in a matter of years and hence the data stored in a specific storage media might not be accessible anymore due to the unavailability of the system to access the media. In order to preserve digital records residing in such storage media, it is necessary to extract the data stored in those media by some means.One early storage medium for home computers in the 1980s was audio tape. The first home computer systems allowed the use of standard cassette players to record and replay data. Audio cassettes are more durable than old home computers when properly stored. Devices playing this medium (i.e. tape recorders can be found in working condition or can be repaired, as they are usually made out of standard components. By re-engineering the format of the waveform and the file formats, the data on such media can then be extracted from a digitised audio stream and migrated to a non-obsolete format.In this paper we present a case study on extracting the data stored on an audio tape by an early home computer system, namely the Philips Videopac+ G7400. The original data formats were re-engineered and an application was written to support the migration of the data stored on tapes without using the original system. This eliminates the necessity of keeping an obsolete system alive for enabling access to the data on the storage media meant for this system. Two different methods to interpret the data and eliminate possible errors in the tape were implemented and evaluated on original tapes, which were recorded 20 years ago. Results show that with some error correction methods, parts of the tapes are still readable even without the original system. It also implies that it is easier to build solutions while original

  16. Audio frequency pulse code modulation data link using an optical fiber

    Science.gov (United States)

    Blackburn, J. A.

    1981-02-01

    A simple, economical and inherently noise-immune asynchronous digital data link design that uses pulse code modulation and a fiber-optic cable is presented. Suitable for audio and instrumentation applications with typical bandwidths of dc-10 kHz, the system samples input signals at 20 kHz and converts them to a seven-bit binary code for transmission through a 20-foot length step index fiber-optic cable. Performance tests of the system, installed in a high fidelity stereo to link a casette recorder output to an amplifier's AUX input, demonstrated dramatic reductions of the hiss associated with quantization noise.

  17. A conceptual framework for audio-visual museum media

    DEFF Research Database (Denmark)

    Kirkedahl Lysholm Nielsen, Mikkel

    2017-01-01

    and museum studies, existing case studies, and real life observations, the suggested framework instead stress particular characteristics of contextual use of audio-visual media in history museums, such as authenticity, virtuality, interativity, social context and spatial attributes of the communication...

  18. Towards a universal representation for audio information retrieval and analysis

    DEFF Research Database (Denmark)

    Jensen, Bjørn Sand; Troelsgaard, Rasmus; Larsen, Jan

    2013-01-01

    A fundamental and general representation of audio and music which integrates multi-modal data sources is important for both application and basic research purposes. In this paper we address this challenge by proposing a multi-modal version of the Latent Dirichlet Allocation model which provides a...

  19. A Power Efficient Audio Amplifier Combining Switching and Linear Techniques

    NARCIS (Netherlands)

    van der Zee, Ronan A.R.; van Tuijl, Adrianus Johannes Maria

    1998-01-01

    Integrated Class D audio amplifiers are very power efficient, but require an external filter which prevents further integration. Also due to this filter, large feedback factors are hard to realise, so that the load influences the distortion- and transfer characteristics. The amplifier presented in

  20. The Single- and Multichannel Audio Recordings Database (SMARD)

    DEFF Research Database (Denmark)

    Nielsen, Jesper Kjær; Jensen, Jesper Rindom; Jensen, Søren Holdt

    2014-01-01

    A new single- and multichannel audio recordings database (SMARD) is presented in this paper. The database contains recordings from a box-shaped listening room for various loudspeaker and array types. The recordings were made for 48 different configurations of three different loudspeakers and four...

  1. Streaming Audio and Video: New Challenges and Opportunities for Museums.

    Science.gov (United States)

    Spadaccini, Jim

    Streaming audio and video present new challenges and opportunities for museums. Streaming media is easier to author and deliver to Internet audiences than ever before; digital video editing is commonplace now that the tools--computers, digital video cameras, and hard drives--are so affordable; the cost of serving video files across the Internet…

  2. Utilization of non-linear converters for audio amplification

    DEFF Research Database (Denmark)

    Iversen, Niels Elkjær; Birch, Thomas; Knott, Arnold

    2012-01-01

    . The introduction of non-linear converters for audio amplification defeats this limitation. A Cuk converter, designed to deliver an AC peak output voltage twice the supply voltage, is presented in this paper. A 3V prototype has been developed to prove the concept. The prototype shows that it is possible to achieve...

  3. Audio Quality Assurance : An Application of Cross Correlation

    DEFF Research Database (Denmark)

    Jurik, Bolette Ammitzbøll; Nielsen, Jesper Asbjørn Sindahl

    2012-01-01

    We describe algorithms for automated quality assurance on content of audio files in context of preservation actions and access. The algorithms use cross correlation to compare the sound waves. They are used to do overlap analysis in an access scenario, where preserved radio broadcasts are used...

  4. Audio-Visual Aid in Teaching "Fatty Liver"

    Science.gov (United States)

    Dash, Sambit; Kamath, Ullas; Rao, Guruprasad; Prakash, Jay; Mishra, Snigdha

    2016-01-01

    Use of audio visual tools to aid in medical education is ever on a rise. Our study intends to find the efficacy of a video prepared on "fatty liver," a topic that is often a challenge for pre-clinical teachers, in enhancing cognitive processing and ultimately learning. We prepared a video presentation of 11:36 min, incorporating various…

  5. Video genre categorization and representation using audio-visual information

    Science.gov (United States)

    Ionescu, Bogdan; Seyerlehner, Klaus; Rasche, Christoph; Vertan, Constantin; Lambert, Patrick

    2012-04-01

    We propose an audio-visual approach to video genre classification using content descriptors that exploit audio, color, temporal, and contour information. Audio information is extracted at block-level, which has the advantage of capturing local temporal information. At the temporal structure level, we consider action content in relation to human perception. Color perception is quantified using statistics of color distribution, elementary hues, color properties, and relationships between colors. Further, we compute statistics of contour geometry and relationships. The main contribution of our work lies in harnessing the descriptive power of the combination of these descriptors in genre classification. Validation was carried out on over 91 h of video footage encompassing 7 common video genres, yielding average precision and recall ratios of 87% to 100% and 77% to 100%, respectively, and an overall average correct classification of up to 97%. Also, experimental comparison as part of the MediaEval 2011 benchmarking campaign demonstrated the efficiency of the proposed audio-visual descriptors over other existing approaches. Finally, we discuss a 3-D video browsing platform that displays movies using feature-based coordinates and thus regroups them according to genre.

  6. Audio-haptic interaction in simulated walking experiences

    DEFF Research Database (Denmark)

    Serafin, Stefania

    2011-01-01

    In this paper an overview of the work conducted on audio-haptic physically based simulation and evaluation of walking is provided. This work has been performed in the context of the Natural Interactive Walking (NIW) project, whose goal is to investigate possibilities for the integrated and interc...

  7. Audio-visual materials usage preference among agricultural ...

    African Journals Online (AJOL)

    It was found that respondents preferred radio, television, poster, advert, photographs, specimen, bulletin, magazine, cinema, videotape, chalkboard, and bulletin board as audio-visual materials for extension work. These are the materials that can easily be manipulated and utilized for extension work. Nigerian Journal of ...

  8. Adding Audio Description: Does It Make a Difference?

    Science.gov (United States)

    Schmeidler, Emilie; Kirchner, Corinne

    2001-01-01

    A study involving 111 adults with blindness examined the impact of watching television science programs with and without audio description. Results indicate respondents gained and retained more information from watching programs with description. They reported that the description makes the program more enjoyable, interesting, and informative.…

  9. Auteur Description: From the Director's Creative Vision to Audio Description

    Science.gov (United States)

    Szarkowska, Agnieszka

    2013-01-01

    In this report, the author follows the suggestion that a film director's creative vision should be incorporated into Audio description (AD), a major technique for making films, theater performances, operas, and other events accessible to people who are blind or have low vision. The author presents a new type of AD for auteur and artistic films:…

  10. Phase Synchronization in Human EEG During Audio-Visual Stimulation

    Czech Academy of Sciences Publication Activity Database

    Teplan, M.; Šušmáková, K.; Paluš, Milan; Vejmelka, Martin

    2009-01-01

    Roč. 28, - (2009), s. 80-84 ISSN 1536-8378 Grant - others:Bilateral project between Slovak AS and AS CR(CZ-SK) Modern methods for evaluation of electrophysiological signals Source of funding: V - iné verejné zdroje Keywords : synchronization * EEG * wavelet * audio- visual stimulation Subject RIV: FH - Neurology Impact factor: 0.729, year: 2009

  11. The Role of Audio Media in the Lives of Children.

    Science.gov (United States)

    Christenson, Peter G.; Lindlof, Thomas R.

    Mass communication researchers have largely ignored the role of audio media and popular music in the lives of children, yet the available evidence shows that children do listen. Extant studies yield a consistent developmental portrait of childrens' listening frequency, but there is a notable lack of programatic research over the past decade, one…

  12. Audio effects on haptics perception during drilling simulation

    Directory of Open Access Journals (Sweden)

    Yair Valbuena

    2017-06-01

    Full Text Available Virtual reality has provided immersion and interactions through computer generated environments attempting to reproduce real life experiences through sensorial stimuli. Realism can be achieved through multimodal interactions which can enhance the user’s presence within the computer generated world. The most notorious advances in virtual reality can be seen in computer graphics visuals, where photorealism is the norm thriving to overcome the uncanny valley. Other advances have followed related to sound, haptics, and in a lesser manner smell and taste feedback. Currently, virtual reality systems (multimodal immersion and interactions through visual-haptic-sound are being massively used in entertainment (e.g., cinema, video games, art, and in non-entertainment scenarios (e.g., social inclusion, educational, training, therapy, and tourism. Moreover, the cost reduction of virtual reality technologies has resulted in the availability at a consumer-level of various haptic, headsets, and motion tracking devices. Current consumer-level devices offer low-fidelity experiences due to the properties of the sensors, displays, and other electro-mechanical devices, that may not be suitable for high-precision or realistic experiences requiring dexterity. However, research has been conducted on how toovercome or compensate the lack of high fidelity to provide an engaging user experience using storytelling, multimodal interactions and gaming elements. Our work focuses on analyzing the possible effects of auditory perception on haptic feedback within a drilling scenario. Drilling involves multimodal interactions and it is a task with multiple applications in medicine, crafting, and construction. We compare two drilling scenarios were two groups of participants had to drill through wood while listening to contextual and non-contextual audios. We gathered their perception using a survey after the task completion. From the results, we believe that sound does

  13. MedlinePlus FAQ: Is audio description available for videos on MedlinePlus?

    Science.gov (United States)

    ... https://medlineplus.gov/faq/audiodescription.html Question: Is audio description available for videos on MedlinePlus? To use ... features on this page, please enable JavaScript. Answer: Audio description of videos helps make the content of ...

  14. Overview of the 2015 Workshop on Speech, Language and Audio in Multimedia

    NARCIS (Netherlands)

    Gravier, Guillaume; Jones, Gareth J.F.; Larson, Martha; Ordelman, Roeland J.F.

    2015-01-01

    The Workshop on Speech, Language and Audio in Multimedia (SLAM) positions itself at at the crossroad of multiple scientific fields - music and audio processing, speech processing, natural language processing and multimedia - to discuss and stimulate research results, projects, datasets and

  15. Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion.

    Science.gov (United States)

    Gebru, Israel D; Ba, Sileye; Li, Xiaofei; Horaud, Radu

    2018-05-01

    Speaker diarization consists of assigning speech signals to people engaged in a dialogue. An audio-visual spatiotemporal diarization model is proposed. The model is well suited for challenging scenarios that consist of several participants engaged in multi-party interaction while they move around and turn their heads towards the other participants rather than facing the cameras and the microphones. Multiple-person visual tracking is combined with multiple speech-source localization in order to tackle the speech-to-person association problem. The latter is solved within a novel audio-visual fusion method on the following grounds: binaural spectral features are first extracted from a microphone pair, then a supervised audio-visual alignment technique maps these features onto an image, and finally a semi-supervised clustering method assigns binaural spectral features to visible persons. The main advantage of this method over previous work is that it processes in a principled way speech signals uttered simultaneously by multiple persons. The diarization itself is cast into a latent-variable temporal graphical model that infers speaker identities and speech turns, based on the output of an audio-visual association process, executed at each time slice, and on the dynamics of the diarization variable itself. The proposed formulation yields an efficient exact inference procedure. A novel dataset, that contains audio-visual training data as well as a number of scenarios involving several participants engaged in formal and informal dialogue, is introduced. The proposed method is thoroughly tested and benchmarked with respect to several state-of-the art diarization algorithms.

  16. Syllable Congruency of Audio-Visual Speech Stimuli Facilitates the Spatial Ventriloquism Only with Bilateral Visual Presentations

    Directory of Open Access Journals (Sweden)

    Shoko Kanaya

    2011-10-01

    Full Text Available Spatial ventriloquism refers to a shift of perceptual location of a sound toward a synchronized visual stimulus. It has been assumed to reflect early processes uninfluenced by cognitive factors such as syllable congruency between audio-visual speech stimuli. Conventional experiments have examined compelling situations which typically entail pairs of single audio and visual stimuli to be bound. However, for natural environments our multisensory system is designed to select relevant sensory signals to be bound among adjacent stimuli. This selection process may depend upon higher (cognitive mechanisms. We investigated whether a cognitive factor affects the size of the ventriloquism when an additional visual stimulus is presented with a conventional audio-visual pair. Participants were presented with a set of audio-visual speech stimuli, comprising one or two bilateral movies of a person uttering single syllables together with recordings of this person speaking the same syllables. One of movies and the speech sound were combined in either congruent or incongruent ways. Participants had to identify sound locations. Results show that syllable congruency affected the size of the ventriloquism only when two movies were presented simultaneously. The selection of a relevant stimulus pair among two or more candidates can be regulated by some higher processes.

  17. On the definition of adapted audio/video profiles for high-quality video calling services over LTE/4G

    Science.gov (United States)

    Ndiaye, Maty; Quinquis, Catherine; Larabi, Mohamed Chaker; Le Lay, Gwenael; Saadane, Hakim; Perrine, Clency

    2014-01-01

    During the last decade, the important advances and widespread availability of mobile technology (operating systems, GPUs, terminal resolution and so on) have encouraged a fast development of voice and video services like video-calling. While multimedia services have largely grown on mobile devices, the generated increase of data consumption is leading to the saturation of mobile networks. In order to provide data with high bit-rates and maintain performance as close as possible to traditional networks, the 3GPP (The 3rd Generation Partnership Project) worked on a high performance standard for mobile called Long Term Evolution (LTE). In this paper, we aim at expressing recommendations related to audio and video media profiles (selection of audio and video codecs, bit-rates, frame-rates, audio and video formats) for a typical video-calling services held over LTE/4G mobile networks. These profiles are defined according to targeted devices (smartphones, tablets), so as to ensure the best possible quality of experience (QoE). Obtained results indicate that for a CIF format (352 x 288 pixels) which is usually used for smartphones, the VP8 codec provides a better image quality than the H.264 codec for low bitrates (from 128 to 384 kbps). However sequences with high motion, H.264 in slow mode is preferred. Regarding audio, better results are globally achieved using wideband codecs offering good quality except for opus codec (at 12.2 kbps).

  18. ARC Code TI: SLAB Spatial Audio Renderer

    Data.gov (United States)

    National Aeronautics and Space Administration — SLAB is a software-based, real-time virtual acoustic environment rendering system being developed as a tool for the study of spatial hearing. SLAB is designed to...

  19. Audio-visual onset differences are used to determine syllable identity for ambiguous audio-visual stimulus pairs

    NARCIS (Netherlands)

    Ten Oever, Sanne; Sack, Alexander T; Wheat, Katherine L; Bien, Nina; van Atteveldt, Nienke

    2013-01-01

    Content and temporal cues have been shown to interact during audio-visual (AV) speech identification. Typically, the most reliable unimodal cue is used more strongly to identify specific speech features; however, visual cues are only used if the AV stimuli are presented within a certain temporal

  20. 47 CFR 73.9005 - Compliance requirements for covered demodulator products: Audio.

    Science.gov (United States)

    2010-10-01

    ... products: Audio. 73.9005 Section 73.9005 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED....9005 Compliance requirements for covered demodulator products: Audio. Except as otherwise provided in §§ 73.9003(a) or 73.9004(a), covered demodulator products shall not output the audio portions of...

  1. 76 FR 591 - Determination of Rates and Terms for Preexisting Subscription and Satellite Digital Audio Radio...

    Science.gov (United States)

    2011-01-05

    ... of Rates and Terms for Preexisting Subscription and Satellite Digital Audio Radio Services AGENCY... satellite digital audio radio services for the digital performance of sound recordings and the making of... both preexisting subscription services (``PSS'') and satellite digital audio radio services...

  2. Effects of Hearing Protection Device Attenuation on Unmanned Aerial Vehicle (UAV) Audio Signatures

    Science.gov (United States)

    2016-03-01

    UAV) Audio Signatures by Melissa Bezandry, Adrienne Raglin, and John Noble Approved for public release; distribution...Research Laboratory Effects of Hearing Protection Device Attenuation on Unmanned Aerial Vehicle (UAV) Audio Signatures by Melissa Bezandry...Aerial Vehicle (UAV) Audio Signatures 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Melissa Bezandry

  3. Responding Effectively to Composition Students: Comparing Student Perceptions of Written and Audio Feedback

    Science.gov (United States)

    Bilbro, J.; Iluzada, C.; Clark, D. E.

    2013-01-01

    The authors compared student perceptions of audio and written feedback in order to assess what types of students may benefit from receiving audio feedback on their essays rather than written feedback. Many instructors previously have reported the advantages they see in audio feedback, but little quantitative research has been done on how the…

  4. 16 CFR 307.8 - Requirements for disclosure in audiovisual and audio advertising.

    Science.gov (United States)

    2010-01-01

    ... and audio advertising. 307.8 Section 307.8 Commercial Practices FEDERAL TRADE COMMISSION REGULATIONS... ACT OF 1986 Advertising Disclosures § 307.8 Requirements for disclosure in audiovisual and audio... and in graphics so that it is easily legible. If the advertisement has an audio component, the warning...

  5. Interactive 3D audio: Enhancing awareness of details in immersive soundscapes?

    DEFF Research Database (Denmark)

    Schmidt, Mikkel Nørgaard; Schwartz, Stephen; Larsen, Jan

    2012-01-01

    Spatial audio and the possibility of interacting with the audio environment is thought to increase listeners' attention to details in a soundscape. This work examines if interactive 3D audio enhances listeners' ability to recall details in a soundscape. Nine different soundscapes were constructed...

  6. Parametric Packet-Layer Model for Evaluation Audio Quality in Multimedia Streaming Services

    Science.gov (United States)

    Egi, Noritsugu; Hayashi, Takanori; Takahashi, Akira

    We propose a parametric packet-layer model for monitoring audio quality in multimedia streaming services such as Internet protocol television (IPTV). This model estimates audio quality of experience (QoE) on the basis of quality degradation due to coding and packet loss of an audio sequence. The input parameters of this model are audio bit rate, sampling rate, frame length, packet-loss frequency, and average burst length. Audio bit rate, packet-loss frequency, and average burst length are calculated from header information in received IP packets. For sampling rate, frame length, and audio codec type, the values or the names used in monitored services are input into this model directly. We performed a subjective listening test to examine the relationships between these input parameters and perceived audio quality. The codec used in this test was the Advanced Audio Codec-Low Complexity (AAC-LC), which is one of the international standards for audio coding. On the basis of the test results, we developed an audio quality evaluation model. The verification results indicate that audio quality estimated by the proposed model has a high correlation with perceived audio quality.

  7. GaN Power Stage for Switch-mode Audio Amplification

    DEFF Research Database (Denmark)

    Ploug, Rasmus Overgaard; Knott, Arnold; Poulsen, Søren Bang

    2015-01-01

    N FETs. This project seeks to investigate the possibilities of using eGaN FETs as the power switching device in a full bridge power stage intended for switch mode audio amplification. A 50 W 1 MHz power stage was built and provided promising audio performance. Future work includes optimization of dead...... time and investigation of switching frequency versus audio performance....

  8. Computationally Efficient Amplitude Modulated Sinusoidal Audio Coding using Frequency-Domain Linear Prediction

    DEFF Research Database (Denmark)

    Christensen, M. G.; Jensen, Søren Holdt

    2006-01-01

    A method for amplitude modulated sinusoidal audio coding is presented that has low complexity and low delay. This is based on a subband processing system, where, in each subband, the signal is modeled as an amplitude modulated sum of sinusoids. The envelopes are estimated using frequency......-domain linear prediction and the prediction coefficients are quantized. As a proof of concept, we evaluate different configurations in a subjective listening test, and this shows that the proposed method offers significant improvements in sinusoidal coding. Furthermore, the properties of the frequency...

  9. An interactive audio-visual installation using ubiquitous hardware and web-based software deployment

    Directory of Open Access Journals (Sweden)

    Tiago Fernandes Tavares

    2015-05-01

    Full Text Available This paper describes an interactive audio-visual musical installation, namely MOTUS, that aims at being deployed using low-cost hardware and software. This was achieved by writing the software as a web application and using only hardware pieces that are built-in most modern personal computers. This scenario implies in specific technical restrictions, which leads to solutions combining both technical and artistic aspects of the installation. The resulting system is versatile and can be freely used from any computer with Internet access. Spontaneous feedback from the audience has shown that the provided experience is interesting and engaging, regardless of the use of minimal hardware.

  10. Auditory cross-modal reorganization in cochlear implant users indicates audio-visual integration

    Directory of Open Access Journals (Sweden)

    Maren Stropahl

    2017-01-01

    Full Text Available There is clear evidence for cross-modal cortical reorganization in the auditory system of post-lingually deafened cochlear implant (CI users. A recent report suggests that moderate sensori-neural hearing loss is already sufficient to initiate corresponding cortical changes. To what extend these changes are deprivation-induced or related to sensory recovery is still debated. Moreover, the influence of cross-modal reorganization on CI benefit is also still unclear. While reorganization during deafness may impede speech recovery, reorganization also has beneficial influences on face recognition and lip-reading. As CI users were observed to show differences in multisensory integration, the question arises if cross-modal reorganization is related to audio-visual integration skills. The current electroencephalography study investigated cortical reorganization in experienced post-lingually deafened CI users (n = 18, untreated mild to moderately hearing impaired individuals (n = 18 and normal hearing controls (n = 17. Cross-modal activation of the auditory cortex by means of EEG source localization in response to human faces and audio-visual integration, quantified with the McGurk illusion, were measured. CI users revealed stronger cross-modal activations compared to age-matched normal hearing individuals. Furthermore, CI users showed a relationship between cross-modal activation and audio-visual integration strength. This may further support a beneficial relationship between cross-modal activation and daily-life communication skills that may not be fully captured by laboratory-based speech perception tests. Interestingly, hearing impaired individuals showed behavioral and neurophysiological results that were numerically between the other two groups, and they showed a moderate relationship between cross-modal activation and the degree of hearing loss. This further supports the notion that auditory deprivation evokes a reorganization of the

  11. Audio-Biofeedback training for posture and balance in Patients with Parkinson's disease

    Directory of Open Access Journals (Sweden)

    Zijlstra Wiebren

    2011-06-01

    Full Text Available Abstract Background Patients with Parkinson's disease (PD suffer from dysrhythmic and disturbed gait, impaired balance, and decreased postural responses. These alterations lead to falls, especially as the disease progresses. Based on the observation that postural control improved in patients with vestibular dysfunction after audio-biofeedback training, we tested the feasibility and effects of this training modality in patients with PD. Methods Seven patients with PD were included in a pilot study comprised of a six weeks intervention program. The training was individualized to each patient's needs and was delivered using an audio-biofeedback (ABF system with headphones. The training was focused on improving posture, sit-to-stand abilities, and dynamic balance in various positions. Non-parametric statistics were used to evaluate training effects. Results The ABF system was well accepted by all participants with no adverse events reported. Patients declared high satisfaction with the training. A significant improvement of balance, as assessed by the Berg Balance Scale, was observed (improvement of 3% p = 0.032, and a trend in the Timed up and go test (improvement of 11%; p = 0.07 was also seen. In addition, the training appeared to have a positive influence on psychosocial aspects of the disease as assessed by the Parkinson's disease quality of life questionnaire (PDQ-39 and the level of depression as assessed by the Geriatric Depression Scale. Conclusions This is, to our knowledge, the first report demonstrating that audio-biofeedback training for patients with PD is feasible and is associated with improvements of balance and several psychosocial aspects.

  12. “Wrapping” X3DOM around Web Audio API

    Directory of Open Access Journals (Sweden)

    Andreas Stamoulias

    2015-12-01

    Full Text Available Spatial sound has a conceptual role in the Web3D environments, due to highly realism scenes that can provide. Lately the efforts are concentrated on the extension of the X3D/ X3DOM through spatial sound attributes. This paper presents a novel method for the introduction of spatial sound components in the X3DOM framework, based on X3D specification and Web Audio API. The proposed method incorporates the introduction of enhanced sound nodes for X3DOM which are derived by the implementation of the X3D standard components, enriched with accessional features of Web Audio API. Moreover, several examples-scenarios developed for the evaluation of our approach. The implemented examples established the achievability of new registered nodes in X3DOM, for spatial sound characteristics in Web3D virtual worlds.

  13. Audio teleconferencing: creative use of a forgotten innovation.

    Science.gov (United States)

    Mather, Carey; Marlow, Annette

    2012-06-01

    As part of a regional School of Nursing and Midwifery's commitment to addressing recruitment and retention issues, approximately 90% of second year undergraduate student nurses undertake clinical placements at: multipurpose centres; regional or district hospitals; aged care; or community centres based in rural and remote regions within the State. The remaining 10% undertake professional experience placement in urban areas only. This placement of a large cohort of students, in low numbers in a variety of clinical settings, initiated the need to provide consistent support to both students and staff at these facilities. Subsequently the development of an audio teleconferencing model of clinical facilitation to guide student teaching and learning and to provide support to registered nurse preceptors in clinical practice was developed. This paper draws on Weimer's 'Personal Accounts of Change' approach to describe, discuss and evaluate the modifications that have occurred since the inception of this audio teleconferencing model (Weimer, 2006).

  14. Digital audio recordings improve the outcomes of patient consultations

    DEFF Research Database (Denmark)

    Wolderslund, Maiken; Kofoed, Poul-Erik; Holst, René

    2017-01-01

    OBJECTIVES: To investigate the effects on patients' outcome of the consultations when provided with: a Digital Audio Recording (DAR) of the consultation and a Question Prompt List (QPL). METHODS: This is a three-armed randomised controlled cluster trial. One group of patients received standard care......, while the other two groups received either the QPL in combination with a recording of their consultation or only the recording. Patients from four outpatient clinics participated: Paediatric, Orthopaedic, Internal Medicine, and Urology. The effects were evaluated by patient-administered questionnaires...... of their consultation positively influences the patients' perception of having adequate information after the consultation. PRACTICE IMPLICATIONS: The implementation of a QPL and audio recording of consultations should be considered in routine practice....

  15. Exploiting Acoustic Similarity of Propagating Paths for Audio Signal Separation

    Directory of Open Access Journals (Sweden)

    Yin Bin

    2003-01-01

    Full Text Available Blind signal separation can easily find its position in audio applications where mutually independent sources need to be separated from their microphone mixtures while both room acoustics and sources are unknown. However, the conventional separation algorithms can hardly be implemented in real time due to the high computational complexity. The computational load is mainly caused by either direct or indirect estimation of thousands of acoustic parameters. Aiming at the complexity reduction, in this paper, the acoustic paths are investigated through an acoustic similarity index (ASI. Then a new mixing model is proposed. With closely spaced microphones (5–10 cm apart, the model relieves the computational load of the separation algorithm by reducing the number and length of the filters to be adjusted. To cope with real situations, a blind audio signal separation algorithm (BLASS is developed on the proposed model. BLASS only uses the second-order statistics (SOS and performs efficiently in frequency domain.

  16. A Robust Zero-Watermarking Algorithm for Audio

    Directory of Open Access Journals (Sweden)

    Jie Zhu

    2008-03-01

    Full Text Available In traditional watermarking algorithms, the insertion of watermark into the host signal inevitably introduces some perceptible quality degradation. Another problem is the inherent conflict between imperceptibility and robustness. Zero-watermarking technique can solve these problems successfully. Instead of embedding watermark, the zero-watermarking technique extracts some essential characteristics from the host signal and uses them for watermark detection. However, most of the available zero-watermarking schemes are designed for still image and their robustness is not satisfactory. In this paper, an efficient and robust zero-watermarking technique for audio signal is presented. The multiresolution characteristic of discrete wavelet transform (DWT, the energy compression characteristic of discrete cosine transform (DCT, and the Gaussian noise suppression property of higher-order cumulant are combined to extract essential features from the host audio signal and they are then used for watermark recovery. Simulation results demonstrate the effectiveness of our scheme in terms of inaudibility, detection reliability, and robustness.

  17. Underdetermined Blind Audio Source Separation Using Modal Decomposition

    Directory of Open Access Journals (Sweden)

    Abdeldjalil Aïssa-El-Bey

    2007-03-01

    Full Text Available This paper introduces new algorithms for the blind separation of audio sources using modal decomposition. Indeed, audio signals and, in particular, musical signals can be well approximated by a sum of damped sinusoidal (modal components. Based on this representation, we propose a two-step approach consisting of a signal analysis (extraction of the modal components followed by a signal synthesis (grouping of the components belonging to the same source using vector clustering. For the signal analysis, two existing algorithms are considered and compared: namely the EMD (empirical mode decomposition algorithm and a parametric estimation algorithm using ESPRIT technique. A major advantage of the proposed method resides in its validity for both instantaneous and convolutive mixtures and its ability to separate more sources than sensors. Simulation results are given to compare and assess the performance of the proposed algorithms.

  18. Underdetermined Blind Audio Source Separation Using Modal Decomposition

    Directory of Open Access Journals (Sweden)

    Aïssa-El-Bey Abdeldjalil

    2007-01-01

    Full Text Available This paper introduces new algorithms for the blind separation of audio sources using modal decomposition. Indeed, audio signals and, in particular, musical signals can be well approximated by a sum of damped sinusoidal (modal components. Based on this representation, we propose a two-step approach consisting of a signal analysis (extraction of the modal components followed by a signal synthesis (grouping of the components belonging to the same source using vector clustering. For the signal analysis, two existing algorithms are considered and compared: namely the EMD (empirical mode decomposition algorithm and a parametric estimation algorithm using ESPRIT technique. A major advantage of the proposed method resides in its validity for both instantaneous and convolutive mixtures and its ability to separate more sources than sensors. Simulation results are given to compare and assess the performance of the proposed algorithms.

  19. Audio Visual Media Components in Educational Game for Elementary Students

    Directory of Open Access Journals (Sweden)

    Meilani Hartono

    2016-12-01

    Full Text Available The purpose of this research was to review and implement interactive audio visual media used in an educational game to improve elementary students’ interest in learning mathematics. The game was developed for desktop platform. The art of the game was set as 2D cartoon art with animation and audio in order to make students more interest. There were four mini games developed based on the researches on mathematics study. Development method used was Multimedia Development Life Cycle (MDLC that consists of requirement, design, development, testing, and implementation phase. Data collection methods used are questionnaire, literature study, and interview. The conclusion is elementary students interest with educational game that has fun and active (moving objects, with fast tempo of music, and carefree color like blue. This educational game is hoped to be an alternative teaching tool combined with conventional teaching method.

  20. Amplitude Modulated Sinusoidal Signal Decomposition for Audio Coding

    DEFF Research Database (Denmark)

    Christensen, M. G.; Jacobson, A.; Andersen, S. V.

    2006-01-01

    In this paper, we present a decomposition for sinusoidal coding of audio, based on an amplitude modulation of sinusoids via a linear combination of arbitrary basis vectors. The proposed method, which incorporates a perceptual distortion measure, is based on a relaxation of a nonlinear least......-squares minimization. Rate-distortion curves and listening tests show that, compared to a constant-amplitude sinusoidal coder, the proposed decomposition offers perceptually significant improvements in critical transient signals....

  1. Entropy coding of Quantized Spectral Components in FDLP audio codec

    OpenAIRE

    Motlicek, Petr; Ganapathy, Sriram; Hermansky, Hynek

    2008-01-01

    Audio codec based on Frequency Domain Linear Prediction (FDLP) exploits auto-regressive modeling to approximate instantaneous energy in critical frequency sub-bands of relatively long input segments. Current version of the FDLP codec operating at 66 kbps has shown to provide comparable subjective listening quality results to the state-of-the-art codecs on similar bit-rates even without employing strategic blocks, such as entropy coding or simultaneous masking. This paper describes an experime...

  2. Studies on a Spatialized Audio Interface for Sonar

    Science.gov (United States)

    2011-10-03

    addition of spatialized audio to visual displays for sonar is much akin to the development of talking movies in the early days of cinema and can be...exclusion of all others. This is a very different use of “space” when compared with the much broader and substantially older literature in spatial cognition...real-world scenarios after first showing that the algorithm, as published in the open literature , introduces substantial unwanted artifacts into

  3. Investigation of an AGC for Audio Applications

    DEFF Research Database (Denmark)

    Haerizadeh, Seyediman; Jørgensen, Ivan Harald Holger; Marker-Villumsen, Niels

    2015-01-01

    An investigation of an amplifier with discrete time Automatic Gain Control (AGC) which is intended for implementation in hearing aid is performed. The aim of this investigation is to find the AGC’s minimum gain step size for which the glitches become inaudible. Such AGCs produce undesirable...... glitches at the output turning into audible sound effects. In order to find this minimum gain step size both objective and subjective evaluation methods have been used. The investigations show that the objective measures indicate a lower limit for the step size where the sound artefacts are no longer...... audible. This is in contrast with the subjective method where several test persons can hear the sound artefacts for all step sizes. Thus, the investigated AGC is not suitable for IC implementation therefore an alternative AGC system is proposed....

  4. Real-Time Transmission and Storage of Video, Audio, and Health Data in Emergency and Home Care Situations

    Directory of Open Access Journals (Sweden)

    Riccardo Stagnaro

    2007-01-01

    Full Text Available The increase in the availability of bandwidth for wireless links, network integration, and the computational power on fixed and mobile platforms at affordable costs allows nowadays for the handling of audio and video data, their quality making them suitable for medical application. These information streams can support both continuous monitoring and emergency situations. According to this scenario, the authors have developed and implemented the mobile communication system which is described in this paper. The system is based on ITU-T H.323 multimedia terminal recommendation, suitable for real-time data/video/audio and telemedical applications. The audio and video codecs, respectively, H.264 and G723.1, were implemented and optimized in order to obtain high performance on the system target processors. Offline media streaming storage and retrieval functionalities were supported by integrating a relational database in the hospital central system. The system is based on low-cost consumer technologies such as general packet radio service (GPRS and wireless local area network (WLAN or WiFi for lowband data/video transmission. Implementation and testing were carried out for medical emergency and telemedicine application. In this paper, the emergency case study is described.

  5. Real-Time Transmission and Storage of Video, Audio, and Health Data in Emergency and Home Care Situations

    Science.gov (United States)

    Barbieri, Ivano; Lambruschini, Paolo; Raggio, Marco; Stagnaro, Riccardo

    2007-12-01

    The increase in the availability of bandwidth for wireless links, network integration, and the computational power on fixed and mobile platforms at affordable costs allows nowadays for the handling of audio and video data, their quality making them suitable for medical application. These information streams can support both continuous monitoring and emergency situations. According to this scenario, the authors have developed and implemented the mobile communication system which is described in this paper. The system is based on ITU-T H.323 multimedia terminal recommendation, suitable for real-time data/video/audio and telemedical applications. The audio and video codecs, respectively, H.264 and G723.1, were implemented and optimized in order to obtain high performance on the system target processors. Offline media streaming storage and retrieval functionalities were supported by integrating a relational database in the hospital central system. The system is based on low-cost consumer technologies such as general packet radio service (GPRS) and wireless local area network (WLAN or WiFi) for lowband data/video transmission. Implementation and testing were carried out for medical emergency and telemedicine application. In this paper, the emergency case study is described.

  6. Sounding better: fast audio cues increase walk speed in treadmill-mediated virtual rehabilitation environments.

    Science.gov (United States)

    Powell, Wendy; Stevens, Brett; Hand, Steve; Simmonds, Maureen

    2010-01-01

    Music or sound effects are often used to enhance Virtual Environments, but it is not known how this audio may influence gait speed. This study investigated the influence of audio cue tempo on treadmill walking with and without visual flow. The walking speeds of 11 individuals were recorded during exposure to a range of audio cue rates. There was a significant effect of audio tempo without visual flow, with a 16% increase in walk speed with faster audio cue tempos. Audio with visual flow resulted in a smaller but still significant increase in walking speed (8%). The results suggest that the inclusion of faster rate audio cues may be of benefit in improving walk speed in virtual rehabilitation.

  7. The method of narrow-band audio classification based on universal noise background model

    Science.gov (United States)

    Rui, Rui; Bao, Chang-chun

    2013-03-01

    Audio classification is the basis of content-based audio analysis and retrieval. The conventional classification methods mainly depend on feature extraction of audio clip, which certainly increase the time requirement for classification. An approach for classifying the narrow-band audio stream based on feature extraction of audio frame-level is presented in this paper. The audio signals are divided into speech, instrumental music, song with accompaniment and noise using the Gaussian mixture model (GMM). In order to satisfy the demand of actual environment changing, a universal noise background model (UNBM) for white noise, street noise, factory noise and car interior noise is built. In addition, three feature schemes are considered to optimize feature selection. The experimental results show that the proposed algorithm achieves a high accuracy for audio classification, especially under each noise background we used and keep the classification time less than one second.

  8. Comparison of Linear Prediction Models for Audio Signals

    Directory of Open Access Journals (Sweden)

    2009-03-01

    Full Text Available While linear prediction (LP has become immensely popular in speech modeling, it does not seem to provide a good approach for modeling audio signals. This is somewhat surprising, since a tonal signal consisting of a number of sinusoids can be perfectly predicted based on an (all-pole LP model with a model order that is twice the number of sinusoids. We provide an explanation why this result cannot simply be extrapolated to LP of audio signals. If noise is taken into account in the tonal signal model, a low-order all-pole model appears to be only appropriate when the tonal components are uniformly distributed in the Nyquist interval. Based on this observation, different alternatives to the conventional LP model can be suggested. Either the model should be changed to a pole-zero, a high-order all-pole, or a pitch prediction model, or the conventional LP model should be preceded by an appropriate frequency transform, such as a frequency warping or downsampling. By comparing these alternative LP models to the conventional LP model in terms of frequency estimation accuracy, residual spectral flatness, and perceptual frequency resolution, we obtain several new and promising approaches to LP-based audio modeling.

  9. Ears on the hand: reaching 3D audio targets

    Directory of Open Access Journals (Sweden)

    Hanneton Sylvain

    2011-12-01

    Full Text Available We studied the ability of right-handed participants to reach 3D audio targets with their right hand. Our immersive audio environment was based on the OpenAL library and Fastrak magnetic sensors for motion capture. Participants listen the target through a “virtual” listener linked to a sensor fixed either on the head or on the hand. We compare three experimental conditions in which the virtual listener is on the head, on the left hand, and on the right hand (that reach the target. We show that (1 participants are able to learn the task but (2 with a low success rate and high durations, (3 the individual levels of performance are very variable, (4 the best performances are achieved when the listener is on the right hand. Consequently, we concluded that our participants were able to learn to locate 3D audio sources even if their ears are transposed on their hand, but we found of behavioral differences between the three experimental conditions.

  10. Interfacing a processor core in FPGA to an audio system

    OpenAIRE

    Mateos, José Ignacio

    2006-01-01

    The thesis project consists on developing an interface for a Nios II processor integrated in a board of Altera (UP3- 2C35F672C6 Cyclone II). The main goal is show how the Nios II processor can interact with the other components of the board.The Quartus II software has been used to create to vhdl code of the interfaces, compile it and download it into the board. The Nios II IDE tool is used to build the C/C++ files and download them into the processor. It has been prepared an application for t...

  11. Audio-Visual Temporal Recalibration Can be Constrained by Content Cues Regardless of Spatial Overlap.

    Science.gov (United States)

    Roseboom, Warrick; Kawabe, Takahiro; Nishida, Shin'ya

    2013-01-01

    It has now been well established that the point of subjective synchrony for audio and visual events can be shifted following exposure to asynchronous audio-visual presentations, an effect often referred to as temporal recalibration. Recently it was further demonstrated that it is possible to concurrently maintain two such recalibrated estimates of audio-visual temporal synchrony. However, it remains unclear precisely what defines a given audio-visual pair such that it is possible to maintain a temporal relationship distinct from other pairs. It has been suggested that spatial separation of the different audio-visual pairs is necessary to achieve multiple distinct audio-visual synchrony estimates. Here we investigated if this is necessarily true. Specifically, we examined whether it is possible to obtain two distinct temporal recalibrations for stimuli that differed only in featural content. Using both complex (audio visual speech; see Experiment 1) and simple stimuli (high and low pitch audio matched with either vertically or horizontally oriented Gabors; see Experiment 2) we found concurrent, and opposite, recalibrations despite there being no spatial difference in presentation location at any point throughout the experiment. This result supports the notion that the content of an audio-visual pair alone can be used to constrain distinct audio-visual synchrony estimates regardless of spatial overlap.

  12. Audio-visual temporal recalibration can be constrained by content cues regardless of spatial overlap

    Directory of Open Access Journals (Sweden)

    Warrick eRoseboom

    2013-04-01

    Full Text Available It has now been well established that the point of subjective synchrony for audio and visual events can be shifted following exposure to asynchronous audio-visual presentations, an effect often referred to as temporal recalibration. Recently it was further demonstrated that it is possible to concurrently maintain two such recalibrated, and opposing, estimates of audio-visual temporal synchrony. However, it remains unclear precisely what defines a given audio-visual pair such that it is possible to maintain a temporal relationship distinct from other pairs. It has been suggested that spatial separation of the different audio-visual pairs is necessary to achieve multiple distinct audio-visual synchrony estimates. Here we investigated if this was necessarily true. Specifically, we examined whether it is possible to obtain two distinct temporal recalibrations for stimuli that differed only in featural content. Using both complex (audio visual speech; Experiment 1 and simple stimuli (high and low pitch audio matched with either vertically or horizontally oriented Gabors; Experiment 2 we found concurrent, and opposite, recalibrations despite there being no spatial difference in presentation location at any point throughout the experiment. This result supports the notion that the content of an audio-visual pair can be used to constrain distinct audio-visual synchrony estimates regardless of spatial overlap.

  13. Audio-Tactile Integration in Congenitally and Late Deaf Cochlear Implant Users

    Science.gov (United States)

    Nava, Elena; Bottari, Davide; Villwock, Agnes; Fengler, Ineke; Büchner, Andreas; Lenarz, Thomas; Röder, Brigitte

    2014-01-01

    Several studies conducted in mammals and humans have shown that multisensory processing may be impaired following congenital sensory loss and in particular if no experience is achieved within specific early developmental time windows known as sensitive periods. In this study we investigated whether basic multisensory abilities are impaired in hearing-restored individuals with deafness acquired at different stages of development. To this aim, we tested congenitally and late deaf cochlear implant (CI) recipients, age-matched with two groups of hearing controls, on an audio-tactile redundancy paradigm, in which reaction times to unimodal and crossmodal redundant signals were measured. Our results showed that both congenitally and late deaf CI recipients were able to integrate audio-tactile stimuli, suggesting that congenital and acquired deafness does not prevent the development and recovery of basic multisensory processing. However, we found that congenitally deaf CI recipients had a lower multisensory gain compared to their matched controls, which may be explained by their faster responses to tactile stimuli. We discuss this finding in the context of reorganisation of the sensory systems following sensory loss and the possibility that these changes cannot be “rewired” through auditory reafferentation. PMID:24918766

  14. Chaotic micromixing in open wells using audio-frequency acoustic microstreaming.

    Science.gov (United States)

    Petkovic-Duran, Karolina; Manasseh, Richard; Zhu, Yonggang; Ooi, Andrew

    2009-10-01

    Mixing fluids for biochemical assays is problematic when volumes are very small (on the order of the 10 microL typical of single drops), which has inspired the development of many micromixing devices. In this paper, we show that micromixing is possible in the simple open wells of standard laboratory consumables using appropriate acoustic frequencies that can be applied using cheap, conventional audio components. Earlier work has shown that the phenomenon of acoustic microstreaming can mix fluids, provided that bubbles are introduced into a specially designed microchamber or that high-frequency surface acoustic wave devices are constructed. We demonstrate a key simplification: acoustic micromixing at audio frequencies by ensuring the system has a liquid-air interface with a small radius of curvature. The meniscus of a drop in a small well provided an appropriately small radius, and so an introduced bubble was not necessary. Microstreaming showed improvement over diffusion-based mixing by 1-2 orders of magnitude. Furthermore, significant improvements are attainable through the utilization of chaotic mixing principles, whereby alternating fluid flow patterns are created by applying, in sequence, two different acoustic frequencies to a drop of liquid in an open well.

  15. SEMICONDUCTOR INTEGRATED CIRCUITS: A high-performance, low-power σ Δ ADC for digital audio applications

    Science.gov (United States)

    Hao, Luo; Yan, Han; Cheung, Ray C. C.; Xiaoxia, Han; Shaoyu, Ma; Peng, Ying; Dazhong, Zhu

    2010-05-01

    A high-performance low-power σ Δ analog-to-digital converter (ADC) for digital audio applications is described. It consists of a 2-1 cascaded σ Δ modulator and a decimation filter. Various design optimizations are implemented in the system design, circuit implementation and layout design, including a high-overload-level coefficient-optimized modulator architecture, a power-efficient class A/AB operational transconductance amplifier, as well as a multi-stage decimation filter conserving area and power consumption. The ADC is implemented in the SMIC 0.18-μm CMOS mixed-signal process. The experimental chip achieves a peak signal-to-noise-plus-distortion ratio of 90 dB and a dynamic range of 94 dB over 22.05-kHz audio band and occupies 2.1 mm2, which dissipates only 2.1 mA quiescent current in the analog circuits.

  16. Music and audio - oh how they can stress your network

    Science.gov (United States)

    Fletcher, R.

    Nearly ten years ago a paper written by the Audio Engineering Society (AES)[1] made a number of interesting statements: 1. 2. The current Internet is inadequate for transmitting music and professional audio. Performance and collaboration across a distance stress beyond acceptable bounds the quality of service Audio and music provide test cases in which the bounds of the network are quickly reached and through which the defects in a network are readily perceived. Given these key points, where are we now? Have we started to solve any of the problems from the musician's point of view? What is it that musician would like to do that can cause the network so many problems? To understand this we need to appreciate that a trained musician's ears are extremely sensitive to very subtle shifts in temporal materials and localisation information. A shift of a few milliseconds can cause difficulties. So, can modern networks provide the temporal accuracy demanded at this level? The sample and bit rates needed to represent music in the digital domain is still contentious, but a general consensus in the professional world is for 96 KHz and IEEE 64-bit floating point. If this was to be run between two points on the network across 24 channels in near real time to allow for collaborative composition/production/performance, with QOS settings to allow as near to zero latency and jitter, it can be seen that the network indeed has to perform very well. Lighting the Blue Touchpaper for UK e-Science - Closing Conference of ESLEA Project The George Hotel, Edinburgh, UK 26-28 March, 200

  17. Synthesis of audio spectra using a diffraction model.

    Science.gov (United States)

    Vijayakumar, V; Eswaran, C

    2006-12-01

    It is shown that the intensity variations of an audio signal in the frequency domain can be obtained by using a mathematical function containing a series of weighted complex Bessel functions. With proper choice of values for two parameters, this function can transform an input spectrum of discrete frequencies of unit intensity into the known spectra of different musical instruments. Specific examples of musical instruments are considered for evaluating the performance of this method. It is found that this function yields musical spectra with a good degree of accuracy.

  18. Audio Source Localization using a Network of Embedded Devices

    Directory of Open Access Journals (Sweden)

    FRANGU, L.

    2010-05-01

    Full Text Available In this paper, a problem of audio source localization is solved, using a network of embedded devices. The intensive computing procedures (such as the crosscorrelation functions are performed by the embedded devices, which have enough speed and memory for this task. A central computer computes the position in a fast procedure, using the data transmitted by the network nodes, and plays the role of operator interface. The paper also contains the description of the embedded devices, which are designed and manufactured by the authors. They prove to be suited for this kind of application, as they perform fast computation and require low power and small space for installing.

  19. Utilization of non-linear converters for audio amplification

    DEFF Research Database (Denmark)

    Iversen, Niels Elkjær; Birch, Thomas; Knott, Arnold

    2012-01-01

    Class D amplifiers fits the automotive demands quite well. The traditional buck-based amplifier has reduced both the cost and size of amplifiers. However the buck topology is not without its limitations. The maximum peak AC output voltage produced by the power stage is only equal the supply voltage....... The introduction of non-linear converters for audio amplification defeats this limitation. A Cuk converter, designed to deliver an AC peak output voltage twice the supply voltage, is presented in this paper. A 3V prototype has been developed to prove the concept. The prototype shows that it is possible to achieve...

  20. Digital video and audio broadcasting technology a practical engineering guide

    CERN Document Server

    Fischer, Walter

    2010-01-01

    Digital Video and Audio Broadcasting Technology - A Practical Engineering Guide' deals with all the most important digital television, sound radio and multimedia standards such as MPEG, DVB, DVD, DAB, ATSC, T-DMB, DMB-T, DRM and ISDB-T. The book provides an in-depth look at these subjects in terms of practical experience. In addition it contains chapters on the basics of technologies such as analog television, digital modulation, COFDM or mathematical transformations between time and frequency domains. The attention in the respective field under discussion is focussed on aspects of measuring t

  1. Utilizing Domain Knowledge in End-to-End Audio Processing

    DEFF Research Database (Denmark)

    Tax, Tycho; Antich, Jose Luis Diez; Purwins, Hendrik

    2017-01-01

    End-to-end neural network based approaches to audio modelling are generally outperformed by models trained on high-level data representations. In this paper we present preliminary work that shows the feasibility of training the first layers of a deep convolutional neural network (CNN) model...... to learn the commonly-used log-scaled mel-spectrogram transformation. Secondly, we demonstrate that upon initializing the first layers of an end-to-end CNN classifier with the learned transformation, convergence and performance on the ESC-50 environmental sound classification dataset are similar to a CNN...

  2. MP3 audio-editing software for the department of radiology

    International Nuclear Information System (INIS)

    Hong Qingfen; Sun Canhui; Li Ziping; Meng Quanfei; Jiang Li

    2006-01-01

    Objective: To evaluate the MP3 audio-editing software in the daily work in the department of radiology. Methods: The audio content of daily consultation seminar, held in the department of radiology every morning, was recorded and converted into MP3 audio format by a computer integrated recording device. The audio data were edited, archived, and eventually saved in the computer memory storage media, which was experimentally replayed and applied in the research or teaching. Results: MP3 audio-editing was a simple process and convenient for saving and searching the data. The record could be easily replayed. Conclusion: MP3 audio-editing perfectly records and saves the contents of consultation seminar, and has replaced the conventional hand writing notes. It is a valuable tool in both research and teaching in the department. (authors)

  3. WebGL and web audio software lightweight components for multimedia education

    Science.gov (United States)

    Chang, Xin; Yuksel, Kivanc; Skarbek, Władysław

    2017-08-01

    The paper presents the results of our recent work on development of contemporary computing platform DC2 for multimedia education usingWebGL andWeb Audio { the W3C standards. Using literate programming paradigm the WEBSA educational tools were developed. It offers for a user (student), the access to expandable collection of WEBGL Shaders and web Audio scripts. The unique feature of DC2 is the option of literate programming, offered for both, the author and the reader in order to improve interactivity to lightweightWebGL andWeb Audio components. For instance users can define: source audio nodes including synthetic sources, destination audio nodes, and nodes for audio processing such as: sound wave shaping, spectral band filtering, convolution based modification, etc. In case of WebGL beside of classic graphics effects based on mesh and fractal definitions, the novel image processing analysis by shaders is offered like nonlinear filtering, histogram of gradients, and Bayesian classifiers.

  4. A dictionary learning and source recovery based approach to classify diverse audio sources

    OpenAIRE

    Girish, K V Vijay; Ananthapadmanabha, T V; Ramakrishnan, A G

    2015-01-01

    A dictionary learning based audio source classification algorithm is proposed to classify a sample audio signal as one amongst a finite set of different audio sources. Cosine similarity measure is used to select the atoms during dictionary learning. Based on three objective measures proposed, namely, signal to distortion ratio (SDR), the number of non-zero weights and the sum of weights, a frame-wise source classification accuracy of 98.2% is obtained for twelve different sources. Cent percen...

  5. Adaptive Modulation Approach for Robust MPEG-4 AAC Encoded Audio Transmission

    Science.gov (United States)

    2011-11-01

    Codec (AAC) - Main profile with the single channel element (SCE) syntax, and transmit it using the audio data transport stream (ADTS) format. Single...4 AAC ENCODED AUDIO TRANSMISSION 5a. CONTRACT NUMBER IN HOUSE 5b. GRANT NUMBER FA8750-11-1-0048 5c. PROGRAM ELEMENT NUMBER 62702F 6. AUTHOR(S... audio data over fading wireless channels using Unequal Error Protection based on adaptive modulation and forward error correcting (FEC) codes. The

  6. Penerapan Audio Amplifier Stereo Untuk Beban Bersama Dan Bergantian Dengan Menggunakan Saklar Ganda Sebagai Pengatur Beban

    OpenAIRE

    Hidayat, Rahmat

    2013-01-01

    — Driver audio amplifier mempunyai fungsi sebagai penguat penggerak yaitu menggerakkan daya isyarat masukan dan meneruskan ke bagian penguat akhir (power amplifier).Perangkat audio sangatlah penting, dimana penggunaannya sangat luas. Terutama digunakan untuk memungkinkan seseorang untuk mengatasi publik yang luas. Penguat audio atau alat penguat bunyi adalah penguat elektonik yang digunakan untuk menguatkansinyal bunyi yang berfrekuensi rendah hingga ke tingkat yang bersesuaian untuk menggera...

  7. Voice radio communication, pedestrian localization, and the tactical use of 3D audio

    OpenAIRE

    Nilsson, John-Olof; Schüldt, Christian; Händel, Peter

    2013-01-01

    The relation between voice radio communication and pedestrian localization is studied. 3D audio is identified as a linking technology which brings strong mutual benefits. Voice communication rendered with 3D audio provides a potential low secondary task interference user interface to the localization information. Vice versa, location information in the 3D audio provides spatial cues in the voice communication, improving speech intelligibility. An experimental setup with voice radio communicat...

  8. Audio-Visual Fusion for Sound Source Localization and Improved Attention

    International Nuclear Information System (INIS)

    Lee, Byoung Gi; Choi, Jong Suk; Yoon, Sang Suk; Choi, Mun Taek; Kim, Mun Sang; Kim, Dai Jin

    2011-01-01

    Service robots are equipped with various sensors such as vision camera, sonar sensor, laser scanner, and microphones. Although these sensors have their own functions, some of them can be made to work together and perform more complicated functions. AudioFvisual fusion is a typical and powerful combination of audio and video sensors, because audio information is complementary to visual information and vice versa. Human beings also mainly depend on visual and auditory information in their daily life. In this paper, we conduct two studies using audioFvision fusion: one is on enhancing the performance of sound localization, and the other is on improving robot attention through sound localization and face detection

  9. Efficiency of Switch-Mode Power Audio Amplifiers - Test Signals and Measurement Techniques

    DEFF Research Database (Denmark)

    Iversen, Niels Elkjær; Knott, Arnold; Andersen, Michael A. E.

    2016-01-01

    Switch-mode technology is greatly used for audio amplification. This is mainly due to the great efficiency this technology offers. Normally the efficiency of a switch-mode audio amplifier is measured using a sine wave input. However this paper shows that sine waves represent real audio very poorly....... An alternative signal is proposed for test purposes. The efficiency of a switch-mode power audio amplifier is modelled and measured with both sine wave and the proposed test signal as inputs. The results show that the choice of switching devices with low on resistances are unfairly favored when measuring...

  10. Subjective evaluation of four low-complexity audio coding schemes.

    Science.gov (United States)

    Joseph, S M; Maher, R C

    1995-06-01

    In this study the subjective performance of four low-complexity audio data compression methods are compared, operating at nominal bit rates of 2, 3, 4, and 5 bits per sample, applied to four 20-kHz bandwidth, 16-bits per sample digitized musical signals. The simple compression schemes compared were elementary differential pulse-code modulation (DPCM), noise feedback coding DPCM (NFC-DPCM), adaptive quantizer DPCM (DPCM-AQB), and a recently proposed method known as recursively indexed quantizer DPCM (RIQ-DPCM). Pairs consisting of a reconstructed signal and a reference signal were presented in a two-interval preference experiment. The reference signals were processed for specified levels of modulated noise reference unit (MNRU) in order to estimate the equality threshold rating (ETR) of the reconstructed audio stimuli. The subjective MNRU values were found to increase by 2-5 dB for each increment in bits per sample. The DPCM-AQB scores were found to be 8-10 dB higher than for DPCM and NFC-DPCM. RIQ-DPCM was rated highest, exceeding the DPCM-AQB results by 2-5 dB in all tests. Objective measurements of segmental signal-to-noise ratio (SNRSEG) for the reconstructed signals predicted a performance level 2-5 dB lower than was actually found in the subjective results, particularly for SNRSEG values below 25 dB.

  11. Speech and audio processing for coding, enhancement and recognition

    CERN Document Server

    Togneri, Roberto; Narasimha, Madihally

    2015-01-01

    This book describes the basic principles underlying the generation, coding, transmission and enhancement of speech and audio signals, including advanced statistical and machine learning techniques for speech and speaker recognition with an overview of the key innovations in these areas. Key research undertaken in speech coding, speech enhancement, speech recognition, emotion recognition and speaker diarization are also presented, along with recent advances and new paradigms in these areas. ·         Offers readers a single-source reference on the significant applications of speech and audio processing to speech coding, speech enhancement and speech/speaker recognition. Enables readers involved in algorithm development and implementation issues for speech coding to understand the historical development and future challenges in speech coding research; ·         Discusses speech coding methods yielding bit-streams that are multi-rate and scalable for Voice-over-IP (VoIP) Networks; ·     �...

  12. Effectiveness and Comparison of Various Audio Distraction Aids in Management of Anxious Dental Paediatric Patients.

    Science.gov (United States)

    Navit, Saumya; Johri, Nikita; Khan, Suleman Abbas; Singh, Rahul Kumar; Chadha, Dheera; Navit, Pragati; Sharma, Anshul; Bahuguna, Rachana

    2015-12-01

    Dental anxiety is a widespread phenomenon and a concern for paediatric dentistry. The inability of children to deal with threatening dental stimuli often manifests as behaviour management problems. Nowadays, the use of non-aversive behaviour management techniques is more advocated, which are more acceptable to parents, patients and practitioners. Therefore, this present study was conducted to find out which audio aid was the most effective in the managing anxious children. The aim of the present study was to compare the efficacy of audio-distraction aids in reducing the anxiety of paediatric patients while undergoing various stressful and invasive dental procedures. The objectives were to ascertain whether audio distraction is an effective means of anxiety management and which type of audio aid is the most effective. A total number of 150 children, aged between 6 to 12 years, randomly selected amongst the patients who came for their first dental check-up, were placed in five groups of 30 each. These groups were the control group, the instrumental music group, the musical nursery rhymes group, the movie songs group and the audio stories group. The control group was treated under normal set-up & audio group listened to various audio presentations during treatment. Each child had four visits. In each visit, after the procedures was completed, the anxiety levels of the children were measured by the Venham's Picture Test (VPT), Venham's Clinical Rating Scale (VCRS) and pulse rate measurement with the help of pulse oximeter. A significant difference was seen between all the groups for the mean pulse rate, with an increase in subsequent visit. However, no significant difference was seen in the VPT & VCRS scores between all the groups. Audio aids in general reduced anxiety in comparison to the control group, and the most significant reduction in anxiety level was observed in the audio stories group. The conclusion derived from the present study was that audio distraction

  13. Comparison of three orientation and mobility aids for individuals with blindness: Verbal description, audio-tactile map and audio-haptic map.

    Science.gov (United States)

    Papadopoulos, Konstantinos; Koustriava, Eleni; Koukourikos, Panagiotis; Kartasidou, Lefkothea; Barouti, Marialena; Varveris, Asimis; Misiou, Marina; Zacharogeorga, Timoclia; Anastasiadis, Theocharis

    2017-01-01

    Disorientation and inability of wayfinding are phenomena with a great frequency for individuals with visual impairments during the process of travelling novel environments. Orientation and mobility aids could suggest important tools for the preparation of a more secure and cognitively mapped travelling. The aim of the present study was to examine if spatial knowledge structured after an individual with blindness had studied the map of an urban area that was delivered through a verbal description, an audio-tactile map or an audio-haptic map, could be used for detecting in the area specific points of interest. The effectiveness of the three aids with reference to each other was also examined. The results of the present study highlight the effectiveness of the audio-tactile and the audio-haptic maps as orientation and mobility aids, especially when these are compared to verbal descriptions.

  14. Applying Spatial Audio to Human Interfaces: 25 Years of NASA Experience

    Science.gov (United States)

    Begault, Durand R.; Wenzel, Elizabeth M.; Godfrey, Martine; Miller, Joel D.; Anderson, Mark R.

    2010-01-01

    From the perspective of human factors engineering, the inclusion of spatial audio within a human-machine interface is advantageous from several perspectives. Demonstrated benefits include the ability to monitor multiple streams of speech and non-speech warning tones using a cocktail party advantage, and for aurally-guided visual search. Other potential benefits include the spatial coordination and interaction of multimodal events, and evaluation of new communication technologies and alerting systems using virtual simulation. Many of these technologies were developed at NASA Ames Research Center, beginning in 1985. This paper reviews examples and describes the advantages of spatial sound in NASA-related technologies, including space operations, aeronautics, and search and rescue. The work has involved hardware and software development as well as basic and applied research.

  15. A digital input class-D audio amplifier with sixth-order PWM

    Science.gov (United States)

    Shumeng, Luo; Dongmei, Li

    2013-11-01

    A digital input class-D audio amplifier with a sixth-order pulse-width modulation (PWM) modulator is presented. This modulator moves the PWM generator into the closed sigma—delta modulator loop. The noise and distortions generated at the PWM generator module are suppressed by the high gain of the forward loop of the sigma—delta modulator. Therefore, at the output of the modulator, a very clean PWM signal is acquired for driving the power stage of the class-D amplifier. A sixth-order modulator is designed to balance the performance and the system clock speed. Fabricated in standard 0.18 μm CMOS technology, this class-D amplifier achieves 110 dB dynamic range, 100 dB signal-to-noise rate, and 0.0056% total harmonic distortion plus noise.

  16. Audio Control Handbook For Radio and Television Broadcasting. Third Revised Edition.

    Science.gov (United States)

    Oringel, Robert S.

    Audio control is the operation of all the types of sound equipment found in the studios and control rooms of a radio or television station. Written in a nontechnical style for beginners, the book explains thoroughly the operation of all types of audio equipment. Diagrams and photographs of commercial consoles, microphones, turntables, and tape…

  17. Reasons to Rethink the Use of Audio and Video Lectures in Online Courses

    Science.gov (United States)

    Stetz, Thomas A.; Bauman, Antonina A.

    2013-01-01

    Recent technological developments allow any instructor to create audio and video lectures for the use in online classes. However, it is questionable if it is worth the time and effort that faculty put into preparing those lectures. This paper presents thirteen factors that should be considered before preparing and using audio and video lectures in…

  18. A Perceptual Model for Sinusoidal Audio Coding Based on Spectral Integration

    NARCIS (Netherlands)

    Van de Par, S.; Kohlrausch, A.; Heusdens, R.; Jensen, J.; Holdt Jensen, S.

    2005-01-01

    Psychoacoustical models have been used extensively within audio coding applications over the past decades. Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of

  19. A perceptual model for sinusoidal audio coding based on spectral integration

    NARCIS (Netherlands)

    Van de Par, S.; Kohlrauch, A.; Heusdens, R.; Jensen, J.; Jensen, S.H.

    2005-01-01

    Psychoacoustical models have been used extensively within audio coding applications over the past decades. Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of

  20. LiveDescribe: Can Amateur Describers Create High-Quality Audio Description?

    Science.gov (United States)

    Branje, Carmen J.; Fels, Deborah I.

    2012-01-01

    Introduction: The study presented here evaluated the usability of the audio description software LiveDescribe and explored the acceptance rates of audio description created by amateur describers who used LiveDescribe to facilitate the creation of their descriptions. Methods: Twelve amateur describers with little or no previous experience with…