WorldWideScience

Sample records for audio feature space

  1. Emotion-based Music Rretrieval on a Well-reduced Audio Feature Space

    DEFF Research Database (Denmark)

    Ruxanda, Maria Magdalena; Chua, Bee Yong; Nanopoulos, Alexandros;

    2009-01-01

    Music expresses emotion. A number of audio extracted features have influence on the perceived emotional expression of music. These audio features generate a high-dimensional space, on which music similarity retrieval can be performed effectively, with respect to human perception of the music-emotion...... on a number of dimensionality reduction algorithms, including both classic and novel approaches. The paper clearly envisages which dimensionality reduction techniques on the considered audio feature space, can preserve in average the accuracy of the emotion-based music retrieval........ However, the real-time systems that retrieve music over large music databases, can achieve order of magnitude performance increase, if applying multidimensional indexing over a dimensionally reduced audio feature space. To meet this performance achievement, in this paper, extensive studies are conducted...

  2. Fall Detection Using Smartphone Audio Features.

    Science.gov (United States)

    Cheffena, Michael

    2016-07-01

    An automated fall detection system based on smartphone audio features is developed. The spectrogram, mel frequency cepstral coefficents (MFCCs), linear predictive coding (LPC), and matching pursuit (MP) features of different fall and no-fall sound events are extracted from experimental data. Based on the extracted audio features, four different machine learning classifiers: k-nearest neighbor classifier (k-NN), support vector machine (SVM), least squares method (LSM), and artificial neural network (ANN) are investigated for distinguishing between fall and no-fall events. For each audio feature, the performance of each classifier in terms of sensitivity, specificity, accuracy, and computational complexity is evaluated. The best performance is achieved using spectrogram features with ANN classifier with sensitivity, specificity, and accuracy all above 98%. The classifier also has acceptable computational requirement for training and testing. The system is applicable in home environments where the phone is placed in the vicinity of the user.

  3. Audio Watermarking Algorithm Based on Centroid and Statistical Features

    Science.gov (United States)

    Zhang, Xiaoming; Yin, Xiong

    Experimental testing shows that the relative relation in the number of samples among the neighboring bins and the audio frequency centroid are two robust features to the Time Scale Modification (TSM) attacks. Accordingly, an audio watermark algorithm based on frequency centroid and histogram is proposed by modifying the frequency coefficients. The audio histogram with equal-sized bins is extracted from a selected frequency coefficient range referred to the audio centroid. The watermarked audio signal is perceptibly similar to the original one. The experimental results show that the algorithm is very robust to resample TSM and a variety of common attacks. Subjective quality evaluation of the algorithm shows that embedded watermark introduces low, inaudible distortion of host audio signal.

  4. Simple Solutions for Space Station Audio Problems

    Science.gov (United States)

    Wood, Eric

    2016-01-01

    Throughout this summer, a number of different projects were supported relating to various NASA programs, including the International Space Station (ISS) and Orion. The primary project that was worked on was designing and testing an acoustic diverter which could be used on the ISS to increase sound pressure levels in Node 1, a module that does not have any Audio Terminal Units (ATUs) inside it. This acoustic diverter is not intended to be a permanent solution to providing audio to Node 1; it is simply intended to improve conditions while more permanent solutions are under development. One of the most exciting aspects of this project is that the acoustic diverter is designed to be 3D printed on the ISS, using the 3D printer that was set up earlier this year. Because of this, no new hardware needs to be sent up to the station, and no extensive hardware testing needs to be performed on the ground before sending it to the station. Instead, the 3D part file can simply be uploaded to the station's 3D printer, where the diverter will be made.

  5. EMOTION ANALYSIS OF SONGS BASED ON LYRICAL AND AUDIO FEATURES

    Directory of Open Access Journals (Sweden)

    Adit Jamdar

    2015-05-01

    Full Text Available In this paper, a method is proposed to detect the emotion of a song based on its lyrical and audio features. Lyrical features are generated by segmentation of lyrics during the process of data extraction. ANEW and WordNet knowledge is then incorporated to compute Valence and Arousal values. In addition to this, linguistic association rules are applied to ensure that the issue of ambiguity is properly addressed. Audio features are used to supplement the lyrical ones and include attributes like energy, tempo, and danceability. These features are extracted from The Echo Nest, a widely used music intelligence platform. Construction of training and test sets is done on the basis of social tags extracted from the last.fm website. The classification is done by applying feature weighting and stepwise threshold reduction on the k-Nearest Neighbors algorithm to provide fuzziness in the classification.

  6. Analytical Features: A Knowledge-Based Approach to Audio Feature Generation

    Directory of Open Access Journals (Sweden)

    Pachet François

    2009-01-01

    Full Text Available We present a feature generation system designed to create audio features for supervised classification tasks. The main contribution to feature generation studies is the notion of analytical features (AFs, a construct designed to support the representation of knowledge about audio signal processing. We describe the most important aspects of AFs, in particular their dimensional type system, on which are based pattern-based random generators, heuristics, and rewriting rules. We show how AFs generalize or improve previous approaches used in feature generation. We report on several projects using AFs for difficult audio classification tasks, demonstrating their advantage over standard audio features. More generally, we propose analytical features as a paradigm to bring raw signals into the world of symbolic computation.

  7. Comparing Audio Features and Playlist Statistics for Music Classification

    OpenAIRE

    Vatolkin, Igor; Bonnin, Geoffray; Jannach, Dietmar

    2014-01-01

    In recent years, a number of approaches have been developed for the automatic recognition of music genres, but also more specific categories (styles, moods, personal preferences, etc.). Among the different sources for building classification models, features extracted from the audio signal play an important role in the literature. Although such features can be extracted from any digitized music piece independently of the availability of other information sources, their extraction can require ...

  8. Feature Selection for Audio Surveillance in Urban Environment

    Directory of Open Access Journals (Sweden)

    KIKTOVA Eva

    2014-05-01

    Full Text Available This paper presents the work leading to the acoustic event detection system, which is designed to recognize two types of acoustic events (shot and breaking glass in urban environment. For this purpose, a huge front-end processing was performed for the effective parametric representation of an input sound. MFCC features and features computed during their extraction (MELSPEC and FBANK, then MPEG-7 audio descriptors and other temporal and spectral characteristics were extracted. High dimensional feature sets were created and in the next phase reduced by the mutual information based selection algorithms. Hidden Markov Model based classifier was applied and evaluated by the Viterbi decoding algorithm. Thus very effective feature sets were identified and also the less important features were found.

  9. Audio-visual synchrony and feature-selective attention co-amplify early visual processing.

    Science.gov (United States)

    Keitel, Christian; Müller, Matthias M

    2016-05-01

    Our brain relies on neural mechanisms of selective attention and converging sensory processing to efficiently cope with rich and unceasing multisensory inputs. One prominent assumption holds that audio-visual synchrony can act as a strong attractor for spatial attention. Here, we tested for a similar effect of audio-visual synchrony on feature-selective attention. We presented two superimposed Gabor patches that differed in colour and orientation. On each trial, participants were cued to selectively attend to one of the two patches. Over time, spatial frequencies of both patches varied sinusoidally at distinct rates (3.14 and 3.63 Hz), giving rise to pulse-like percepts. A simultaneously presented pure tone carried a frequency modulation at the pulse rate of one of the two visual stimuli to introduce audio-visual synchrony. Pulsed stimulation elicited distinct time-locked oscillatory electrophysiological brain responses. These steady-state responses were quantified in the spectral domain to examine individual stimulus processing under conditions of synchronous versus asynchronous tone presentation and when respective stimuli were attended versus unattended. We found that both, attending to the colour of a stimulus and its synchrony with the tone, enhanced its processing. Moreover, both gain effects combined linearly for attended in-sync stimuli. Our results suggest that audio-visual synchrony can attract attention to specific stimulus features when stimuli overlap in space. PMID:26226930

  10. Audio Environment Recognition using Zero Crossing Features and MPEG-7 Descriptors

    Directory of Open Access Journals (Sweden)

    Saleh Al-Zhrani

    2010-01-01

    Full Text Available Problem statement: This study investigated zero crossing features and selected MPEG-7 audio descriptors for environment sound recognition applications such as audio forensics. Approach: The study implemented several experiments focusing on the problems of environment recognition from audio particularly for forensic applications. Results: It was investigated the effect of the temporal zero crossing feature as well as selected MPEG-7 audio low level descriptors on environment sound recognition. The performance was evaluated against a varying number of training sounds and samples per training file. Conclusion/Recommendations: Experimental results showed that higher recognition accuracy is achieved by increasing the number of training files and by decreasing the number of samples per training file. This study presented an audio environment recognition using zero crossing features and MPEG-7 Descriptors.

  11. A quick search method for audio signals based on a piecewise linear representation of feature trajectories

    CERN Document Server

    Kimura, Akisato; Kurozumi, Takayuki; Murase, Hiroshi

    2007-01-01

    This paper presents a new method for a quick similarity-based search through long unlabeled audio streams to detect and locate audio clips provided by users. The method involves feature-dimension reduction based on a piecewise linear representation of a sequential feature trajectory extracted from a long audio stream. Two techniques enable us to obtain a piecewise linear representation: the dynamic segmentation of feature trajectories and the segment-based Karhunen-L\\'{o}eve (KL) transform. The proposed search method guarantees the same search results as the search method without the proposed feature-dimension reduction method in principle. Experiment results indicate significant improvements in search speed. For example the proposed method reduced the total search time to approximately 1/12 that of previous methods and detected queries in approximately 0.3 seconds from a 200-hour audio database.

  12. Recognition of Isolated Words using Zernike and MFCC features for Audio Visual Speech Recognition

    OpenAIRE

    Bordea, Prashant; Varpeb, Amarsinh; Manzac, Ramesh; Yannawara, Pravin

    2014-01-01

    Automatic Speech Recognition (ASR) by machine is an attractive research topic in signal processing domain and has attracted many researchers to contribute in this area. In recent year, there have been many advances in automatic speech reading system with the inclusion of audio and visual speech features to recognize words under noisy conditions. The objective of audio-visual speech recognition system is to improve recognition accuracy. In this paper we computed visual features using Zernike m...

  13. Automatic Segmentation of News Items Based on Video and Audio Features

    Institute of Scientific and Technical Information of China (English)

    王伟强; 高文

    2002-01-01

    The automatic segmentation of news items is a key for implementing the automatic cataloging system of news video. This paper presents an approach which manages audio and video feature information to automatically segment news items. The integration of audio and visual analyses can overcome the weakness of the approach using only image analysis techniques. It makes the approach more adaptable to various situations of news items. The proposed approach detects silence segments in accompanying audio, and integrates them with shot segmentation results, as well as anchor shot detection results, to determine the boundaries among news items. Experimental results show that the integration of audio and video features is an effective approach to solving the problem of automatic segmentation of news items.

  14. Robust identification/fingerprinting of audio signals using spectral flatness features

    Science.gov (United States)

    Herre, Juergen; Allamanche, Eric; Hellmuth, Oliver; Kastner, Thorsten

    2002-05-01

    Recently, the problem of content-based identification material has received increased attention as an important technique for managing the ever-increasing amount of multimedia assets available to users today. This talk discusses the problem of robust identification of audio signals by comparing them to a known reference (``fingerprint'') in the feature domain. Desirable properties of the underlying features include robustness with respect to common signal distortions and compactness of representation. A family of suitable features with favorable properties is described and evaluated for their recognition performance. Some applications of signal identification are discussed, including MPEG-7 Audio.

  15. Music preferences based on audio features, and its relation to personality

    OpenAIRE

    Dunn, Greg

    2009-01-01

    Recent studies have summarized reported music preferences by genre into four broadly defined categories, which relate to various personality characteristics. Other research has indicated that genre classification is ambiguous and inconsistent. This ambiguity suggests that research relating personality to music preferences based on genre could benefit from a more objective definition of music. This problem is addressed by investigating how music preferences linked to objective audio features r...

  16. 47 CFR 25.214 - Technical requirements for space stations in the satellite digital audio radio service and...

    Science.gov (United States)

    2010-10-01

    ... 47 Telecommunication 2 2010-10-01 2010-10-01 false Technical requirements for space stations in the satellite digital audio radio service and associated terrestrial repeaters. 25.214 Section 25.214... Technical Standards § 25.214 Technical requirements for space stations in the satellite digital audio...

  17. Reconsidering the Role of Recorded Audio as a Rich, Flexible and Engaging Learning Space

    Science.gov (United States)

    Middleton, Andrew

    2016-01-01

    Audio needs to be recognised as an integral medium capable of extending education's formal and informal, virtual and physical learning spaces. This paper reconsiders the value of educational podcasting through a review of literature and a module case study. It argues that a pedagogical understanding is needed and challenges technology-centred or…

  18. Integrating Audio-Visual Features and Text Information for Story Segmentation of News Video

    Institute of Scientific and Technical Information of China (English)

    Liu Hua-yong; Zhou Dong-ru

    2003-01-01

    Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.

  19. Integrating Audio-Visual Features and Text Information for Story Segmentation of News Video

    Institute of Scientific and Technical Information of China (English)

    LiuHua-yong; ZhouDong-ru

    2003-01-01

    Video data are composed of multimodal information streams including visual, auditory and textual streams, an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.

  20. Reconsidering the role of recorded audio as a rich, flexible and engaging learning space

    Directory of Open Access Journals (Sweden)

    Andrew Middleton

    2016-01-01

    Full Text Available Audio needs to be recognised as an integral medium capable of extending education's formal and informal, virtual and physical learning spaces. This paper reconsiders the value of educational podcasting through a review of literature and a module case study. It argues that a pedagogical understanding is needed and challenges technology-centred or teacher-centred understandings of podcasting. It considers the diverse methods being used that enhance and redefine podcasting as a medium for student-centred active learning. The case study shows how audio created a rich learning space by meaningfully connecting tutors, students and those beyond the existing formal study space. The approaches used can be categorised as new types of learning activity, extended connected activity, relocated activity, and recorded ‘captured’ activity which promote learner replay and re-engagement. The paper concludes that the educational use of the recorded voice needs to be reconsidered and reconceptualised so that audio is valued as a manageable, immediate, flexible, potent and engaging medium.

  1. Slow feature analysis with spiking neurons and its application to audio stimuli.

    Science.gov (United States)

    Bellec, Guillaume; Galtier, Mathieu; Brette, Romain; Yger, Pierre

    2016-06-01

    Extracting invariant features in an unsupervised manner is crucial to perform complex computation such as object recognition, analyzing music or understanding speech. While various algorithms have been proposed to perform such a task, Slow Feature Analysis (SFA) uses time as a means of detecting those invariants, extracting the slowly time-varying components in the input signals. In this work, we address the question of how such an algorithm can be implemented by neurons, and apply it in the context of audio stimuli. We propose a projected gradient implementation of SFA that can be adapted to a Hebbian like learning rule dealing with biologically plausible neuron models. Furthermore, we show that a Spike-Timing Dependent Plasticity learning rule, shaped as a smoothed second derivative, implements SFA for spiking neurons. The theory is supported by numerical simulations, and to illustrate a simple use of SFA, we have applied it to auditory signals. We show that a single SFA neuron can learn to extract the tempo in sound recordings. PMID:27075919

  2. Online Learning for Classification of Low-rank Representation Features and Its Applications in Audio Segment Classification

    CERN Document Server

    Shi, Ziqiang; Zheng, Tieran; Deng, Shiwen

    2011-01-01

    In this paper, a novel framework based on trace norm minimization for audio segment is proposed. In this framework, both the feature extraction and classification are obtained by solving corresponding convex optimization problem with trace norm regularization. For feature extraction, robust principle component analysis (robust PCA) via minimization a combination of the nuclear norm and the $\\ell_1$-norm is used to extract low-rank features which are robust to white noise and gross corruption for audio segments. These low-rank features are fed to a linear classifier where the weight and bias are learned by solving similar trace norm constrained problems. For this classifier, most methods find the weight and bias in batch-mode learning, which makes them inefficient for large-scale problems. In this paper, we propose an online framework using accelerated proximal gradient method. This framework has a main advantage in memory cost. In addition, as a result of the regularization formulation of matrix classificatio...

  3. The Audio Zero Watermark Algorithm Based on Audio Features and Statistical Feature of Approximation Signal%基于音频特征和逼近信号统计特征的数字零水印算法

    Institute of Scientific and Technical Information of China (English)

    杨得国; 姜金娣; 曹文泉; 曾玥; 万红娟

    2011-01-01

    An zero-digital watermark algotithm based on audio features and statistical feature of approximation signal is presented. Experimental results show that the proposed algorithm can find the suitable audio frames to embed watermark according to the characteristics of the audio signal and can achieve watermark information embedding, extraction and blind detection. Reducing the computation and improving the robustness of watermarking system without losing auditory quality.%提出了一种基于音频特征和逼近信号统计特征的零水印算法.实验结果表明,该算法能根据音频自身的特点寻找到适合用于嵌入水印的音频帧,实现水印信息的嵌入、提取和盲检测,在不改变听觉质量的同时降低了计算量,提高了水印的鲁棒性.

  4. Unique features of space reactors

    Science.gov (United States)

    Buden, David

    Space reactors are designed to meet a unique set of requirements; they must be sufficiently compact to be launched in a rocket to their operational location, operate for many years without maintenance and servicing, operate in extreme environments, and reject heat by radiation to space. To meet these restrictions, operating temperatures are much greater than in terrestrial power plants, and the reactors tend to have a fast neutron spectrum. Currently, a new generation of space reactor power plants is being developed. The major effort is in the SP-100 program, where the power plant is being designed for seven years of full power, and no maintenance operation at a reactor outlet operating temperature of 1350 K.

  5. Audio-Visual Classification of Sports Types

    DEFF Research Database (Denmark)

    Gade, Rikke; Abou-Zleikha, Mohamed; Christensen, Mads Græsbøll;

    2015-01-01

    In this work we propose a method for classification of sports types from combined audio and visual features ex- tracted from thermal video. From audio Mel Frequency Cepstral Coefficients (MFCC) are extracted, and PCA are applied to reduce the feature space to 10 dimensions. From the visual modality...... short trajectories are constructed to rep- resent the motion of players. From these, four motion fea- tures are extracted and combined directly with audio fea- tures for classification. A k-nearest neighbour classifier is applied for classification of 180 1-minute video sequences from three sports types...

  6. From music similarity to music recommendation : computational approaches based on audio features and metadata

    OpenAIRE

    Bogdanov, Dmitry

    2013-01-01

    In this work we focus on user modeling for music recommendation and develop algorithms for computational understanding and visualization of music preferences. Firstly, we propose a user model starting from an explicit set of music tracks provided by the user as evidence of his/her preferences. Secondly, we study approaches to music similarity, working solely on audio content and propose a number of novel measures working with timbral, temporal, tonal, and semantic information about music. Thi...

  7. Categorizing Video Game Audio

    DEFF Research Database (Denmark)

    Westerberg, Andreas Rytter; Schoenau-Fog, Henrik

    2015-01-01

    This paper dives into the subject of video game audio and how it can be categorized in order to deliver a message to a player in the most precise way. A new categorization, with a new take on the diegetic spaces, can be used a tool of inspiration for sound- and game-designers to rethink how...... they can use audio in video games. The conclusion of this study is that the current models' view of the diegetic spaces, used to categorize video game audio, is not t to categorize all sounds. This can however possibly be changed though a rethinking of how the player interprets audio....

  8. Audio 2008: Audio Fixation

    Science.gov (United States)

    Kaye, Alan L.

    2008-01-01

    Take a look around the bus or subway and see just how many people are bumping along to an iPod or an MP3 player. What they are listening to is their secret, but the many signature earbuds in sight should give one a real sense of just how pervasive digital audio has become. This article describes how that popularity is mirrored in library audio…

  9. New robust audio watermarking based on feature points%新的特征点强鲁棒音频水印技术

    Institute of Scientific and Technical Information of China (English)

    李金梅; 宋欣; 马卓赛; 肖海; 苑立军

    2011-01-01

    以均值量化索引调制(MQIM)、特征点为理论基础,提出一种新颖的强鲁棒性数字音频水印技术.该算法选取数字音频的局部能量峰值点作为特征,提取稳定的特征点;以特征点为标记,对特征点后的区域进行分段,利用均值量化索引调制将水印嵌入到原始音频的小波域中.水印提取时无需原始图像.仿真实验表明,提出的算法对常规信号处理(MP3压缩、叠加噪声、重新采样、重新量化等)和去同步攻击(随机剪切、幅度缩放、变调.抖动等)均具有较好的鲁棒性.%A robust digital audio watermarking technology based on Mean Quantization Index Modulation(MQIM) and feature points is proposed.It chooses the digital audio local energy peaks as feature to extract the stable feature points, marks the feature points and segments the audio signal behind the feature points, then embeds the watermark bits into wavelet domain by using MQIM.Meanwhile,the algorithm can extract the watermark without the help from the original digital audio signal.Simulation results show that the proposed watermarking scheme is not only robust against common signals processing such as MP3 compression, noise addition, re-sampling, re-quantization, but also robust against the desynchronization attacks such as random cropping,amplitude variation,pitch shifting,time-scale modification,and jittering.

  10. A framework for event detection in field-sports video broadcasts based on SVM generated audio-visual feature model. Case-study: soccer video

    OpenAIRE

    Sadlier, David A.; O''Connor, Noel E.; Murphy, Noel; Marlow, Seán

    2004-01-01

    In this paper we propose a novel audio-visual feature-based framework, for event detection in field sports broadcast video. The system is evaluated via a case-study involving MPEG encoded soccer video. Specifically, the evidence gathered by various feature detectors is combined by means of a learning algorithm (a support vector machine), which infers the occurrence of an event, based on a model generated during a training phase, utilizing a corpus of 25 hours of content. The system is evaluat...

  11. Feature Integration across Space, Time, and Orientation

    Science.gov (United States)

    Otto, Thomas U.; Ogmen, Haluk; Herzog, Michael H.

    2009-01-01

    The perception of a visual target can be strongly influenced by flanking stimuli. In static displays, performance on the target improves when the distance to the flanking elements increases--presumably because feature pooling and integration vanishes with distance. Here, we studied feature integration with dynamic stimuli. We show that features of…

  12. Introduction to audio analysis a MATLAB approach

    CERN Document Server

    Giannakopoulos, Theodoros

    2014-01-01

    Introduction to Audio Analysis serves as a standalone introduction to audio analysis, providing theoretical background to many state-of-the-art techniques. It covers the essential theory necessary to develop audio engineering applications, but also uses programming techniques, notably MATLAB®, to take a more applied approach to the topic. Basic theory and reproducible experiments are combined to demonstrate theoretical concepts from a practical point of view and provide a solid foundation in the field of audio analysis. Audio feature extraction, audio classification, audio segmentation, au

  13. Audio Twister

    DEFF Research Database (Denmark)

    Cermak, Daniel; Moreno Garcia, Rodrigo; Monastiridis, Stefanos

    2015-01-01

    Daniel Cermak-Sassenrath, Rodrigo Moreno Garcia, Stefanos Monastiridis. Audio Twister. Installation. P-Hack Copenhagen 2015, Copenhagen, DK, Apr 24, 2015.......Daniel Cermak-Sassenrath, Rodrigo Moreno Garcia, Stefanos Monastiridis. Audio Twister. Installation. P-Hack Copenhagen 2015, Copenhagen, DK, Apr 24, 2015....

  14. Searching Fragment Spaces with feature trees.

    Science.gov (United States)

    Lessel, Uta; Wellenzohn, Bernd; Lilienthal, Markus; Claussen, Holger

    2009-02-01

    Virtual combinatorial chemistry easily produces billions of compounds, for which conventional virtual screening cannot be performed even with the fastest methods available. An efficient solution for such a scenario is the generation of Fragment Spaces, which encode huge numbers of virtual compounds by their fragments/reagents and rules of how to combine them. Similarity-based searches can be performed in such spaces without ever fully enumerating all virtual products. Here we describe the generation of a huge Fragment Space encoding about 5 * 10(11) compounds based on established in-house synthesis protocols for combinatorial libraries, i.e., we encode practically evaluated combinatorial chemistry protocols in a machine readable form, rendering them accessible to in silico search methods. We show how such searches in this Fragment Space can be integrated as a first step in an overall workflow. It reduces the extremely huge number of virtual products by several orders of magnitude so that the resulting list of molecules becomes more manageable for further more elaborated and time-consuming analysis steps. Results of a case study are presented and discussed, which lead to some general conclusions for an efficient expansion of the chemical space to be screened in pharmaceutical companies.

  15. On Feature Binding in Space and Time

    OpenAIRE

    Chennu, Srivas

    2008-01-01

    When presented with a yellow Volkswagen and a red Ferrari, how does the brain �gure out which color goes with which car? The binding problem refers to how the visual system pre-consciously combines visual features of objects in the physical world to create coherent mental equivalents in our consciousness. I discuss why feature binding is a problem for our brains despite its seemingly e�ortless resolution in every-day life. Drawing from experimental cognitive psychology, I demonstrate how i...

  16. Detection of Facial Features in Scale-Space

    OpenAIRE

    P. Hosten; M. Asbach

    2007-01-01

    This paper presents a new approach to the detection of facial features. A scale adapted Harris Corner detector is used to find interest points in scale-space. These points are described by the SIFT descriptor. Thus invariance with respect to image scale, rotation and illumination is obtained. Applying a Karhunen-Loeve transform reduces the dimensionality of the feature space. In the training process these features are clustered by the k-means algorithm, followed by a cluster analysis to find ...

  17. Detection of Facial Features in Scale-Space

    Directory of Open Access Journals (Sweden)

    P. Hosten

    2007-01-01

    Full Text Available This paper presents a new approach to the detection of facial features. A scale adapted Harris Corner detector is used to find interest points in scale-space. These points are described by the SIFT descriptor. Thus invariance with respect to image scale, rotation and illumination is obtained. Applying a Karhunen-Loeve transform reduces the dimensionality of the feature space. In the training process these features are clustered by the k-means algorithm, followed by a cluster analysis to find the most distinctive clusters, which represent facial features in feature space. Finally, a classifier based on the nearest neighbor approach is used to decide whether the features obtained from the interest points are facial features or not. 

  18. Concept Framework for Audio Information Retrieval: ARF

    Institute of Scientific and Technical Information of China (English)

    LI GuoHui(李国辉); WU DeFeng(武德峰); ZHANG Jun(张军)

    2003-01-01

    The majority of researches on content-based retrieval focused on visual media.However audio is also an important medium and information carrier from the viewpoint of humanauditory perception, so it is needed to retrieve for audio collection. Audio is handled by conven-tional methods as an opaque stream medium, which is not suitable for information retrieval byits content. In fact, audio carries rich aural information with the form of speech, musical, andsound effects, so it could be retrieved based on its aural content, such as acoustic features, musicalmelodies and associated semantics. In this paper, a concept framework (ARF) for content-basedaudio retrieval is proposed from systematic perspectives, which describes audio content model,audio retrieval architecture and audio query schemes. Audio contents are represented by a hier-archical model and a set of formal descriptions from physical to acoustic to semantic level, whichdepict acoustic features, logical structure and semantics of audio and audio objects. The archi-tecture consisting of audio meta-database, populating and accessing modules presents a systemstructure view of audio information retrieval. The query schemes give generalized approaches andmodes concerning how users deliver audio information needs to audio collections. Finally, an audioretrieval example implemented is used to explain and specify the application of the components in the proposed ARF.

  19. Audio Watermarking Based On The PSK Modulation

    Directory of Open Access Journals (Sweden)

    Wahid Barkouti

    2011-09-01

    Full Text Available Audio watermarking is a technique, which can be used to embed information into the digital representation of audio signals. The main challenge is to hide data representing some information withoutcompromising the quality of the watermarked track and at the same time ensure that the embedded watermark is robust against removal attacks. Especially providing perfect audio quality combined withhigh robustness against a wide variety of attacks is not adequately addressed and evaluated in current watermarking systems. In this paper, we present a new phase modulation audio watermarking technique,which among other features provides evidence for high audio quality. PSK modulation has been proposed as an effective approach to watermarking.

  20. pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis.

    Science.gov (United States)

    Giannakopoulos, Theodoros

    2015-01-01

    Audio information plays a rather important role in the increasing digital content that is available today, resulting in a need for methodologies that automatically analyze such content: audio event recognition for home automations and surveillance systems, speech recognition, music information retrieval, multimodal analysis (e.g. audio-visual analysis of online videos for content-based recommendation), etc. This paper presents pyAudioAnalysis, an open-source Python library that provides a wide range of audio analysis procedures including: feature extraction, classification of audio signals, supervised and unsupervised segmentation and content visualization. pyAudioAnalysis is licensed under the Apache License and is available at GitHub (https://github.com/tyiannak/pyAudioAnalysis/). Here we present the theoretical background behind the wide range of the implemented methodologies, along with evaluation metrics for some of the methods. pyAudioAnalysis has been already used in several audio analysis research applications: smart-home functionalities through audio event detection, speech emotion recognition, depression classification based on audio-visual features, music segmentation, multimodal content-based movie recommendation and health applications (e.g. monitoring eating habits). The feedback provided from all these particular audio applications has led to practical enhancement of the library. PMID:26656189

  1. pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis.

    Science.gov (United States)

    Giannakopoulos, Theodoros

    2015-01-01

    Audio information plays a rather important role in the increasing digital content that is available today, resulting in a need for methodologies that automatically analyze such content: audio event recognition for home automations and surveillance systems, speech recognition, music information retrieval, multimodal analysis (e.g. audio-visual analysis of online videos for content-based recommendation), etc. This paper presents pyAudioAnalysis, an open-source Python library that provides a wide range of audio analysis procedures including: feature extraction, classification of audio signals, supervised and unsupervised segmentation and content visualization. pyAudioAnalysis is licensed under the Apache License and is available at GitHub (https://github.com/tyiannak/pyAudioAnalysis/). Here we present the theoretical background behind the wide range of the implemented methodologies, along with evaluation metrics for some of the methods. pyAudioAnalysis has been already used in several audio analysis research applications: smart-home functionalities through audio event detection, speech emotion recognition, depression classification based on audio-visual features, music segmentation, multimodal content-based movie recommendation and health applications (e.g. monitoring eating habits). The feedback provided from all these particular audio applications has led to practical enhancement of the library.

  2. pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis.

    Directory of Open Access Journals (Sweden)

    Theodoros Giannakopoulos

    Full Text Available Audio information plays a rather important role in the increasing digital content that is available today, resulting in a need for methodologies that automatically analyze such content: audio event recognition for home automations and surveillance systems, speech recognition, music information retrieval, multimodal analysis (e.g. audio-visual analysis of online videos for content-based recommendation, etc. This paper presents pyAudioAnalysis, an open-source Python library that provides a wide range of audio analysis procedures including: feature extraction, classification of audio signals, supervised and unsupervised segmentation and content visualization. pyAudioAnalysis is licensed under the Apache License and is available at GitHub (https://github.com/tyiannak/pyAudioAnalysis/. Here we present the theoretical background behind the wide range of the implemented methodologies, along with evaluation metrics for some of the methods. pyAudioAnalysis has been already used in several audio analysis research applications: smart-home functionalities through audio event detection, speech emotion recognition, depression classification based on audio-visual features, music segmentation, multimodal content-based movie recommendation and health applications (e.g. monitoring eating habits. The feedback provided from all these particular audio applications has led to practical enhancement of the library.

  3. Multipurpose audio watermarking algorithm

    Institute of Scientific and Technical Information of China (English)

    Ning CHEN; Jie ZHU

    2008-01-01

    To make audio watermarking accomplish both copyright protection and content authentication with localization, a novel multipurpose audio watermarking scheme is proposed in this paper. The zero-watermarking idea is introduced into the design of robust watermarking algorithm to ensure the transparency and to avoid the interference between the robust watermark and the semi-fragile watermark. The property of natural audio that the VQ indices of DWT-DCT coefficients among neighboring frames tend to be very similar is utilized to extract essential feature from the host audio, which is then used for watermark extraction. And, the chaotic mapping based semi-fragile watermark is embedded in the detail wavelet coefficients based on the instantaneous mixing model of the independent component analysis (ICA) system. Both the robust and semi-fragile watermarks can be extracted blindly and the semi-fragile watermarking algorithm can localize the tampering accurately. Simulation results demonstrate the effectiveness of our algorithm in terms of transparency, security, robustness and tampering localization ability.

  4. AC-3 audio coder

    Science.gov (United States)

    Todd, Craig

    1995-12-01

    AC-3 is a system for coding up to 5.1 channels of audio into a low bit-rate data stream. High quality may be obtained with compression ratios approaching 12-1 for multichannel audio programs. The high compression ratio is achieved by methods which do not increase decoder memory, and thus cost. The methods employed include: the transmission of a high frequency resolution spectral envelope; and a novel forward/backward adaptive bit allocation algorithm. In order to satisfy practical requirements of an emissions coder, the AC-3 syntax includes a number of features useful to broadcasters and consumers. These features include: loudness uniformity between programs; dynamic range control; and broadcaster control of downmix coefficients. The AC-3 coder has been formally selected for inclusion of the U.S. HDTV broadcast standard, and has been informally selected for several additional applications.

  5. Feature-space transformation improves supervised segmentation across scanners

    DEFF Research Database (Denmark)

    van Opbroek, Annegreet; Achterberg, Hakim C.; de Bruijne, Marleen

    2015-01-01

    Image-segmentation techniques based on supervised classification generally perform well on the condition that training and test samples have the same feature distribution. However, if training and test images are acquired with different scanners or scanning parameters, their feature distributions....... This transformation is learned from unlabeled images of subjects scanned on both the training scanner and the test scanner. We evaluated our method on hippocampus segmentation on 27 images of the Harmonized Hippocampal Protocol (HarP), a heterogeneous dataset consisting of 1.5T and 3T MR images. The results showed...... can be very different, which can hurt the performance of such techniques. We propose a feature-space-transformation method to overcome these differences in feature distributions. Our method learns a mapping of the feature values of training voxels to values observed in images from the test scanner...

  6. Registration of Standardized Histological Images in Feature Space

    CERN Document Server

    Bagci, Ulas; 10.1117/12.770219

    2009-01-01

    In this paper, we propose three novel and important methods for the registration of histological images for 3D reconstruction. First, possible intensity variations and nonstandardness in images are corrected by an intensity standardization process which maps the image scale into a standard scale where the similar intensities correspond to similar tissues meaning. Second, 2D histological images are mapped into a feature space where continuous variables are used as high confidence image features for accurate registration. Third, we propose an automatic best reference slice selection algorithm that improves reconstruction quality based on both image entropy and mean square error of the registration process. We demonstrate that the choice of reference slice has a significant impact on registration error, standardization, feature space and entropy information. After 2D histological slices are registered through an affine transformation with respect to an automatically chosen reference, the 3D volume is reconstruct...

  7. Scale space smoothing, image feature extraction and bessel filters

    OpenAIRE

    Mahmoodi S.; Gunn S.

    2011-01-01

    The Green function of Mumford-Shah functional in the absence of discontinuities is known to be a modified Bessel function of the second kind and zero degree. Such a Bessel function is regularized here and used as a filter for feature extraction. It is demonstrated in this paper that a Bessel filter does not follow the scale space smoothing property of bounded linear filters such as Gaussian filters. The features extracted by the Bessel filter are therefore scale invariant. Edges, blobs, and j...

  8. Image registration in high-dimensional feature space

    Science.gov (United States)

    Neemuchwala, Huzefa F.; Hero, Alfred O.

    2005-03-01

    Image registration is a difficult task especially when spurrious image intensity differences and spatial variations between the two images are present. To robustify image registration algorithms to such spurrious variations it can be useful to employ an image registration matching criteria on higher dimensional feature spaces. This paper will present an overviewof our recent work on image registration using high dimensional image features and entropic graph matching criteria. New entropic graph estimates of information divergence measures will be presented. We will demonstrate the advantage of our approach for ultrasound breast image registration.

  9. Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning

    CERN Document Server

    Bach, Francis

    2008-01-01

    For supervised and unsupervised learning, positive definite kernels allow to use large and potentially infinite dimensional feature spaces with a computational cost that only depends on the number of observations. This is usually done through the penalization of predictor functions by Euclidean or Hilbertian norms. In this paper, we explore penalizing by sparsity-inducing norms such as the l1-norm or the block l1-norm. We assume that the kernel decomposes into a large sum of individual basis kernels which can be embedded in a directed acyclic graph; we show that it is then possible to perform kernel selection through a hierarchical multiple kernel learning framework, in polynomial time in the number of selected kernels. This framework is naturally applied to non linear variable selection; our extensive simulations on synthetic datasets and datasets from the UCI repository show that efficiently exploring the large feature space through sparsity-inducing norms leads to state-of-the-art predictive performance.

  10. Feature extraction on local jet space for texture classification

    Science.gov (United States)

    Oliveira, Marcos William da Silva; da Silva, Núbia Rosa; Manzanera, Antoine; Bruno, Odemir Martinez

    2015-12-01

    The proposal of this study is to analyze the texture pattern recognition over the local jet space looking forward to improve the texture characterization. Local jets decompose the image based on partial derivatives allowing the texture feature extraction be exploited in different levels of geometrical structures. Each local jet component evidences a different local pattern, such as, flat regions, directional variations and concavity or convexity. Subsequently, a texture descriptor is used to extract features from 0th, 1st and 2nd-derivative components. Four well-known databases (Brodatz, Vistex, Usptex and Outex) and four texture descriptors (Fourier descriptors, Gabor filters, Local Binary Pattern and Local Binary Pattern Variance) were used to validate the idea, showing in most cases an increase of the success rates.

  11. Audio Classification from Time-Frequency Texture

    CERN Document Server

    Yu, Guoshen

    2008-01-01

    Time-frequency representations of audio signals often resemble texture images. This paper derives a simple audio classification algorithm based on treating sound spectrograms as texture images. The algorithm is inspired by an earlier visual classification scheme particularly efficient at classifying textures. While solely based on time-frequency texture features, the algorithm achieves surprisingly good performance in musical instrument classification experiments.

  12. Structure Learning in Audio

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch

    and speech, using novel features based on pitch dynamics. Within instrument classification two different harmonic models have been compared. Finally voiced/unvoiced segmentation of popular music is done based on MFCC’s and AR coefficients. The structures in the mixings of multiple sources have been...... investigated. A fast and computationally simple approach that compares recordings and classifies if they are from the same audio environment have been developed, and shows very high accuracy and the ability to synchronize recordings in the case of recording devices which are not connected. A more general model...

  13. Learning bimodal structure in audio-visual data

    OpenAIRE

    Monaci, Gianluca; Vandergheynst, Pierre; Sommer, Friederich T.

    2009-01-01

    A novel model is presented to learn bimodally informative structures from audio-visual signals. The signal is represented as a sparse sum of audio- visual kernels. Each kernel is a bimodal function consisting of synchronous snippets of an audio waveform and a spatio-temporal visual basis function. To represent an audio-visual signal, the kernels can be positioned independently and arbitrarily in space and time. The proposed algorithm uses unsupervised learning to form dicti...

  14. Watermarking Algorithm Based on Audio Features and Smaller Values of Low Frequency Coefficients%基于音频特征和低频系数较小值的水印算法

    Institute of Scientific and Technical Information of China (English)

    杨得国; 李智; 姜金娣

    2012-01-01

    This paper presents a digital watermarking algorithm based on audio features and the relatively smaller values of the low frequency of DWT coefficients. The proposed algorithm analyzes the zero-cross ratio and the short-time energy of frame to select the appropriate threshold for initial discarding the high-frequency signal components of the audio frame, selecting the audio frame to be processed. The selected audio frames are stitched together and perform discrete wavelet transform, choosing the low-frequency coefficients and segmenting, using the sum binary value of the neighboring three watermark bit combination to determine the location of the watermark embedding, and make its value smaller. The proposed algorithm for blind watermark detection reduces the extraction time complexity of low-frequency components through the analysis of audio features, and improves the robustness of the watermark.%提出一种基于音频特征和DWT低频系数相对较小值的数字水印算法.分析音频帧的过零率及短时能量,选取适当的阈值初步舍弃表明信号中高频信号成分的音频帧,筛选出待处理的音频帧.将选定的音频帧拼接在一起进行小波变换,选取低频系数并分段,通过相邻3个水印比特组合的二进制之和确定水印在该段中的嵌入位置,将系数修改为相邻系数中较小的值.实验结果表明,该算法通过对音频特征的分析,能降低提取低频分量的时间复杂度,实现水印信息的盲检测,提高水印的鲁棒性.

  15. Audio Papers - A Manifesto

    DEFF Research Database (Denmark)

    Krogh Groth, Sanne; Samson, Kristine

    2016-01-01

    Audio papers resemble the regular essay or the academic text in that they deal with a certain topic of interest, but presented in the form of an audio production. The audio paper is an extension of the written paper through its specific use of media, a sonic awareness of aesthetics and materialit...

  16. Audio Mining with emphasis on Music Genre Classification

    DEFF Research Database (Denmark)

    Meng, Anders

    2004-01-01

    etc. is receiving quite a lot of attention. The first breakthough in audio mining was created by MuscleFish in 1996, a simple audio retrieval system. With the increasing amount of audio material being accessible through the web, e.g. Apple's iTunes (700,000+ songs), Sony, Amazon, new methods...... in searching / retrieving audio effectively is needed. Currently, search engines such as e.g. Google, AltaVista etc. do not search into audio files, but uses either the textual information attached to the audio file or the textual information around the audio. Also in the hearing aid industries around...... to choose from. Basically every audio mining system is more or less consisting of the same stages as for the music genre setting. My research so far has mainly focussed on finding relevant features for music genre classification living at different timescales using early and late information fusion. It has...

  17. Feature Space Mapping as a universal adaptive system

    Science.gov (United States)

    Duch, Włodzisław; Diercksen, Geerd H. F.

    1995-06-01

    The most popular realizations of adaptive systems are based on the neural network type of algorithms, in particular feedforward multilayered perceptrons trained by backpropagation of error procedures. In this paper an alternative approach based on multidimensional separable localized functions centered at the data clusters is proposed. In comparison with the neural networks that use delocalized transfer functions this approach allows for full control of the basins of attractors of all stationary points. Slow learning procedures are replaced by the explicit construction of the landscape function followed by the optimization of adjustable parameters using gradient techniques or genetic algorithms. Retrieving information does not require searches in multidimensional subspaces but it is factorized into a series of one-dimensional searches. Feature Space Mapping is applicable to learning not only from facts but also from general laws and may be treated as a fuzzy expert system (neurofuzzy system). The number of nodes (fuzzy rules) is growing as the network creates new nodes for novel data but the search time is sublinear in the number of rules or data clusters stored. Such a system may work as a universal classificator, approximator and reasoning system. Examples of applications for the identification of spectra (classification), intelligent databases (association) and for the analysis of simple electrical circuits (expert system type) are given.

  18. Back to basics audio

    CERN Document Server

    Nathan, Julian

    1998-01-01

    Back to Basics Audio is a thorough, yet approachable handbook on audio electronics theory and equipment. The first part of the book discusses electrical and audio principles. Those principles form a basis for understanding the operation of equipment and systems, covered in the second section. Finally, the author addresses planning and installation of a home audio system.Julian Nathan joined the audio service and manufacturing industry in 1954 and moved into motion picture engineering and production in 1960. He installed and operated recording theaters in Sydney, Austra

  19. A Model of Distraction in an Audio-on-Audio Interference Situation with Music Program Material

    DEFF Research Database (Denmark)

    Francombe, J.; Mason, R.; Dewhirst, M.;

    2015-01-01

    listener can be viewed as having a personal sound zone system. In order to evaluate and optimize such situations in a perceptually relevant manner, the authors created a predictive model using the features that contribute to the distraction from unwanted sounds. Feature extraction was motivated...... by a qualitative analysis of subject responses. Distraction ratings were collected for one hundred randomly created audio-on-audio interference situations with music target and interferer programs. The selected features were related to the overall loudness, loudness ratio, perceptual evaluation of audio source...

  20. Enhancing Manual Scan Registration Using Audio Cues

    Science.gov (United States)

    Ntsoko, T.; Sithole, G.

    2014-04-01

    Indoor mapping and modelling requires that acquired data be processed by editing, fusing, formatting the data, amongst other operations. Currently the manual interaction the user has with the point cloud (data) while processing it is visual. Visual interaction does have limitations, however. One way of dealing with these limitations is to augment audio in point cloud processing. Audio augmentation entails associating points of interest in the point cloud with audio objects. In coarse scan registration, reverberation, intensity and frequency audio cues were exploited to help the user estimate depth and occupancy of space of points of interest. Depth estimations were made reliably well when intensity and frequency were both used as depth cues. Coarse changes of depth could be estimated in this manner. The depth between surfaces can therefore be estimated with the aid of the audio objects. Sound reflections of an audio object provided reliable information of the object surroundings in some instances. For a point/area of interest in the point cloud, these reflections can be used to determine the unseen events around that point/area of interest. Other processing techniques could benefit from this while other information is estimated using other audio cues like binaural cues and Head Related Transfer Functions. These other cues could be used in position estimations of audio objects to aid in problems such as indoor navigation problems.

  1. AUTOMATIC SEGMENTATION OF BROADCAST AUDIO SIGNALS USING AUTO ASSOCIATIVE NEURAL NETWORKS

    Directory of Open Access Journals (Sweden)

    P. Dhanalakshmi

    2010-12-01

    Full Text Available In this paper, we describe automatic segmentation methods for audio broadcast data. Today, digital audio applications are part of our everyday lives. Since there are more and more digital audio databases in place these days, the importance of effective management for audio databases have become prominent. Broadcast audio data is recorded from the Television which comprises of various categories of audio signals. Efficient algorithms for segmenting the audio broadcast data into predefined categories are proposed. Audio features namely Linear prediction coefficients (LPC, Linear prediction cepstral coefficients, and Mel frequency cepstral coefficients (MFCC are extracted to characterize the audio data. Auto Associative Neural Networks are used to segment the audio data into predefined categories using the extracted features. Experimental results indicate that the proposed algorithms can produce satisfactory results.

  2. Audio Interfaces for Improved Accessibility

    OpenAIRE

    Duarte, Carlos; Carrico, Lu&#;s

    2008-01-01

    This chapter focused on how endowing interfaces with audio interaction capabilities can improve their accessibility. To exemplify this outcome the development of several versions of a Digital Talking Book player was presented. This allowed us to show it is possible to maintain the same set of features while stripping the interface of visual components, and still keep it usable for the visually impaired population. The interface development concerns focused on both ends of the interaction spec...

  3. Principles of Audio Watermarking

    Directory of Open Access Journals (Sweden)

    Martin Hrncar

    2008-01-01

    Full Text Available The article contains a brief overview of modern methods for embedding additional data in audio signals. It could have many reasons - for the purposes of access control or identification related to particular type of audio. This secret information is not “visible” for a user. This concept utilizes the imperfection of human auditory system. Simple data hiding into audio file has been proved in MATLAB.

  4. Digital Audio Legal Recorder

    Data.gov (United States)

    Department of Transportation — The Digital Audio Legal Recorder (DALR) provides the legal recording capability between air traffic controllers, pilots and ground-based air traffic control TRACONs...

  5. Extraction of Geometric Features of Wear Particles in Color Ferrograph Images Based on RGB Color Space

    Institute of Scientific and Technical Information of China (English)

    CHEN Gui-ming; WANG Han-gong; ZHANG Bao-jun; PAN Wei

    2003-01-01

    This paper analyzes the potential color formats of ferrograph images, and presents the algorithms of converting the formats to RGB(Red, Green, Blue) color space. Through statistical analysis of wear par-ticles' geometric features of color ferrograph images in the RGB color space, we give the differences of ferro-graph wear panicles' geometric features among RGB color spaces and gray scale space, and calculate their respective distributions.

  6. The Schema Features and Aesthetic Functions of the Foreign Language Teaching with Electric Audio-visual Aids%外语电化教学的图式特征与美育功能

    Institute of Scientific and Technical Information of China (English)

    齐欣

    2015-01-01

    外语电化教学对传统外语教学模式提出挑战的同时,其自身也面临着诸多的挑战,需要更多的理论支撑和功能研究。基于图式理论和美育教育,对外语电化教学图式特征及其隐性、感性、个性三种美育功能的创新审视,进一步丰富了外语电化教学的理论基础,并强调了其美育功能实现的必要性。%While the foreign language teaching with electric audio-visual aids brings about challenges to the traditional language teaching,it is also faced with many challenges,and more studies on its theoretical basis and functions are encouraged. On the basis of Schema Theory and aesthetic education,this paper makes an innovative examination of the schema features of foreign language teaching with electric audio-visual aids and its implicit,emotional,and personalized aesthetic functions,further enriches its theoretical basis and emphasizes the necessity of achieving its aesthetic functions.

  7. The Study of Audio Watermarking

    Institute of Scientific and Technical Information of China (English)

    王景; 唐晟

    2011-01-01

    This paper mainly introduced the basic knowledge of the digital watermarking and digital audio watermarking, including the definition of digital watermarking and digital audio watermarking, the embedding algorithm of digital audio watermarking and the com

  8. Robust audio hashing for audio authentication watermarking

    Science.gov (United States)

    Zmudzinski, Sascha; Steinebach, Martin

    2008-02-01

    Current systems and protocols based on cryptographic methods for integrity and authenticity verification of media data do not distinguish between legitimate signal transformation and malicious tampering that manipulates the content. Furthermore, they usually provide no localization or assessment of the relevance of such manipulations with respect to human perception or semantics. We present an algorithm for a robust message authentication code in the context of content fragile authentication watermarking to verify the integrity of audio recodings by means of robust audio fingerprinting. Experimental results show that the proposed algorithm provides both a high level of distinction between perceptually different audio data and a high robustness against signal transformations that do not change the perceived information. Furthermore, it is well suited for the integration in a content-based authentication watermarking system.

  9. Audio fingerprint extraction for content identification

    Science.gov (United States)

    Shiu, Yu; Yeh, Chia-Hung; Kuo, C. C. J.

    2003-11-01

    In this work, we present an audio content identification system that identifies some unknown audio material by comparing its fingerprint with those extracted off-line and saved in the music database. We will describe in detail the procedure to extract audio fingerprints and demonstrate that they are robust to noise and content-preserving manipulations. The main feature in the proposed system is the zero-crossing rate extracted with the octave-band filter bank. The zero-crossing rate can be used to describe the dominant frequency in each subband with a very low computational cost. The size of audio fingerprint is small and can be efficiently stored along with the compressed files in the database. It is also robust to many modifications such as tempo change and time-alignment distortion. Besides, the octave-band filter bank is used to enhance the robustness to distortion, especially those localized on some frequency regions.

  10. Roundtable Audio Discussion

    Directory of Open Access Journals (Sweden)

    Chris Bigum

    2007-01-01

    Full Text Available RoundTable on Technology, Teaching and Tools. This is a roundtable audio interview conducted by James Farmer, founder of Edublogs, with Anne Bartlett-Bragg (University of Technology Sydney and Chris Bigum (Deakin University. Skype was used to make and record the audio conference and the resulting sound file was edited by Andrew McLauchlan.

  11. Limitations in 4-Year-Old Children's Sensitivity to the Spacing among Facial Features

    Science.gov (United States)

    Mondloch, Catherine J.; Thomson, Kendra

    2008-01-01

    Four-year-olds' sensitivity to differences among faces in the spacing of features was tested under 4 task conditions: judging distinctiveness when the external contour was visible and when it was occluded, simultaneous match-to-sample, and recognizing the face of a friend. In each task, the foil differed only in the spacing of features, and…

  12. Watermarking-Based Digital Audio Data Authentication

    Directory of Open Access Journals (Sweden)

    Jana Dittmann

    2003-09-01

    Full Text Available Digital watermarking has become an accepted technology for enabling multimedia protection schemes. While most efforts concentrate on user authentication, recently interest in data authentication to ensure data integrity has been increasing. Existing concepts address mainly image data. Depending on the necessary security level and the sensitivity to detect changes in the media, we differentiate between fragile, semifragile, and content-fragile watermarking approaches for media authentication. Furthermore, invertible watermarking schemes exist while each bit change can be recognized by the watermark which can be extracted and the original data can be reproduced for high-security applications. Later approaches can be extended with cryptographic approaches like digital signatures. As we see from the literature, only few audio approaches exist and the audio domain requires additional strategies for time flow protection and resynchronization. To allow different security levels, we have to identify relevant audio features that can be used to determine content manipulations. Furthermore, in the field of invertible schemes, there are a bunch of publications for image and video data but no approaches for digital audio to ensure data authentication for high-security applications. In this paper, we introduce and evaluate two watermarking algorithms for digital audio data, addressing content integrity protection. In our first approach, we discuss possible features for a content-fragile watermarking scheme to allow several postproduction modifications. The second approach is designed for high-security applications to detect each bit change and reconstruct the original audio by introducing an invertible audio watermarking concept. Based on the invertible audio scheme, we combine digital signature schemes and digital watermarking to provide a public verifiable data authentication and a reproduction of the original, protected with a secret key.

  13. MedlinePlus FAQ: Is audio description available for videos on MedlinePlus?

    Science.gov (United States)

    ... audiodescription.html Question: Is audio description available for videos on MedlinePlus? To use the sharing features on ... page, please enable JavaScript. Answer: Audio description of videos helps make the content of videos accessible to ...

  14. Portable Audio Design

    DEFF Research Database (Denmark)

    Groth, Sanne Krogh

    2014-01-01

    The chapter presents a methodological approach to the early process of producing portable audio design. The chapter high lights audio walks and audio guides, but can also be of inspiration when working with graphical and video production for portable devices. The final products can be presented...... within online and physical institutional contexts. The approach focuses especially on the relationship to specific sites, and how an awareness of the relationship between the site and the production can be part of the design process. Such awareness entails several approaches: the necessity of paying...

  15. Audio Video Compression Stream Synthesis and Implementation

    Institute of Scientific and Technical Information of China (English)

    徐燕凌; 方向忠; 周源华

    2004-01-01

    Multiplex of digital streams is one of the key technologies in audio video communication, and determines audio-video quality. A design scheme for an MPEG2 compliant digital television system including audio-video encoding and multiplexing was implemented. The principles and elements of system layer stream synthesis were analyzed. The key technologies of video and audio PES packetization were discussed, such as stream structure,scheduling matching, audio-video synchronization, data flow and buffering. DSP and FPGA are combined to construct header information and packet structure. The substitution of traditional RAM or PLD results in high operational efficiency and saves memory space. A scheduling algorithm was introduced for PES coding, using the monitor information of PES buffers. DTS is generated by multiplexer to guarantee synchronization. The system is not only simple but also stable, and maintains synchronization constraints of the standard. It supports both analogy and digital audio-video source input, and provides real-time MPEG2 compliant TS/PS output. It has perfect performance and meets the national broadcasting requirements.

  16. High energy electromagnetic cascades in extragalactic space: physics and features

    CERN Document Server

    Berezinsky, V

    2016-01-01

    Using the analytic modeling of the electromagnetic cascades compared with more precise numerical simulations we describe the physical properties of electromagnetic cascades developing in the universe on CMB and EBL background radiations. A cascade is initiated by very high energy photon or electron and the remnant photons at large distance have two-component energy spectrum, $\\propto E^{-2}$ ($\\propto E^{-1.9}$ in numerical simulations) produced at cascade multiplication stage, and $\\propto E^{-3/2}$ from Inverse Compton electron cooling at low energies. The most noticeable property of the cascade spectrum in analytic modeling is 'strong universality', which includes the standard energy spectrum and the energy density of the cascade $\\omega_{\\rm cas}$ as its only numerical parameter. Using numerical simulations of the cascade spectrum and comparing it with recent Fermi LAT spectrum we obtained the upper limit on $\\omega_{\\rm cas}$ stronger than in previous works. The new feature of the analysis is "$E_{\\max}$...

  17. A fast one-pass-training feature selection technique for GMM-based acoustic event detection with audio-visual data

    OpenAIRE

    Butko, Taras; Nadeu Camprubí, Climent

    2010-01-01

    Acoustic event detection becomes a difficult task, even for a small number of events, in scenarios where events are produced rather spontaneously and often overlap in time. In this work, we aim to improve the detection rate by means of feature selection. Using a one-against-all detection approach, a new fast one-pass-training algorithm, and an associated highly-precise metric are developed. Choosing a different subset of multimodal features for each acoustic event class, the results obtain...

  18. Introduction to AVS Audio

    Institute of Scientific and Technical Information of China (English)

    Hao-Jun Ai; Shui-Xian Chen; Rui-Min Hu

    2006-01-01

    This paper describes a general audio coding algorithm which has been recently standardized by AVS, China.The algorithm is based on a perceptual coding technique. The codec delivers near CD-quality audio at 128kb/s. This paper describes the coder structure in detail and discusses the reasons for specific design methods. A summary of the subjective test results are presented for the prototype codec. Comparison Mean Opinion Score (CMOS) test indicates that the quality of the AVS audio coder is comparable with MPEG Layer-3 audio coder. A real-time decoder was used for the characterization test,which is based on a 16-bit fixed-point DSP. The performance of the DSP solution was demonstrated, including computational complexity and storage characteristics.

  19. Forensic audio watermark detection

    Science.gov (United States)

    Steinebach, Martin; Zmudzinski, Sascha; Petrautzki, Dirk

    2012-03-01

    Digital audio watermarking detection is often computational complex and requires at least as much audio information as required to embed a complete watermark. In some applications, especially real-time monitoring, this is an important drawback. The reason for this is the usage of sync sequences at the beginning of the watermark, allowing a decision about the presence only if at least the sync has been found and retrieved. We propose an alternative method for detecting the presence of a watermark. Based on the knowledge of the secret key used for embedding, we create a mark for all potential marking stages and then use a sliding window to test a given audio file on the presence of statistical characteristics caused by embedding. In this way we can detect a watermark in less than 1 second of audio.

  20. Voice activity detection using audio-visual information

    DEFF Research Database (Denmark)

    Petsatodis, Theodore; Pnevmatikakis, Aristodemos; Boukis, Christos

    2009-01-01

    An audio-visual voice activity detector that uses sensors positioned distantly from the speaker is presented. Its constituting unimodal detectors are based on the modeling of the temporal variation of audio and visual features using Hidden Markov Models; their outcomes are fused using a post...

  1. Improving audio chord transcription by exploiting harmonic and metric knowledge

    NARCIS (Netherlands)

    de Haas, W.B.; Rodrigues Magalhães, J.P.; Wiering, F.

    2012-01-01

    We present a new system for chord transcription from polyphonic musical audio that uses domain-specific knowledge about tonal harmony and metrical position to improve chord transcription performance. Low-level pulse and spectral features are extracted from an audio source using the Vamp plugin archi

  2. Structure Learning in Audio

    OpenAIRE

    Nielsen, Andreas Brinch; Hansen, Lars Kai

    2009-01-01

    By having information about the setting a user is in, a computer is able to make decisions proactively to facilitate tasks for the user. Two approaches are taken in this thesis to achieve more information about an audio environment. One approach is that of classifying audio, and a new approach using pitch dynamics is suggested. The other approach is finding structures between the mixings of multiple sources based on an assumption of statistical independence of the sources. Three different aud...

  3. A Physiologically Inspired Method for Audio Classification

    Directory of Open Access Journals (Sweden)

    David V. Anderson

    2005-06-01

    Full Text Available We explore the use of physiologically inspired auditory features with both physiologically motivated and statistical audio classification methods. We use features derived from a biophysically defensible model of the early auditory system for audio classification using a neural network classifier. We also use a Gaussian-mixture-model (GMM-based classifier for the purpose of comparison and show that the neural-network-based approach works better. Further, we use features from a more advanced model of the auditory system and show that the features extracted from this model of the primary auditory cortex perform better than the features from the early auditory stage. The features give good classification performance with only one-second data segments used for training and testing.

  4. Audio-visual classification video browser

    OpenAIRE

    Scott, David; Zhang, ZhenXing; Albatal, Rami; McGuinness, Kevin; Acar, Esra; Hopfgartner, Frank; Gurrin, Cathal; O'Connor, Noel; Smeaton, Alan

    2014-01-01

    This paper presents our third participation in the Video Browser Showdown. Building on the experience that we gained while participating in this event, we compete in the 2014 showdown with a more advanced browsing system based on incorporating several audio- visual retrieval techniques. This paper provides a short overview of the features and functionality of our new system.

  5. Using Touch Screen Audio-CASI to Obtain Data on Sensitive Topics

    OpenAIRE

    Cooley, Philip C.; Rogers, Susan M; Turner, Charles F.; Al-Tayyib, Alia A.; Willis, Gordon; Ganapathi, Laxminarayana

    2001-01-01

    This paper describes a new interview data collection system that uses a laptop personal computer equipped with a touch-sensitive video monitor. The touch-screen-based audio computer-assisted self-interviewing system, or touch screen audio-CASI, enhances the ease of use of conventional audio CASI systems while simultaneously providing the privacy of self-administered questionnaires. We describe touch screen audio-CASI design features and operational characteristics. In addition, we present dat...

  6. Virtual Audio - Three-Dimensional Audio in Virtual Environments

    OpenAIRE

    Adler, Daniel

    1996-01-01

    Three-dimensional interactive audio has a variety ofpotential uses in human-machine interfaces. After lagging seriously behind the visual components, the importance of sound is now becoming increas-ingly accepted. This paper mainly discusses background and techniques to implement three-dimensional audio in computer interfaces. A case study of a system for three-dimensional audio, implemented by the author, is described in great detail. The audio system was moreover integra...

  7. Digital Multicasting of Multiple Audio Streams

    Science.gov (United States)

    Macha, Mitchell; Bullock, John

    2007-01-01

    at the MCC. In the other access-control provision, the program verifies that the user is authorized to have access to the audio streams. Once both access-control checks are completed, the audio software presents a graphical display that includes audiostream-selection buttons and volume-control sliders. The user can select all or any subset of the available audio streams and can adjust the volume of each stream independently of that of the other streams. The audio-player program spawns a "read" process for the selected stream(s). The spawned process sends, to the router(s), a "multicast-join" request for the selected streams. The router(s) responds to the request by sending the encrypted multicast packets to the spawned process. The spawned process receives the encrypted multicast packets and sends a decryption packet to audio-driver software. As the volume or muting features are changed by the user, interrupts are sent to the spawned process to change the corresponding attributes sent to the audio-driver software. The total latency of this system - that is, the total time from the origination of the audio signals to generation of sound at a listener s computer - lies between four and six seconds.

  8. Overall feature of EAST operation space by using simple Core-SOL-Divertor model

    International Nuclear Information System (INIS)

    We have developed a simple Core-SOL-Divertor (C-S-D) model to investigate qualitatively the overall features of the operational space for the integrated core and edge plasma. To construct the simple C-S-D model, a simple core plasma model of ITER physics guidelines and a two-point SOL-divertor model are used. The simple C-S-D model is applied to the study of the EAST operational space with lower hybrid current drive experiments under various kinds of trade-off for the basic plasma parameters. Effective methods for extending the operation space are also presented. As shown by this study for the EAST operation space, it is evident that the C-S-D model is a useful tool to understand qualitatively the overall features of the plasma operation space. (author)

  9. Perceptual Audio Hashing Functions

    Directory of Open Access Journals (Sweden)

    Emin Anarım

    2005-07-01

    Full Text Available Perceptual hash functions provide a tool for fast and reliable identification of content. We present new audio hash functions based on summarization of the time-frequency spectral characteristics of an audio document. The proposed hash functions are based on the periodicity series of the fundamental frequency and on singular-value description of the cepstral frequencies. They are found, on one hand, to perform very satisfactorily in identification and verification tests, and on the other hand, to be very resilient to a large variety of attacks. Moreover, we address the issue of security of hashes and propose a keying technique, and thereby a key-dependent hash function.

  10. DAFX Digital Audio Effects

    CERN Document Server

    2011-01-01

    The rapid development in various fields of Digital Audio Effects, or DAFX, has led to new algorithms and this second edition of the popular book, DAFX: Digital Audio Effects has been updated throughout to reflect progress in the field. It maintains a unique approach to DAFX with a lecture-style introduction into the basics of effect processing. Each effect description begins with the presentation of the physical and acoustical phenomena, an explanation of the signal processing techniques to achieve the effect, followed by a discussion of musical applications and the control of effect parameter

  11. Large anterior temporal Virchow-Robin spaces: unique MR imaging features

    Energy Technology Data Exchange (ETDEWEB)

    Lim, Anthony T. [Monash University, Neuroradiology Service, Monash Imaging, Monash Health, Melbourne, Victoria (Australia); Chandra, Ronil V. [Monash University, Neuroradiology Service, Monash Imaging, Monash Health, Melbourne, Victoria (Australia); Monash University, Department of Surgery, Faculty of Medicine, Nursing and Health Sciences, Melbourne (Australia); Trost, Nicholas M. [St Vincent' s Hospital, Neuroradiology Service, Melbourne (Australia); McKelvie, Penelope A. [St Vincent' s Hospital, Anatomical Pathology, Melbourne (Australia); Stuckey, Stephen L. [Monash University, Neuroradiology Service, Monash Imaging, Monash Health, Melbourne, Victoria (Australia); Monash University, Southern Clinical School, Faculty of Medicine, Nursing and Health Sciences, Melbourne (Australia)

    2015-05-01

    Large Virchow-Robin (VR) spaces may mimic cystic tumor. The anterior temporal subcortical white matter is a recently described preferential location, with only 18 reported cases. Our aim was to identify unique MR features that could increase prospective diagnostic confidence. Thirty-nine cases were identified between November 2003 and February 2014. Demographic, clinical data and the initial radiological report were retrospectively reviewed. Two neuroradiologists reviewed all MR imaging; a neuropathologist reviewed histological data. Median age was 58 years (range 24-86 years); the majority (69 %) was female. There were no clinical symptoms that could be directly referable to the lesion. Two thirds were considered to be VR spaces on the initial radiological report. Mean maximal size was 9 mm (range 5-17 mm); majority (79 %) had perilesional T2 or fluid-attenuated inversion recovery (FLAIR) hyperintensity. The following were identified as potential unique MR features: focal cortical distortion by an adjacent branch of the middle cerebral artery (92 %), smaller adjacent VR spaces (26 %), and a contiguous cerebrospinal fluid (CSF) intensity tract (21 %). Surgery was performed in three asymptomatic patients; histopathology confirmed VR spaces. Unique MR features were retrospectively identified in all three patients. Large anterior temporal lobe VR spaces commonly demonstrate perilesional T2 or FLAIR signal and can be misdiagnosed as cystic tumor. Potential unique MR features that could increase prospective diagnostic confidence include focal cortical distortion by an adjacent branch of the middle cerebral artery, smaller adjacent VR spaces, and a contiguous CSF intensity tract. (orig.)

  12. Large anterior temporal Virchow-Robin spaces: unique MR imaging features

    International Nuclear Information System (INIS)

    Large Virchow-Robin (VR) spaces may mimic cystic tumor. The anterior temporal subcortical white matter is a recently described preferential location, with only 18 reported cases. Our aim was to identify unique MR features that could increase prospective diagnostic confidence. Thirty-nine cases were identified between November 2003 and February 2014. Demographic, clinical data and the initial radiological report were retrospectively reviewed. Two neuroradiologists reviewed all MR imaging; a neuropathologist reviewed histological data. Median age was 58 years (range 24-86 years); the majority (69 %) was female. There were no clinical symptoms that could be directly referable to the lesion. Two thirds were considered to be VR spaces on the initial radiological report. Mean maximal size was 9 mm (range 5-17 mm); majority (79 %) had perilesional T2 or fluid-attenuated inversion recovery (FLAIR) hyperintensity. The following were identified as potential unique MR features: focal cortical distortion by an adjacent branch of the middle cerebral artery (92 %), smaller adjacent VR spaces (26 %), and a contiguous cerebrospinal fluid (CSF) intensity tract (21 %). Surgery was performed in three asymptomatic patients; histopathology confirmed VR spaces. Unique MR features were retrospectively identified in all three patients. Large anterior temporal lobe VR spaces commonly demonstrate perilesional T2 or FLAIR signal and can be misdiagnosed as cystic tumor. Potential unique MR features that could increase prospective diagnostic confidence include focal cortical distortion by an adjacent branch of the middle cerebral artery, smaller adjacent VR spaces, and a contiguous CSF intensity tract. (orig.)

  13. Efectos digitales de audio con Web Audio API

    OpenAIRE

    GARCÍA CHAPARRO, SAMUEL

    2015-01-01

    El presente trabajo consiste en un estudio de la capacidad de Web Audio API para el procesado de efectos de audio en tiempo real. De todos los efectos de audio posibles se han elegido el wah-wah, el flanger y el choris, efectos ampliamente empleados con guitarra eléctrica. Se crean funciones de lenguaje JavaScript que modelan el comportamiento de los efectos de audio elegidos, haciéndolas funcionar sobre una plataforma web HTML5. García Chaparro, S. (2015). Efectos digitales de audio con W...

  14. The audio expert everything you need to know about audio

    CERN Document Server

    Winer, Ethan

    2012-01-01

    The Audio Expert is a comprehensive reference that covers all aspects of audio, with many practical, as well as theoretical, explanations. Providing in-depth descriptions of how audio really works, using common sense plain-English explanations and mechanical analogies with minimal math, the book is written for people who want to understand audio at the deepest, most technical level, without needing an engineering degree. It's presented in an easy-to-read, conversational tone, and includes more than 400 figures and photos augmenting the text.The Audio Expert takes th

  15. Audio Feedback -- Better Feedback?

    Science.gov (United States)

    Voelkel, Susanne; Mello, Luciane V.

    2014-01-01

    National Student Survey (NSS) results show that many students are dissatisfied with the amount and quality of feedback they get for their work. This study reports on two case studies in which we tried to address these issues by introducing audio feedback to one undergraduate (UG) and one postgraduate (PG) class, respectively. In case study one…

  16. Embedded Audio Without Beeps

    DEFF Research Database (Denmark)

    Overholt, Daniel; Møbius, Nikolaj Friis

    2014-01-01

    software environments for audio processing) via innovative interfaces that send real-time inputs to such software running on a laptop, mobile device, or small Linux board (e.g., Raspberry Pi or Beagleboard). Basic hardware will be provided, but participants are also encouraged to bring related equipment...

  17. Feature-space clustering for fMRI meta-analysis

    DEFF Research Database (Denmark)

    Goutte, C.; Hansen, L.K.; Liptrot, Matthew George;

    2001-01-01

    MRI sequences containing several hundreds of images, it is sometimes necessary to invoke feature extraction to reduce the dimensionality of the data space. A second interesting application is in the meta-analysis of fMRI experiment, where features are obtained from a possibly large number of single-voxel......, shows interesting differences between individual voxel analysis performed with traditional methods. © 2001 Wiley-Liss, Inc....

  18. FPGA implementation of DWT for Audio Watermarking Application

    Directory of Open Access Journals (Sweden)

    Naveen.S.Hampannavar

    2013-06-01

    Full Text Available Digital water marking is a technique of embedding extra information into the multimedia content, which can be extracted to prove the copy rights. Compared to human visual system, audio system is more sensitive. As a result very few audio watermarking algorithms have been robust and imperceptible. In this paper we are implementing audio watermarking using discrete wavelet transform (DWT. Anaudio signal in the form of .wav file is decomposed into multi level DWT coefficients. A watermark signal is embedded in the final level coefficients. The audio is reconstructed from the embedded co-efficient using inverse DWT. The simulation results will be verified by comparing watermarked audio with the original audio for its perceptibility. The watermarked audio will be tested for its robustness towards retaining watermark. Computation of DWT involves large number of arithmetic operations. Hence, a hardware chip for the same would help in achieving real time performance, low power consumption and lesser area utilization. This hardware implementation through FPGA eases the integration of watermarking feature with the existing audio electronic systems.

  19. An alternative to scale-space representation for extracting local features in image recognition

    DEFF Research Database (Denmark)

    Andersen, Hans Jørgen; Nguyen, Phuong Giang

    2012-01-01

    In image recognition, the common approach for extracting local features using a scale-space representation has usually three main steps; first interest points are extracted at different scales, next from a patch around each interest point the rotation is calculated with corresponding orientation...

  20. Processing features of audio and video files

    Directory of Open Access Journals (Sweden)

    E. N. Vydalko

    2012-11-01

    Full Text Available Currently the analog videotape recorders using is a thing of the past. Therefore, digital video recording became actual and attractive for the users who put image quality above all else. It is important to make a video recording in digital format without digital signal into analog signal converting. The last leads to a significant loss of quality records. A program processes video stream of digital cable TV is described in this article. Also it can convert video stream of digital cable TV into a format that can easily be used by any computer or DVD-player in digital form.

  1. High-Order Sparse Linear Predictors for Audio Processing

    DEFF Research Database (Denmark)

    Giacobello, Daniele; van Waterschoot, Toon; Christensen, Mads Græsbøll;

    2010-01-01

    of interesting features that make the idea of using it in audio processing not far fetched, e.g., the strong ability of modeling the spectral peaks that play a dominant role in perception. In this paper, we provide some preliminary conjectures and experiments on the use of high-order sparse linear predictors......Linear prediction has generally failed to make a breakthrough in audio processing, as it has done in speech processing. This is mostly due to its poor modeling performance, since an audio signal is usually an ensemble of different sources. Nevertheless, linear prediction comes with a whole set...

  2. Feature selection by separability assessment of input spaces for transient stability classification based on neural networks

    Energy Technology Data Exchange (ETDEWEB)

    Tso, S.K. [City University of Hong Kong (China). Dept. of Manufacturing Engineering; Gu, X.P. [North China Electric Power University, Baoding (China). Dept. of Electrical Engineering

    2004-03-01

    Power system transient-stability assessment based on neural networks can usually be treated as a two-pattern classification problem separating the stable class from the unstable class. In such a classification problem, the feature extraction and selection is the first important task to be carried out. A new approach of feature selection is presented using a new separability measure in this paper. Through finding the 'inconsistent cases' in a sample set, a separability index of input spaces is defined. Using the defined separability index as criterion, the breadth-first searching technique is employed to find the minimal or optimal subsets of the initial feature set. The numerical results based on extensive data obtained for the 10-unit 39-bus New England power system demonstrate the effectiveness of the proposed approach in extracting the 'best combination' of features for improving the quality of transient-stability classification. (author)

  3. Digital Audio Watermarking: An Overview

    OpenAIRE

    Bhuvnesh Kumar Singh; Alok Kumar Singh

    2013-01-01

    Digital watermarking is a very recent research area. Digital audio watermarking is a method to embed or hide the Watermark (Information signal) into a digital signal i.e. Image, audio, text or video data. The watermark is difficult to remove from the audio signal. If the signal is copied, the information or watermark is also carried in the copy. A signal may carry several different watermarks at the same time. It is used to protecting multimedia data from unauthorized copying, piracy, ownersh...

  4. Self-Healing Audio System

    OpenAIRE

    Sharma, Shubham; Sridhar, Aditya; Krishnia, Jai Prakash

    2015-01-01

    Installed sound applications typically involve a large number of audio processors, amplifiers and speaker systems spread across the venue. They could be spatially distributed at the venue across different rack rooms and floors. These systems are commissioned and configured by sound engineers using software application(s). This is essentially a one-time activity, following which, the audio systems run independently. Detection of faults and reconfiguration of any audio device(s) that fail(s) is...

  5. Quality and Distortion Evaluation of Audio Signal by Spectrum

    Directory of Open Access Journals (Sweden)

    Er. Niranjan Singh

    2012-02-01

    Full Text Available Information hiding in digital audio can be used for such diverse applications as proof ofownership, authentication, integrity, secret communication, broadcast monitoring and eventannotation. To achieve secure and undetectable communication, stegano-objects, anddocuments containing a secret message, should be indistinguishable from cover-objects, andshow that documents not containing any secret message. In this respect, Steganalysis is the setof techniques that aim to distinguish between cover-objects and stegano-objects [1]. A coveraudio object can be converted into a stegano-audio object via steganographic methods. In thispaper we present statistical method to detect the presence of hidden messages in audio signals.The basic idea is that, the distribution of various statistical distance measures, calculated oncover audio signals and on stegano-audio signals vis-à-vis their de-noised versions, arestatistically different. A distortion metric based on Signal spectrum was designed specifically todetect modifications and additions to audio media. We used the Signal spectrum to measure thedistortion. The distortion measurement was obtained at various wavelet decomposition levelsfrom which we derived high-order statistics as features for a classifier to determine the presenceof hidden information in an audio signal. This paper looking at evidence in a criminal caseprobably has no reason to alter any evidence files. However, it is part of an ongoing terroristsurveillance might well want to disrupt the hidden information, even if it cannot be recovered.

  6. A Comparative Study on Two Techniques of Reducing the Dimension of Text Feature Space

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    With the development of large-scale text processing, the dimension of text feature space has become larger and larger, which has added a lot of difficulties to natural language processing. How to reduce the dimension has become a practical problem in the field. Here we present two clustering methods, i.e. concept association and concept abstract, to achieve the goal. The first refers to the keyword clustering based on the co-occurrence of keywords in the same text, and the second refers to that in the same category. Then we compare the difference between them. Our experiment results show that they are efficient to reduce the dimension of text feature space.

  7. Beyond podcasting: creative approaches to designing educational audio

    Directory of Open Access Journals (Sweden)

    Andrew Middleton

    2009-12-01

    Full Text Available This paper discusses a university-wide pilot designed to encourage academics to creatively explore learner-centred applications for digital audio. Participation in the pilot was diverse in terms of technical competence, confidence and contextual requirements and there was little prior experience of working with digital audio. Many innovative approaches were taken to using audio in a blended context including student-generated vox pops, audio feedback models, audio conversations and task-setting. A podcast was central to the pilot itself, providing a common space for the 25 participants, who were also supported by materials in several other formats. An analysis of podcast interviews involving pilot participants provided the data informing this case study. This paper concludes that audio has the potential to promote academic creativity in engaging students through media intervention. However, institutional scalability is dependent upon the availability of suitable timely support mechanisms that can address the lack of technical confidence evident in many staff. If that is in place, audio can be widely adopted by anyone seeking to add a new layer of presence and connectivity through the use of voice.

  8. Small signal audio design

    CERN Document Server

    Self, Douglas

    2014-01-01

    Learn to use inexpensive and readily available parts to obtain state-of-the-art performance in all the vital parameters of noise, distortion, crosstalk and so on. With ample coverage of preamplifiers and mixers and a new chapter on headphone amplifiers, this practical handbook provides an extensive repertoire of circuits that can be put together to make almost any type of audio system.A resource packed full of valuable information, with virtually every page revealing nuggets of specialized knowledge not found elsewhere. Essential points of theory that bear on practical performance are lucidly

  9. Advances in audio watermarking based on singular value decomposition

    CERN Document Server

    Dhar, Pranab Kumar

    2015-01-01

    This book introduces audio watermarking methods for copyright protection, which has drawn extensive attention for securing digital data from unauthorized copying. The book is divided into two parts. First, an audio watermarking method in discrete wavelet transform (DWT) and discrete cosine transform (DCT) domains using singular value decomposition (SVD) and quantization is introduced. This method is robust against various attacks and provides good imperceptible watermarked sounds. Then, an audio watermarking method in fast Fourier transform (FFT) domain using SVD and Cartesian-polar transformation (CPT) is presented. This method has high imperceptibility and high data payload and it provides good robustness against various attacks. These techniques allow media owners to protect copyright and to show authenticity and ownership of their material in a variety of applications.   ·         Features new methods of audio watermarking for copyright protection and ownership protection ·         Outl...

  10. Digital audio and video broadcasting by satellite

    Science.gov (United States)

    Yoshino, Takehiko

    In parallel with the progress of the practical use of satellite broadcasting and Hi-Vision or high-definition television technologies, research activities are also in progress to replace the conventional analog broadcasting services with a digital version. What we call 'digitalization' is not a mere technical matter but an important subject which will help promote multichannel or multimedia applications and, accordingly, can change the old concept of mass media, such as television or radio. NHK Science and Technical Research Laboratories has promoted studies of digital bandwidth compression, transmission, and application techniques. The following topics are covered: the trend of digital broadcasting; features of Integrated Services Digital Broadcasting (ISDB); compression encoding and transmission; transmission bit rate in 12 GHz band; number of digital TV transmission channels; multichannel pulse code modulation (PCM) audio broadcasting system via communication satellite; digital Hi-Vision broadcasting; and development of digital audio broadcasting (DAB) for mobile reception in Japan.

  11. Efficient Audio Power Amplification - Challenges

    DEFF Research Database (Denmark)

    Andersen, Michael Andreas E.

    2005-01-01

    For more than a decade efficient audio power amplification has evolved and today switch-mode audio power amplification in various forms are the state-of-the-art. The technical steps that lead to this evolution are described and in addition many of the challenges still to be faced and where extens...

  12. Efficient audio power amplification - challenges

    Energy Technology Data Exchange (ETDEWEB)

    Andersen, Michael A.E.

    2005-07-01

    For more than a decade efficient audio power amplification has evolved and today switch-mode audio power amplification in various forms are the state-of-the-art. The technical steps that lead to this evolution are described and in addition many of the challenges still to be faced and where extensive research and development are needed is covered. (au)

  13. A New Ensemble Method with Feature Space Partitioning for High-Dimensional Data Classification

    Directory of Open Access Journals (Sweden)

    Yongjun Piao

    2015-01-01

    Full Text Available Ensemble data mining methods, also known as classifier combination, are often used to improve the performance of classification. Various classifier combination methods such as bagging, boosting, and random forest have been devised and have received considerable attention in the past. However, data dimensionality increases rapidly day by day. Such a trend poses various challenges as these methods are not suitable to directly apply to high-dimensional datasets. In this paper, we propose an ensemble method for classification of high-dimensional data, with each classifier constructed from a different set of features determined by partitioning of redundant features. In our method, the redundancy of features is considered to divide the original feature space. Then, each generated feature subset is trained by a support vector machine, and the results of each classifier are combined by majority voting. The efficiency and effectiveness of our method are demonstrated through comparisons with other ensemble techniques, and the results show that our method outperforms other methods.

  14. Nanoscale Analysis of Space-Weathering Features in Soils from Itokawa

    Science.gov (United States)

    Thompson, M. S.; Christoffersen, R.; Zega, T. J.; Keller, L. P.

    2014-01-01

    Space weathering alters the spectral properties of airless body surface materials by redden-ing and darkening their spectra and attenuating characteristic absorption bands, making it challenging to characterize them remotely [1,2]. It also causes a discrepency between laboratory analysis of meteorites and remotely sensed spectra from asteroids, making it difficult to associate meteorites with their parent bodies. The mechanisms driving space weathering include mi-crometeorite impacts and the interaction of surface materials with solar energetic ions, particularly the solar wind. These processes continuously alter the microchemical and structural characteristics of exposed grains on airless bodies. The change of these properties is caused predominantly by the vapor deposition of reduced Fe and FeS nanoparticles (npFe(sup 0) and npFeS respectively) onto the rims of surface grains [3]. Sample-based analysis of space weathering has tra-ditionally been limited to lunar soils and select asteroidal and lunar regolith breccias [3-5]. With the return of samples from the Hayabusa mission to asteroid Itoka-wa [6], for the first time we are able to compare space-weathering features on returned surface soils from a known asteroidal body. Analysis of these samples will contribute to a more comprehensive model for how space weathering varies across the inner solar system. Here we report detailed microchemical and microstructal analysis of surface grains from Itokawa.

  15. Prediction of protein structure classes using hybrid space of multi-profile Bayes and bi-gram probability feature spaces.

    Science.gov (United States)

    Hayat, Maqsood; Tahir, Muhammad; Khan, Sher Afzal

    2014-04-01

    Proteins are the executants of biological functions in living organisms. Comprehension of protein structure is a challenging problem in the era of proteomics, computational biology, and bioinformatics because of its pivotal role in protein folding patterns. Owing to the large exploration of protein sequences in protein databanks and intricacy of protein structures, experimental and theoretical methods are insufficient for prediction of protein structure classes. Therefore, it is highly desirable to develop an accurate, reliable, and high throughput computational model to predict protein structure classes correctly from polygenetic sequences. In this regard, we propose a promising model employing hybrid descriptor space in conjunction with optimized evidence-theoretic K-nearest neighbor algorithm. Hybrid space is the composition of two descriptor spaces including Multi-profile Bayes and bi-gram probability. In order to enhance the generalization power of the classifier, we have selected high discriminative descriptors from the hybrid space using particle swarm optimization, a well-known evolutionary feature selection technique. Performance evaluation of the proposed model is performed using the jackknife test on three low similarity benchmark datasets including 25PDB, 1189, and 640. The success rates of the proposed model are 87.0%, 86.6%, and 88.4%, respectively on the three benchmark datasets. The comparative analysis exhibits that our proposed model has yielded promising results compared to the existing methods in the literature. In addition, our proposed prediction system might be helpful in future research particularly in cases where the major focus of research is on low similarity datasets. PMID:24384128

  16. Audio Watermarking with Error Correction

    CERN Document Server

    Chadha, Aman; Goel, Rishabh; Dave, Hiren; Roja, M Mani

    2011-01-01

    In recent times, communication through the internet has tremendously facilitated the distribution of multimedia data. Although this is indubitably a boon, one of its repercussions is that it has also given impetus to the notorious issue of online music piracy. Unethical attempts can also be made to deliberately alter such copyrighted data and thus, misuse it. Copyright violation by means of unauthorized distribution, as well as unauthorized tampering of copyrighted audio data is an important technological and research issue. Audio watermarking has been proposed as a solution to tackle this issue. The main purpose of audio watermarking is to protect against possible threats to the audio data and in case of copyright violation or unauthorized tampering, authenticity of such data can be disputed by virtue of audio watermarking.

  17. Audio Watermarking with Error Correction

    Directory of Open Access Journals (Sweden)

    Aman Chadha

    2011-09-01

    Full Text Available In recent times, communication through the internet has tremendously facilitated the distribution of multimedia data. Although this is indubitably a boon, one of its repercussions is that it has also given impetus to the notorious issue of online music piracy. Unethical attempts can also be made to deliberately alter such copyrighted data and thus, misuse it. Copyright violation by means of unauthorized distribution, as well as unauthorized tampering of copyrighted audio data is an important technological and research issue. Audio watermarking has been proposed as a solution to tackle this issue. The main purpose of audio watermarking is to protect against possible threats to the audio data and in case of copyright violation or unauthorized tampering, authenticity of such data can be disputed by virtue of audio watermarking.

  18. Assessing efficiency of spatial sampling using combined coverage analysis in geographical and feature spaces

    Science.gov (United States)

    Hengl, Tomislav

    2015-04-01

    Efficiency of spatial sampling largely determines success of model building. This is especially important for geostatistical mapping where an initial sampling plan should provide a good representation or coverage of both geographical (defined by the study area mask map) and feature space (defined by the multi-dimensional covariates). Otherwise the model will need to extrapolate and, hence, the overall uncertainty of the predictions will be high. In many cases, geostatisticians use point data sets which are produced using unknown or inconsistent sampling algorithms. Many point data sets in environmental sciences suffer from spatial clustering and systematic omission of feature space. But how to quantify these 'representation' problems and how to incorporate this knowledge into model building? The author has developed a generic function called 'spsample.prob' (Global Soil Information Facilities package for R) and which simultaneously determines (effective) inclusion probabilities as an average between the kernel density estimation (geographical spreading of points; analysed using the spatstat package in R) and MaxEnt analysis (feature space spreading of points; analysed using the MaxEnt software used primarily for species distribution modelling). The output 'iprob' map indicates whether the sampling plan has systematically missed some important locations and/or features, and can also be used as an input for geostatistical modelling e.g. as a weight map for geostatistical model fitting. The spsample.prob function can also be used in combination with the accessibility analysis (cost of field survey are usually function of distance from the road network, slope and land cover) to allow for simultaneous maximization of average inclusion probabilities and minimization of total survey costs. The author postulates that, by estimating effective inclusion probabilities using combined geographical and feature space analysis, and by comparing survey costs to representation

  19. Machine learning approaches for discrimination of Extracellular Matrix proteins using hybrid feature space.

    Science.gov (United States)

    Ali, Farman; Hayat, Maqsood

    2016-08-21

    Extracellular Matrix (ECM) proteins are the vital type of proteins that are secreted by resident cells. ECM proteins perform several significant functions including adhesion, differentiation, cell migration and proliferation. In addition, ECM proteins regulate angiogenesis process, embryonic development, tumor growth and gene expression. Due to tremendous biological significance of the ECM proteins and rapidly increases of protein sequences in databases, it is indispensable to introduce a new high throughput computation model that can accurately identify ECM proteins. Various traditional models have been developed, but they are laborious and tedious. In this work, an effective and high throughput computational classification model is proposed for discrimination of ECM proteins. In this model, protein sequences are formulated using amino acid composition, pseudo amino acid composition (PseAAC) and di-peptide composition (DPC) techniques. Further, various combination of feature extraction techniques are fused to form hybrid feature spaces. Several classifiers were employed. Among these classifiers, K-Nearest Neighbor obtained outstanding performance in combination with the hybrid feature space of PseAAC and DPC. The obtained accuracy of our proposed model is 96.76%, which the highest success rate has been reported in the literature so far. PMID:27179459

  20. Digital Audio Collections

    Directory of Open Access Journals (Sweden)

    Jason Tenter

    2010-11-01

    Full Text Available

    This paper is about the possibility of libraries creating digital music or audio collections based on the current state of the digital music industry, and in comparison with the difficulties librarians have found in adding e-books to collections. In comparing the e-book and digital music markets, factors such as digital rights management (DRM and the differences in both markets’ relationships with customers are examined. This juxtaposition suggests that where e-books have been difficult to include in library collections because publishers want to maintain control over their content, music publishers have had to resign some of the control over their products because of file-sharing, and so may work with libraries to develop these collections in a more constructive way than e-book venders. At the end of the paper, some models are suggested for developing these collections.

  1. Features of human skin in HSV color space and new recognition parameter

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    Features of human skin in HSV color space are widely applied in the area of image retrieval based on content. H is selected as the basic recognition parameter because its value has a narrow range for the skin color and can keep stable while the illumination intensity or the curvature of skin surface is changing. Rules of parameters with the change of illumination in HSV color space are studied. It is firstly found that the mean of saturation and value (S+V)/2 can keep stable when the illumination intensity is changed or the skin surface is inflected, and (S+V)/2 changes with skin color, but the tendency of change is contrary to that of H. Therefore, (S+V)/H can be used as a new recognition parameter which can enhance HSV ability to recognize human skin.

  2. Between technical features and analytic capabilities: Charting a relational affordance space for digital social analytics

    Directory of Open Access Journals (Sweden)

    Anders Koed Madsen

    2015-01-01

    Full Text Available Digital social analytics is a subset of Big Data methods that is used to understand the social environment in which people and organizations have to act. This paper presents an analysis of eight projects that are experimenting with the use of these methods for various purposes. It shows that two specific technological features influence the work with such methods in all the cases. The first concerns the need to distribute choices about the structure of data to third-party actors and the second concerns the need to balance machine intelligence and human intuition when automating the analysis. These features set specific conditions for knowledge production, and the paper identifies two opposite approaches for engaging with each of these conditions. These features and approaches are finally combined into a two-dimensional affordance space that illustrates how there is flexibility in the way project leaders interact with the features of the data environment. It thereby also shows how digital social analytics come to have different affordances for different projects.

  3. Supervised pixel classification using a feature space derived from an artificial visual system

    Science.gov (United States)

    Baxter, Lisa C.; Coggins, James M.

    1991-01-01

    Image segmentation involves labelling pixels according to their membership in image regions. This requires the understanding of what a region is. Using supervised pixel classification, the paper investigates how groups of pixels labelled manually according to perceived image semantics map onto the feature space created by an Artificial Visual System. Multiscale structure of regions are investigated and it is shown that pixels form clusters based on their geometric roles in the image intensity function, not by image semantics. A tentative abstract definition of a 'region' is proposed based on this behavior.

  4. Synthecology: sound use of audio in teleimmersion

    Science.gov (United States)

    Baum, Geoffrey; Gotsis, Marientina; Chang, Benjamin; Drinkwater, Robb; St. Clair, Dan

    2006-02-01

    This paper examines historical audio applications used to provide real-time immersive sound for CAVE TM environments and discusses their relative strengths and weaknesses. We examine and explain issues of providing spatialized sound immersion in real-time virtual environments (VEs), some problems with currently used sound servers, and a set of requirements for an 'ideal' sound server. We present the initial configuration of a new cross-platform sound server solution using open source software and the Open Sound Control (OSC) specification for the creation of real-time spatialized audio with CAVE applications, specifically Ygdrasil (Yg) environments. The application, aNother Sound Server (NSS) establishes an application interface (API) using OSC, a logical server layer implemented in Python, and an audio engine using SuperCollider (SC). We discuss spatialization implementation and other features. Finally, we document the Synthecology project which premiered at WIRED NEXTFEST 2005 and was the first VE to use NSS. We also discuss various techniques that enhance presence in networked VEs, as well as possible and planned extensions of NSS.

  5. Digital Audio Watermarking: An Overview

    Directory of Open Access Journals (Sweden)

    Bhuvnesh Kumar Singh

    2013-10-01

    Full Text Available Digital watermarking is a very recent research area. Digital audio watermarking is a method to embed or hide the Watermark (Information signal into a digital signal i.e. Image, audio, text or video data. The watermark is difficult to remove from the audio signal. If the signal is copied, the information or watermark is also carried in the copy. A signal may carry several different watermarks at the same time. It is used to protecting multimedia data from unauthorized copying, piracy, ownership, inventions, authentication etc. in this paper we present the watermarking methods and applications

  6. Robust Audio Watermarking Based on Log-Polar Frequency Index

    Science.gov (United States)

    Yang, Rui; Kang, Xiangui; Huang, Jiwu

    In this paper, we analyze the audio signal distortions introduced by pitch-scaling, random cropping and DA/AD conversion, and find a robust feature, average Fourier magnitude over the log-polar frequency index(AFM), which can resist these attacks. Theoretical analysis and extensive experiments demonstrate that AFM is an appropriate embedding region for robust audio watermarking. This is the first work on applying log-polar mapping to audio watermark. The usage of log-polar mapping in our work is basically different from the existing works in image watermarking. The log-polar mapping is only applied to the frequency index, not to the transform coefficients, which avoids the reconstruction distortion of inverse log-polar transform and reduces the computation cost. Comparison with the existing methods, the proposed AFM-based watermarking scheme has the outstanding performance on resisting pitch-scaling and random cropping, as well as very approving robustness to DA/AD conversion and TSM (Time-Scale Modification). The watermarked audio achieves high auditory quality. Experimental results show that the scheme is very robust to common audio signal processing and distortions introduced in Stirmark for Audio.

  7. Virtual environment interaction through 3D audio by blind children.

    Science.gov (United States)

    Sánchez, J; Lumbreras, M

    1999-01-01

    Interactive software is actively used for learning, cognition, and entertainment purposes. Educational entertainment software is not very popular among blind children because most computer games and electronic toys have interfaces that are only accessible through visual cues. This work applies the concept of interactive hyperstories to blind children. Hyperstories are implemented in a 3D acoustic virtual world. In past studies we have conceptualized a model to design hyperstories. This study illustrates the feasibility of the model. It also provides an introduction to researchers to the field of entertainment software for blind children. As a result, we have designed and field tested AudioDoom, a virtual environment interacted through 3D Audio by blind children. AudioDoom is also a software that enables testing nontrivial interfaces and cognitive tasks with blind children. We explored the construction of cognitive spatial structures in the minds of blind children through audio-based entertainment and spatial sound navigable experiences. Children playing AudioDoom were exposed to first person experiences by exploring highly interactive virtual worlds through the use of 3D aural representations of the space. This experience was structured in several cognitive tasks where they had to build concrete models of their spatial representations constructed through the interaction with AudioDoom by using Legotrade mark blocks. We analyze our preliminary results after testing AudioDoom with Chilean children from a school for blind children. We discuss issues such as interactivity in software without visual cues, the representation of spatial sound navigable experiences, and entertainment software such as computer games for blind children. We also evaluate the feasibility to construct virtual environments through the design of dynamic learning materials with audio cues.

  8. Modeling Audio Fingerprints: Structure, Distortion, Capacity

    NARCIS (Netherlands)

    Doets, P.J.O.

    2010-01-01

    An audio fingerprint is a compact low-level representation of a multimedia signal. An audio fingerprint can be used to identify audio files or fragments in a reliable way. The use of audio fingerprints for identification consists of two phases. In the enrollment phase known content is fingerprinted,

  9. Tag Based Audio Search Engine

    Directory of Open Access Journals (Sweden)

    Parameswaran Vellachu

    2012-03-01

    Full Text Available The volume of the music database is increasing day by day. Getting the required song as per the choice of the listener is a big challenge. Hence, it is really hard to manage this huge quantity, in terms of searching, filtering, through the music database. It is surprising to see that the audio and music industry still rely on very simplistic metadata to describe music files. However, while searching audio resource, an efficient "Tag Based Audio Search Engine" is necessary. The current research focuses on two aspects of the musical databases 1. Tag Based Semantic Annotation Generation using the tag based approach.2. An audio search engine, using which the user can retrieve the songs based on the users choice. The proposed method can be used to annotation and retrieve songs based on musical instruments used , mood of the song, theme of the song, singer, music director, artist, film director, instrument, genre or style and so on.

  10. A centralized audio presentation manager

    Energy Technology Data Exchange (ETDEWEB)

    Papp, A.L. III; Blattner, M.M.

    1994-05-16

    The centralized audio presentation manager addresses the problems which occur when multiple programs running simultaneously attempt to use the audio output of a computer system. Time dependence of sound means that certain auditory messages must be scheduled simultaneously, which can lead to perceptual problems due to psychoacoustic phenomena. Furthermore, the combination of speech and nonspeech audio is examined; each presents its own problems of perceptibility in an acoustic environment composed of multiple auditory streams. The centralized audio presentation manager receives abstract parameterized message requests from the currently running programs, and attempts to create and present a sonic representation in the most perceptible manner through the use of a theoretically and empirically designed rule set.

  11. ENERGY STAR Certified Audio Video

    Data.gov (United States)

    U.S. Environmental Protection Agency — Certified models meet all ENERGY STAR requirements as listed in the Version 3.0 ENERGY STAR Program Requirements for Audio Video Equipment that are effective as of...

  12. Development of Sensitivity to Spacing Versus Feature Changes in Pictures of Houses: Evidence for Slow Development of a General Spacing Detection Mechanism?

    Science.gov (United States)

    Robbins, Rachel A.; Shergill, Yaadwinder; Maurer, Daphne; Lewis, Terri L.

    2011-01-01

    Adults are expert at recognizing faces, in part because of exquisite sensitivity to the spacing of facial features. Children are poorer than adults at recognizing facial identity and less sensitive to spacing differences. Here we examined the specificity of the immaturity by comparing the ability of 8-year-olds, 14-year-olds, and adults to…

  13. Audio-Visual Speech Recognition Using Lip Information Extracted from Side-Face Images

    Directory of Open Access Journals (Sweden)

    Koji Iwano

    2007-03-01

    Full Text Available This paper proposes an audio-visual speech recognition method using lip information extracted from side-face images as an attempt to increase noise robustness in mobile environments. Our proposed method assumes that lip images can be captured using a small camera installed in a handset. Two different kinds of lip features, lip-contour geometric features and lip-motion velocity features, are used individually or jointly, in combination with audio features. Phoneme HMMs modeling the audio and visual features are built based on the multistream HMM technique. Experiments conducted using Japanese connected digit speech contaminated with white noise in various SNR conditions show effectiveness of the proposed method. Recognition accuracy is improved by using the visual information in all SNR conditions. These visual features were confirmed to be effective even when the audio HMM was adapted to noise by the MLLR method.

  14. Audio-Visual Speech Recognition Using Lip Information Extracted from Side-Face Images

    Directory of Open Access Journals (Sweden)

    Iwano Koji

    2007-01-01

    Full Text Available This paper proposes an audio-visual speech recognition method using lip information extracted from side-face images as an attempt to increase noise robustness in mobile environments. Our proposed method assumes that lip images can be captured using a small camera installed in a handset. Two different kinds of lip features, lip-contour geometric features and lip-motion velocity features, are used individually or jointly, in combination with audio features. Phoneme HMMs modeling the audio and visual features are built based on the multistream HMM technique. Experiments conducted using Japanese connected digit speech contaminated with white noise in various SNR conditions show effectiveness of the proposed method. Recognition accuracy is improved by using the visual information in all SNR conditions. These visual features were confirmed to be effective even when the audio HMM was adapted to noise by the MLLR method.

  15. Tourism research and audio methods

    DEFF Research Database (Denmark)

    Jensen, Martin Trandberg

    2016-01-01

    Audio methods enriches sensuous tourism ethnographies. • The note suggests five research avenues for future auditory scholarship. • Sensuous tourism research has neglected the role of sounds in embodied tourism experiences.......• Audio methods enriches sensuous tourism ethnographies. • The note suggests five research avenues for future auditory scholarship. • Sensuous tourism research has neglected the role of sounds in embodied tourism experiences....

  16. Python for audio signal processing

    OpenAIRE

    Glover, John C.; Lazzarini, Victor; Timoney, Joseph

    2011-01-01

    This paper discusses the use of Python for developing audio signal processing applications. Overviews of Python language, NumPy, SciPy and Matplotlib are given, which together form a powerful platform for scientic computing. We then show how SciPy was used to create two audio programming libraries, and describe ways that Python can be integrated with the SndObj library and Pure Data, two existing environments for music composition and signal processing.

  17. A Reproducible Research Framework for Audio Inpainting

    OpenAIRE

    Adler, Amir; Emiya, Valentin; Jafari, Maria,; Elad, Michael; Gribonval, Rémi; Plumbley, Mark D.

    2011-01-01

    International audience We introduce a unified framework for the restoration of distorted audio data, leveraging the Image Inpainting concept and covering existing audio applications. In this framework, termed Audio Inpainting, the distorted data is considered missing and its location is assumed to be known. We further introduce baseline approaches based on sparse representations. For this new audio inpainting concept, we provide reproducible-research tools including: the handling of audio ...

  18. Development of an audio input toolkit for multiple sources

    OpenAIRE

    Kosch, Thomas

    2013-01-01

    Audio services, like voice over IP or several voice recognition systems, are developing very fast and since they are easy to use nearly everybody is linked to such systems. In this thesis about the processing of multiple audio inputs, an audio toolkit for processing multiple audio inputs has to be developed. Used audio input devices are bluetooth headsets, which can send audio via UDP to the audio toolkit. This audio toolkit is able to process these multiple audio inputs and determines a domi...

  19. Joint spatial-spectral feature space clustering for speech activity detection from ECoG signals.

    Science.gov (United States)

    Kanas, Vasileios G; Mporas, Iosif; Benz, Heather L; Sgarbas, Kyriakos N; Bezerianos, Anastasios; Crone, Nathan E

    2014-04-01

    Brain-machine interfaces for speech restoration have been extensively studied for more than two decades. The success of such a system will depend in part on selecting the best brain recording sites and signal features corresponding to speech production. The purpose of this study was to detect speech activity automatically from electrocorticographic signals based on joint spatial-frequency clustering of the ECoG feature space. For this study, the ECoG signals were recorded while a subject performed two different syllable repetition tasks. We found that the optimal frequency resolution to detect speech activity from ECoG signals was 8 Hz, achieving 98.8% accuracy by employing support vector machines as a classifier. We also defined the cortical areas that held the most information about the discrimination of speech and nonspeech time intervals. Additionally, the results shed light on the distinct cortical areas associated with the two syllables repetition tasks and may contribute to the development of portable ECoG-based communication.

  20. Location audio simplified capturing your audio and your audience

    CERN Document Server

    Miles, Dean

    2014-01-01

    From the basics of using camera, handheld, lavalier, and shotgun microphones to camera calibration and mixer set-ups, Location Audio Simplified unlocks the secrets to clean and clear broadcast quality audio no matter what challenges you face. Author Dean Miles applies his twenty-plus years of experience as a professional location operator to teach the skills, techniques, tips, and secrets needed to produce high-quality production sound on location. Humorous and thoroughly practical, the book covers a wide array of topics, such as:* location selection* field mixing* boo

  1. A Joint Audio-Visual Approach to Audio Localization

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Christensen, Mads Græsbøll

    2015-01-01

    Localization of audio sources is an important research problem, e.g., to facilitate noise reduction. In the recent years, the problem has been tackled using distributed microphone arrays (DMA). A common approach is to apply direction-of-arrival (DOA) estimation on each array (denoted as nodes), a...... time-of-flight cameras. Moreover, we propose an optimal method for weighting such DOA and range information for audio localization. Our experiments on both synthetic and real data show that there is a clear, potential advantage of using the joint audiovisual localization framework....

  2. Evaluation of Audio Compression Artifacts

    Directory of Open Access Journals (Sweden)

    M. Herrera Martinez

    2007-01-01

    Full Text Available This paper deals with subjective evaluation of audio-coding systems. From this evaluation, it is found that, depending on the type of signal and the algorithm of the audio-coding system, different types of audible errors arise. These errors are called coding artifacts. Although three kinds of artifacts are perceivable in the auditory domain, the author proposes that in the coding domain there is only one common cause for the appearance of the artifact, inefficient tracking of transient-stochastic signals. For this purpose, state-of-the art audio coding systems use a wide range of signal processing techniques, including application of the wavelet transform, which is described here. 

  3. Survey Musik und Medien 2012: Audio Media Usage in Germany - Audio Emitters - Audio Emitters used in 2012 - Versatile Traditionalists

    OpenAIRE

    Lepa, Steffen

    2013-01-01

    Where did the sound emanate from in 2012? Audio Emitters comprise those technical objects that are connected to Audio Devices in order to make Audio Sources audible. This category includes headphones and loudspeakers with both diverse sound formats (mono, stereo, surround) as well as various device types (Headphones: small, standard, HiFi; Loudspeakers: integrated models, single components and docking stations). Versatile Traditionalists constitute the most prevalent audio repertoire with...

  4. Audio power amplifier design handbook

    CERN Document Server

    Self, Douglas

    2013-01-01

    This book is essential for audio power amplifier designers and engineers for one simple reason...it enables you as a professional to develop reliable, high-performance circuits. The Author Douglas Self covers the major issues of distortion and linearity, power supplies, overload, DC-protection and reactive loading. He also tackles unusual forms of compensation and distortion produced by capacitors and fuses. This completely updated fifth edition includes four NEW chapters including one on The XD Principle, invented by the author, and used by Cambridge Audio. Cro

  5. QRDA: Quantum Representation of Digital Audio

    Science.gov (United States)

    Wang, Jian

    2016-03-01

    Multimedia refers to content that uses a combination of different content forms. It includes two main medias: image and audio. However, by contrast with the rapid development of quantum image processing, quantum audio almost never been studied. In order to change this status, a quantum representation of digital audio (QRDA) is proposed in this paper to present quantum audio. QRDA uses two entangled qubit sequences to store the audio amplitude and time information. The two qubit sequences are both in basis state: |0> and |1>. The QRDA audio preparation from initial state |0> is given to store an audio in quantum computers. Then some exemplary quantum audio processing operations are performed to indicate QRDA's usability.

  6. Digital Audio Application to Short Wave Broadcasting

    Science.gov (United States)

    Chen, Edward Y.

    1997-01-01

    Digital audio is becoming prevalent not only in consumer electornics, but also in different broadcasting media. Terrestrial analog audio broadcasting in the AM and FM bands will be eventually be replaced by digital systems.

  7. Audio watermark a comprehensive foundation using Matlab

    CERN Document Server

    Lin, Yiqing

    2015-01-01

    This book illustrates the commonly used and novel approaches of audio watermarking for copyrights protection. The author examines the theoretical and practical step by step guide to the topic of data hiding in audio signal such as music, speech, broadcast. The book covers new techniques developed by the authors are fully explained and MATLAB programs, for audio watermarking and audio quality assessments and also discusses methods for objectively predicting the perceptual quality of the watermarked audio signals. Explains the theoretical basics of the commonly used audio watermarking techniques Discusses the methods used to objectively and subjectively assess the quality of the audio signals Provides a comprehensive well tested MATLAB programs that can be used efficiently to watermark any audio media

  8. Audio Watermarking Using Lsb With Adjustment Method

    Directory of Open Access Journals (Sweden)

    Ansith.S, Priyanka Udayabhanu

    2013-05-01

    Full Text Available In this paper we are discussing watermarking on audio signals. In this method the recorded audio data is first sampled using a sampling frequency of 22050 Hz. Then the watermark message is watermarked into the sampled data of the audio signal. In this method the adjustment is done to increase the accuracy of the watermarked signal. Finally we extract the message from the audio data.

  9. AudioRegent: Exploiting SimpleADL and SoX for Digital Audio Delivery

    OpenAIRE

    Nitin Arora

    2010-01-01

    AudioRegent is a command-line Python script currently being used by the University of Alabama Libraries’ Digital Services to create web-deliverable MP3s from regions within archival audio files. In conjunction with a small-footprint XML file called SimpleADL and SoX, an open-source command-line audio editor, AudioRegent batch processes archival audio files, allowing for one or many user-defined regions, particular to each audio file, to be extracted with additional audio processing in a trans...

  10. 36 CFR 2.12 - Audio disturbances.

    Science.gov (United States)

    2010-07-01

    ... 36 Parks, Forests, and Public Property 1 2010-07-01 2010-07-01 false Audio disturbances. 2.12... RESOURCE PROTECTION, PUBLIC USE AND RECREATION § 2.12 Audio disturbances. (a) The following are prohibited..., motorized toy, or an audio device, such as a radio, television set, tape deck or musical instrument, in...

  11. 50 CFR 27.72 - Audio equipment.

    Science.gov (United States)

    2010-10-01

    ... 50 Wildlife and Fisheries 6 2010-10-01 2010-10-01 false Audio equipment. 27.72 Section 27.72 Wildlife and Fisheries UNITED STATES FISH AND WILDLIFE SERVICE, DEPARTMENT OF THE INTERIOR (CONTINUED) THE... Audio equipment. The operation or use of audio devices including radios, recording and playback...

  12. Audio Frequency Analysis in Mobile Phones

    Science.gov (United States)

    Aguilar, Horacio Munguía

    2016-01-01

    A new experiment using mobile phones is proposed in which its audio frequency response is analyzed using the audio port for inputting external signal and getting a measurable output. This experiment shows how the limited audio bandwidth used in mobile telephony is the main cause of the poor speech quality in this service. A brief discussion is…

  13. Presence and the utility of audio spatialization

    DEFF Research Database (Denmark)

    Bormann, Karsten

    2005-01-01

    The primary concern of this paper is whether the utility of audio spatialization, as opposed to the fidelity of audio spatialization, impacts presence. An experiment is reported that investigates the presence-performance relationship by decoupling spatial audio fidelity (realism) from task...

  14. Bit rates in audio source coding

    NARCIS (Netherlands)

    Veldhuis, Raymond N.J.

    1992-01-01

    The goal is to introduce and solve the audio coding optimization problem. Psychoacoustic results such as masking and excitation pattern models are combined with results from rate distortion theory to formulate the audio coding optimization problem. The solution of the audio optimization problem is a

  15. Audio-Visual Aids: Historians in Blunderland.

    Science.gov (United States)

    Decarie, Graeme

    1988-01-01

    A history professor relates his experiences producing and using audio-visual material and warns teachers not to rely on audio-visual aids for classroom presentations. Includes examples of popular audio-visual aids on Canada that communicate unintended, inaccurate, or unclear ideas. Urges teachers to exercise caution in the selection and use of…

  16. [Audio-visual aids and tropical medicine].

    Science.gov (United States)

    Morand, J J

    1989-01-01

    The author presents a list of the audio-visual productions about Tropical Medicine, as well as of their main characteristics. He thinks that the audio-visual educational productions are often dissociated from their promotion; therefore, he invites the future creator to forward his work to the Audio-Visual Health Committee.

  17. Spatial audio quality perception (part 1)

    DEFF Research Database (Denmark)

    Conetta, R.; Brookes, T.; Rumsey, F.;

    2015-01-01

    Spatial audio processes (SAPs) commonly encountered in consumer audio reproduction systems are known to produce a range of impairments to spatial quality. By way of two listening tests, this paper investigated the degree of degradation of the spatial quality of six 5-channel audio recordings resu...

  18. Haptic and Audio Interaction Design

    DEFF Research Database (Denmark)

    This book constitutes the refereed proceedings of the 5th International Workshop on Haptic and Audio Interaction Design, HAID 2010 held in Copenhagen, Denmark, in September 2010. The 21 revised full papers presented were carefully reviewed and selected for inclusion in the book. The papers are or...

  19. Engaging Students with Audio Feedback

    Science.gov (United States)

    Cann, Alan

    2014-01-01

    Students express widespread dissatisfaction with academic feedback. Teaching staff perceive a frequent lack of student engagement with written feedback, much of which goes uncollected or unread. Published evidence shows that audio feedback is highly acceptable to students but is underused. This paper explores methods to produce and deliver audio…

  20. Digital Augmented Reality Audio Headset

    Directory of Open Access Journals (Sweden)

    Jussi Rämö

    2012-01-01

    Full Text Available Augmented reality audio (ARA combines virtual sound sources with the real sonic environment of the user. An ARA system can be realized with a headset containing binaural microphones. Ideally, the ARA headset should be acoustically transparent, that is, it should not cause audible modification to the surrounding sound. A practical implementation of an ARA mixer requires a low-latency headphone reproduction system with additional equalization to compensate for the attenuation and the modified ear canal resonances caused by the headphones. This paper proposes digital IIR filters to realize the required equalization and evaluates a real-time prototype ARA system. Measurements show that the throughput latency of the digital prototype ARA system can be less than 1.4 ms, which is sufficiently small in practice. When the direct and processed sounds are combined in the ear, a comb filtering effect is brought about and appears as notches in the frequency response. The comb filter effect in speech and music signals was studied in a listening test and it was found to be inaudible when the attenuation is 20 dB. Insert ARA headphones have a sufficient attenuation at frequencies above about 1 kHz. The proposed digital ARA system enables several immersive audio applications, such as a virtual audio tourist guide and audio teleconferencing.

  1. Vascular lesions of the lumbar epidural space: magnetic resonance imaging features of epidural cavernous hemangioma and epidural hematoma

    Directory of Open Access Journals (Sweden)

    Basile Júnior Roberto

    1999-01-01

    Full Text Available The authors report the magnetic resonance imaging diagnostic features in two cases with respectively lumbar epidural hematoma and cavernous hemangioma of the lumbar epidural space. Enhanced MRI T1-weighted scans show a hyperintense signal rim surrounding the vascular lesion. Non-enhanced T2-weighted scans showed hyperintense signal.

  2. Multiple-output support vector machine regression with feature selection for arousal/valence space emotion assessment.

    Science.gov (United States)

    Torres-Valencia, Cristian A; Álvarez, Mauricio A; Orozco-Gutiérrez, Alvaro A

    2014-01-01

    Human emotion recognition (HER) allows the assessment of an affective state of a subject. Until recently, such emotional states were described in terms of discrete emotions, like happiness or contempt. In order to cover a high range of emotions, researchers in the field have introduced different dimensional spaces for emotion description that allow the characterization of affective states in terms of several variables or dimensions that measure distinct aspects of the emotion. One of the most common of such dimensional spaces is the bidimensional Arousal/Valence space. To the best of our knowledge, all HER systems so far have modelled independently, the dimensions in these dimensional spaces. In this paper, we study the effect of modelling the output dimensions simultaneously and show experimentally the advantages in modeling them in this way. We consider a multimodal approach by including features from the Electroencephalogram and a few physiological signals. For modelling the multiple outputs, we employ a multiple output regressor based on support vector machines. We also include an stage of feature selection that is developed within an embedded approach known as Recursive Feature Elimination (RFE), proposed initially for SVM. The results show that several features can be eliminated using the multiple output support vector regressor with RFE without affecting the performance of the regressor. From the analysis of the features selected in smaller subsets via RFE, it can be observed that the signals that are more informative into the arousal and valence space discrimination are the EEG, Electrooculogram/Electromiogram (EOG/EMG) and the Galvanic Skin Response (GSR).

  3. Techniques in audio and acoustic measurement

    Science.gov (United States)

    Kite, Thomas D.

    2003-10-01

    Measurement of acoustic devices and spaces is commonly performed with time-delay spectrometry (TDS) or maximum length sequence (MLS) analysis. Both techniques allow an impulse response to be measured with a signal-to-noise ratio (SNR) that can be traded off against the measurement time. However, TDS suffers from long measurement times because of its linear sweep, while MLS suffers from the corruption of the impulse response by distortion. Recently a logarithmic sweep-based method has been devised which offers high SNR, short measurement times, and the ability to separate the linear impulse response from the impulse responses of distortion products. The applicability of these methods to audio and acoustic measurement will be compared.

  4. REPRESENTING URBAN SPACE ACCORDING TO THE FEATURES OF THE IDEAL CITY

    Directory of Open Access Journals (Sweden)

    MARIA ELIZA DULAMĂ

    2012-01-01

    Full Text Available This study focused on high school students’ representing the real and ideal urban spaces on plans and it also focused on their representing of these spaces in texts. Students worked in groups and we presented their results: the city plans created for three ideal cities. We analysed the represented geographical elements, the functions of those cities, and the difficulties that students had in perceiving and representing geographical space.

  5. NFL Films audio, video, and film production facilities

    Science.gov (United States)

    Berger, Russ; Schrag, Richard C.; Ridings, Jason J.

    2003-04-01

    The new NFL Films 200,000 sq. ft. headquarters is home for the critically acclaimed film production that preserves the NFL's visual legacy week-to-week during the football season, and is also the technical plant that processes and archives football footage from the earliest recorded media to the current network broadcasts. No other company in the country shoots more film than NFL Films, and the inclusion of cutting-edge video and audio formats demands that their technical spaces continually integrate the latest in the ever-changing world of technology. This facility houses a staggering array of acoustically sensitive spaces where music and sound are equal partners with the visual medium. Over 90,000 sq. ft. of sound critical technical space is comprised of an array of sound stages, music scoring stages, audio control rooms, music writing rooms, recording studios, mixing theaters, video production control rooms, editing suites, and a screening theater. Every production control space in the building is designed to monitor and produce multi channel surround sound audio. An overview of the architectural and acoustical design challenges encountered for each sophisticated listening, recording, viewing, editing, and sound critical environment will be discussed.

  6. WAVE : an audio virtual environment

    OpenAIRE

    Valbom, Leonel; Marcos, Adérito

    2004-01-01

    This paper outlines the basis and gives a description of the project WAVE that is starting in the Department of Information Systems of the University of Minho in co-operation with a research group in the Computer Graphics Centre - ZGDV, Guimaraes. The project aims to set up an immersive environment of virtual reality, where the music, sound and audio (3D or not) plays an important role in a virtual musical/sound instrument for performances, education, entertainment or experimentat...

  7. Audio Watermarking with Error Correction

    OpenAIRE

    Aman Chadha; Sandeep Gangundi; Rishabh Goel; Hiren Dave; M.Mani Roja

    2011-01-01

    In recent times, communication through the internet has tremendously facilitated the distribution of multimedia data. Although this is indubitably a boon, one of its repercussions is that it has also given impetus to the notorious issue of online music piracy. Unethical attempts can also be made to deliberately alter such copyrighted data and thus, misuse it. Copyright violation by means of unauthorized distribution, as well as unauthorized tampering of copyrighted audio data is an important ...

  8. C Implementation & comparison of companding & silence audio compression techniques

    OpenAIRE

    Dangarwala, Kruti; Shah, Jigar

    2010-01-01

    Just about all the newest living room audio-video electronics and PC multimedia products being designed today will incorporate some form of compressed digitized-audio processing capability. Audio compression reduces the bit rate required to represent an analog audio signal while maintaining the perceived audio quality. Discarding inaudible data reduces the storage, transmission and compute requirements of handling high-quality audio files. This paper covers wave audio file format & algorithm ...

  9. Survey Musik und Medien 2012: Audio Media Usage in Germany - Audio Emitters - Audio Emitters in used in 2012

    OpenAIRE

    Lepa, Steffen

    2013-01-01

    Where did the sound emanate from in 2012? Audio Emitters comprise those technical objects that are connected to Audio Devices in order to make Audio Sources audible. This category includes headphones and loudspeakers with both diverse sound formats (mono, stereo, surround) as well as various device types (Headphones: small, standard, HiFi; Loudspeakers: integrated models, single components and docking stations). How do the Germans listen to music nowadays? Survey Musik und Medien 2012 deli...

  10. Survey Musik und Medien 2012: Audio Media Usage in Germany - Audio Devices - Audio Devices used in 2012

    OpenAIRE

    Lepa, Steffen

    2013-01-01

    By what means was music played back in 2012? Audio Devices comprise technical devices that permit access to and enable playback of Audio Sources. This includes CD players, record players, cassette recorders, MP3 player and smartphones but also computers and various multimedia entertainment devices that allow music use. How do the Germans listen to music nowadays? Survey Musik und Medien 2012 delivers representative data on actual audio media usage of German population. These data allow the...

  11. Survey Musik und Medien 2012: Audio Media Usage in Germany - Audio Emitters - Audio Emitters used in 2012 - Radio Traditionalists

    OpenAIRE

    Lepa, Steffen

    2013-01-01

    Where did the sound emanate from in 2012? Audio Emitters comprise those technical objects that are connected to Audio Devices in order to make Audio Sources audible. This category includes headphones and loudspeakers with both diverse sound formats (mono, stereo, surround) as well as various device types (Headphones: small, standard, HiFi; Loudspeakers: integrated models, single components and docking stations). Radio Traditionalists are represented in various age groups, and may be born ...

  12. Survey Musik und Medien 2012: Audio Media Usage in Germany - Audio Devices - Audio Devices used in 2012 - Selective Traditionalists

    OpenAIRE

    Lepa, Steffen

    2013-01-01

    By what means was music played back in 2012? Audio Devices comprise technical devices that permit access to and enable playback of Audio Sources. This includes CD players, record players, cassette recorders, MP3 player and smartphones but also computers and various multimedia entertainment devices that allow music use. 11,6 % of our participants may be described as Selective Traditionalists who are typically born between 1955 and 1975. The radio is the dominant audio source used at least ...

  13. Audio-magnetotelluric methods in reconnaissance geothermal exploration

    Science.gov (United States)

    Hoover, D.B.; Long, C.L.

    1976-01-01

    An audio-magnetotelluric (AMT) system has been developed by the U.S. Geological Survey for low-cost reconnaissance exploration of geothermal regions. This is an electromagnetic sounding technique in which the scalar or Cagniard resistivity is computed at 12 frequencies logarithmically spaced from 7.5 to 18 600 Hz. Our system uses natural source fields except at the two upper frequencies of 10 200

  14. Navigation for the Blind through Audio-Based Virtual Environments

    OpenAIRE

    Sánchez, Jaime; Sáenz, Mauricio; Pascual-Leone, Alvaro; Merabet, Lotfi

    2010-01-01

    We present the design, development and an initial study changes and adaptations related to navigation that take place in the brain, by incorporating an Audio-Based Environments Simulator (AbES) within a neuroimaging environment. This virtual environment enables a blind user to navigate through a virtual representation of a real space in order to train his/her orientation and mobility skills. Our initial results suggest that this kind of virtual environment could be highly efficient as a testi...

  15. Audio-visual Feature Fusion Person Identification Based on SVM and Score Normalization%基于SVM和归一化技术的音视频特征融合身份识别

    Institute of Scientific and Technical Information of China (English)

    丁辉; 安今朝

    2012-01-01

    In order to solve the problem of low recognition rate of face recognition and speech recognition under the wicked noise conditions. Based on the studies of feature level fusion theory and combined with Normalization and SVM theory, a novel model for face features and speech features fusion recognition is presented in this paper. First, we extract the face features and speech features correspondingly, then we fuse the two features on the feature level in order to obtain the fusion feature, after the calculation of the distance between the test people and template people we normalize the matching distance so as to reduce the computational and to improve the recognition accuracy. Al the last, we put the normalization matching distance into SVM can we obtain the recognition result. Trie experiment show that the fusion system performs well both in response time and system accuracy especially in noisy background.%针对噪声环境下人脸识别率和说话人识别率低的问题,在研究特征层融合的基础上,结合归一化技术和SVM理论,提出了一种融合人脸和语音的多生物特征识别模型.首先采用离散余弦变换和局部保持投影算法提取人脸特征及SVM方法提取语音特征,在特征层进行融合得到融合特征后,计算测试身份与模板问的距离,为了减少计算量和提高识别性能,对匹配距离进行归一化处理,最后输入到SVM进行识别.仿真结果表明,在噪声环境下,当信噪比降低时,融合识别率要明显高于单个系统的识别率,达到了身份识别的目的.

  16. Review of AVS Audio Coding Standard

    Institute of Scientific and Technical Information of China (English)

    ZHANG Tao; ZHANG Caixia; ZHAO Xin

    2016-01-01

    Audio Video Coding Standard (AVS) is a second⁃generation source coding standard and the first standard for audio and video coding in China with independent intellectual property rights. Its performance has reached the international standard. Its coding efficiency is 2 to 3 times greater than that of MPEG⁃2. This technical solution is more simple, and it can greatly save channel resource. After more than ten years ’develop⁃ment, AVS has achieved great success. The latest version of the AVS audio coding standard is ongoing and mainly aims at the increasing demand for low bitrate and high quality audio services. The paper reviews the history and recent develop⁃ment of AVS audio coding standard in terms of basic fea⁃tures, key techniques and performance. Finally, the future de⁃velopment of AVS audio coding standard is discussed.

  17. Implementing Audio-CASI on Windows’ Platforms

    OpenAIRE

    Cooley, Philip C.; Turner, Charles F.

    1998-01-01

    Audio computer-assisted self interviewing (Audio-CASI) technologies have recently been shown to provide important and sometimes dramatic improvements in the quality of survey measurements. This is particularly true for measurements requiring respondents to divulge highly sensitive information such as their sexual, drug use, or other sensitive behaviors. However, DOS-based Audio-CASI systems that were designed and adopted in the early 1990s have important limitations. Most salient is the poor ...

  18. Weakly Supervised Scalable Audio Content Analysis

    OpenAIRE

    Kumar, Anurag; Raj, Bhiksha

    2016-01-01

    Audio Event Detection is an important task for content analysis of multimedia data. Most of the current works on detection of audio events is driven through supervised learning approaches. We propose a weakly supervised learning framework which can make use of the tremendous amount of web multimedia data with significantly reduced annotation effort and expense. Specifically, we use several multiple instance learning algorithms to show that audio event detection through weak labels is feasible...

  19. Audio Watermarking Based On The PSK Modulation

    OpenAIRE

    Wahid Barkouti; Sihem Nasri; Adnane Cherif

    2011-01-01

    Audio watermarking is a technique, which can be used to embed information into the digital representation of audio signals. The main challenge is to hide data representing some information withoutcompromising the quality of the watermarked track and at the same time ensure that the embedded watermark is robust against removal attacks. Especially providing perfect audio quality combined withhigh robustness against a wide variety of attacks is not adequately addressed and evaluated in current w...

  20. MODIS: an audio motif discovery software

    OpenAIRE

    Catanese, Laurence; Souviraà-Labastie, Nathan; Qu, Bingqing; Campion, Sébastien; Gravier, Guillaume; Vincent, Emmanuel; Bimbot, Frédéric

    2013-01-01

    International audience MODIS is a free speech and audio motif discovery software developed at IRISA Rennes. Motif discovery is the task of discovering and collecting occurrences of repeating patterns in the absence of prior knowledge, or training material. MODIS is based on a generic approach to mine repeating audio sequences, with tolerance to motif variability. The algorithm implementation allows to process large audio streams at a reasonable speed where motif discovery often requires hu...

  1. Using virtual 3D audio in multispeech channel and multimedia environments

    Science.gov (United States)

    Orosz, Michael D.; Karplus, Walter J.; Balakrishnan, Jerry D.

    2000-08-01

    The advantages and disadvantages of using virtual 3-D audio in mission-critical, multimedia display interfaces were evaluated. The 3D audio platform seems to be an especially promising candidate for aircraft cockpits, flight control rooms, and other command and control environments in which operators must make mission-critical decisions while handling demanding and routine tasks. Virtual audio signal processing creates the illusion for a listener wearing conventional earphones that each of a multiplicity of simultaneous speech or audio channels is originating from a different, program- specified location in virtual space. To explore the possible uses of this new, readily available technology, a test bed simulating some of the conditions experienced by the chief flight test coordinator at NASA's Dryden Flight Research Center was designed and implemented. Thirty test subjects simultaneously performed routine tasks requiring constant hand-eye coordination, while monitoring four speech channels, each generating continuous speech signals, for the occurrence of pre-specified keywords. Performance measures included accuracy in identifying the keywords, accuracy in identifying the speaker of the keyword, and response time. We found substantial improvements on all of these measures when comparing virtual audio with conventional, monaural transmissions. We also explored the effect on operator performance of different spatial configurations of the audio sources in 3-D space, simulated movement (dither) in the source locations, and of providing graphical redundancy. Some of these manipulations were less effective and may even decrease performance efficiency, even though they improve some aspects of the virtual space simulation.

  2. Distortion Estimation in Compressed Music Using Only Audio Fingerprints

    NARCIS (Netherlands)

    Doets, P.J.O.; Lagendijk, R.L.

    2008-01-01

    An audio fingerprint is a compact yet very robust representation of the perceptually relevant parts of an audio signal. It can be used for content-based audio identification, even when the audio is severely distorted. Audio compression changes the fingerprint slightly. We show that these small finge

  3. Evaluation of the Audio Bracelet for Blind Interaction for improving mobility and spatial cognition in early blind children - A pilot study.

    Science.gov (United States)

    Finocchietti, Sara; Cappagli, Giulia; Ben Porquis, Lope; Baud-Bovy, Gabriel; Cocchi, Elena; Gori, Monica

    2015-08-01

    This study was designed to assess the effectiveness of the Audio Bracelet for Blind Interaction (ABBI) system for improving mobility and spatial cognition in visually impaired children. The bracelet is worn on the wrist and the key feature is to provide an audio feedback about body movements to help visually impaired children to build a sense of space. Nine early blind children took part at this study. The study lasted 12 weeks. Once per week each child participated in a 45-minutes ABBI rehabilitation with trained professionals. He also had to use it one hour per day at home alone or with one relative. The mobility and spatial cognition abilities were measured before and after a 12-weeks rehabilitation program with three different tests. Results showed that the use of the Audio Bracelet for Blind Interaction allowed the early blind children to significantly improve their mobility and spatial abilities. Although an extended study including a larger number of participants is needed to confirm these data, the present results are encouraging. They do suggest that ABBI could be used for rehabilitate the sense of space in visually impaired children. PMID:26738148

  4. Optimization of audio - ultrasonic plasma system parameters

    Science.gov (United States)

    Haleem, N. A.; Abdelrahman, M. M.; Ragheb, M. S.

    2016-10-01

    The present plasma is a special glow plasma type generated by an audio ultrasonic discharge voltage. A definite discharge frequency using a gas at a narrow band pressure creates and stabilizes this plasma type. The plasma cell is a self-extracted ion beam; it is featured with its high output intensity and its small size. The influence of the plasma column length on the output beam due to the variation of both the audio discharge frequency and the power applied to the plasma electrodes is investigated. In consequence, the aim of the present work is to put in evidence the parameters that influence the self-extracted collected ion beam and to optimize the conditions that enhance the collected ion beam. The experimental parameters studied are the nitrogen gas, the applied frequency from 10 to 100 kHz, the plasma length that varies from 8 to 14 cm, at a gas pressure of ≈ 0.25 Torr and finally the discharge power from 50 to 500 Watt. A sheet of polyethylene of 5 micrometer covers the collector electrode in order to confirm how much ions from the beam can go through the polymer and reach the collector. To diagnose the occurring events of the beam on the collector, the polymer used is analyzed by means of the FTIR and the XRF techniques. Optimization of the plasma cell parameters succeeded to enhance and to identify the parameters that influence the output ion beam and proved that its particles attaining the collector are multi-energetic.

  5. pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis

    OpenAIRE

    Theodoros Giannakopoulos

    2015-01-01

    Audio information plays a rather important role in the increasing digital content that is available today, resulting in a need for methodologies that automatically analyze such content: audio event recognition for home automations and surveillance systems, speech recognition, music information retrieval, multimodal analysis (e.g. audio-visual analysis of online videos for content-based recommendation), etc. This paper presents pyAudioAnalysis, an open-source Python library that provides a wid...

  6. Hi fi digital audio tape to SUN workstation transfer system for digital audio data

    OpenAIRE

    Gartenlaub, Arie Gal

    1994-01-01

    Approved for public release; distribution is unlimited This thesis describes a subsystem developed to provide for the transfer of digital audio signals from a SUN SPARCstation 10 workstation to a digital audio tape (DAT) and vice versa. The new system expands the audio recording/reproduction options available in the laboratory by integrating an analog tape deck and a digital tape deck with the SUN workstation. The desired connection enables working with a larger audio bandwidth to achieve ...

  7. Noise-Canceling Helmet Audio System

    Science.gov (United States)

    Seibert, Marc A.; Culotta, Anthony J.

    2007-01-01

    A prototype helmet audio system has been developed to improve voice communication for the wearer in a noisy environment. The system was originally intended to be used in a space suit, wherein noise generated by airflow of the spacesuit life-support system can make it difficult for remote listeners to understand the astronaut s speech and can interfere with the astronaut s attempt to issue vocal commands to a voice-controlled robot. The system could be adapted to terrestrial use in helmets of protective suits that are typically worn in noisy settings: examples include biohazard, fire, rescue, and diving suits. The system (see figure) includes an array of microphones and small loudspeakers mounted at fixed positions in a helmet, amplifiers and signal-routing circuitry, and a commercial digital signal processor (DSP). Notwithstanding the fixed positions of the microphones and loudspeakers, the system can accommodate itself to any normal motion of the wearer s head within the helmet. The system operates in conjunction with a radio transceiver. An audio signal arriving via the transceiver intended to be heard by the wearer is adjusted in volume and otherwise conditioned and sent to the loudspeakers. The wearer s speech is collected by the microphones, the outputs of which are logically combined (phased) so as to form a microphone- array directional sensitivity pattern that discriminates in favor of sounds coming from vicinity of the wearer s mouth and against sounds coming from elsewhere. In the DSP, digitized samples of the microphone outputs are processed to filter out airflow noise and to eliminate feedback from the loudspeakers to the microphones. The resulting conditioned version of the wearer s speech signal is sent to the transceiver.

  8. The Effect Of 3D Audio And Other Audio Techniques On Virtual Reality Experience

    NARCIS (Netherlands)

    Brinkman, W.P.; Hoekstra, A.R.D.; Van Egmond, R.

    2015-01-01

    Three studies were conducted to examine the effect of audio on people's experience in a virtual world. The first study showed that people could distinguish between mono, stereo, Dolby surround and 3D audio of a wasp. The second study found significant effects for audio techniques on people's self-re

  9. On the comparison of audio fingerprints for extracting quality parameters of compressed audio

    NARCIS (Netherlands)

    Doets, P.J.O.; Menot Gisbert, M.; Lagendijk, R.L.

    2006-01-01

    Audio fingerprints can be seen as hashes of the perceptual content of an audio excerpt. Applications include linking metadata to unlabeled audio, watermark support, and broadcast monitoring. Existing systems identify a song by comparing its fingerprint to pre-computed fingerprints in a database. Sma

  10. Audio-visual gender recognition

    Science.gov (United States)

    Liu, Ming; Xu, Xun; Huang, Thomas S.

    2007-11-01

    Combining different modalities for pattern recognition task is a very promising field. Basically, human always fuse information from different modalities to recognize object and perform inference, etc. Audio-Visual gender recognition is one of the most common task in human social communication. Human can identify the gender by facial appearance, by speech and also by body gait. Indeed, human gender recognition is a multi-modal data acquisition and processing procedure. However, computational multimodal gender recognition has not been extensively investigated in the literature. In this paper, speech and facial image are fused to perform a mutli-modal gender recognition for exploring the improvement of combining different modalities.

  11. Study on the construction of multi-dimensional Remote Sensing feature space for hydrological drought

    International Nuclear Information System (INIS)

    Hydrological drought refers to an abnormal water shortage caused by precipitation and surface water shortages or a groundwater imbalance. Hydrological drought is reflected in a drop of surface water, decrease of vegetation productivity, increase of temperature difference between day and night and so on. Remote sensing permits the observation of surface water, vegetation, temperature and other information from a macro perspective. This paper analyzes the correlation relationship and differentiation of both remote sensing and surface measured indicators, after the selection and extraction a series of representative remote sensing characteristic parameters according to the spectral characterization of surface features in remote sensing imagery, such as vegetation index, surface temperature and surface water from HJ-1A/B CCD/IRS data. Finally, multi-dimensional remote sensing features such as hydrological drought are built on a intelligent collaborative model. Further, for the Dong-ting lake area, two drought events are analyzed for verification of multi-dimensional features using remote sensing data with different phases and field observation data. The experiments results proved that multi-dimensional features are a good method for hydrological drought

  12. Non-retinotopic feature processing in the absence of retinotopic spatial layout and the construction of perceptual space from motion.

    Science.gov (United States)

    Ağaoğlu, Mehmet N; Herzog, Michael H; Oğmen, Haluk

    2012-10-15

    The spatial representation of a visual scene in the early visual system is well known. The optics of the eye map the three-dimensional environment onto two-dimensional images on the retina. These retinotopic representations are preserved in the early visual system. Retinotopic representations and processing are among the most prevalent concepts in visual neuroscience. However, it has long been known that a retinotopic representation of the stimulus is neither sufficient nor necessary for perception. Saccadic Stimulus Presentation Paradigm and the Ternus-Pikler displays have been used to investigate non-retinotopic processes with and without eye movements, respectively. However, neither of these paradigms eliminates the retinotopic representation of the spatial layout of the stimulus. Here, we investigated how stimulus features are processed in the absence of a retinotopic layout and in the presence of retinotopic conflict. We used anorthoscopic viewing (slit viewing) and pitted a retinotopic feature-processing hypothesis against a non-retinotopic feature-processing hypothesis. Our results support the predictions of the non-retinotopic feature-processing hypothesis and demonstrate the ability of the visual system to operate non-retinotopically at a fine feature processing level in the absence of a retinotopic spatial layout. Our results suggest that perceptual space is actively constructed from the perceptual dimension of motion. The implications of these findings for normal ecological viewing conditions are discussed.

  13. Feasibility and Structural Feature on Monotone Second-Order Cone Linear Complementarity Problems in Hilbert Space

    Institute of Scientific and Technical Information of China (English)

    Miao Xinhe 苗新河; Guo Shengjuan郭胜娟

    2015-01-01

    Given a real finite-dimensional or infinite-dimensional Hilbert space H with a Jordan product, the sec-ond-order cone linear complementarity problem(SOCLCP)is considered. Some conditions are investigated, for which the SOCLCP is feasible and solvable for any element qÎH . The solution set of a monotone SOCLCP is also characterized. It is shown that the second-order cone and Jordan product are interconnected.

  14. “Wrapping” X3DOM around Web Audio API

    Directory of Open Access Journals (Sweden)

    Andreas Stamoulias

    2015-12-01

    Full Text Available Spatial sound has a conceptual role in the Web3D environments, due to highly realism scenes that can provide. Lately the efforts are concentrated on the extension of the X3D/ X3DOM through spatial sound attributes. This paper presents a novel method for the introduction of spatial sound components in the X3DOM framework, based on X3D specification and Web Audio API. The proposed method incorporates the introduction of enhanced sound nodes for X3DOM which are derived by the implementation of the X3D standard components, enriched with accessional features of Web Audio API. Moreover, several examples-scenarios developed for the evaluation of our approach. The implemented examples established the achievability of new registered nodes in X3DOM, for spatial sound characteristics in Web3D virtual worlds.

  15. A Robust Zero-Watermarking Algorithm for Audio

    Directory of Open Access Journals (Sweden)

    Jie Zhu

    2008-03-01

    Full Text Available In traditional watermarking algorithms, the insertion of watermark into the host signal inevitably introduces some perceptible quality degradation. Another problem is the inherent conflict between imperceptibility and robustness. Zero-watermarking technique can solve these problems successfully. Instead of embedding watermark, the zero-watermarking technique extracts some essential characteristics from the host signal and uses them for watermark detection. However, most of the available zero-watermarking schemes are designed for still image and their robustness is not satisfactory. In this paper, an efficient and robust zero-watermarking technique for audio signal is presented. The multiresolution characteristic of discrete wavelet transform (DWT, the energy compression characteristic of discrete cosine transform (DCT, and the Gaussian noise suppression property of higher-order cumulant are combined to extract essential features from the host audio signal and they are then used for watermark recovery. Simulation results demonstrate the effectiveness of our scheme in terms of inaudibility, detection reliability, and robustness.

  16. Audio Quality for a Simple Forward Error Correcting Code

    OpenAIRE

    Calas, Yvan; Jean-Marie, Alain

    2004-01-01

    International audience The aim of this paper is to study the audio quality offered by a simple Forward Error Correction (FEC) code used in audio applications like Freephone or Rat. This coding technique consists in adding to every audio packet a redundant information concerning a preceding audio packet which belongs to the same audio flow. We show that the audio quality depends not only on the number of FEC flows and the utility function associated to the quantity of information received, ...

  17. Progressive Syntax-Rich Coding of Multichannel Audio Sources

    OpenAIRE

    Yang Dai; Ai Hongmei; Kyriakakis Chris; Kuo C-C Jay

    2003-01-01

    Being able to transmit the audio bitstream progressively is a highly desirable property for network transmission. MPEG- version audio supports fine grain bit rate scalability in the generic audio coder (GAC). It has a bit-sliced arithmetic coding (BSAC) tool, which provides scalability in the step of 1 Kbps per audio channel. There are also several other scalable audio coding methods, which have been proposed in recent years. However, these scalable audio tools are only available for mono ...

  18. Progressive Syntax-Rich Coding of Multichannel Audio Sources

    OpenAIRE

    Dai Yang; Hongmei Ai; Chris Kyriakakis; C.-C. Jay Kuo

    2003-01-01

    Being able to transmit the audio bitstream progressively is a highly desirable property for network transmission. MPEG-4 version 2 audio supports fine grain bit rate scalability in the generic audio coder (GAC). It has a bit-sliced arithmetic coding (BSAC) tool, which provides scalability in the step of 1 Kbps per audio channel. There are also several other scalable audio coding methods, which have been proposed in recent years. However, these scalable audio tools are only available for mono ...

  19. Content-based audio authentication using a hierarchical patchwork watermark embedding

    Science.gov (United States)

    Gulbis, Michael; Müller, Erika

    2010-05-01

    Content-based audio authentication watermarking techniques extract perceptual relevant audio features, which are robustly embedded into the audio file to protect. Manipulations of the audio file are detected on the basis of changes between the original embedded feature information and the anew extracted features during verification. The main challenges of content-based watermarking are on the one hand the identification of a suitable audio feature to distinguish between content preserving and malicious manipulations. On the other hand the development of a watermark, which is robust against content preserving modifications and able to carry the whole authentication information. The payload requirements are significantly higher compared to transaction watermarking or copyright protection. Finally, the watermark embedding should not influence the feature extraction to avoid false alarms. Current systems still lack a sufficient alignment of watermarking algorithm and feature extraction. In previous work we developed a content-based audio authentication watermarking approach. The feature is based on changes in DCT domain over time. A patchwork algorithm based watermark was used to embed multiple one bit watermarks. The embedding process uses the feature domain without inflicting distortions to the feature. The watermark payload is limited by the feature extraction, more precisely the critical bands. The payload is inverse proportional to segment duration of the audio file segmentation. Transparency behavior was analyzed in dependence of segment size and thus the watermark payload. At a segment duration of about 20 ms the transparency shows an optimum (measured in units of Objective Difference Grade). Transparency and/or robustness are fast decreased for working points beyond this area. Therefore, these working points are unsuitable to gain further payload, needed for the embedding of the whole authentication information. In this paper we present a hierarchical extension

  20. Custom Architecture for Immersive-Audio Applications

    NARCIS (Netherlands)

    Theodoropoulos, D.N.

    2011-01-01

    In this dissertation, we propose a new approach for rapid development of multi-core immersive-audio systems. We study two popular immersive-audio techniques, namely the Beamforming and the Wave Field Synthesis (WFS). Beamforming utilizes microphone arrays to extract acoustic sources recorded in a no

  1. A listening test system for automotive audio

    DEFF Research Database (Denmark)

    Christensen, Flemming; Geoff, Martin; Minnaar, Pauli;

    2005-01-01

    This paper describes a system for simulating automotive audio through headphones for the purposes of conducting listening experiments in the laboratory. The system is based on binaural technology and consists of a component for reproducing the sound of the audio system itself and a component...

  2. Audio-Visual Aids in Universities

    Science.gov (United States)

    Douglas, Jackie

    1970-01-01

    A report on the proceedings and ideas expressed at a one day seminar on "Audio-Visual Equipment--Its Uses and Applications for Teaching and Research in Universities." The seminar was organized by England's National Committee for Audio-Visual Aids in Education in conjunction with the British Universities Film Council. (LS)

  3. Digital Advances in Contemporary Audio Production.

    Science.gov (United States)

    Shields, Steven O.

    Noting that a revolution in sonic high fidelity occurred during the 1980s as digital-based audio production methods began to replace traditional analog modes, this paper offers both an overview of digital audio theory and descriptions of some of the related digital production technologies that have begun to emerge from the mating of the computer…

  4. Exploiting Acoustic Similarity of Propagating Paths for Audio Signal Separation

    Directory of Open Access Journals (Sweden)

    Yin Bin

    2003-01-01

    Full Text Available Blind signal separation can easily find its position in audio applications where mutually independent sources need to be separated from their microphone mixtures while both room acoustics and sources are unknown. However, the conventional separation algorithms can hardly be implemented in real time due to the high computational complexity. The computational load is mainly caused by either direct or indirect estimation of thousands of acoustic parameters. Aiming at the complexity reduction, in this paper, the acoustic paths are investigated through an acoustic similarity index (ASI. Then a new mixing model is proposed. With closely spaced microphones (5–10 cm apart, the model relieves the computational load of the separation algorithm by reducing the number and length of the filters to be adjusted. To cope with real situations, a blind audio signal separation algorithm (BLASS is developed on the proposed model. BLASS only uses the second-order statistics (SOS and performs efficiently in frequency domain.

  5. Stego-audio Using Genetic Algorithm Approach

    Directory of Open Access Journals (Sweden)

    V. Santhi

    2014-06-01

    Full Text Available With the rapid development of digital multimedia applications, the secure data transmission becomes the main issue in data communication system. So the multimedia data hiding techniques have been developed to ensure the secured data transfer. Steganography is an art of hiding a secret message within an image/audio/video file in such a way that the secret message cannot be perceived by hacker/intruder. In this study, we use RSA encryption algorithm to encrypt the message and Genetic Algorithm (GA to encode the message in the audio file. This study presents a method to access the negative audio bytes and includes the negative audio bytes in the message encoding and position embedding process. This increases the capacity of encoding message in the audio file. The use of GA operators in Genetic Algorithm reduces the noise distortions.

  6. An audio encryption using transposition method

    Directory of Open Access Journals (Sweden)

    Ahmad Jawahir

    2015-07-01

    Full Text Available Encryption is a technique to secure sounds data from attackers. In this study, transposition technique that corresponds to a WAV file extension is used. The performance of the transposition technique is measured using the mean square error (MSE. In the test, the value of MSE of the original and encrypted audio files were compared; the original and decrypted audio files used the correct password is ‘SEMBILAN’ and the incorrect password is ‘DELAPAN’. The experimental results showed that the original and encrypted audio files, and the original and decrypted audio files used the correct password that has a value of MSE = 0, and with the incorrect one with a value of MSE 0.00000428 or ≠ 0. In other words, the transposition technique is able to ensure the security of audio data files.

  7. Authenticity examination of compressed audio recordings using detection of multiple compression and encoders' identification.

    Science.gov (United States)

    Korycki, Rafal

    2014-05-01

    Since the appearance of digital audio recordings, audio authentication has been becoming increasingly difficult. The currently available technologies and free editing software allow a forger to cut or paste any single word without audible artifacts. Nowadays, the only method referring to digital audio files commonly approved by forensic experts is the ENF criterion. It consists in fluctuation analysis of the mains frequency induced in electronic circuits of recording devices. Therefore, its effectiveness is strictly dependent on the presence of mains signal in the recording, which is a rare occurrence. Recently, much attention has been paid to authenticity analysis of compressed multimedia files and several solutions were proposed for detection of double compression in both digital video and digital audio. This paper addresses the problem of tampering detection in compressed audio files and discusses new methods that can be used for authenticity analysis of digital recordings. Presented approaches consist in evaluation of statistical features extracted from the MDCT coefficients as well as other parameters that may be obtained from compressed audio files. Calculated feature vectors are used for training selected machine learning algorithms. The detection of multiple compression covers up tampering activities as well as identification of traces of montage in digital audio recordings. To enhance the methods' robustness an encoder identification algorithm was developed and applied based on analysis of inherent parameters of compression. The effectiveness of tampering detection algorithms is tested on a predefined large music database consisting of nearly one million of compressed audio files. The influence of compression algorithms' parameters on the classification performance is discussed, based on the results of the current study.

  8. Features of Virchow-Robin spaces in newly diagnosed multiple sclerosis patients

    Energy Technology Data Exchange (ETDEWEB)

    Etemadifar, Masoud [Department of Clinical and Biological Sciences, Division of Neurology, San Luigi Gonzaga School of Medicine, Orbassano (Torino), Turin (Italy); Department of Neurology, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Isfahan Research Committee of Multiple Sclerosis (IRCOMS), Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Hekmatnia, Ali; Tayari, Nazila [Department of Radiology, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Kazemi, Mojtaba [Department of Neurology, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Ghazavi, Amirhossein [Department of Radiology, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Akbari, Mojtaba [Department of Epidemiology and Statistics, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Maghzi, Amir-Hadi, E-mail: maghzi@edc.mui.ac.ir [Isfahan Research Committee of Multiple Sclerosis (IRCOMS), Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Neuroimmunology Unit, Centre for Neuroscience and Trauma, Blizard Institute of Cell and Molecular Science, Barts and the London School of Medicine and Dentistry, London (United Kingdom); Isfahan Neurosciences Research Center, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of)

    2011-11-15

    Background: Virchow-Robin spaces (VRSs) are perivascular pia-lined extensions of the subarachnoid space around the arteries and veins as they enter the brain parenchyma. These spaces are responsible for inflammatory processes within the brain. Objectives: This study was designed to shed more light on the location, size and shape of VRSs on 3 mm slice thickness, 1.5 Tesla MRI scans of newly diagnosed MS patients in Isfahan, Iran and compare the results with healthy age- and sex-matched controls. Methods: We evaluated MRI scans of 73 MS patients obtained within 3 months of MS onset and compared them with MRI scans from 73 age- and sex-matched healthy volunteers. Three mm section proton density, T2W and FLAIR MR images were obtained for all subjects. The location, size and shape of VRSs were compared between the two groups. Results: The total number of VRSs was significantly more in the MS group (p < 0.001). The distribution of VRSs were significantly more located in the high convexity areas in the MS group (p < 0.001), while there was no significant differences in other regions. The round shaped VRSs were significantly more detected on MRI scans of MS patients, and curvilinear shapes were significantly more frequently observed in healthy volunteers, however there were no significant differences for oval shaped VRSs between the two groups. The number of VRSs with the size over than 2 mm were significantly more observed in the MS groups compared to controls. We also observed some differences in the characteristics of VRSs between the genders in the MS group. Conclusion: The results of this study shed more light on the usefulness of VRSs as an MRI marker for the disease. In addition, according to our results VRSs might also have implication to determine the prognosis of the disease. However, larger studies with more advanced MRI techniques are required to confirm our results.

  9. Features of Virchow-Robin spaces in newly diagnosed multiple sclerosis patients

    International Nuclear Information System (INIS)

    Background: Virchow-Robin spaces (VRSs) are perivascular pia-lined extensions of the subarachnoid space around the arteries and veins as they enter the brain parenchyma. These spaces are responsible for inflammatory processes within the brain. Objectives: This study was designed to shed more light on the location, size and shape of VRSs on 3 mm slice thickness, 1.5 Tesla MRI scans of newly diagnosed MS patients in Isfahan, Iran and compare the results with healthy age- and sex-matched controls. Methods: We evaluated MRI scans of 73 MS patients obtained within 3 months of MS onset and compared them with MRI scans from 73 age- and sex-matched healthy volunteers. Three mm section proton density, T2W and FLAIR MR images were obtained for all subjects. The location, size and shape of VRSs were compared between the two groups. Results: The total number of VRSs was significantly more in the MS group (p < 0.001). The distribution of VRSs were significantly more located in the high convexity areas in the MS group (p < 0.001), while there was no significant differences in other regions. The round shaped VRSs were significantly more detected on MRI scans of MS patients, and curvilinear shapes were significantly more frequently observed in healthy volunteers, however there were no significant differences for oval shaped VRSs between the two groups. The number of VRSs with the size over than 2 mm were significantly more observed in the MS groups compared to controls. We also observed some differences in the characteristics of VRSs between the genders in the MS group. Conclusion: The results of this study shed more light on the usefulness of VRSs as an MRI marker for the disease. In addition, according to our results VRSs might also have implication to determine the prognosis of the disease. However, larger studies with more advanced MRI techniques are required to confirm our results.

  10. The HDTV digital audio matrix

    Science.gov (United States)

    Mason, A. J.

    Multichannel sound systems are being studied as part of the Eureka 95 and Radio-communication Bureau TG10-1 investigations into high definition television. One emerging sound system has five channels; three at the front and two at the back. This raises some compatibility issues. The listener might have only, say, two loudspeakers or the material to be broadcast may have fewer than five channels. The problem is how best to produce a set of signals to be broadcast, which is suitable for all listeners, from those that are available. To investigate this area, a device has been designed and built which has six input channels and six output channels. Each output signal is a linear combination of the input signals. The inputs and outputs are in AES/EBU digital audio format using BBC-designed AESIC chips. The matrix operation, to produce the six outputs from the six inputs, is performed by a Motorola DSP56001. The user interface and 'housekeeping' is managed by a T222 transputer. The operator of the matrix uses a VDU to enter sets of coefficients and a rotary switch to select which set to use. A set of analog controls is also available and is used to control operations other than the simple compatibility matrixing. The matrix has been very useful for simple tasks: mixing a stereo signal into mono, creating a stereo signal from a mono signal, applying a fixed gain or attenuation to a signal, exchanging the A and B channels of an AES/EBU bitstream, and so on. These are readily achieved using simple sets of coefficients. Additions to the user interface software have led to several more sophisticated applications which still consist of a matrix operation. Different multichannel panning laws have been evaluated. The analog controls adjust the panning; the audio signals are processed digitally using a matrix operation. A digital SoundField microphone decoder has also been implemented. digital audio matrix is such that it can be applied to a wide variety of signal processing

  11. Competing descriptions of diffusion profiles with two features: Surface space-charge layer versus fast grain-boundary diffusion

    Science.gov (United States)

    Schraknepper, H.; De Souza, R. A.

    2016-02-01

    Two different physical processes, (i) fast grain-boundary diffusion (FGBD) of oxygen and (ii) hindered oxygen diffusion in a surface space-charge layer, yield oxygen isotope diffusion profiles in a similar form. Two features are observed, with the short, sharp profile close to the surface being followed by a longer, shallower profile. In this study, we develop a procedure for deciding which of the two descriptions applies to experimentally measured profiles. Specifically, we solve Fick's second law, using finite-element simulations, to obtain oxygen isotope diffusion profiles for the two cases. Each set of profiles is then analysed in terms of the competing description. In this manner, we derive falsifiable conditions that allow physical processes to be assigned unambiguously to the two features of such isotope profiles. Applying these conditions to experimental profiles for SrTiO3 single crystals published in the literature, we find that FGBD is an invalid model for describing the diffusion processes.

  12. Specific Features of Discursive Space for Birthday Greetings in Facebook Social Network: Ecolinguistic Perspective

    Directory of Open Access Journals (Sweden)

    Olga Alekseevna Karamalak

    2015-09-01

    Full Text Available The paper deals with the establishment of a new ecological paradigm in linguistics, where language is viewed as an action aimed at achieving results in the future and proceeds in the interaction of an organism and the environment. Language is regarded as a complex activity (holistic approach. Cartesian dualism is shifted by the empirical language study in correlation with living organisms (biological systems. Verbal performance is considered to be a part of distributed human activity, coordination between the dynamic system or organism and environment. In general, Facebook posts are considered as an artificial symbolic niche, i.e. network of material structures for social interaction that comprises text messages, video, visual, or sound images put by users on their own or others' pages aimed at coordinating actions and triggering changes. The author outlines some specific features of the Internet discourse while analyzing birthday greetings in the Facebook social network posted by Russian, American, French, and German users. Some of the features include reduced character of the message, the use of abbreviations, simple structures, merging and interaction of written and spoken languages, the use of graphical signs to convey emotions. Spoken-like text types mediated by Internet are recently of great interest for linguists. The abovementioned social network is used as some environment for extended, distributed, and diverse ecology with definite culturally oriented values.

  13. Miniaturized, 9-12 micron heterodyne spectrometer with space qualifiable design features

    Science.gov (United States)

    Glenar, D. A.; Mumma, M. J.; Kostiuk, T.; Huffman, H.; Degnan, J.

    1990-01-01

    A demonstration-prototype CO2-laser heterodyne spectrometer operating at 9-12 microns and suitable for long-term space missions is described and illustrated with extensive diagrams, drawings, photographs, and graphs of test performance data. The spectrometer has total volume 0.63 cu m, mass 30 kg, and power requirement 60-70 W, compatible with miniature-class Space Shuttle experiment payload specifications. It comprises three modules: (1) an optical front end with reflecting optics, a 2-GHz BW HgCdTe photomixer, and a 0-2-GHz 40-dB RF preamplifier; (2) a local oscillator with an RF-excited waveguide CO2 laser, a 75-percent-efficiency RF amplifier, a stepper-driven grating mode selector, and an etalon stabilized for over 30,000 h of use; and (3) an RF-filter-bank spectral-line receiver with a 25-MHz RF channel, 1.6-GHz IF spectral coverage, onboard instrument control, a serial link to the host computer, and highly integrated design.

  14. A Visual Analytics Approach Using the Exploration of Multidimensional Feature Spaces for Content-Based Medical Image Retrieval.

    Science.gov (United States)

    Kumar, Ashnil; Nette, Falk; Klein, Karsten; Fulham, Michael; Kim, Jinman

    2015-09-01

    Content-based image retrieval (CBIR) is a search technique based on the similarity of visual features and has demonstrated potential benefits for medical diagnosis, education, and research. However, clinical adoption of CBIR is partially hindered by the difference between the computed image similarity and the user's search intent, the semantic gap, with the end result that relevant images with outlier features may not be retrieved. Furthermore, most CBIR algorithms do not provide intuitive explanations as to why the retrieved images were considered similar to the query (e.g., which subset of features were similar), hence, it is difficult for users to verify if relevant images, with a small subset of outlier features, were missed. Users, therefore, resort to examining irrelevant images and there are limited opportunities to discover these "missed" images. In this paper, we propose a new approach to medical CBIR by enabling a guided visual exploration of the search space through a tool, called visual analytics for medical image retrieval (VAMIR). The visual analytics approach facilitates interactive exploration of the entire dataset using the query image as a point-of-reference. We conducted a user study and several case studies to demonstrate the capabilities of VAMIR in the retrieval of computed tomography images and multimodality positron emission tomography and computed tomography images. PMID:25296409

  15. Ubiquitous WLAN/Camera Positioning using Inverse Intensity Chromaticity Space-based Feature Detection and Matching: A Preliminary Result

    CERN Document Server

    Bejuri, Wan Mohd Yaakob Wan; Sapri, Maimunah; Rosly, Mohd Adly

    2012-01-01

    This paper present our new intensity chromaticity space-based feature detection and matching algorithm. This approach utilizes hybridization of wireless local area network and camera internal sensor which to receive signal strength from a access point and the same time retrieve interest point information from hallways. This information is combined by model fitting approach in order to find the absolute of user target position. No conventional searching algorithm is required, thus it is expected reducing the computational complexity. Finally we present pre-experimental results to illustrate the performance of the localization system for an indoor environment set-up.

  16. EFFECTS OF ELECTRODE SPACING AND INVERSION TECHNIQUES ON THE EFFICACY OF 2D RESISTIVITY IMAGING TO DELINEATE SUBSURFACE FEATURES

    Directory of Open Access Journals (Sweden)

    Adiat Kola Abdul-Nafiu

    2013-01-01

    Full Text Available In this study, the effect of the choice of appropriate electrode spacing and inversion algorithms on the efficacy of 2D imaging to map subsurface features was investigated. The target being investigated was the drainage concrete pipe buried at approximately 0.3 m into the subsurface. A profile perpendicular to the strike of the pipe was established. 2D resistivity data was separately collected with the electrode spacings of 1.5 m and 0.5 m. using the Dipole-Dipole, the Wenner and the Wenner-Schlumberger array configurations. The results obtained showed that when the electrode spacing of 1.5 m was used for the investigations, none of the three array types was able to map the target with either of the two inversion techniques. The results further show that the attainment of RMS error of less about 10% which usually gives the indication of a good subsurface model is not a guarantee that subsurface features are successfully mapped. On the other hand, when the electrode spacing of 0.5 m was used for the data collection, the results obtained with the standard constrains inversion technique showed that all the three array configurations mapped the target however, only the dipole-dipole array was able to resolve the boundary between the concrete pipe and the entrapped air. With the robust constrain inversion technique; the target was also successfully mapped by all the three array types. In addition to this, the boundary between the entrapped air and the concrete pipe was resolved by all the three array types. This suggests that if there is a significant contrast in the subsurface layers’ resistivities, the robust constrain inversion algorithm technique gives better boundaries resolution irrespective of the array types used for the survey. The inversion of the 3D data gave 3D resistivity sections which were presented as horizontal depth slices. The result obtained from the inversion of the 3D data has assisted us in getting information about the

  17. High-Fidelity Piezoelectric Audio Device

    Science.gov (United States)

    Woodward, Stanley E.; Fox, Robert L.; Bryant, Robert G.

    2003-01-01

    ModalMax is a very innovative means of harnessing the vibration of a piezoelectric actuator to produce an energy efficient low-profile device with high-bandwidth high-fidelity audio response. The piezoelectric audio device outperforms many commercially available speakers made using speaker cones. The piezoelectric device weighs substantially less (4 g) than the speaker cones which use magnets (10 g). ModalMax devices have extreme fabrication simplicity. The entire audio device is fabricated by lamination. The simplicity of the design lends itself to lower cost. The piezoelectric audio device can be used without its acoustic chambers and thereby resulting in a very low thickness of 0.023 in. (0.58 mm). The piezoelectric audio device can be completely encapsulated, which makes it very attractive for use in wet environments. Encapsulation does not significantly alter the audio response. Its small size (see Figure 1) is applicable to many consumer electronic products, such as pagers, portable radios, headphones, laptop computers, computer monitors, toys, and electronic games. The audio device can also be used in automobile or aircraft sound systems.

  18. High-energy electromagnetic cascades in extragalactic space: Physics and features

    Science.gov (United States)

    Berezinsky, V.; Kalashev, O.

    2016-07-01

    Using the analytic modeling of the electromagnetic cascades compared with more precise numerical simulations, we describe the physical properties of electromagnetic cascades developing in the universe on cosmic microwave background and extragalactic background light radiations. A cascade is initiated by very-high-energy photon or electron, and the remnant photons at large distance have two-component energy spectrum, ∝E-2 (∝E-1.9 in numerical simulations) produced at the cascade multiplication stage and ∝E-3 /2 from Inverse Compton electron cooling at low energies. The most noticeable property of the cascade spectrum in analytic modeling is "strong universality," which includes the standard energy spectrum and the energy density of the cascade ωcas as its only numerical parameter. Using numerical simulations of the cascade spectrum and comparing it with recent Fermi LAT spectrum, we obtained the upper limit on ωcas stronger than in previous works. The new feature of the analysis is the "Emax rule." We investigate the dependence of ωcas on the distribution of sources, distinguishing two cases of universality: the strong and weak ones.

  19. The Visualization and Analysis of POI Features under Network Space Supported by Kernel Density Estimation

    Directory of Open Access Journals (Sweden)

    YU Wenhao

    2015-01-01

    Full Text Available The distribution pattern and the distribution density of urban facility POIs are of great significance in the fields of infrastructure planning and urban spatial analysis. The kernel density estimation, which has been usually utilized for expressing these spatial characteristics, is superior to other density estimation methods (such as Quadrat analysis, Voronoi-based method, for that the Kernel density estimation considers the regional impact based on the first law of geography. However, the traditional kernel density estimation is mainly based on the Euclidean space, ignoring the fact that the service function and interrelation of urban feasibilities is carried out on the network path distance, neither than conventional Euclidean distance. Hence, this research proposed a computational model of network kernel density estimation, and the extension type of model in the case of adding constraints. This work also discussed the impacts of distance attenuation threshold and height extreme to the representation of kernel density. The large-scale actual data experiment for analyzing the different POIs' distribution patterns (random type, sparse type, regional-intensive type, linear-intensive type discusses the POI infrastructure in the city on the spatial distribution of characteristics, influence factors, and service functions.

  20. Temporal structure and complexity affect audio-visual correspondence detection

    Directory of Open Access Journals (Sweden)

    Rachel N Denison

    2013-01-01

    Full Text Available Synchrony between events in different senses has long been considered the critical temporal cue for multisensory integration. Here, using rapid streams of auditory and visual events, we demonstrate how humans can use temporal structure (rather than mere temporal coincidence to detect multisensory relatedness. We find psychophysically that participants can detect matching auditory and visual streams via shared temporal structure for crossmodal lags of up to 200 ms. Performance on this task reproduced features of past findings based on explicit timing judgments but did not show any special advantage for perfectly synchronous streams. Importantly, the complexity of temporal patterns influences sensitivity to correspondence. Stochastic, irregular streams – with richer temporal pattern information – led to higher audio-visual matching sensitivity than predictable, rhythmic streams. Our results reveal that temporal structure and its complexity are key determinants for human detection of audio-visual correspondence. The distinctive emphasis of our new paradigms on temporal patterning could be useful for studying special populations with suspected abnormalities in audio-visual temporal perception and multisensory integration.

  1. Audio recording and reproduction in CARROUSO: Getting closer to perfection?

    Science.gov (United States)

    Teutsch, Heinz; Spors, Sascha; Buchner, Herbert; Rabenstein, Rudolf; Kellermann, Walter

    2002-05-01

    State-of-the-art systems for spatial audio reproduction utilize two to six discrete playback channels. A problem inherent to these systems is the relatively small area where the listener is able to experience a true 3-D sound sensation. This so-called ``sweet spot'' can be significantly enlarged by using loudspeaker arrays in combination with wave field synthesis (WFS) technology, initially developed at Delft University. By following this approach, actual sonic spaces can be reproduced in their entirety and not only discrete multichannel representations thereof. While loudspeaker arrays can be used to reproduce sound fields, microphone arrays can be used for sound field capture and analysis. Having high-quality audio reproduction in mind, microphone array designs are presented that need to fulfill stricter requirements than what has been traditionally considered for microphone array applications. Information on acoustic source position is essential for WFS-based rendering techniques. As will be shown, joint audio-video object tracking proves to be efficient for this task. Moreover, full-duplex applications based on WFS technology, like high-quality teleconferencing or remote music teaching, call for sophisticated multichannel acoustic echo cancellation algorithms. The European project ``CARROUSO'' aims at developing, integrating, and building a real-time system that embraces all previously described technologies in an MPEG-4 context.

  2. Audio-visual voice activity detection

    Institute of Scientific and Technical Information of China (English)

    LIU Peng; WANG Zuo-ying

    2006-01-01

    In speech signal processing systems,frame-energy based voice activity detection (VAD) method may be interfered with the background noise and non-stationary characteristic of the frame-energy in voice segment.The purpose of this paper is to improve the performance and robustness of VAD by introducing visual information.Meanwhile,data-driven linear transformation is adopted in visual feature extraction,and a general statistical VAD model is designed.Using the general model and a two-stage fusion strategy presented in this paper,a concrete multimodal VAD system is built.Experiments show that a 55.0% relative reduction in frame error rate and a 98.5% relative reduction in sentence-breaking error rate are obtained when using multimodal VAD,compared to frame-energy based audio VAD.The results show that using multimodal method,sentence-breaking errors are almost avoided,and flame-detection performance is clearly improved, which proves the effectiveness of the visual modal in VAD.

  3. Complete guide to high-end audio

    CERN Document Server

    Harley, Robert

    2010-01-01

    In this newly updated directory, the latest in cutting-edge audio equipment is provided, including how to choose the best audio equipment on a budget, how to get the best sound for the money, and how to set up a system for maximum performance. Revised and expanded to include all the latest audio technologies, this book is packed with expert advice how to make speakers sound up to 50 percent better at no cost, avoid the most common system set-up mistakes, and how to choose the one speaker in 50 worth owning. Among the new topics covered are computer-based music servers, wireless streaming of au

  4. Mobile video-to-audio transducer and motion detection for sensory substitution

    Directory of Open Access Journals (Sweden)

    Maxime eAmbard

    2015-10-01

    Full Text Available Visuo-auditory sensory substitution systems are augmented reality devices that translate a video stream into an audio stream in order to help the blind in daily tasks requiring visuo-spatial information. In this work, we present both a new mobile device and a transcoding method specifically designed to sonify moving objects. Frame differencing is used to extract spatial features from the video stream and two-dimensional spatial information is converted into audio cues using pitch, interaural time difference and interaural level difference. Using numerical methods, we attempt to reconstruct visuo-spatial information based on audio signals generated from various video stimuli. We show that despite a contrasted visual background and a highly lossy encoding method, the information in the audio signal is sufficient to allow object localization, object trajectory evaluation, object approach detection, and spatial separation of multiple objects. We also show that this type of audio signal can be interpreted by human users by asking ten subjects to discriminate trajectories based on generated audio signals.

  5. Main circulator design features for HTR 100, HTR 500 and space heating plants

    International Nuclear Information System (INIS)

    All design alternatives for modern high-temperature reactors have a common circulator concept: It is based on a vertical shaft design with a flying impeller. The circulators are equipped with active magnetic bearings and are driven by induction motors connected to variable-speed static converters. Due to their multiple functions during normal reactor operation and under accident conditions, extremely high requirements are made to safety-relevant circulators, since with the reactor pressurized as well as under depressurized conditions specified delivery heads and flow rates have to be ensured. The use of active magnetic bearings permits to obtain maintenance-free operation and functional safety to an extent which had not been achieved before. Magnetic bearings are therefore provided for the total range including primary gas circulators of a drive power of several MW as well as circulators for helium loops of reactor auxiliary systems. The essential feature for using active magnetic bearings is the retainer bearing technology, preventing contact between rotor and static circulator parts upon unintended deenergisation of the magnets. Results of current experiments are reported. Another aspect to be considered for reliable long-term operation for several decades is the effect of rotor dynamics. The various natural frequencies resulting from torsion and bending modes in view of a drive by a frequency-controlled induction motor have to be considered as well as the specific characteristics of the active magnetic bearings. Special attention has to be directed to the internal cooling loop so as to ensure that reactor temperature excursions in the event of deviation from normal operation can be overcome without damage. For circulator components exposed to temperature fields the design characteristics are determined by combining experimental and analytical methods. The coordination of all component parts is currently being optimized on a prototype circulator whose detailed

  6. Estimating Species Distributions Across Space Through Time and with Features of the Environment

    Energy Technology Data Exchange (ETDEWEB)

    Kelling, S. [Cornell Lab of Ornithology; Fink, D. [Cornell Lab of Ornithology; Hochachka, W. [Cornell Lab of Ornithology; Rosenberg, K. [Cornell Lab of Ornithology; Cook, R. [Oak Ridge National Laboratory (ORNL); Damoulas, C. [Department of Computer Science, Cornell University; Silva, C. [Department of Computer Science, Polytechnic Institute of New York; Michener, W. [DataONE, University of New Mexico

    2013-01-01

    Complete guidance for mastering the tools and techniques of the digital revolution With the digital revolution opening up tremendous opportunities in many fields, there is a growing need for skilled professionals who can develop data-intensive systems and extract information and knowledge from them. This book frames for the first time a new systematic approach for tackling the challenges of data-intensive computing, providing decision makers and technical experts alike with practical tools for dealing with our exploding data collections. Emphasizing data-intensive thinking and interdisciplinary collaboration, The Data Bonanza: Improving Knowledge Discovery in Science, Engineering, and Business examines the essential components of knowledge discovery, surveys many of the current research efforts worldwide, and points to new areas for innovation. Complete with a wealth of examples and DISPEL-based methods demonstrating how to gain more from data in real-world systems, the book: Outlines the concepts and rationale for implementing data-intensive computing in organizations Covers from the ground up problem-solving strategies for data analysis in a data-rich world Introduces techniques for data-intensive engineering using the Data-Intensive Systems Process Engineering Language DISPEL Features in-depth case studies in customer relations, environmental hazards, seismology, and more Showcases successful applications in areas ranging from astronomy and the humanities to transport engineering Includes sample program snippets throughout the text as well as additional materials on a companion website The Data Bonanza is a must-have guide for information strategists, data analysts, and engineers in business, research, and government, and for anyone wishing to be on the cutting edge of data mining, machine learning, databases, distributed systems, or large-scale computing.

  7. Virtual environment display for a 3D audio room simulation

    Science.gov (United States)

    Chapin, William L.; Foster, Scott

    1992-06-01

    Recent developments in virtual 3D audio and synthetic aural environments have produced a complex acoustical room simulation. The acoustical simulation models a room with walls, ceiling, and floor of selected sound reflecting/absorbing characteristics and unlimited independent localizable sound sources. This non-visual acoustic simulation, implemented with 4 audio ConvolvotronsTM by Crystal River Engineering and coupled to the listener with a Poihemus IsotrakTM, tracking the listener's head position and orientation, and stereo headphones returning binaural sound, is quite compelling to most listeners with eyes closed. This immersive effect should be reinforced when properly integrated into a full, multi-sensory virtual environment presentation. This paper discusses the design of an interactive, visual virtual environment, complementing the acoustic model and specified to: 1) allow the listener to freely move about the space, a room of manipulable size, shape, and audio character, while interactively relocating the sound sources; 2) reinforce the listener's feeling of telepresence into the acoustical environment with visual and proprioceptive sensations; 3) enhance the audio with the graphic and interactive components, rather than overwhelm or reduce it; and 4) serve as a research testbed and technology transfer demonstration. The hardware/software design of two demonstration systems, one installed and one portable, are discussed through the development of four iterative configurations. The installed system implements a head-coupled, wide-angle, stereo-optic tracker/viewer and multi-computer simulation control. The portable demonstration system implements a head-mounted wide-angle, stereo-optic display, separate head and pointer electro-magnetic position trackers, a heterogeneous parallel graphics processing system, and object oriented C++ program code.

  8. Survey Musik und Medien 2012: Audio Media Usage in Germany - Audio Devices - Audio Devices used in 2012 - Radio Traditionalists

    OpenAIRE

    Lepa, Steffen

    2013-01-01

    By what means was music played back in 2012? Audio Devices comprise technical devices that permit access to and enable playback of Audio Sources. This includes CD players, record players, cassette recorders, MP3 player and smartphones but also computers and various multimedia entertainment devices that allow music use. Radio Traditionalists are represented in various age groups, and may be born in 1920 as well as in 1959. They constitute 22,2 % of the German population between age 14 and ...

  9. Survey Musik und Medien 2012: Audio Media Usage in Germany - Audio Devices - Audio Devices used in 2012 - Digital Mobilists

    OpenAIRE

    Lepa, Steffen

    2013-01-01

    By what means was music played back in 2012? Audio Devices comprise technical devices that permit access to and enable playback of Audio Sources. This includes CD players, record players, cassette recorders, MP3 player and smartphones but also computers and various multimedia entertainment devices that allow music use. Typically born between 1979 and 1998, the Digital Mobilists constitute the youngest user type and comprise 16,1 % of the German Population. They access free video streaming...

  10. Implementation of Audio signal by using wavelet transform

    Directory of Open Access Journals (Sweden)

    Chakresh kumar,

    2010-10-01

    Full Text Available Audio coding is the technology to represent audio in digital form with as few bits as possible while maintaining the intelligibility and quality required for particular application. Interest in audio coding is motivated by the evolution to digital communications and the requirement to minimize bit rate, and hence conserve bandwidth. There is always a tradeoff between compression ratio and maintaining the delivered audio quality and intelligibility. Audio coding is widely used in application such as digital broadcasting, Internet audio or music database to reduce the bit rate of high quality audio signal without comprising the perceptual quality. In this dissertation work Design and implementation of a MPEG Lossless audio codec using wavelet transform has been proposed. The major issues concerning the development of audio codec are choosing optimal wavelets for audio signals, decomposition level in the digital wavelet transform and thresholding criteria for coefficient truncation which is the basis to provide compression ratio for audio with suitable peak signal to noise ratio (PSNR, wavelet packet compression technique has also been used to compare the performanceof audio codec using wavelet transform. A psychoacoustic model is used to improve the quality of audio signal. The proposed audio codec has been implemented on DSK6713 Starter Kit using MATLAB-7.3 and Link to Code Composer Studio and various audio signals of different time duration have been tested. Result obtained show that the proposed codec improves quality of the reconstructed audio signal.

  11. Security of audio secret sharing scheme encrypting audio secrets with bounded shares

    OpenAIRE

    鷲尾, 槙也; 渡邊, 曜大

    2014-01-01

    Secret sharing is a method of encrypting a secret into multiple pieces called shares so that only qualified sets of shares can be employed to reconstruct the secret. Audio secret sharing (ASS) is an example of secret sharing whose decryption can be performed by human ears. This paper examines the security of an audio secret sharing scheme encrypting audio secrets with bounded shares, and optimizes the security with respect to the probability distribution used in its encryption.

  12. A Morphological Analysis of Audio Objects and their Control Methods for 3D Audio

    OpenAIRE

    Mathew, Justin; Huot, Stéphane; Blum, Alan

    2014-01-01

    International audience Recent technological improvements in audio reproduction systems increased the possibilities to spatialize sources in a listening environment. The spatialization of reproduced audio is highly dependent on the recording technique, the rendering method, and the loudspeaker configuration. While object-based audio production reduces this dependency on loudspeaker configurations, related authoring tools are still difficult to interact with. In this paper, we investigate th...

  13. Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder

    OpenAIRE

    Chung, Yu-An; Wu, Chao-Chung; Shen, Chia-Hao; Lee, Hung-Yi; Lee, Lin-shan

    2016-01-01

    The vector representations of fixed dimensionality for words (in text) offered by Word2Vec have been shown to be very useful in many application scenarios, in particular due to the semantic information they carry. This paper proposes a parallel version, the Audio Word2Vec. It offers the vector representations of fixed dimensionality for variable-length audio segments. These vector representations are shown to describe the sequential phonetic structures of the audio segments to a good degree, ...

  14. Elicitation of attributes for the evaluation of audio-on audio-interference

    DEFF Research Database (Denmark)

    Francombe, Jon; Mason, R.; Dewhirst, M.;

    2014-01-01

    An experiment to determine the perceptual attributes of the experience of listening to a target audio program in the presence of an audio interferer was performed. The first stage was a free elicitation task in which a total of 572 phrases were produced. In the second stage, a consensus vocabulary......, annoyance, balance and blend, and confusion. Ratings using these attributes were collected in the fourth stage, and a principal component analysis performed. This suggested two dimensions underlying the perception of an audio-on-audio interference situation: The first dimension was labeled “distraction” and...

  15. CERN automatic audio-conference service

    CERN Document Server

    Sierra Moral, R

    2010-01-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first Euro...

  16. CERN automatic audio-conference service

    CERN Multimedia

    Sierra Moral, R

    2009-01-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first Euro...

  17. Augmenting Environmental Interaction in Audio Feedback Systems

    Directory of Open Access Journals (Sweden)

    Seunghun Kim

    2016-04-01

    Full Text Available Audio feedback is defined as a positive feedback of acoustic signals where an audio input and output form a loop, and may be utilized artistically. This article presents new context-based controls over audio feedback, leading to the generation of desired sonic behaviors by enriching the influence of existing acoustic information such as room response and ambient noise. This ecological approach to audio feedback emphasizes mutual sonic interaction between signal processing and the acoustic environment. Mappings from analyses of the received signal to signal-processing parameters are designed to emphasize this specificity as an aesthetic goal. Our feedback system presents four types of mappings: approximate analyses of room reverberation to tempo-scale characteristics, ambient noise to amplitude and two different approximations of resonances to timbre. These mappings are validated computationally and evaluated experimentally in different acoustic conditions.

  18. Virtual Microphones for Multichannel Audio Resynthesis

    Science.gov (United States)

    Mouchtaris, Athanasios; Narayanan, Shrikanth S.; Kyriakakis, Chris

    2003-12-01

    Multichannel audio offers significant advantages for music reproduction, including the ability to provide better localization and envelopment, as well as reduced imaging distortion. On the other hand, multichannel audio is a demanding media type in terms of transmission requirements. Often, bandwidth limitations prohibit transmission of multiple audio channels. In such cases, an alternative is to transmit only one or two reference channels and recreate the rest of the channels at the receiving end. Here, we propose a system capable of synthesizing the required signals from a smaller set of signals recorded in a particular venue. These synthesized "virtual" microphone signals can be used to produce multichannel recordings that accurately capture the acoustics of that venue. Applications of the proposed system include transmission of multichannel audio over the current Internet infrastructure and, as an extension of the methods proposed here, remastering existing monophonic and stereophonic recordings for multichannel rendering.

  19. A Study of Audio Tape: Part II

    Science.gov (United States)

    Reen, Noel K.

    1975-01-01

    To evaluate reel audio tape, tests were performed to identify: signal-to-noise ratio, total harmonic distortion, dynamic response, frequency response, biased and virgin tape noise, dropout susceptibility and oxide coating uniformity. (SCC)

  20. Virtual Microphones for Multichannel Audio Resynthesis

    Directory of Open Access Journals (Sweden)

    Athanasios Mouchtaris

    2003-09-01

    Full Text Available Multichannel audio offers significant advantages for music reproduction, including the ability to provide better localization and envelopment, as well as reduced imaging distortion. On the other hand, multichannel audio is a demanding media type in terms of transmission requirements. Often, bandwidth limitations prohibit transmission of multiple audio channels. In such cases, an alternative is to transmit only one or two reference channels and recreate the rest of the channels at the receiving end. Here, we propose a system capable of synthesizing the required signals from a smaller set of signals recorded in a particular venue. These synthesized “virtual” microphone signals can be used to produce multichannel recordings that accurately capture the acoustics of that venue. Applications of the proposed system include transmission of multichannel audio over the current Internet infrastructure and, as an extension of the methods proposed here, remastering existing monophonic and stereophonic recordings for multichannel rendering.

  1. Audio-visual affective expression recognition

    Science.gov (United States)

    Huang, Thomas S.; Zeng, Zhihong

    2007-11-01

    Automatic affective expression recognition has attracted more and more attention of researchers from different disciplines, which will significantly contribute to a new paradigm for human computer interaction (affect-sensitive interfaces, socially intelligent environments) and advance the research in the affect-related fields including psychology, psychiatry, and education. Multimodal information integration is a process that enables human to assess affective states robustly and flexibly. In order to understand the richness and subtleness of human emotion behavior, the computer should be able to integrate information from multiple sensors. We introduce in this paper our efforts toward machine understanding of audio-visual affective behavior, based on both deliberate and spontaneous displays. Some promising methods are presented to integrate information from both audio and visual modalities. Our experiments show the advantage of audio-visual fusion in affective expression recognition over audio-only or visual-only approaches.

  2. Audio Source Separation Using a Deep Autoencoder

    OpenAIRE

    Jang, Giljin; Kim, Han-Gyu; Oh, Yung-Hwan

    2014-01-01

    This paper proposes a novel framework for unsupervised audio source separation using a deep autoencoder. The characteristics of unknown source signals mixed in the mixed input is automatically by properly configured autoencoders implemented by a network with many layers, and separated by clustering the coefficient vectors in the code layer. By investigating the weight vectors to the final target, representation layer, the primitive components of the audio signals in the frequency domain are o...

  3. Implementation of Audio signal by using wavelet transform

    OpenAIRE

    Chakresh kumar; Chandra Shekhar; Mrs. Ashu Soni; Bindu Thakral

    2010-01-01

    Audio coding is the technology to represent audio in digital form with as few bits as possible while maintaining the intelligibility and quality required for particular application. Interest in audio coding is motivated by the evolution to digital communications and the requirement to minimize bit rate, and hence conserve bandwidth. There is always a tradeoff between compression ratio and maintaining the delivered audio quality and intelligibility. Audio coding is widely used in application s...

  4. Optimized Audio Classification and Segmentation Algorithm by Using Ensemble Methods

    OpenAIRE

    Saadia Zahid; Fawad Hussain; Muhammad Rashid; Muhammad Haroon Yousaf; Hafiz Adnan Habib

    2015-01-01

    Audio segmentation is a basis for multimedia content analysis which is the most important and widely used application nowadays. An optimized audio classification and segmentation algorithm is presented in this paper that segments a superimposed audio stream on the basis of its content into four main audio types: pure-speech, music, environment sound, and silence. An algorithm is proposed that preserves important audio content and reduces the misclassification rate without using large amount o...

  5. Audio watermarking for live performance

    Science.gov (United States)

    Tachibana, Ryuki

    2003-06-01

    Audio watermarking has been used mainly for digitally stored content. Using real-time watermark embedding, its coverage can be extended to live broadcasts and live performances. In general, a conventional embedding algorithm receives a host signal (HS) and outputs the summation of the HS and a watermark signal (WS). However, when applied to real-time embedding, there are two problems: (1) delay of the HS, and (2) possible interruption of the broadcast. To solve these problems, we propose a watermark generation algorithm that outputs only a WS, and a system composition method in which a mixer outside the computer mixes the WS generated by the algorithm and the HS. In addition, we propose a new composition method "sonic watermarking." In this composition method, the sound of the HS and the sound of the WS are played separately by two speakers, and the sounds are mixed in the air. Using this composition method, it would be possible to generate a watermarking sound in a concerto hall so that the watermark could be detected from content recorded by audience members who have recording devices at their seats. We report on the results of experiments and discuss the merits and flaws of various real-time watermarking composition methods.

  6. PENGGUNAAN MEDIA AUDIO DALAM PEMBELAJARAN STENOGRAFI

    Directory of Open Access Journals (Sweden)

    S Martono

    2011-06-01

    Full Text Available The objective this study is to know the effectivenes of using audio media in stenografi typing learning. The population  of this research was 30 students that divided into two groups; experimental and controlled group consisted of 15 students. Based on the first score in stenografi subject that the two groups have the same abillity but they were given different treatment. For experimental group, they got a treatment of audio media whereas the controlled group didn’t use audio media. The technique of collecting data were documentation technique and experimental tecnique. The instrument was stenografi speed typing. The final result showed that the using of audio media was more effective and can improve the study result better than controlled group. This result was expected to  give significance for the stenografi teachers to apply audio media in learning and input for the students that stenografi was not a memorizing subject but it was a skill subject that must be trained by joining the lesson. Thus, people can use stenografi typing to record each talk. Keywords: Learning, Audio Media, Stenografi

  7. On the Use of Memory Models in Audio Features

    DEFF Research Database (Denmark)

    Jensen, Karl Kristoffer

    2011-01-01

    has a peak, and the spectral content that is increased by the new note is added to the STM. The STM is exponentially fading with time span and number of elements, and each note only belongs to the STM for a limited time. Initial experiment regarding the behavior of the STM shows promising results...

  8. Extrastriate Visual Areas Integrate Form Features over Space and Time to Construct Representations of Stationary and Rigidly Rotating Objects.

    Science.gov (United States)

    McCarthy, J Daniel; Kohler, Peter J; Tse, Peter U; Caplovitz, Gideon Paul

    2015-11-01

    When an object moves behind a bush, for example, its visible fragments are revealed at different times and locations across the visual field. Nonetheless, a whole moving object is perceived. Unlike traditional modal and amodal completion mechanisms known to support spatial form integration when all parts of a stimulus are simultaneously visible, relatively little is known about the neural substrates of the spatiotemporal form integration (STFI) processes involved in generating coherent object representations from a succession visible fragments. We used fMRI to identify brain regions involved in two mechanisms supporting the representation of stationary and rigidly rotating objects whose form features are shown in succession: STFI and position updating. STFI allows past and present form cues to be integrated over space and time into a coherent object even when the object is not visible in any given frame. STFI can occur whether or not the object is moving. Position updating allows us to perceive a moving object, whether rigidly rotating or translating, even when its form features are revealed at different times and locations in space. Our results suggest that STFI is mediated by visual regions beyond V1 and V2. Moreover, although widespread cortical activation has been observed for other motion percepts derived solely from form-based analyses [Tse, P. U. Neural correlates of transformational apparent motion. Neuroimage, 31, 766-773, 2006; Krekelberg, B., Vatakis, A., & Kourtzi, Z. Implied motion from form in the human visual cortex. Journal of Neurophysiology, 94, 4373-4386, 2005], increased responses for the position updating that lead to rigidly rotating object representations were only observed in visual areas KO and possibly hMT+, indicating that this is a distinct and highly specialized type of processing.

  9. Lip-reading aids word recognition most in moderate noise: a Bayesian explanation using high-dimensional feature space.

    Directory of Open Access Journals (Sweden)

    Wei Ji Ma

    Full Text Available Watching a speaker's facial movements can dramatically enhance our ability to comprehend words, especially in noisy environments. From a general doctrine of combining information from different sensory modalities (the principle of inverse effectiveness, one would expect that the visual signals would be most effective at the highest levels of auditory noise. In contrast, we find, in accord with a recent paper, that visual information improves performance more at intermediate levels of auditory noise than at the highest levels, and we show that a novel visual stimulus containing only temporal information does the same. We present a Bayesian model of optimal cue integration that can explain these conflicts. In this model, words are regarded as points in a multidimensional space and word recognition is a probabilistic inference process. When the dimensionality of the feature space is low, the Bayesian model predicts inverse effectiveness; when the dimensionality is high, the enhancement is maximal at intermediate auditory noise levels. When the auditory and visual stimuli differ slightly in high noise, the model makes a counterintuitive prediction: as sound quality increases, the proportion of reported words corresponding to the visual stimulus should first increase and then decrease. We confirm this prediction in a behavioral experiment. We conclude that auditory-visual speech perception obeys the same notion of optimality previously observed only for simple multisensory stimuli.

  10. Could Audio-Described Films Benefit from Audio Introductions? An Audience Response Study

    Science.gov (United States)

    Romero-Fresco, Pablo; Fryer, Louise

    2013-01-01

    Introduction: Time constraints limit the quantity and type of information conveyed in audio description (AD) for films, in particular the cinematic aspects. Inspired by introductory notes for theatre AD, this study developed audio introductions (AIs) for "Slumdog Millionaire" and "Man on Wire." Each AI comprised 10 minutes of…

  11. Spatial domain entertainment audio decompression/compression

    Science.gov (United States)

    Chan, Y. K.; Tam, Ka Him K.

    2014-02-01

    The ARM7 NEON processor with 128bit SIMD hardware accelerator requires a peak performance of 13.99 Mega Cycles per Second for MP3 stereo entertainment quality decoding. For similar compression bit rate, OGG and AAC is preferred over MP3. The Patent Cooperation Treaty Application dated 28/August/2012 describes an audio decompression scheme producing a sequence of interleaving "min to Max" and "Max to min" rising and falling segments. The number of interior audio samples bound by "min to Max" or "Max to min" can be {0|1|…|N} audio samples. The magnitudes of samples, including the bounding min and Max, are distributed as normalized constants within the 0 and 1 of the bounding magnitudes. The decompressed audio is then a "sequence of static segments" on a frame by frame basis. Some of these frames needed to be post processed to elevate high frequency. The post processing is compression efficiency neutral and the additional decoding complexity is only a small fraction of the overall decoding complexity without the need of extra hardware. Compression efficiency can be speculated as very high as source audio had been decimated and converted to a set of data with only "segment length and corresponding segment magnitude" attributes. The PCT describes how these two attributes are efficiently coded by the PCT innovative coding scheme. The PCT decoding efficiency is obviously very high and decoding latency is basically zero. Both hardware requirement and run time is at least an order of magnitude better than MP3 variants. The side benefit is ultra low power consumption on mobile device. The acid test on how such a simplistic waveform representation can indeed reproduce authentic decompressed quality is benchmarked versus OGG(aoTuv Beta 6.03) by three pair of stereo audio frames and one broadcast like voice audio frame with each frame consisting 2,028 samples at 44,100KHz sampling frequency.

  12. A High-Voltage Class D Audio Amplifier for Dielectric Elastomer Transducers

    DEFF Research Database (Denmark)

    Nielsen, Dennis; Knott, Arnold; Andersen, Michael A. E.

    2014-01-01

    Dielectric Elastomer (DE) transducers have emerged as a very interesting alternative to the traditional electrodynamic transducer. Lightweight, small size and high maneuverability are some of the key features of the DE transducer. An amplifier for the DE transducer suitable for audio applications...... is proposed and analyzed. The amplifier addresses the issue of a high impedance load, ensuring a linear response over the midrange region of the audio bandwidth (100 Hz – 3.5 kHz). THD+N below 0.1% are reported for the ± 300 V prototype amplifier producing a maximum of 125 Var at a peak efficiency of 95 %....

  13. Audio Steganography Using GA with Multilevel Security

    Directory of Open Access Journals (Sweden)

    K. Bhowal

    2013-05-01

    Full Text Available In this paper we present a novel method for digital audio steganography where messages are embedded into image and image is embedded into the host audio. “Audio Steganography using GA with multilevel security” is a proposed system which is based on Steganography and Encryption; the system ensures secured data transfer between the source and destination. Here a novel approach is presented to resolve the remained problems of substitution technique of audio Steganography. In the first level of security, encrypted message bits are inserted into image using LSB algorithm. In the second level, a secured GA based LSB (Least Significant Bit Algorithm is used to encode the image data into audio data. Here image bits are embedded into random and higher LSB layers, resulting in increased robustness against noise addition. The robustness specially would be increased against those intentional attacks which try to reveal the hidden message and also some unintentional attacks like noise addition as well. On the other hand, multi-objective GA is used to reduce distortion

  14. Audio stream classification for multimedia database search

    Science.gov (United States)

    Artese, M.; Bianco, S.; Gagliardi, I.; Gasparini, F.

    2013-03-01

    Search and retrieval of huge archives of Multimedia data is a challenging task. A classification step is often used to reduce the number of entries on which to perform the subsequent search. In particular, when new entries of the database are continuously added, a fast classification based on simple threshold evaluation is desirable. In this work we present a CART-based (Classification And Regression Tree [1]) classification framework for audio streams belonging to multimedia databases. The database considered is the Archive of Ethnography and Social History (AESS) [2], which is mainly composed of popular songs and other audio records describing the popular traditions handed down generation by generation, such as traditional fairs, and customs. The peculiarities of this database are that it is continuously updated; the audio recordings are acquired in unconstrained environment; and for the non-expert human user is difficult to create the ground truth labels. In our experiments, half of all the available audio files have been randomly extracted and used as training set. The remaining ones have been used as test set. The classifier has been trained to distinguish among three different classes: speech, music, and song. All the audio files in the dataset have been previously manually labeled into the three classes above defined by domain experts.

  15. Real-Time Conversion of Stereo Audio to 5.1 Channel Audio for Providing Realistic Sounds

    Directory of Open Access Journals (Sweden)

    Chan Jun Chun

    2009-12-01

    Full Text Available In this paper, we address issues associated with the real-time implementation of upmixing stereo audio into 5.1 channel audio in order to improve audio realism. First, we review four different upmixing methods, including a passive surround decoding method, a least-meansquare based upmixing method, a principal component analysis based upmixing method, and an adaptive panning method. After that, we implement a simulator that includes the upmixingmethods and audio controls to play both stereo and upmixed 5.1 channel audio signals. Finally, we carry out a MUSHRA test to compare the quality of the upmixed 5.1 channel audio signals to that of the original stereo audio signal. It is shown from the test that the upmixed 5.1 channel audio signals generated by the four different upmixing methods are preferred to the original stereo audio signals.

  16. Segmentation of perivascular spaces in 7T MR image using auto-context model with orientation-normalized features.

    Science.gov (United States)

    Park, Sang Hyun; Zong, Xiaopeng; Gao, Yaozong; Lin, Weili; Shen, Dinggang

    2016-07-01

    Quantitative study of perivascular spaces (PVSs) in brain magnetic resonance (MR) images is important for understanding the brain lymphatic system and its relationship with neurological diseases. One of the major challenges is the accurate extraction of PVSs that have very thin tubular structures with various directions in three-dimensional (3D) MR images. In this paper, we propose a learning-based PVS segmentation method to address this challenge. Specifically, we first determine a region of interest (ROI) by using the anatomical brain structure and the vesselness information derived from eigenvalues of image derivatives. Then, in the ROI, we extract a number of randomized Haar features which are normalized with respect to the principal directions of the underlying image derivatives. The classifier is trained by the random forest model that can effectively learn both discriminative features and classifier parameters to maximize the information gain. Finally, a sequential learning strategy is used to further enforce various contextual patterns around the thin tubular structures into the classifier. For evaluation, we apply our proposed method to the 7T brain MR images scanned from 17 healthy subjects aged from 25 to 37. The performance is measured by voxel-wise segmentation accuracy, cluster-wise classification accuracy, and similarity of geometric properties, such as volume, length, and diameter distributions between the predicted and the true PVSs. Moreover, the accuracies are also evaluated on the simulation images with motion artifacts and lacunes to demonstrate the potential of our method in segmenting PVSs from elderly and patient populations. The experimental results show that our proposed method outperforms all existing PVS segmentation methods. PMID:27046107

  17. A Perceptually Reweighted Mixed-Norm Method for Sparse Approximation of Audio Signals

    DEFF Research Database (Denmark)

    Christensen, Mads Græsbøll; Sturm, Bob L.

    2011-01-01

    In this paper, we consider the problem of finding sparse representations of audio signals for coding purposes. In doing so, it is of utmost importance that when only a subset of the present components of an audio signal are extracted, it is the perceptually most important ones. To this end, we...... propose a new iterative algorithm based on two principles: 1) a reweighted l1-norm based measure of sparsity; and 2) a reweighted l2-norm based measure of perceptual distortion. Using these measures, the considered problem is posed as a constrained convex optimization problem that can be solved optimally...... using standard software. A prominent feature of the new method is that it solves a problem that is closely related to the objective of coding, namely rate-distortion optimization. In computer simulations, we demonstrate the properties of the algorithm and its application to real audio signals....

  18. Evaluation of Perceived Spatial Audio Quality

    Directory of Open Access Journals (Sweden)

    Jan Berg

    2006-04-01

    Full Text Available The increased use of audio applications capable of conveying enhanced spatial quality puts focus on how such a quality should be evaluated. Different approaches to evaluation of perceived quality are briefly discussed and a new technique is introduced. In a series of experiment, attributes were elicited from subjects, tested and subsequently used for derivation of evaluation scales that were feasible for subjective evaluation of the spatial quality of certain multichannel stimuli. The findings of these experiments led to the development of a novel method for evaluation of spatial audio in surround sound systems. Parts of the method were subsequently implemented in the OPAQUE software prototype designed to facilitate the elicitation process. The prototype was successfully tested in a pilot experiment. The experiments show that attribute scales derived from subjects' personal constructs are functional for evaluation of perceived spatial audio quality. Finally, conclusions on the importance of spatial quality evaluation of new applications are made.

  19. Personalized Audio Systems - a Bayesian Approach

    DEFF Research Database (Denmark)

    Nielsen, Jens Brehm; Jensen, Bjørn Sand; Hansen, Toke Jansen;

    2013-01-01

    Modern audio systems are typically equipped with several user-adjustable parameters unfamiliar to most users listening to the system. To obtain the best possible setting, the user is forced into multi-parameter optimization with respect to the users's own objective and preference. To address this......, the present paper presents a general inter-active framework for personalization of such audio systems. The framework builds on Bayesian Gaussian process regression in which a model of the users's objective function is updated sequentially. The parameter setting to be evaluated in a given trial is...

  20. Synchronization and comparison of Lifelog audio recordings

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Hansen, Lars Kai

    2008-01-01

    as a preprocessing step to select and synchronize recordings before further processing. The two methods perform similarly in classification, but fingerprinting scales better with the number of recordings, while cross-correlation can offer sample resolution synchronization. We propose and investigate the benefits......We investigate concurrent ‘Lifelog’ audio recordings to locate segments from the same environment. We compare two techniques earlier proposed for pattern recognition in extended audio recordings, namely cross-correlation and a fingerprinting technique. If successful, such alignment can be used...... of combining the two. In particular we show that the combination allows sample resolution synchronization and scalability....

  1. Frequency Hopping Method for Audio Watermarking

    Directory of Open Access Journals (Sweden)

    A. Anastasijević

    2012-11-01

    Full Text Available This paper evaluates the degradation of audio content for a perceptible removable watermark. Two different approaches to embedding the watermark in the spectral domain were investigated. The frequencies for watermark embedding are chosen according to a pseudorandom sequence making the methods robust. Consequentially, the lower quality audio can be used for promotional purposes. For a fee, the watermark can be removed with a secret watermarking key. Objective and subjective testing was conducted in order to measure degradation level for the watermarked music samples and to examine residual distortion for different parameters of the watermarking algorithm and different music genres.

  2. DWT-Based High Capacity Audio Watermarking

    Science.gov (United States)

    Fallahpour, Mehdi; Megías, David

    This letter suggests a novel high capacity robust audio watermarking algorithm by using the high frequency band of the wavelet decomposition, for which the human auditory system (HAS) is not very sensitive to alteration. The main idea is to divide the high frequency band into frames and then, for embedding, the wavelet samples are changed based on the average of the relevant frame. The experimental results show that the method has very high capacity (about 5.5kbps), without significant perceptual distortion (ODG in [-1, 0] and SNR about 33dB) and provides robustness against common audio signal processing such as added noise, filtering, echo and MPEG compression (MP3).

  3. Near-field Localization of Audio

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Christensen, Mads Græsbøll

    2014-01-01

    Localization of audio sources using microphone arrays has been an important research problem for more than two decades. Many traditional methods for solving the problem are based on a two-stage procedure: first, information about the audio source, such as time differences-of-arrival (TDOAs......, where the desired signal is modeled using TDOAs and GROAs, which are determined by the source location. This facilitates the derivation of one-stage, maximum likelihood methods under a white Gaussian noise assumption that is applicable in both near- and far-field scenarios. Simulations show...

  4. Nonlinear dynamic macromodeling techniques for audio systems

    Science.gov (United States)

    Ogrodzki, Jan; Bieńkowski, Piotr

    2015-09-01

    This paper develops a modelling method and a models identification technique for the nonlinear dynamic audio systems. Identification is performed by means of a behavioral approach based on a polynomial approximation. This approach makes use of Discrete Fourier Transform and Harmonic Balance Method. A model of an audio system is first created and identified and then it is simulated in real time using an algorithm of low computational complexity. The algorithm consists in real time emulation of the system response rather than in simulation of the system itself. The proposed software is written in Python language using object oriented programming techniques. The code is optimized for a multithreads environment.

  5. Calibration of an audio frequency noise generator

    DEFF Research Database (Denmark)

    Diamond, Joseph M.

    1966-01-01

    A noise generator of known output is very convenient in noise measurement. At low audio frequencies, however, all devices, including noise sources, may be affected by excess noise (1/f noise). It is therefore very desirable to be able to check the spectral density of a noise source before it is...... a noise bandwidth Bn = π/2 × (3dB bandwidth). To apply this method to low audio frequencies, the noise bandwidth of the low Q parallel resonant circuit has been found, including the effects of both series and parallel damping. The method has been used to calibrate a General Radio 1390-B noise...

  6. Information Security using Audio Steganography -A Survey

    Directory of Open Access Journals (Sweden)

    B. Santhi

    2012-07-01

    Full Text Available The most important application of internet is data transmission. Unfortunately this is less secured because of advanced hacking technologies. So, for secured data transmission we make use of steganography. This is the art of hiding information where the existence of data is unknown. Any medium like music, video, text, speech, etc can be used. In this study, the selected medium is audio. This study discusses about the existing audio steganographic techniques along with their advantages and limitations. Also an algorithm implementing parity and LSB methods is proposed. This mitigates the limitations of the existing methods discussed, thus increasing security and reducing computational load and code complexity.

  7. Music information retrieval in compressed audio files: a survey

    Science.gov (United States)

    Zampoglou, Markos; Malamos, Athanasios G.

    2014-07-01

    In this paper, we present an organized survey of the existing literature on music information retrieval systems in which descriptor features are extracted directly from the compressed audio files, without prior decompression to pulse-code modulation format. Avoiding the decompression step and utilizing the readily available compressed-domain information can significantly lighten the computational cost of a music information retrieval system, allowing application to large-scale music databases. We identify a number of systems relying on compressed-domain information and form a systematic classification of the features they extract, the retrieval tasks they tackle and the degree in which they achieve an actual increase in the overall speed-as well as any resulting loss in accuracy. Finally, we discuss recent developments in the field, and the potential research directions they open toward ultra-fast, scalable systems.

  8. Audio wiring guide how to wire the most popular audio and video connectors

    CERN Document Server

    Hechtman, John

    2012-01-01

    Whether you're a pro or an amateur, a musician or into multimedia, you can't afford to guess about audio wiring. The Audio Wiring Guide is a comprehensive, easy-to-use guide that explains exactly what you need to know. No matter the size of your wiring project or installation, this handy tool provides you with the essential information you need and the techniques to use it. Using The Audio Wiring Guide is like having an expert at your side. By following the clear, step-by-step directions, you can do professional-level work at a fraction of the cost.

  9. Extracting meaning from audio signals - a machine learning approach

    DEFF Research Database (Denmark)

    Larsen, Jan

    2007-01-01

    * Machine learning framework for sound search * Genre classification * Music and audio separation * Wind noise suppression......* Machine learning framework for sound search * Genre classification * Music and audio separation * Wind noise suppression...

  10. Cross-modal retrieval of scripted speech audio

    Science.gov (United States)

    Owen, Charles B.; Makedon, Fillia

    1997-12-01

    This paper describes an approach to the problem of searching speech-based digital audio using cross-modal information retrieval. Audio containing speech (speech-based audio) is difficult to search. Open vocabulary speech recognition is advancing rapidly, but cannot yield high accuracy in either search or transcription modalities. However, text can be searched quickly and efficiently with high accuracy. Script- light digital audio is audio that has an available transcription. This is a surprisingly large class of content including legal testimony, broadcasting, dramatic productions and political meetings and speeches. An automatic mechanism for deriving the synchronization between the transcription and the audio allows for very accurate retrieval of segments of that audio. The mechanism described in this paper is based on building a transcription graph from the text and computing biphone probabilities for the audio. A modified beam search algorithm is presented to compute the alignment.

  11. SONICMAPS: CONNECTING THE RITUAL OF THE CONCERT HALL WITH A LOCATIVE AUDIO URBAN EXPERIENCE

    OpenAIRE

    Pecino, Ignacio / Climent, Ricardo

    2013-01-01

    Physical space is often used as a means for sound organization. Furthermore, our environment can be sonically augmented, highlighting hidden aspects of it or proposing a whole new reinterpretation. On this basis, we present SonicMaps, a new locative audio tool and complete solution for sound geolocation. Using a number of sensors on mobile devices, the SonicMaps application virtually places sounds into real space providing panning and amplitude information. These sounds are played back as we ...

  12. Assessment of spatial audio quality based on sound attributes

    OpenAIRE

    LE BAGOUSSE, Sarah; Paquier, Mathieu; Colomes, Catherine

    2012-01-01

    International audience Spatial audio technologies become very important in audio broadcast services. But, there is a lack of methods for evaluating spatial audio quality. Standards do not take into account spatial dimension of sound and assessments are limited to the overall quality particularly in the context of audio coding. Through different elicitation methods, a long list of attributes has been established to characterize sound but it is difficult to include them in a listening test. ...

  13. A high performance switching audio amplifier using sliding mode control

    OpenAIRE

    Pillonnet, Gael; Cellier, Rémy; Abouchi, Nacer; Chiollaz, Monique

    2008-01-01

    International audience The switching audio amplifiers are widely used in various portable and consumer electronics due to their high efficiency, but suffers from low audio performances due to inherent nonlinearity. This paper presents an integrated class D audio amplifier with low consumption and high audio performances. It includes a power stage and an efficient control based on sliding mode technique. This monolithic class D amplifier is capable of delivering up to 1W into 8Ω load at les...

  14. Debugging of Class-D Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Crone, Lasse; Pedersen, Jeppe Arnsdorf; Mønster, Jakob Døllner;

    2012-01-01

    Determining and optimizing the performance of a Class-D audio power amplier can be very dicult without knowledge of the use of audio performance measuring equipment and of how the various noise and distortion sources in uence the audio performance. This paper gives an introduction on how to measu...

  15. Progressive Syntax-Rich Coding of Multichannel Audio Sources

    Directory of Open Access Journals (Sweden)

    Dai Yang

    2003-09-01

    Full Text Available Being able to transmit the audio bitstream progressively is a highly desirable property for network transmission. MPEG-4 version 2 audio supports fine grain bit rate scalability in the generic audio coder (GAC. It has a bit-sliced arithmetic coding (BSAC tool, which provides scalability in the step of 1 Kbps per audio channel. There are also several other scalable audio coding methods, which have been proposed in recent years. However, these scalable audio tools are only available for mono and stereo audio material. Little work has been done on progressive coding of multichannel audio sources. MPEG advanced audio coding (AAC is one of the most distinguished multichannel digital audio compression systems. Based on AAC, we develop in this work a progressive syntax-rich multichannel audio codec (PSMAC. It not only supports fine grain bit rate scalability for the multichannel audio bitstream but also provides several other desirable functionalities. A formal subjective listening test shows that the proposed algorithm achieves an excellent performance at several different bit rates when compared with MPEG AAC.

  16. 47 CFR 10.520 - Common audio attention signal.

    Science.gov (United States)

    2010-10-01

    ... 47 Telecommunication 1 2010-10-01 2010-10-01 false Common audio attention signal. 10.520 Section... Equipment Requirements § 10.520 Common audio attention signal. A Participating CMS Provider and equipment manufacturers may only market devices for public use under part 10 that include an audio attention signal...

  17. Switching-mode Audio Power Amplifiers with Direct Energy Conversion

    DEFF Research Database (Denmark)

    Ljusev, Petar; Andersen, Michael Andreas E.

    2005-01-01

    This paper presents a new class of switching-mode audio power amplifiers, which are capable of direct energy conversion from the AC mains to the audio output. They represent an ultimate integration of a switching-mode power supply and a Class D audio power amplifier, where the intermediate DC bus...

  18. 47 CFR 73.403 - Digital audio broadcasting service requirements.

    Science.gov (United States)

    2010-10-01

    ... 47 Telecommunication 4 2010-10-01 2010-10-01 false Digital audio broadcasting service requirements... SERVICES RADIO BROADCAST SERVICES Digital Audio Broadcasting § 73.403 Digital audio broadcasting service requirements. (a) Broadcast radio stations using IBOC must transmit at least one over-the-air digital...

  19. Providing Students with Formative Audio Feedback

    Science.gov (United States)

    Brearley, Francis Q.; Cullen, W. Rod

    2012-01-01

    The provision of timely and constructive feedback is increasingly challenging for busy academics. Ensuring effective student engagement with feedback is equally difficult. Increasingly, studies have explored provision of audio recorded feedback to enhance effectiveness and engagement with feedback. Few, if any, of these focus on purely formative…

  20. An ESL Audio-Script Writing Workshop

    Science.gov (United States)

    Miller, Carla

    2012-01-01

    The roles of dialogue, collaborative writing, and authentic communication have been explored as effective strategies in second language writing classrooms. In this article, the stages of an innovative, multi-skill writing method, which embeds students' personal voices into the writing process, are explored. A 10-step ESL Audio Script Writing Model…

  1. Spatial audio quality perception (part 2)

    DEFF Research Database (Denmark)

    Conetta, R.; Brookes, T.; Rumsey, F.;

    2015-01-01

    location, envelopment, coverage angle, ensemble width, and spaciousness. They can also impact timbre, and changes to timbre can then influence spatial perception. Previously obtained data was used to build a regression model of perceived spatial audio quality in terms of spatial and timbral metrics...

  2. All About Audio Equalization: Solutions and Frontiers

    Directory of Open Access Journals (Sweden)

    Vesa Välimäki

    2016-05-01

    Full Text Available Audio equalization is a vast and active research area. The extent of research means that one often cannot identify the preferred technique for a particular problem. This review paper bridges those gaps, systemically providing a deep understanding of the problems and approaches in audio equalization, their relative merits and applications. Digital signal processing techniques for modifying the spectral balance in audio signals and applications of these techniques are reviewed, ranging from classic equalizers to emerging designs based on new advances in signal processing and machine learning. Emphasis is placed on putting the range of approaches within a common mathematical and conceptual framework. The application areas discussed herein are diverse, and include well-defined, solvable problems of filter design subject to constraints, as well as newly emerging challenges that touch on problems in semantics, perception and human computer interaction. Case studies are given in order to illustrate key concepts and how they are applied in practice. We also recommend preferred signal processing approaches for important audio equalization problems. Finally, we discuss current challenges and the uncharted frontiers in this field. The source code for methods discussed in this paper is made available at https://code.soundsoftware.ac.uk/projects/allaboutaudioeq.

  3. Audible Aliasing Distortion in Digital Audio Synthesis

    Directory of Open Access Journals (Sweden)

    J. Schimmel

    2012-04-01

    Full Text Available This paper deals with aliasing distortion in digital audio signal synthesis of classic periodic waveforms with infinite Fourier series, for electronic musical instruments. When these waveforms are generated in the digital domain then the aliasing appears due to its unlimited bandwidth. There are several techniques for the synthesis of these signals that have been designed to avoid or reduce the aliasing distortion. However, these techniques have high computing demands. One can say that today's computers have enough computing power to use these methods. However, we have to realize that today’s computer-aided music production requires tens of multi-timbre voices generated simultaneously by software synthesizers and the most of the computing power must be reserved for hard-disc recording subsystem and real-time audio processing of many audio channels with a lot of audio effects. Trivially generated classic analog synthesizer waveforms are therefore still effective for sound synthesis. We cannot avoid the aliasing distortion but spectral components produced by the aliasing can be masked with harmonic components and thus made inaudible if sufficient oversampling ratio is used. This paper deals with the assessment of audible aliasing distortion with the help of a psychoacoustic model of simultaneous masking and compares the computing demands of trivial generation using oversampling with those of other methods.

  4. Structuring Broadcast Audio for Information Access

    Science.gov (United States)

    Gauvain, Jean-Luc; Lamel, Lori

    2003-12-01

    One rapidly expanding application area for state-of-the-art speech recognition technology is the automatic processing of broadcast audiovisual data for information access. Since much of the linguistic information is found in the audio channel, speech recognition is a key enabling technology which, when combined with information retrieval techniques, can be used for searching large audiovisual document collections. Audio indexing must take into account the specificities of audio data such as needing to deal with the continuous data stream and an imperfect word transcription. Other important considerations are dealing with language specificities and facilitating language portability. At Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI), broadcast news transcription systems have been developed for seven languages: English, French, German, Mandarin, Portuguese, Spanish, and Arabic. The transcription systems have been integrated into prototype demonstrators for several application areas such as audio data mining, structuring audiovisual archives, selective dissemination of information, and topic tracking for media monitoring. As examples, this paper addresses the spoken document retrieval and topic tracking tasks.

  5. Relevant Research on Audio-Tutorial Methods

    Science.gov (United States)

    Novak, Joseph D.

    1970-01-01

    Reviews two aspects of research related to audio-tutorial instructional methods. First, the learning theory of David P. Ausebel is summarized and applied to instructional procedures. Secondly, learning time for attainment of concept and knowledge levels is discussed. Concludes that studies are needed on designs based on Ausebel's theory,…

  6. Audio-visual integration in schizophrenia

    NARCIS (Netherlands)

    Gelder, B.LM.F. de; Vroomen, J.; Annen, L.; Masthoff, E.D.M.; Hodiamont, P.P.G.

    2003-01-01

    Integration of information provided simultaneously by audition and vision was studied in a group of 18 schizophrenic patients. They were compared to a control group, consisting of 12 normal adults of comparable age and education. By administering two tasks, each focusing on one aspect of audio-visua

  7. Audio-visual integration in schizophrenia.

    NARCIS (Netherlands)

    Gelder, B. de; Vroomen, J.; Annen, L.; Masthof, E.; Hodiamont, P.P.G.

    2003-01-01

    Integration of information provided simultaneously by audition and vision was studied in a group of 18 schizophrenic patients. They were compared to a control group, consisting of 12 normal adults of comparable age and education. By administering two tasks, each focusing on one aspect of audio-visua

  8. Anthropomorphic Coding of Speech and Audio: A Model Inversion Approach

    Directory of Open Access Journals (Sweden)

    W. Bastiaan Kleijn

    2005-06-01

    Full Text Available Auditory modeling is a well-established methodology that provides insight into human perception and that facilitates the extraction of signal features that are most relevant to the listener. The aim of this paper is to provide a tutorial on perceptual speech and audio coding using an invertible auditory model. In this approach, the audio signal is converted into an auditory representation using an invertible auditory model. The auditory representation is quantized and coded. Upon decoding, it is then transformed back into the acoustic domain. This transformation converts a complex distortion criterion into a simple one, thus facilitating quantization with low complexity. We briefly review past work on auditory models and describe in more detail the components of our invertible model and its inversion procedure, that is, the method to reconstruct the signal from the output of the auditory model. We summarize attempts to use the auditory representation for low-bit-rate coding. Our approach also allows the exploitation of the inherent redundancy of the human auditory system for the purpose of multiple description (joint source-channel coding.

  9. Anthropomorphic Coding of Speech and Audio: A Model Inversion Approach

    Science.gov (United States)

    Feldbauer, Christian; Kubin, Gernot; Kleijn, W. Bastiaan

    2005-12-01

    Auditory modeling is a well-established methodology that provides insight into human perception and that facilitates the extraction of signal features that are most relevant to the listener. The aim of this paper is to provide a tutorial on perceptual speech and audio coding using an invertible auditory model. In this approach, the audio signal is converted into an auditory representation using an invertible auditory model. The auditory representation is quantized and coded. Upon decoding, it is then transformed back into the acoustic domain. This transformation converts a complex distortion criterion into a simple one, thus facilitating quantization with low complexity. We briefly review past work on auditory models and describe in more detail the components of our invertible model and its inversion procedure, that is, the method to reconstruct the signal from the output of the auditory model. We summarize attempts to use the auditory representation for low-bit-rate coding. Our approach also allows the exploitation of the inherent redundancy of the human auditory system for the purpose of multiple description (joint source-channel) coding.

  10. The KUSC Classical Music Dataset for Audio Key Finding

    Directory of Open Access Journals (Sweden)

    Ching-Hua Chuan

    2014-08-01

    Full Text Available In this paper, we present a benchmark dataset based on the KUSC classical music collection and provide baseline key-finding comparison results. Audio key finding is a basic music information retrieval task; it forms an essential component of systems for music segmentation, similarity assessment, and mood detection. Due to copyright restrictions and a labor-intensive annotation process, audio key finding algorithms have only been evaluated using small proprietary datasets to date. To create a common base for systematic comparisons, we have constructed a dataset comprising of more than 3,000 excerpts of classical music. The excerpts are made publicly accessible via commonly used acoustic features such as pitch-based spectrograms and chromagrams. We introduce a hybrid annotation scheme that combines the use of title keys with expert validation and correction of only the challenging cases. The expert musicians also provide ratings of key recognition difficulty. Other meta-data include instrumentation. As demonstration of use of the dataset, and to provide initial benchmark comparisons for evaluating new algorithms, we conduct a series of experiments reporting key determination accuracy of four state-of-the-art algorithms. We further show the importance of considering factors such as estimated tuning frequency, key strength or confidence value, and key recognition difficulty in key finding. In the future, we plan to expand the dataset to include meta-data for other music information retrieval tasks.

  11. Survey Musik und Medien 2012: Audio Media Usage in Germany - Audio Repertoires by birth cohorts

    OpenAIRE

    Lepa, Steffen

    2013-01-01

    Audio Repertoires are widespread patterns regarding the use of audiotechnologies in everyday life which may also be interpreted as “user types”. They were identified in Survey Musik und Medien 2012 based on the nationwide collected representative Audio Usage Data. Nowadays, people listen to music by means of various different devices, infrastructures and technologies. Furthermore, people often tend to combine those options within their daily routines. Therefore, it is reasonable to analyz...

  12. Audio Oracle: A New Algorithm for Fast Learning of Audio Structures

    OpenAIRE

    Dubnov, Shlomo; Assayag, Gerard; Cont, Arshia

    2007-01-01

    International audience In this paper we present a new method for indexing of audio data in terms of repeating sub-clips of variable length that we call audio factors. The new structure allows fast retrieval and recombination of sub-clips in a manner that assures continuity between splice points. The resulting structure accomplishes effectively a new method for texture synthesis, where the amount of innovation is controlled by one of the synthesis parameters. In the paper we present the new...

  13. ANALYSIS OF MULTIMODAL FUSION TECHNIQUES FOR AUDIO-VISUAL SPEECH RECOGNITION

    Directory of Open Access Journals (Sweden)

    D.V. Ivanko

    2016-05-01

    Full Text Available The paper deals with analytical review, covering the latest achievements in the field of audio-visual (AV fusion (integration of multimodal information. We discuss the main challenges and report on approaches to address them. One of the most important tasks of the AV integration is to understand how the modalities interact and influence each other. The paper addresses this problem in the context of AV speech processing and speech recognition. In the first part of the review we set out the basic principles of AV speech recognition and give the classification of audio and visual features of speech. Special attention is paid to the systematization of the existing techniques and the AV data fusion methods. In the second part we provide a consolidated list of tasks and applications that use the AV fusion based on carried out analysis of research area. We also indicate used methods, techniques, audio and video features. We propose classification of the AV integration, and discuss the advantages and disadvantages of different approaches. We draw conclusions and offer our assessment of the future in the field of AV fusion. In the further research we plan to implement a system of audio-visual Russian continuous speech recognition using advanced methods of multimodal fusion.

  14. Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues

    Directory of Open Access Journals (Sweden)

    W. H. Adams

    2003-02-01

    Full Text Available We present a learning-based approach to the semantic indexing of multimedia content using cues derived from audio, visual, and text features. We approach the problem by developing a set of statistical models for a predefined lexicon. Novel concepts are then mapped in terms of the concepts in the lexicon. To achieve robust detection of concepts, we exploit features from multiple modalities, namely, audio, video, and text. Concept representations are modeled using Gaussian mixture models (GMM, hidden Markov models (HMM, and support vector machines (SVM. Models such as Bayesian networks and SVMs are used in a late-fusion approach to model concepts that are not explicitly modeled in terms of features. Our experiments indicate promise in the proposed classification and fusion methodologies: our proposed fusion scheme achieves more than 10% relative improvement over the best unimodal concept detector.

  15. Label Propagation Using Sparse Approximated Nearest Feature Space Embedding%稀疏近似最近特征空间嵌入标签传播*

    Institute of Scientific and Technical Information of China (English)

    陶剑文; 王士同; 姚奇富

    2014-01-01

    There exist several problems in existing graph-based semi-supervised learning (GSSL) methods such as model parameters sensitiveness and insufficient discriminative information in data space, etc. To address those issues, this paper proposes a sparse approximated nearest feature space embedding label propagation (SANFSP) algorithm, which is inspired by both ideas of nearest feature space embedding and that of sparse representation. SANFSP first sparsely reconstructs data from original space using its feature space embedding projection images, and then measures the similarity between original data and its sparse approximated nearest feature space embedding projection points, thus proposing a sparse approximated nearest feature space embedding regularizer. At last, SANFSP complets label propagation procedure by using classical label propagation algorithm. The study also derives an easy way to extend SANFSP to out-of-sample data. Promising experimental results are obtained on several toy and real-world classification tasks such as face recognition, visual object recognition and digit classification.%针对现有的基于图的半监督学习(graph-based semi-supervised learning,简称 GSSL)方法存在模型参数敏感和数据空间判别信息不充分等问题,受最近特征空间嵌入和数据稀疏表示思想的启发,提出一种稀疏近似最近特征空间嵌入标签传播算法SANFSP(sparse approximated nearest feature space embedding label propagation).SANFSP首先利用特征空间嵌入投影点来稀疏表示原始数据;然后,度量原始数据和稀疏近似最近特征空间嵌入投影间的相似性;进而提出稀疏近似最近特征空间嵌入正则化项;最后,基于传统GSSL方法的标签传播算法,实现数据标签的平滑传播.同时,还将SANFSP算法简单拓展到out-of-sample学习.SANFSP算法在人造和实际数据集(如人脸识别、可视物件识别以及手写数字分类等)上取得了有效的实验结果.

  16. Automatic Speech Segmentation Based On Audio and Optical Flow Visual Classification

    Directory of Open Access Journals (Sweden)

    Behnam Torabi

    2014-10-01

    Full Text Available Automatic speech segmentation as an important part of speech recognition system (ASR is highly noise dependent. Noise is made by changes in the communication channel, background, level of speaking etc. In recent years, many researchers have proposed noise cancelation techniques and have added visual features from speaker’s face to reduce the effect of noise on ASR systems. Removing noise from audio signals depends on the type of the noise; so it cannot be used as a general solution. Adding visual features improve this lack of efficiency, but advanced methods of this type need manual extraction of visual features. In this paper we propose a completely automatic system which uses optical flow vectors from speaker’s image sequence to obtain visual features. Then, Hidden Markov Models are trained to segment audio signals from image sequences and audio features based on extracted optical flow. The developed segmentation system based on such method acts totally automatic and become more robust to noise.

  17. A Unified Approach to Real Time Audio-to-Score and Audio-to-Audio Alignment Using Sequential Montecarlo Inference Techniques

    OpenAIRE

    Montecchio, Nicola; Cont, Arshia

    2011-01-01

    International audience We present a methodology for the real time alignment of music signals using sequential Montecarlo inference techniques. The alignment problem is formulated as the state tracking of a dynamical system, and differs from traditional Hidden Markov Model - Dynamic Time Warping based systems in that the hidden state is continuous rather than discrete. The major contribution of this paper is addressing both problems of audio-to-score and audio-to-audio alignment within the ...

  18. Survey Musik und Medien 2012: Audio Media Usage in Germany - Audio Sources - Audio Sources used in 2012

    OpenAIRE

    Lepa, Steffen

    2013-01-01

    Where did everyday music come from in 2012? Audio Sources describe those distribution channels by means of which music is purchased, archived and made accessible. This includes physical recordings (CD, LP, MC etc.), electronic services in terms of downloading and streaming of digital music (iTunes, last.fm, Spotify etc.) as well as traditional radio reception and last but not least musical content on websites or digital storage media. How do the Germans listen to music nowadays? Survey Mus...

  19. Survey Musik und Medien 2012: Audio Media Usage in Germany - Audio Sources - Audio Sources used in 2012 - Selective Adopters

    OpenAIRE

    Lepa, Steffen

    2013-01-01

    Where did everyday music come from in 2012? Audio Sources describe those distribution channels by means of which music is purchased, archived and made accessible. This includes physical recordings (CD, LP, MC etc.), electronic services in terms of downloading and streaming of digital music (iTunes, last.fm, Spotify etc.) as well as traditional radio reception and last but not least musical content on websites or digital storage media. Typically born between 1963 and 1980, the Selective ad...

  20. Survey Musik und Medien 2012: Audio Media Usage in Germany - Audio Sources - Audio Sources used in 2012 - Digital Mobilists

    OpenAIRE

    Lepa, Steffen

    2013-01-01

    Where did everyday music come from in 2012? Audio Sources describe those distribution channels by means of which music is purchased, archived and made accessible. This includes physical recordings (CD, LP, MC etc.), electronic services in terms of downloading and streaming of digital music (iTunes, last.fm, Spotify etc.) as well as traditional radio reception and last but not least musical content on websites or digital storage media. Typically born between 1979 and 1998, the Digital Mobi...

  1. Features of motivation of the crewmembers in an enclosed space at atmospheric pressure changes during breathing inert gases.

    Science.gov (United States)

    Komarevcev, Sergey

    Since the 1960s, our psychologists are working on experimenting with small groups in isolation .It was associated with the beginning of spaceflight and necessity to study of human behaviors in ways different from the natural habitat of man .Those, who study human behavior especially in isolation, know- that the behavior in isolation markedly different from that in the natural situаtions. It associated with the development of new, more adaptive behaviors (1) What are the differences ? First of all , isolation is achieved by the fact ,that the group is in a closed space. How experiments show - the crew members have changed the basic personality traits, such as motivation Statement of the problem and methods. In our experimentation we were interested in changing the features of human motivation (strength, stability and direction of motivation) in terms of a closed group in the modified atmosphere pressure and breathing inert gases. Also, we were interested in particular external and internal motivation of the individual in the circumstances. To conduct experimentation , we used an experimental barocomplex GVK -250 , which placed a group of six mаns. A task was to spend fifteen days in isolation on barokomplex when breathing oxigen - xenon mixture of fifteen days in isolation on the same complex when breathing oxygen- helium mixture and fifteen days of isolation on the same complex when breathing normal air All this time, the subjects were isolated under conditions of atmospheric pressure changes , closer to what you normally deal divers. We assumed that breathing inert mixtures can change the strength and stability , and with it , the direction and stability of motivation. To check our results, we planned on using the battery of psychological techniques : 1. Schwartz technique that measures personal values and behavior in society, DORS procedure ( measurement of fatigue , monotony , satiety and stress ) and riffs that give the test once a week. Our assumption is

  2. Three-dimensional audio using loudspeakers

    Science.gov (United States)

    Gardner, William G.

    1997-12-01

    3-D audio systems, which can surround a listener with sounds at arbitrary locations, are an important part of immersive interfaces. A new approach is presented for implementing 3-D audio using a pair of conventional loudspeakers. The new idea is to use the tracked position of the listener's head to optimize the acoustical presentation, and thus produce a much more realistic illusion over a larger listening area than existing loudspeaker 3-D audio systems. By using a remote head tracker, for instance based on computer vision, an immersive audio environment can be created without donning headphones or other equipment. The general approach to a 3-D audio system is to reconstruct the acoustic pressures at the listener's ears that would result from the natural listening situation to be simulated. To accomplish this using loudspeakers requires that first, the ear signals corresponding to the target scene are synthesized by appropriately encoding directional cues, a process known as 'binaural synthesis,' and second, these signals are delivered to the listener by inverting the transmission paths that exist from the speakers to the listener, a process known as 'crosstalk cancellation.' Existing crosstalk cancellation systems only function at a fixed listening location; when the listener moves away from the equalization zone, the 3-D illusion is lost. Steering the equalization zone to the tracked listener preserves the 3-D illusion over a large listening volume, thus simulating a reconstructed soundfield, and also provides dynamic localization cues by maintaining stationary external sound sources during head motion. This dissertation will discuss the theory, implementation, and testing of a head-tracked loudspeaker 3-D audio system. Crosstalk cancellers that can be steered to the location of a tracked listener will be described. The objective performance of these systems has been evaluated using simulations and acoustical measurements made at the ears of human subjects. Many

  3. Using Audio-Derived Affective Offset to Enhance TV Recommendation

    DEFF Research Database (Denmark)

    Shepstone, Sven Ewan; Tan, Zheng-Hua; Jensen, Søren Holdt

    2014-01-01

    . First a user's mood profile is determined using 12-class audio-based emotion classifications . An initial TV content item is then displayed to the user based on the extracted mood profile. The user has the option to either accept the recommendation, or to critique the item once or several times, by...... navigating the emotion space to request an alternative match. The final match is then compared to the initial match, in terms of the difference in the items' affective parameterization . This offset is then utilized in future recommendation sessions. The system was evaluated by eliciting three different...... moods in 22 separate users and examining the influence of applying affective offset to the users' sessions. Results show that, in the case when affective offset was applied, better user satisfaction was achieved: the average ratings went from 7.80 up to 8.65, with an average decrease in the number of...

  4. Stream/Bounce Event Perception Reveals a Temporal Limit of Motion Correspondence Based on Surface Feature over Space and Time

    Directory of Open Access Journals (Sweden)

    Yousuke Kawachi

    2011-06-01

    Full Text Available We examined how stream/bounce event perception is affected by motion correspondence based on the surface features of moving objects passing behind an occlusion. In the stream/bounce display two identical objects moving across each other in a two-dimensional display can be perceived as either streaming through or bouncing off each other at coincidence. Here, surface features such as colour (Experiments 1 and 2 or luminance (Experiment 3 were switched between the two objects at coincidence. The moment of coincidence was invisible to observers due to an occluder. Additionally, the presentation of the moving objects was manipulated in duration after the feature switch at coincidence. The results revealed that a postcoincidence duration of approximately 200 ms was required for the visual system to stabilize judgments of stream/bounce events by determining motion correspondence between the objects across the occlusion on the basis of the surface feature. The critical duration was similar across motion speeds of objects and types of surface features. Moreover, controls (Experiments 4a–4c showed that cognitive bias based on feature (colour/luminance congruency across the occlusion could not fully account for the effects of surface features on the stream/bounce judgments. We discuss the roles of motion correspondence, visual feature processing, and attentive tracking in the stream/bounce judgments.

  5. 基于张量神经网络的音频多语义分类方法%Multi-semantic audio classification method based on tensor neural network

    Institute of Scientific and Technical Information of China (English)

    邢玲; 贺梅; 马强; 朱敏

    2012-01-01

    Researches on the audio classification have involved various types of vector features. However, multi-semantics of audio information not only have their own properties, but also have some correlations among them. Whereas, to a certain extent, the simple vector representation cannot represent the multi-semantics and ignore their relations. Tensor Uniform Content Locator (TUCL) was brought forward to express the semantic information of audio, and a three-order Tensor Semantic Space (TSS) was constructed according to the semantic tensor. Tensor Semantic Dispersion (TSD) can aggregate some audio resources with the same semantics, and at the same time, the automatic audio classification can be accomplished by calculating their TSD. And Radical Basis Function Tensor Neural Network ( RBFTNN) was constructed and used to train intelligent learning model. For the problem of multi-semantic audio classification, the experimental results show that our method can significantly improve the classification precision in comparison with the typical method of Gaussian Mixture Model (GMM), and the classification precision of RBFTNN model is obviously better than that of Support Vector Machine (SVM).%音频特征向量已广泛应用于音频分类的研究,该表示形式虽能有效体现音频的固有特性,但无法表示音频信息多语义特性及各语义间的相关性.提出了基于张量统一内容定位(TUCL)的音频语义表征方式,将音频语义描述表示为三阶张量,并构建多语义张量空间.在此空间中,张量语义离散度(TSD)能有效聚集具有相同语义的音频资源,通过计算各音频资源的TSD来完成对音频资源的分类,并构建了RBF张量神经网络(RBFTNN)来自适应学习分类模型.实验结果表明,在多语义分类的情况下,TSD算法的分类性能明显优于当前典型的高斯混合模型(GMM)算法;通过与支持向量机(SVM)学习模型相比可知,基于TSD的RBFTNN模型分类学习的准确率明显优于基于TSD的SVM模型.

  6. Audio Steganography Techniques-A Survey

    Directory of Open Access Journals (Sweden)

    Navneet Kaur

    2014-06-01

    Full Text Available we can communicate with each other by passing messages which is not secure, but we make a communication be kept secret by embedding the message into carrier or by special tools such as invisible ink, microdots etc. Steganography is the science that involves communicating secret data in an appropriate carrier which is used from hundreds of years. In digital age new techniques of hiding the data inside the carrier are invented which are known as digital steganography. Nowadays, the carrier of the message can be an image, audio, video or a text file. In this paper we have purposed a method to enhance the security level in audio steganography and also improve the quality by making 2-level steganography.

  7. Content Classification of Multimedia Documents using Partitions of Low-Level Features

    Directory of Open Access Journals (Sweden)

    Jörg Kindermann

    2007-01-01

    Full Text Available Audio-visual documents obtained from German TV news are classified according to the IPTC topic categorization scheme. To this end usual text classification techniques are adapted to speech, video, and non-speech audio. For each of the three modalities word analogues are generated: sequences of syllables for speech, “video words” based on low level color features (color moments, color correlogram and color wavelet, and “audio words” based on low-level spectral features (spectral envelope and spectral flatness for non-speech audio. Such audio and video words provide a means to represent the different modalities in a uniform way. The frequencies of the word analogues represent audio-visual documents: the standard bag-of-words approach. Support vector machines are used for supervised classification in a 1 vs. n setting. Classification based on speech outperforms all other single modalities. Combining speech with non-speech audio improves classification. Classification is further improved by supplementing speech and non-speech audio with video words. Optimal F-scores range between 62% and 94% corresponding to 50% - 84% above chance. The optimal combination of modalities depends on the category to be recognized. The construction of audio and video words from low-level features provide a good basis for the integration of speech, non-speech audio and video.

  8. Digitisation of the CERN Audio Archives

    CERN Multimedia

    Maximilien Brice

    2006-01-01

    Since the creation of CERN in 1954 until mid 1980s, the audiovisual service has recorded hundreds of hours of moments of life at CERN on audio tapes. These moments range from inaugurations of new facilities to VIP speeches and general interest cultural seminars The preservation process started in June 2005 On these pictures, we see Waltraud Hug working on an open-reel tape.

  9. Museum audio guides as an accessibility enhancer

    OpenAIRE

    Martins, Cláudia Susana Nunes

    2012-01-01

    Accessibility to museums is enhanced by various types of cultural mediation, such as the use of audio guides, which consist of a means for innovative mediation put forth to make the museum visit more autonomous and simultaneously replace the traditional guided visit. Their use is integrated in the tendency for museum democratisation felt in Europe between the 60s and the 80s of the 20th century, especially with the development of educational services at museums and their opening to schools. I...

  10. Audible Aliasing Distortion in Digital Audio Synthesis

    OpenAIRE

    J. Schimmel

    2012-01-01

    This paper deals with aliasing distortion in digital audio signal synthesis of classic periodic waveforms with infinite Fourier series, for electronic musical instruments. When these waveforms are generated in the digital domain then the aliasing appears due to its unlimited bandwidth. There are several techniques for the synthesis of these signals that have been designed to avoid or reduce the aliasing distortion. However, these techniques have high computing demands. One can say that today'...

  11. C Implementation and comparison of companding and silence audio compression techniques

    OpenAIRE

    Kruti Dangarwala; Jigar Shah

    2010-01-01

    Just about all the newest living room audio-video electronics and PC multimedia products being designed today will incorporate some form of compressed digitized-audio processing capability. Audio compression reduces the bit rate required to represent an analog audio signal while maintaining the perceived audio quality. Discarding inaudible data reduces the storage, transmission and compute requirements of handling high-quality audio files. This paper covers wave audio file format and algorith...

  12. Le registrazioni audio dell’archivio Luigi Nono di Venezia

    Directory of Open Access Journals (Sweden)

    Luca Cossettini

    2009-11-01

    Full Text Available The audio recordings of the Luigi Nono Archive in Venice: guidelines for preservation and critical edition of audio documentsStudying audio recordings brings us back to ancient source verification problems that too often one thinks are overcome by the technical reproduction of sound. Au-dio signal is “fixed” on a specific carrier (tape, disc etc with a specific audio format (speed, number of tracks etc; the choice of support and format during the first “memorizing” process and the following copying processes is a subjective and, in case of copying, an interpretative operation conducted within a continuously evolv-ing audio technology. What we listen to today is the result of a transmission process that unavoidably transforms the original acoustic event and the documents that memorize it. Audio recording is no way a timeless and immutable fixing process. It is therefore necessary to study the transmission processes and to reconstruct the au-dio document tradition. The re-recording of the tapes of the Archivio Luigi Nono, conducted by the Audio Labs of the DAMS Musica of the University of Udine, of-fers clear examples of the technical and musicological interpretative problems one can find when he works with audio recordings.

  13. A new alley in Opinion Mining using Senti Audio Visual Algorithm

    Directory of Open Access Journals (Sweden)

    Mukesh Rawat,

    2016-02-01

    Full Text Available People share their views about products and services over social media, blogs, forums etc. If someone is willing to spend resources and money over these products and services will definitely learn about them from the past experiences of their peers. Opinion mining plays vital role in knowing increasing interests of a particular community, social and political events, making business strategies, marketing campaigns etc. This data is in unstructured form over internet but analyzed properly can be of great use. Sentiment analysis focuses on polarity detection of emotions like happy, sad or neutral. In this paper we proposed an algorithm i.e. Senti Audio Visual for examining Video as well as Audio sentiments. A review in the form of video/audio may contain several opinions/emotions, this algorithm will classify the reviews with the help of Baye’s Classifiers to three different classes i.e., positive, negative or neutral. The algorithm will use smiles, cries, gazes, pauses, pitch, and intensity as relevant Audio Visual features.

  14. SCALABLE PERCEPTUAL AUDIO REPRESENTATION WITH AN ADAPTIVE THREE TIME-SCALE SINUSOIDAL SIGNAL MODEL

    Institute of Scientific and Technical Information of China (English)

    Al-Moussawy Raed; Yin Junxun; Song Shaopeng

    2004-01-01

    This work is concerned with the development and optimization of a signal model for scalable perceptual audio coding at low bit rates. A complementary two-part signal model consisting of Sines plus Noise (SN) is described. The paper presents essentially a fundamental enhancement to the sinusoidal modeling component. The enhancement involves an audio signal scheme based on carrying out overlap-add sinusoidal modeling at three successive time scales,large, medium, and small. The sinusoidal modeling is done in an analysis-by-synthesis overlapadd manner across the three scales by using a psychoacoustically weighted matching pursuits.The sinusoidal modeling residual at the first scale is passed to the smaller scales to allow for the modeling of various signal features at appropriate resolutions. This approach greatly helps to correct the pre-echo inherent in the sinusoidal model. This improves the perceptual audio quality upon our previous work of sinusoidal modeling while using the same number of sinusoids. The most obvious application for the SN model is in scalable, high fidelity audio coding and signal modification.

  15. High-performance combination method of electric network frequency and phase for audio forgery detection in battery-powered devices.

    Science.gov (United States)

    Savari, Maryam; Abdul Wahab, Ainuddin Wahid; Anuar, Nor Badrul

    2016-09-01

    Audio forgery is any act of tampering, illegal copy and fake quality in the audio in a criminal way. In the last decade, there has been increasing attention to the audio forgery detection due to a significant increase in the number of forge in different type of audio. There are a number of methods for forgery detection, which electric network frequency (ENF) is one of the powerful methods in this area for forgery detection in terms of accuracy. In spite of suitable accuracy of ENF in a majority of plug-in powered devices, the weak accuracy of ENF in audio forgery detection for battery-powered devices, especially in laptop and mobile phone, can be consider as one of the main obstacles of the ENF. To solve the ENF problem in terms of accuracy in battery-powered devices, a combination method of ENF and phase feature is proposed. From experiment conducted, ENF alone give 50% and 60% accuracy for forgery detection in mobile phone and laptop respectively, while the proposed method shows 88% and 92% accuracy respectively, for forgery detection in battery-powered devices. The results lead to higher accuracy for forgery detection with the combination of ENF and phase feature. PMID:27442454

  16. Differences in Human Audio Localization Performance between a HRTF- and a non-HRTF Audio System

    DEFF Research Database (Denmark)

    Larsen, Camilla Horne; Lauritsen, David Skødt; Larsen, Jacob Junker;

    2013-01-01

    Spatial audio solutions have been around for a long time in real-time applications, but yielding spatial cues that more closely simulate real life accuracy has been a computational issue, and has often been solved by hardware solutions. This has long been a restriction, but now with more powerful...... computers this is becoming a lesser and lesser concern and software solutions are now applicable. Most current virtual environment applications do not take advantage of these im- plementations of accurate spatial cues, however. This paper compares a common implementation of spatial audio and a head......-related transfer function (HRTF) system implemen- tation in a study in relation to precision, speed and navi- gational performance in localizing audio sources in a virtual environment. We found that a system using HRTFs is signif- icantly better at all three performance tasks than a system using panning....

  17. Origin, Development and Trend of Audio Book---Coping Strategies of Library in the Face of New Audio Resources%“听书”形态的起源、发展与趋势--兼论图书馆面对新型音频资源的应对策略

    Institute of Scientific and Technical Information of China (English)

    张鹏; 王铮

    2016-01-01

    在网络化、数字化、移动化的背景下,传统的听书形态发生了新的变化。文章首先回顾了听书形态的概念演变,分析了听书发展的历史、听书与载体的关系,以及听书的普及化、市场化、资源化特征,在此基础上分析了听书新形态所带来的音频资源变革,最后讨论了图书馆面对新型音频资源的应对策略。%Traditional audio book pattern has changed in the environment of Internet, digitization and mobile. This article reviews the conception evaluation of audio book, and then analyzes the history of audio book, the connection between audio book and record medi-um, and the feature of popularization, marketization and resource of audio book. Based on above research, the article analyzes the new pattern of audio book and the revolution on audio content resource, and discusses the strategy on new audio content resources for library.

  18. Content-Based Hierarchical Analysis of News Video Using Audio and Visual Information

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    A schema for content-based analysis of broadcast news video is presented. First, we separate commercials from news using audiovisual features. Then, we automatically organize news programs into a content hierarchy at various levels of abstraction via effective integration of video, audio, and text data available from the news programs. Based on these news video structure and content analysis technologies, a TV news video Library is generated, from which users can retrieve definite news story according to their demands.

  19. The Digital Audio Editor as a Teaching and Laboratory Tool

    Science.gov (United States)

    Latta, Gregory

    2001-10-01

    Digital audio editors such as Software Audio Workshop and Cool Edit Pro are powerful tools used in the radio and audio recording fields for editing digital audio. However, they are also powerful tools in the physics classroom and laboratory. During this presentation the author will show how a digital audio editor, combined with a library of audio .wav files produced by the author as part of sabbatical work, can be used to: 1. demonstrate quantitatively and qualitatively the relationship between the decibel, sound intensity, and loudness perception, 2. demonstrate quantitatively and qualitatively the relationship between frequency and pitch perception, 3. perform additive and subtractive sound synthesis, 4. demonstrate comb filtering, 5. demonstrate constructive and destructive interference, and 6. turn the computer into an accurate signal generator (sine wave, square wave, etc.) with a frequency resolution of 1Hz. Availability of the required software and .wav file library will also be discussed.

  20. Audio Classical Composer Identification by Deep Neural Network

    OpenAIRE

    Hu, Zhen; Fu, Kun; Zhang, Changshui

    2013-01-01

    Audio Classical Composer Identification (ACC) is an important problem in Music Information Retrieval (MIR) which aims at identifying the composer for audio classical music clips. The famous annual competition, Music Information Retrieval Evaluation eXchange (MIREX), also takes it as one of the four training&testing tasks. We built a hybrid model based on Deep Belief Network (DBN) and Stacked Denoising Autoencoder (SDA) to identify the composer from audio signal. As a matter of copyright, spon...

  1. Use of Effective Audio in E-learning Courseware

    OpenAIRE

    Ray, Kisor

    2015-01-01

    E-Learning uses electronic media, information & communication technologies to provide education to the masses. E-learning deliver hypertext, text, audio, images, animation and videos using desktop standalone computer, local area network based intranet and internet based contents. While producing an e-learning content or course-ware, a major decision making factor is whether to use audio for the benefit of the end users. Generally, three types of audio can be used in e-learning: narration, mus...

  2. A ROBUST WAVELET BASED WATERMARKING SCHEME FOR DIGITAL AUDIO

    Directory of Open Access Journals (Sweden)

    Ayad Ibrahim Abdulsada

    2012-06-01

    Full Text Available In this paper, a robust wavelet based watermarking scheme has been proposed for digital audio. A single bit is embedded in the approximation part of each frame. The watermark bits are embedded in two subsets of indexes randomly generated by using two keys for security purpose. The embedding process is done in adaptively fashion according to the mean of each approximation part. The detection of watermark does not depend on the original audio. To measure the robustness of the algorithm, different signal processing operations have been applied on the watermarked audio. Several experimental results have been conducted to illustrate the robustness and efficiency of the proposed watermarked audio scheme.

  3. Standardization Promotes the Quality of Meteorological Audio & Video Service

    Institute of Scientific and Technical Information of China (English)

    2011-01-01

    As an important part of meteorological sector and a critical basis for enhancing the capability of meteorological disaster prevention and mitigation and climate change response,the meteorological standardization is a significant support for facilitating the good and quick development of meteorological sector.Huafeng Group,as a leading enterprise of meteorological audio & video service,has,for years,attached much importance to employing the standardization of meteorological audio & video service to improve its management level and quality of programs,enhance the quality of meteorological audio & video service,build the brand image,cultivate the highlevel backbone personnel,and facilitate the sustainable development of meteorological audio & video service.

  4. Newnes audio and Hi-Fi engineer's pocket book

    CERN Document Server

    Capel, Vivian

    2013-01-01

    Newnes Audio and Hi-Fi Engineer's Pocket Book, Second Edition provides concise discussion of several audio topics. The book is comprised of 10 chapters that cover different audio equipment. The coverage of the text includes microphones, gramophones, compact discs, and tape recorders. The book also covers high-quality radio, amplifiers, and loudspeakers. The book then reviews the concepts of sound and acoustics, and presents some facts and formulas relevant to audio. The text will be useful to sound engineers and other professionals whose work involves sound systems.

  5. A content-based digital audio watermarking algorithm

    Science.gov (United States)

    Zhang, Liping; Zhao, Yi; Xu, Wen Li

    2015-12-01

    Digital audio watermarking embeds inaudible information into digital audio data for the purposes of copyright protection, ownership verification, covert communication, and/or auxiliary data carrying. In this paper, we present a novel watermarking scheme to embed a meaningful gray image into digital audio by quantizing the wavelet coefficients (using integer lifting wavelet transform) of audio samples. Our audio-dependent watermarking procedure directly exploits temporal and frequency perceptual masking of the human auditory system (HAS) to guarantee that the embedded watermark image is inaudible and robust. The watermark is constructed by utilizing still image compression technique, breaking each audio clip into smaller segments, selecting the perceptually significant audio segments to wavelet transform, and quantizing the perceptually significant wavelet coefficients. The proposed watermarking algorithm can extract the watermark image without the help from the original digital audio signals. We also demonstrate the robustness of that watermarking procedure to audio degradations and distortions, e.g., those that result from noise adding, MPEG compression, low pass filtering, resampling, and requantization.

  6. A Novel Algorithm for Robust Audio Watermarking in Wavelet Domain

    Institute of Scientific and Technical Information of China (English)

    FU Yu; WANG Bao-bao; LI Chun-ru; QUAN Ning-qiang

    2004-01-01

    A novel algorithm for digital audio watermarking in wavelet domain is proposed. First,an original audio signal is decomposed by discrete wavelet transform at three levels. Then, a discrete watermark is embedded into the coefficients of its intermediate frequencies. Finally, the watermarked audio signal is obtained by wavelet reconstruction. The proposed algorithm makes good use of the multiresolution characteristics of wavelet transform. The original audio signal is not needed when detecting the watermark correlatively. Simulation results show that the algorithm is inaudible and robust to noise, filtering and resampling.

  7. Thermal and neutron-physical features of the nuclear reactor for a power pulsation plant for space applications

    Science.gov (United States)

    Gordeev, É. G.; Kaminskii, A. S.; Konyukhov, G. V.; Pavshuk, V. A.; Turbina, T. A.

    2012-05-01

    We have explored the possibility of creating small-size reactors with a high power output with the provision of thermal stability and nuclear safety under standard operating conditions and in emergency situations. The neutron-physical features of such a reactor have been considered and variants of its designs preserving the main principles and approaches of nuclear rocket engine technology are presented.

  8. Construction features of procedural simulator for objective control expert systems management devices of rocket and space technology ground infrastructure objects

    OpenAIRE

    Pashchenko Dmitry; Sinev Michael; Trokoz Dmitriy; Sineva Mary; Tokarev Andrey; Dubravin Aleksey

    2016-01-01

    Paper refers to the construction of procedural simulators for expert systems management devices of rocket and space technology ground infrastructure objects. A classification of courses, which allows to realize the development of skills with both hardware and software components of a diagnostic expert systems. A method of calculating the integrated assessment of the student mastery of the material.

  9. Materials Science Research Hardware for Application on the International Space Station: an Overview of Typical Hardware Requirements and Features

    Science.gov (United States)

    Schaefer, D. A.; Cobb, S.; Fiske, M. R.; Srinivas, R.

    2000-01-01

    NASA's Marshall Space Flight Center (MSFC) is the lead center for Materials Science Microgravity Research. The Materials Science Research Facility (MSRF) is a key development effort underway at MSFC. The MSRF will be the primary facility for microgravity materials science research on board the International Space Station (ISS) and will implement the NASA Materials Science Microgravity Research Program. It will operate in the U.S. Laboratory Module and support U. S. Microgravity Materials Science Investigations. This facility is being designed to maintain the momentum of the U.S. role in microgravity materials science and support NASA's Human Exploration and Development of Space (HEDS) Enterprise goals and objectives for Materials Science. The MSRF as currently envisioned will consist of three Materials Science Research Racks (MSRR), which will be deployed to the International Space Station (ISS) in phases, Each rack is being designed to accommodate various Experiment Modules, which comprise processing facilities for peer selected Materials Science experiments. Phased deployment will enable early opportunities for the U.S. and International Partners, and support the timely incorporation of technology updates to the Experiment Modules and sensor devices.

  10. Instructional Audio Guidelines: Four Design Principles to Consider for Every Instructional Audio Design Effort

    Science.gov (United States)

    Carter, Curtis W.

    2012-01-01

    This article contends that instructional designers and developers should attend to four particular design principles when creating instructional audio. Support for this view is presented by referencing the limited research that has been done in this area, and by indicating how and why each of the four principles is important to the design process.…

  11. Audio Arduino - an ALSA (Advanced Linux Sound Architecture) audio driver for FTDI-based Arduinos

    DEFF Research Database (Denmark)

    Dimitrov, Smilen; Serafin, Stefania

    2011-01-01

    Technology Devices International Ltd [FTDI] company) can be demonstrated to behave as a full-duplex, mono, 8-bit 44.1 kHz soundcard, through an implementation of: a PC audio driver for ALSA (Advanced Linux Sound Architecture); a matching program for the Arduino's ATmega microcontroller - and nothing more...

  12. On Steganography in Lost Audio Packets

    CERN Document Server

    Mazurczyk, Wojciech; Szczypiorski, Krzysztof

    2011-01-01

    The paper presents a new hidden data insertion procedure based on estimated probability of the remaining time of the call for steganographic method called LACK (Lost Audio PaCKets steganography). LACK provides hidden communication for real-time services like Voice over IP. The analytical results presented in this paper concern the influence of LACK's hidden data insertion procedures on the method's impact on quality of voice transmission and its resistance to steganalysis. The proposed hidden data insertion procedure is also compared to previous steganogram insertion approach based on estimated remaining average call duration.

  13. Non Audio-Video gesture recognition system

    DEFF Research Database (Denmark)

    Craciunescu, Razvan; Mihovska, Albena Dimitrova; Kyriazakos, Sofoklis;

    2016-01-01

    Gesture recognition is a topic in computer science and language technology with the goal of interpreting human gestures via mathematical algorithms. Gestures can originate from any bodily motion or state but commonly originate from the face or hand. Current research focus includes on the emotion...... recognition from the face and hand gesture recognition. Gesture recognition enables humans to communicate with the machine and interact naturally without any mechanical devices. This paper investigates the possibility to use non-audio/video sensors in order to design a low-cost gesture recognition device...

  14. Predistortion of a Bidirectional Cuk Audio Amplifier

    DEFF Research Database (Denmark)

    Birch, Thomas Hagen; Nielsen, Dennis; Knott, Arnold;

    2014-01-01

    Some non-linear amplifier topologies are capable of providing a larger voltage gain than one from a DC source, which could make them suitable for various applications. However, the non-linearities introduce a significant amount of harmonic distortion (THD). Some of this distortion could be reduced...... using predistortion. This paper suggests linearizing a nonlinear bidirectional Cuk audio amplifier using an analog predistortion approach. A prototype power stage was built and results show that a voltage gain of up to 9 dB and reduction in THD from 6% down to 3% was obtainable using this approach....

  15. Audio marketing v ČR

    OpenAIRE

    Timanov, Vladimir

    2015-01-01

    The aim of the work is processing and evaluation of the investment project. The project implies an establishment of the firm in Czech Republic. The branch of the entrepreneurship is sensory marketing or audio-visual marketing. The essence of this field of the marketing is encouragement of sales through the influence on emotional side of the client. Components of the work are market research, analysis of the competitors in this sphere, and the financial plan. As a result, the work will be stru...

  16. Mixing audio concepts, practices and tools

    CERN Document Server

    Izhaki, Roey

    2013-01-01

    Your mix can make or break a record, and mixing is an essential catalyst for a record deal. Professional engineers with exceptional mixing skills can earn vast amounts of money and find that they are in demand by the biggest acts. To develop such skills, you need to master both the art and science of mixing. The new edition of this bestselling book offers all you need to know and put into practice in order to improve your mixes. Covering the entire process --from fundamental concepts to advanced techniques -- and offering a multitude of audio samples, tips and tricks, this boo

  17. Characteristics of Abductive Inquiry in Earth and Space Science: An Undergraduate Teacher Prospective Case Study

    Science.gov (United States)

    Ramalis, T. R.; Liliasari; Herdiwidjaya, D.

    2016-08-01

    The purpose this case study was to describe characteristic features learning activities in the domain of earth and space science. Context of this study is earth and space learning activities on three groups of student teachers prospective, respectively on the subject of the shape and size of Earth, land and sea breeze, and moon's orbit. The analysis is conducted qualitatively from activity data and analyze students doing project work, student worksheets, group project report documents, note and audio recordings of discussion. Research findings identified the type of abduction: theoretical models abduction, factual abduction, and law abduction during the learning process. Implications for science inquiry learning as well as relevant research were suggested.

  18. Semantic Labeling of Nonspeech Audio Clips

    Directory of Open Access Journals (Sweden)

    Xiaojuan Ma

    2010-01-01

    Full Text Available Human communication about entities and events is primarily linguistic in nature. While visual representations of information are shown to be highly effective as well, relatively little is known about the communicative power of auditory nonlinguistic representations. We created a collection of short nonlinguistic auditory clips encoding familiar human activities, objects, animals, natural phenomena, machinery, and social scenes. We presented these sounds to a broad spectrum of anonymous human workers using Amazon Mechanical Turk and collected verbal sound labels. We analyzed the human labels in terms of their lexical and semantic properties to ascertain that the audio clips do evoke the information suggested by their pre-defined captions. We then measured the agreement with the semantically compatible labels for each sound clip. Finally, we examined which kinds of entities and events, when captured by nonlinguistic acoustic clips, appear to be well-suited to elicit information for communication, and which ones are less discriminable. Our work is set against the broader goal of creating resources that facilitate communication for people with some types of language loss. Furthermore, our data should prove useful for future research in machine analysis/synthesis of audio, such as computational auditory scene analysis, and annotating/querying large collections of sound effects.

  19. TVAR modeling of EEG to detect audio distraction during simulated driving

    Science.gov (United States)

    Dahal, Nabaraj; (Nanda Nandagopal, D.; Cocks, Bernadine; Vijayalakshmi, Ramasamy; Dasari, Naga; Gaertner, Paul

    2014-06-01

    Objective. The objective of our current study was to look for the EEG correlates that can reveal the engaged state of the brain while undertaking cognitive tasks. Specifically, we aimed to identify EEG features that could detect audio distraction during simulated driving. Approach. Time varying autoregressive (TVAR) analysis using Kalman smoother was carried out on short time epochs of EEG data collected from participants as they undertook two simulated driving tasks. TVAR coefficients were then used to construct all pole model enabling the identification of EEG features that could differentiate normal driving from audio distracted driving. Main results. Pole analysis of the TVAR model led to the visualization of event related synchronization/desynchronization (ERS/ERD) patterns in the form of pole displacements in pole plots of the temporal EEG channels in the z plane enabling the differentiation of the two driving conditions. ERS in the EEG data has been demonstrated during audio distraction as an associated phenomenon. Significance. Visualizing the ERD/ERS phenomenon in terms of pole displacement is a novel approach. Although ERS/ERD has previously been demonstrated as reliable when applied to motor related tasks, it is believed to be the first time that it has been applied to investigate human cognitive phenomena such as attention and distraction. Results confirmed that distracted/non-distracted driving states can be identified using this approach supporting its applicability to cognition research.

  20. Effect of downsampling and compressive sensing on audio-based continuous cough monitoring.

    Science.gov (United States)

    Casaseca-de-la-Higuera, Pablo; Lesso, Paul; McKinstry, Brian; Pinnock, Hilary; Rabinovich, Roberto; McCloughan, Lucy; Monge-Álvarez, Jesús

    2015-01-01

    This paper presents an efficient cough detection system based on simple decision-tree classification of spectral features from a smartphone audio signal. Preliminary evaluation on voluntary coughs shows that the system can achieve 98% sensitivity and 97.13% specificity when the audio signal is sampled at full rate. With this baseline system, we study possible efficiency optimisations by evaluating the effect of downsampling below the Nyquist rate and how the system performance at low sampling frequencies can be improved by incorporating compressive sensing reconstruction schemes. Our results show that undersampling down to 400 Hz can still keep sensitivity and specificity values above 90% despite of aliasing. Furthermore, the sparsity of cough signals in the time domain allows keeping performance figures close to 90% when sampling at 100 Hz using compressive sensing schemes.

  1. El tratamiento documental del mensaje audiovisual Documentary treatment of the audio-visual message

    Directory of Open Access Journals (Sweden)

    Blanca Rodríguez Bravo

    2005-06-01

    Full Text Available Se analizan las peculiaridades del documento audiovisual y el tratamiento documental que sufre en las emisoras de televisión. Observando a las particularidades de la imagen que condicionan su análisis y recuperación, se establecen las etapas y procedimientos para representar el mensaje audiovisual con vistas a su reutilización. Por último se realizan algunas consideraciones acerca del procesamiento automático del video y de los cambios introducidos por la televisión digital.Peculiarities of the audio-visual document and the treatment it undergoes in TV broadcasting stations are analyzed. The particular features of images condition their analysis and recovery; this paper establishes stages and proceedings for the representation of audio-visual messages with a view to their re-usability Also, some considerations about the automatic processing of the video and the changes introduced by digital TV are made.

  2. Beyond Podcasting: Creative Approaches to Designing Educational Audio

    Science.gov (United States)

    Middleton, Andrew

    2009-01-01

    This paper discusses a university-wide pilot designed to encourage academics to creatively explore learner-centred applications for digital audio. Participation in the pilot was diverse in terms of technical competence, confidence and contextual requirements and there was little prior experience of working with digital audio. Many innovative…

  3. Effect of Audio vs. Video on Aural Discrimination of Vowels

    Science.gov (United States)

    McCrocklin, Shannon

    2012-01-01

    Despite the growing use of media in the classroom, the effects of using of audio versus video in pronunciation teaching has been largely ignored. To analyze the impact of the use of audio or video training on aural discrimination of vowels, 61 participants (all students at a large American university) took a pre-test followed by two training…

  4. Content Discovery from Composite Audio: An unsupervised approach

    NARCIS (Netherlands)

    Lu, L.

    2009-01-01

    In this thesis, we developed and assessed a novel robust and unsupervised framework for semantic inference from composite audio signals. We focused on the problem of detecting audio scenes and grouping them into meaningful clusters. Our approach addressed all major steps in a general process of comp

  5. Using Audio Books to Improve Reading and Academic Performance

    Science.gov (United States)

    Montgomery, Joel R.

    2009-01-01

    This article highlights significant research about what below grade-level reading means in middle school classrooms and suggests a tested approach to improve reading comprehension levels significantly by using audio books. The use of these audio books can improve reading and academic performance for both English language learners (ELLs) and for…

  6. A Case Study on Audio Feedback with Geography Undergraduates

    Science.gov (United States)

    Rodway-Dyer, Sue; Knight, Jasper; Dunne, Elizabeth

    2011-01-01

    Several small-scale studies have suggested that audio feedback can help students to reflect on their learning and to develop deep learning approaches that are associated with higher attainment in assessments. For this case study, Geography undergraduates were given audio feedback on a written essay assignment, alongside traditional written…

  7. Use of Audio Modification in Science Vocabulary Assessment

    Science.gov (United States)

    Adiguzel, Tufan

    2011-01-01

    The purposes of this study were to examine the utilization of audio modification in vocabulary assessment in school subject areas, specifically in elementary science, and to present a web-based key vocabulary assessment tool for the elementary school level. Audio-recorded readings were used to replace independent student readings as the task…

  8. Performance Analysis of Data Hiding in MPEG-4 AAC Audio

    Institute of Scientific and Technical Information of China (English)

    XU Shuzheng; ZHANG Peng; WANG Pengjun; YANG Huazhong

    2009-01-01

    A high capacity data hiding technique was developed for compressed digital audio.As perceptual audio coding has become the accepted technology for storage and transmission of audio signals,compressed audio information hiding enables robust,imperceptible transmission of data within audio signals,thus allowing valuable information to be attached to the content,such as the song title,lyrics,composer's name,and artist or property rights related data.This paper describes simultaneous low bitrate encoding and information hiding for highly compressed audio signals.The information hiding is implemented in the quantization process of the audio content which improves robustness,signal quality,and security.The imperceptibility of the embedded data is ensured based on the masking property of the human auditory system (HAS).The robustness and security are evaluated by various attacking algorithms.Tests with an extended MPEG4 advanced audio coding (AAC) encoder confirm that the method is robust to the regular and singular groups method (RS) and sample pair analysis (SPA) attacks as well as other statistical steganalysis method attacks.

  9. Minimizing Crosstalk in Self Oscillating Switch Mode Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Knott, Arnold; Ploug, Rasmus Overgaard

    2012-01-01

    The varying switching frequencies of self oscillating switch mode audio amplifiers have been known to cause interchannel intermodulation disturbances in multi channel configurations. This crosstalk phenomenon has a negative impact on the audio performance. The goal of this paper is to present...... by the implementation presented. Future work could include further refinement of the implementation of the concepts, electromagnetic interference investigations or PCB design....

  10. The Practical Audio-Visual Handbook for Teachers.

    Science.gov (United States)

    Scuorzo, Herbert E.

    The use of audio/visual media as an aid to instruction is a common practice in today's classroom. Most teachers, however, have little or no formal training in this field and rarely a knowledgeable coordinator to help them. "The Practical Audio-Visual Handbook for Teachers" discusses the types and mechanics of many of these media forms and proposes…

  11. IELTS speaking instruction through audio/voice conferencing

    OpenAIRE

    Hamed Ghaemi; Hossein Khodabakhshzade; Hamid R. Kargozari

    2012-01-01

    The currentstudyaimsatinvestigatingtheimpactofAudio/Voiceconferencing,asanewapproachtoteaching speaking, on the speakingperformanceand/orspeakingband score ofIELTScandidates.Experimentalgroupsubjectsparticipated in an audio conferencing classwhile those of the control group enjoyed attending in a traditional IELTS Speakingclass. At the endofthestudy,allsubjectsparticipatedinanIELTSExaminationheldonNovemberfourthin Tehran,Iran.To compare thegroupmeansforthestudy,anindependentt-testanalysiswase...

  12. Parametric Audio Based Decoder and Music Synthesizer for Mobile Applications

    NARCIS (Netherlands)

    Oomen, A.W.J.; Szczerba, M.Z.; Therssen, D.

    2011-01-01

    This paper reviews parametric audio coders and discusses novel technologies introduced in a low-complexity, low-power consumption audiodecoder and music synthesizer platform developed by the authors. Thedecoder uses parametric coding scheme based on the MPEG-4 Parametric Audio standard. In order to

  13. Four-quadrant flyback converter for direct audio power amplification

    DEFF Research Database (Denmark)

    Ljusev, Petar; Andersen, Michael Andreas E.

    2005-01-01

    This paper presents a bidirectional, four-quadrant flyback converter for use in direct audio power amplification. When compared to the standard Class-D switching audio power amplifier with a separate power supply, the proposed four-quadrant flyback converter provides simple solution with better...

  14. Multilevel inverter based class D audio amplifier for capacitive transducers

    DEFF Research Database (Denmark)

    Nielsen, Dennis; Knott, Arnold; Andersen, Michael A. E.

    2014-01-01

    The reduced semiconductor voltage stress makes the multilevel inverters especially interesting, when driving capacitive transducers for audio applications. A ± 300 V flying capacitor class D audio amplifier driving a 100 nF load in the midrange region of 0.1-3.5 kHz with Total Harmonic Distortion...

  15. ESA personal communications and digital audio broadcasting systems based on non-geostationary satellites

    Science.gov (United States)

    Logalbo, P.; Benedicto, J.; Viola, R.

    Personal Communications and Digital Audio Broadcasting are two new services that the European Space Agency (ESA) is investigating for future European and Global Mobile Satellite systems. ESA is active in promoting these services in their various mission options including non-geostationary and geostationary satellite systems. A Medium Altitude Global Satellite System (MAGSS) for global personal communications at L and S-band, and a Multiregional Highly inclined Elliptical Orbit (M-HEO) system for multiregional digital audio broadcasting at L-band are described. Both systems are being investigated by ESA in the context of future programs, such as Archimedes, which are intended to demonstrate the new services and to develop the technology for future non-geostationary mobile communication and broadcasting satellites.

  16. ESA personal communications and digital audio broadcasting systems based on non-geostationary satellites

    Science.gov (United States)

    Logalbo, P.; Benedicto, J.; Viola, R.

    1993-01-01

    Personal Communications and Digital Audio Broadcasting are two new services that the European Space Agency (ESA) is investigating for future European and Global Mobile Satellite systems. ESA is active in promoting these services in their various mission options including non-geostationary and geostationary satellite systems. A Medium Altitude Global Satellite System (MAGSS) for global personal communications at L and S-band, and a Multiregional Highly inclined Elliptical Orbit (M-HEO) system for multiregional digital audio broadcasting at L-band are described. Both systems are being investigated by ESA in the context of future programs, such as Archimedes, which are intended to demonstrate the new services and to develop the technology for future non-geostationary mobile communication and broadcasting satellites.

  17. The Citation Patterns on the Papers' Feature Space of Being Cited%文献被引特征空间上的引文模式分析

    Institute of Scientific and Technical Information of China (English)

    徐建中; 王名扬

    2013-01-01

    在揭示文献被引数量之后隐含的知识流动特性的基础上,提出了文献被引特征空间的概念。通过对文献在:发表早期、整个生命周期中的特征空间引用分布特性进行探讨,深入分析了文献被引的特征空间分布对文献最终被引频次形成产生的影响。发现在早期具有较广泛引用分布的文献,越容易成长为高被引文献。此研究,为深入理解高被引文献的形成,以及预测未来的高被引文献,提供了非常重要的理论依据。%A new concept of papers' feature space was proposed by considering the knowledge flow properties hidden behind papers' cita-tion counts. Then the papers' citation patterns based on the feature space were discussed around two time periods: a) the first five years after paper publication; b) the total citation life cycle. We found that those papers with wide distribution on feature space in their early stage of publication would have higher probability to become highly-cited papers. The results would be helpful for a better understanding of the formation of highly-cited papers, and also for a more accurate prediction of the future highly-cited papers.

  18. Biomedical image representation approach using visualness and spatial information in a concept feature space for interactive region-of-interest-based retrieval.

    Science.gov (United States)

    Rahman, Md Mahmudur; Antani, Sameer K; Demner-Fushman, Dina; Thoma, George R

    2015-10-01

    This article presents an approach to biomedical image retrieval by mapping image regions to local concepts where images are represented in a weighted entropy-based concept feature space. The term "concept" refers to perceptually distinguishable visual patches that are identified locally in image regions and can be mapped to a glossary of imaging terms. Further, the visual significance (e.g., visualness) of concepts is measured as the Shannon entropy of pixel values in image patches and is used to refine the feature vector. Moreover, the system can assist the user in interactively selecting a region-of-interest (ROI) and searching for similar image ROIs. Further, a spatial verification step is used as a postprocessing step to improve retrieval results based on location information. The hypothesis that such approaches would improve biomedical image retrieval is validated through experiments on two different data sets, which are collected from open access biomedical literature.

  19. Features of Dongjing’s Commercial Space in the Northern Song Dynasty: An Interpretation Based on Riverside Scene at Qingming Festival

    Institute of Scientific and Technical Information of China (English)

    Zhu; Jin; Pan; Jiahong; Zhu; Xiaofeng; Li; Min

    2015-01-01

    With an analysis on the city image presented by the painting of Riverside Scene at Qingming Festival, as well as other relevant documents, this paper explores the factors that caused the market system reform in the Northern Song Dynasty. It also explores the features of Dongjing, the capital city’s commercial space prompted by the reform, revealing that the growth of urban population, the rise of its commercial status and the emergence of citizen class were the essential factors contributing to the market system reform. It concludes that Dongjing’s commercial space shows the following characteristics: developing in a linear form, gradually forming a commercial network system by integrating various shops, markets, and warehouses, expanding to the Outer City to combine the prosperous grassroots markets, and hosting commercial activities with longer business time.

  20. Extraction of Subject-Specific Facial Expression Categories and Generation of Facial Expression Feature Space using Self-Mapping

    Directory of Open Access Journals (Sweden)

    Masaki Ishii

    2008-06-01

    Full Text Available This paper proposes a generation method of a subject-specific Facial Expression Map (FEMap using the Self-Organizing Maps (SOM of unsupervised learning and Counter Propagation Networks (CPN of supervised learning together. The proposed method consists of two steps. In the first step, the topological change of a face pattern in the expressional process of facial expression is learned hierarchically using the SOM of a narrow mapping space, and the number of subject-specific facial expression categories and the representative images of each category are extracted. Psychological significance based on the neutral and six basic emotions (anger, sadness, disgust, happiness, surprise, and fear is assigned to each extracted category. In the latter step, the categories and the representative images described above are learned using the CPN of a large mapping space, and a category map that expresses the topological characteristics of facial expression is generated. This paper defines this category map as an FEMap. Experimental results for six subjects show that the proposed method can generate a subject-specific FEMap based on the topological characteristics of facial expression appearing on face images.

  1. 论网络穿越小说的基本特性%On the Features of "Travel through Space-time" Network Fiction

    Institute of Scientific and Technical Information of China (English)

    李玉萍

    2012-01-01

    网络穿越小说是“穿越时空”文学母题在网络媒体时代的文本演绎形式。本文从分析“穿越时空”文学母题的特性和网络媒体的特性入手,分析了网络穿越小说的文学母题特性和网络艺术特性,进而分析了网络穿越小说的时空特质,结论表明:网络穿越小说在文学意义上是一种全新的小说艺术形式,在美学意义上则在最大程度上实现了人在数字化环境中的虚拟性存在。%"Travel through space-time network fiction" is the text format for the through space-time" in current internet era. By analyzing the "Travel through features and the characteristics of network media, this paper expounds the features motif of traditional~ "Travel space-time" literary motif of "Travel through spacetime" network fiction' s motif and network art; then, analyzes its spatial and temporal characteristics. The conclusion is, in the literary sense, "Travel through space-time" network fiction is a brand new novel form; in the aesthetic sense, it realizes people' s virtual existence in the digital environment to a maximum extent.

  2. Spaces

    Directory of Open Access Journals (Sweden)

    Maziar Nekovee

    2010-01-01

    Full Text Available Cognitive radio is being intensively researched as the enabling technology for license-exempt access to the so-called TV White Spaces (TVWS, large portions of spectrum in the UHF/VHF bands which become available on a geographical basis after digital switchover. Both in the US, and more recently, in the UK the regulators have given conditional endorsement to this new mode of access. This paper reviews the state-of-the-art in technology, regulation, and standardisation of cognitive access to TVWS. It examines the spectrum opportunity and commercial use cases associated with this form of secondary access.

  3. A Dither Modulation Audio Watermarking Algorithm Based on HAS

    Directory of Open Access Journals (Sweden)

    Yi-bo Huang

    2012-11-01

    Full Text Available In this study, we propose a dither modulation audio watermarking algorithm based on human auditory system which applied the theory of dither modulation. The algorithm made the two-value image watermarking to one-dimensional digital sequence firstly and used the Fibonacci to transform one-dimensional digital sequence. Then divide the audio into audio data segment and made discrete wavelet transform with audio data segment, every segment can adaptive choose quantization step. Finally put low frequency coefficients transformed embedding the watermarking which applied the dither modulation. When extract the watermark with no original audio, they realized blind extraction. The experimental results show that this algorithm has preferable robustness to against the attack from noise addition, compression, low pass filtering and re-sampling.

  4. Attention to sound improves auditory reliability in audio-tactile spatial optimal integration

    Directory of Open Access Journals (Sweden)

    Tiziana eVercillo

    2015-05-01

    Full Text Available The role of attention on multisensory processing is still poorly understood. In particular, it is unclear whether directing attention toward a sensory cue dynamically reweights cue reliability during integration of multiple sensory signals. In this study, we investigated the impact of attention in combining audio-tactile signals in an optimal fashion. We used the Maximum Likelihood Estimation (MLE model to predict audio-tactile spatial localization on the body surface. We developed a new audio-tactile device composed by several small units, each one consisting of a speaker and a tactile vibrator independently controllable by external software. We tested subjects in an attentional and a non-attentional condition. In the attention experiment participants performed a dual task paradigm: they were required to evaluate the duration of a sound while performing an audio-tactile spatial task. Three unisensory or multisensory stimuli (conflictual or not conflictual sounds and vibrations arranged along the horizontal axis were presented sequentially. In the primary task subjects had to evaluate the position of the second stimulus (the probe with respect to the others (in a space bisection task. In the secondary task they had to report occasionally changes in duration of the second auditory stimulus. In the non-attentional task participants had only to perform the primary task (space bisection. Our results showed enhanced auditory precision (and auditory weights in the auditory attentional condition with respect to the control non-attentional condition. Interestingly in both conditions the multisensory results are well predicted by the MLE model. The results of this study support the idea that modality-specific attention modulates multisensory integration.

  5. Audio/Visual Aids: A Study of the Effect of Audio/Visual Aids on the Comprehension Recall of Students.

    Science.gov (United States)

    Bavaro, Sandra

    A study investigated whether the use of audio/visual aids had an effect upon comprehension recall. Thirty fourth-grade students from an urban public school were randomly divided into two equal samples of 15. One group was given a story to read (print only), while the other group viewed a filmstrip of the same story, thereby utilizing audio/visual…

  6. Horatio Audio-Describes Shakespeare's "Hamlet": Blind and Low-Vision Theatre-Goers Evaluate an Unconventional Audio Description Strategy

    Science.gov (United States)

    Udo, J. P.; Acevedo, B.; Fels, D. I.

    2010-01-01

    Audio description (AD) has been introduced as one solution for providing people who are blind or have low vision with access to live theatre, film and television content. However, there is little research to inform the process, user preferences and presentation style. We present a study of a single live audio-described performance of Hart House…

  7. Subjective and Objective Assessment of Perceived Audio Quality of Current Digital Audio Broadcasting Systems and Web-Casting Applications

    NARCIS (Netherlands)

    Pocta, P.; Beerends, J.G.

    2015-01-01

    This paper investigates the impact of different audio codecs typically deployed in current digital audio broadcasting (DAB) systems and web-casting applications, which represent a main source of quality impairment in these systems and applications, on the quality perceived by the end user. Both subj

  8. Human Motion Synthesis Based on Independent Spatio-Temporal Feature Space%基于独立时空特征空间的人体运动合成

    Institute of Scientific and Technical Information of China (English)

    刘更代; 徐明亮; 张明敏

    2011-01-01

    It is still a challenge to generate interactively stylistic human motions satisfying userdefined constraints in computer graphics community. Previous works failed to generate stylistic human motions with both geometric and timing features. As a solution to this problem, a motion model called Independent Spatio-temporal Feature Space is presented in this paper. This model encapsulates spatio-temporal features for motion style which can be extracted from a deformable motion model with an unsupervised machine learning techniques-independent feature subspaceanalysis. The spatio-temporal features are described as several low-dimensional subspaces. Motion style editing is subsequently achieved by blending low-dimensional parameters or solving space-time constrained optimization. Both geometric and timing features are considered in this work and a directly interactive interface is provided. The power and flexibility of this method are demonstrated by taking locomotion as examples. The experimental results show that this method is fast and effective for interactive motion style editing.%如何通过指定约束条件的方式交互式地合成风格化人体运动是计算机动画研究领域的热点和难点,传统的数据驱动办法通常没有全面考虑运动的静态和动态特性.针对这一问题,文中提出人体运动的独立时空特征空间模型,利用一个可变形运动模型和独立特征子空间分析算法提取运动在时空两个域上的特征,并将其封装起来,通过低维子空间进行描述.运动风格的编辑可利用低维运动混合和空时约束优化等方法来实现.该方法充分考虑了运动在时域上的特征,并为用户提供了可编辑接口.文中以各种风格的行走运动为例讨论了该方法的有效性,结果证明效果良好,可用于交互式运动风格编辑.

  9. On Building Immersive Audio Applications Using Robust Adaptive Beamforming and Joint Audio-Video Source Localization

    Directory of Open Access Journals (Sweden)

    Beracoechea JA

    2006-01-01

    Full Text Available This paper deals with some of the different problems, strategies, and solutions of building true immersive audio systems oriented to future communication applications. The aim is to build a system where the acoustic field of a chamber is recorded using a microphone array and then is reconstructed or rendered again, in a different chamber using loudspeaker array-based techniques. Our proposal explores the possibility of using recent robust adaptive beamforming techniques for effectively estimating the original sources of the emitting room. A joint audio-video localization method needed in the estimation process as well as in the rendering engine is also presented. The estimated source signal and the source localization information drive a wave field synthesis engine that renders the acoustic field again at the receiving chamber. The system performance is tested using MUSHRA-based subjective tests.

  10. Survey Musik und Medien 2012: Audio Media Usage in Germany - Audio Sources - Radio Traditionalists

    OpenAIRE

    Lepa, Steffen

    2013-01-01

    Where did everyday music come from in 2012? Audio Sources describe those distribution channels by means of which music is purchased, archived and made accessible. This includes physical recordings (CD, LP, MC etc.), electronic services in terms of downloading and streaming of digital music (iTunes, last.fm, Spotify etc.) as well as traditional radio reception and last but not least musical content on websites or digital storage media. Radio Traditionalists are represented in various age g...

  11. Scalar Quantization for Audio Data Coding

    CERN Document Server

    Kudryashov, Boris D; Oh, Eunmi L

    2008-01-01

    This paper is concerned with scalar quantization of transform coefficients in an audio codec. The generalized Gaussian distribution (GGD) is used as an approximation of one-dimensional probability density function for transform coefficients obtained by modulated lapped transform (MLT) or modified cosine transform (MDCT) filterbank. The rationale of the model is provided in comparison with theoretically achievable rate-distortion function. The rate-distortion function computed for the random sequence obtained from a real sequence of samples from a large database is compared with that computed for random sequence obtained by a GGD random generator. A simple algorithm of constructing the Extended Zero Zone (EZZ) quantizer is proposed. Simulation results show that the EZZ quantizer yields a negligible loss in terms of coding efficiency compared to optimal scalar quantizers. Furthermore, we describe an adaptive version of the EZZ quantizer which works efficiently with low bitrate requirements for transmitting side...

  12. Audio visual information materials for risk communication

    International Nuclear Information System (INIS)

    Japan Nuclear Cycle Development Institute (JNC), Tokai Works set up the Risk Communication Study Team in January, 2001 to promote mutual understanding between the local residents and JNC. The Team has studied risk communication from various viewpoints and developed new methods of public relations which are useful for the local residents' risk perception toward nuclear issues. We aim to develop more effective risk communication which promotes a better mutual understanding of the local residents, by providing the risk information of the nuclear fuel facilities such a Reprocessing Plant and other research and development facilities. We explain the development process of audio visual information materials which describe our actual activities and devices for the risk management in nuclear fuel facilities, and our discussion through the effectiveness measurement. (author)

  13. Enhanced Audio LSB Steganography for Secure Communication

    Directory of Open Access Journals (Sweden)

    Muhammad Junaid Hussain

    2016-01-01

    Full Text Available The ease with which data can be remitted across the globe via Internet has made it an obvious (as medium choice for on line data transmission and communication. This salient trait, however, is constraint with akin issues of privacy, veracity of the information being exchanged over it, and legitimacy of its sender together with its availability when needed. Although cryptography is being used to confront confidentiality concern yet for many is slightly limited in scope because of discernibility of encrypted information. Further, due to restrictions imposed on the use of cryptography by its citizens for personal doings, various Governments have also coxswained the research arena to explore another discipline of information hiding called steganography – whose sole purpose is to make the information being exchanged inaudible. This research is focused on evolution of model based secure LSB Steganographic scheme for digital audio wave file format to withstand passive attack by Warden Wendy.

  14. Particle Filtering on the Audio Localization Manifold

    CERN Document Server

    Ettinger, Evan

    2010-01-01

    We present a novel particle filtering algorithm for tracking a moving sound source using a microphone array. If there are N microphones in the array, we track all $N \\choose 2$ delays with a single particle filter over time. Since it is known that tracking in high dimensions is rife with difficulties, we instead integrate into our particle filter a model of the low dimensional manifold that these delays lie on. Our manifold model is based off of work on modeling low dimensional manifolds via random projection trees [1]. In addition, we also introduce a new weighting scheme to our particle filtering algorithm based on recent advancements in online learning. We show that our novel TDOA tracking algorithm that integrates a manifold model can greatly outperform standard particle filters on this audio tracking task.

  15. A direct broadcast satellite-audio experiment

    Science.gov (United States)

    Vaisnys, Arvydas; Abbe, Brian; Motamedi, Masoud

    1992-03-01

    System studies have been carried out over the past three years at the Jet Propulsion Laboratory (JPL) on digital audio broadcasting (DAB) via satellite. The thrust of the work to date has been on designing power and bandwidth efficient systems capable of providing reliable service to fixed, mobile, and portable radios. It is very difficult to predict performance in an environment which produces random periods of signal blockage, such as encountered in mobile reception where a vehicle can quickly move from one type of terrain to another. For this reason, some signal blockage mitigation techniques were built into an experimental DAB system and a satellite experiment was conducted to obtain both qualitative and quantitative measures of performance in a range of reception environments. This paper presents results from the experiment and some conclusions on the effectiveness of these blockage mitigation techniques.

  16. Audio-visual speech cue combination.

    Directory of Open Access Journals (Sweden)

    Derek H Arnold

    Full Text Available BACKGROUND: Different sources of sensory information can interact, often shaping what we think we have seen or heard. This can enhance the precision of perceptual decisions relative to those made on the basis of a single source of information. From a computational perspective, there are multiple reasons why this might happen, and each predicts a different degree of enhanced precision. Relatively slight improvements can arise when perceptual decisions are made on the basis of multiple independent sensory estimates, as opposed to just one. These improvements can arise as a consequence of probability summation. Greater improvements can occur if two initially independent estimates are summated to form a single integrated code, especially if the summation is weighted in accordance with the variance associated with each independent estimate. This form of combination is often described as a Bayesian maximum likelihood estimate. Still greater improvements are possible if the two sources of information are encoded via a common physiological process. PRINCIPAL FINDINGS: Here we show that the provision of simultaneous audio and visual speech cues can result in substantial sensitivity improvements, relative to single sensory modality based decisions. The magnitude of the improvements is greater than can be predicted on the basis of either a Bayesian maximum likelihood estimate or a probability summation. CONCLUSION: Our data suggest that primary estimates of speech content are determined by a physiological process that takes input from both visual and auditory processing, resulting in greater sensitivity than would be possible if initially independent audio and visual estimates were formed and then subsequently combined.

  17. Semi-fragile Audio Watermarking Scheme Based on the Approximate Components Energy%基于近似分量能量的半脆弱音频水印算法

    Institute of Scientific and Technical Information of China (English)

    宁超魁; 和红杰; 陈帆; 尹忠科

    2013-01-01

    为提高半脆弱音频水印算法的安全性,本文提出一种基于近似分量能量的半脆弱音频水印算法.该算法将每个音频帧分为两段,分别用于提取音频帧特征和嵌入其它音频帧的水印信息.本文利用音频段近似分量能量以α为底的对数作为音频帧特征,基于密钥将音频帧特征加密后随机嵌入到其他音频帧另一段的混合域中,检测时根据音频帧及其相邻帧水印的不一致性判断音频帧的真实性.兼顾音频帧特征的鲁棒性和分布特性讨论α的取值,实验结果表明该算法能准确定位被篡改的音频帧且能有效抵抗拼贴攻击.%In order to improve the security of the semi-fragile audio watermarking scheme, the semi-fragile audio watermarking algorithm based on the approximate components energy(ACE) was proposed. Every audio frame was divided into two sections. One section was used to extract the feature of the audio frame, and the other was used to hide the watermark data of the other audio frame. The feature of an audio frame was the logarithm of ACE of the chosen audio section form the audio frame to the base α. For each audio frame, the feature was encrypted and randomly embedded in the hybrid domain of another audio frame based on the secret key. The validity of an audio frame was determined by the inconsistency of itself and its neighborhood audio frames. This paper also discussed the value of α from the viewpoint of the robustness and distribution of the audio frame feature. Experimental results demonstrate that the proposed scheme can localize the tampered regions accurately and resist collage attacks effectively.

  18. Optimized Audio Classification and Segmentation Algorithm by Using Ensemble Methods

    Directory of Open Access Journals (Sweden)

    Saadia Zahid

    2015-01-01

    Full Text Available Audio segmentation is a basis for multimedia content analysis which is the most important and widely used application nowadays. An optimized audio classification and segmentation algorithm is presented in this paper that segments a superimposed audio stream on the basis of its content into four main audio types: pure-speech, music, environment sound, and silence. An algorithm is proposed that preserves important audio content and reduces the misclassification rate without using large amount of training data, which handles noise and is suitable for use for real-time applications. Noise in an audio stream is segmented out as environment sound. A hybrid classification approach is used, bagged support vector machines (SVMs with artificial neural networks (ANNs. Audio stream is classified, firstly, into speech and nonspeech segment by using bagged support vector machines; nonspeech segment is further classified into music and environment sound by using artificial neural networks and lastly, speech segment is classified into silence and pure-speech segments on the basis of rule-based classifier. Minimum data is used for training classifier; ensemble methods are used for minimizing misclassification rate and approximately 98% accurate segments are obtained. A fast and efficient algorithm is designed that can be used with real-time multimedia applications.

  19. An inconclusive digital audio authenticity examination: a unique case.

    Science.gov (United States)

    Koenig, Bruce E; Lacey, Douglas S

    2012-01-01

    This case report sets forth an authenticity examination of 35 encrypted, proprietary-format digital audio files containing recorded telephone conversations between two codefendants in a criminal matter. The codefendant who recorded the conversations did so on a recording system he developed; additionally, he was both a forensic audio authenticity examiner, who had published and presented in the field, and was the head of a professional audio society's writing group for authenticity standards. The authors conducted the examination of the recordings following nine laboratory steps of the peer-reviewed and published 11-step digital audio authenticity protocol. Based considerably on the codefendant's direct involvement with the development of the encrypted audio format, his experience in the field of forensic audio authenticity analysis, and the ease with which the audio files could be accessed, converted, edited in the gap areas, and reconstructed in such a way that the processes were undetected, the authors concluded that the recordings could not be scientifically authenticated through accepted forensic practices.

  20. Portable audio electronics for impedance-based measurements in microfluidics

    International Nuclear Information System (INIS)

    We demonstrate the use of audio electronics-based signals to perform on-chip electrochemical measurements. Cell phones and portable music players are examples of consumer electronics that are easily operated and are ubiquitous worldwide. Audio output (play) and input (record) signals are voltage based and contain frequency and amplitude information. A cell phone, laptop soundcard and two compact audio players are compared with respect to frequency response; the laptop soundcard provides the most uniform frequency response, while the cell phone performance is found to be insufficient. The audio signals in the common portable music players and laptop soundcard operate in the range of 20 Hz to 20 kHz and are found to be applicable, as voltage input and output signals, to impedance-based electrochemical measurements in microfluidic systems. Validated impedance-based measurements of concentration (0.1–50 mM), flow rate (2–120 µL min−1) and particle detection (32 µm diameter) are demonstrated. The prevailing, lossless, wave audio file format is found to be suitable for data transmission to and from external sources, such as a centralized lab, and the cost of all hardware (in addition to audio devices) is ∼10 USD. The utility demonstrated here, in combination with the ubiquitous nature of portable audio electronics, presents new opportunities for impedance-based measurements in portable microfluidic systems. (technical note)

  1. On the relevance of spectral features for instrument classification

    OpenAIRE

    Nielsen, Andreas Brinch; Sigurdsson, Sigurdur; Hansen, Lars Kai; Arenas-García, Jerónimo

    2007-01-01

    Automatic knowledge extraction from music signals is a key component for most music organization and music information retrieval systems. In this paper, we consider the problem of instrument modelling and instrument classification from the rough audio data. Existing systems for automatic instrument classification operate normally on a relatively large number of features, from which those related to the spectrum of the audio signal are particularly relevant. In this paper, we confront two diff...

  2. Mixed-Signal Architectures for High-Efficiency and Low-Distortion Digital Audio Processing and Power Amplification

    Directory of Open Access Journals (Sweden)

    Pierangelo Terreni

    2010-01-01

    Full Text Available The paper addresses the algorithmic and architectural design of digital input power audio amplifiers. A modelling platform, based on a meet-in-the-middle approach between top-down and bottom-up design strategies, allows a fast but still accurate exploration of the mixed-signal design space. Different amplifier architectures are configured and compared to find optimal trade-offs among different cost-functions: low distortion, high efficiency, low circuit complexity and low sensitivity to parameter changes. A novel amplifier architecture is derived; its prototype implements digital processing IP macrocells (oversampler, interpolating filter, PWM cross-point deriver, noise shaper, multilevel PWM modulator, dead time compensator on a single low-complexity FPGA while off-chip components are used only for the power output stage (LC filter and power MOS bridge; no heatsink is required. The resulting digital input amplifier features a power efficiency higher than 90% and a total harmonic distortion down to 0.13% at power levels of tens of Watts. Discussions towards the full-silicon integration of the mixed-signal amplifier in embedded devices, using BCD technology and targeting power levels of few Watts, are also reported.

  3. Towards Structural Analysis of Audio Recordings in the Presence of Musical Variations

    Directory of Open Access Journals (Sweden)

    Müller Meinard

    2007-01-01

    Full Text Available One major goal of structural analysis of an audio recording is to automatically extract the repetitive structure or, more generally, the musical form of the underlying piece of music. Recent approaches to this problem work well for music, where the repetitions largely agree with respect to instrumentation and tempo, as is typically the case for popular music. For other classes of music such as Western classical music, however, musically similar audio segments may exhibit significant variations in parameters such as dynamics, timbre, execution of note groups, modulation, articulation, and tempo progression. In this paper, we propose a robust and efficient algorithm for audio structure analysis, which allows to identify musically similar segments even in the presence of large variations in these parameters. To account for such variations, our main idea is to incorporate invariance at various levels simultaneously: we design a new type of statistical features to absorb microvariations, introduce an enhanced local distance measure to account for local variations, and describe a new strategy for structure extraction that can cope with the global variations. Our experimental results with classical and popular music show that our algorithm performs successfully even in the presence of significant musical variations.

  4. A Perceptual Model for Sinusoidal Audio Coding Based on Spectral Integration

    Directory of Open Access Journals (Sweden)

    Jensen Søren Holdt

    2005-01-01

    Full Text Available Psychoacoustical models have been used extensively within audio coding applications over the past decades. Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of audio signals. In this paper, we present a new perceptual model that predicts masked thresholds for sinusoidal distortions. The model relies on signal detection theory and incorporates more recent insights about spectral and temporal integration in auditory masking. As a consequence, the model is able to predict the distortion detectability. In fact, the distortion detectability defines a (perceptually relevant norm on the underlying signal space which is beneficial for optimisation algorithms such as rate-distortion optimisation or linear predictive coding. We evaluate the merits of the model by combining it with a sinusoidal extraction method and compare the results with those obtained with the ISO MPEG-1 Layer I-II recommended model. Listening tests show a clear preference for the new model. More specifically, the model presented here leads to a reduction of more than 20% in terms of number of sinusoids needed to represent signals at a given quality level.

  5. Implementation of ETAS (Embedding Text in Audio Signal Model To Ensure Secrecy

    Directory of Open Access Journals (Sweden)

    K. GEETHA

    2010-07-01

    Full Text Available Steganography is the art of hiding information that evolves as a new secret communication technology. For a long period time, information hiding was done using plain text, still images, video and IP datagram. Embedding secret messages using audio signal in digital format is now the area of focus. There exists numerous steganography techniques for hiding information in audio medium. In this work we propose a new model ETAS - Embedding Text in Audio Signal that embeds the text like the existing system but with encryption that gains thefull advantages of cryptography. Using steganography it is possible to conceal the full existence of the original text and the results obtained from the proposed model is compared with other existing techniques and proved to be efficient for textual messages of size beyond 12 KB as the size of the embedded text is approximately same as that of encrypted text size. This emphasis the fact that we are able to ensure secrecy without an additional cost of extra space consumed for the text to be communicated.

  6. A Perceptual Model for Sinusoidal Audio Coding Based on Spectral Integration

    Science.gov (United States)

    van de Par, Steven; Kohlrausch, Armin; Heusdens, Richard; Jensen, Jesper; Jensen, Søren Holdt

    2005-12-01

    Psychoacoustical models have been used extensively within audio coding applications over the past decades. Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of audio signals. In this paper, we present a new perceptual model that predicts masked thresholds for sinusoidal distortions. The model relies on signal detection theory and incorporates more recent insights about spectral and temporal integration in auditory masking. As a consequence, the model is able to predict the distortion detectability. In fact, the distortion detectability defines a (perceptually relevant) norm on the underlying signal space which is beneficial for optimisation algorithms such as rate-distortion optimisation or linear predictive coding. We evaluate the merits of the model by combining it with a sinusoidal extraction method and compare the results with those obtained with the ISO MPEG-1 Layer I-II recommended model. Listening tests show a clear preference for the new model. More specifically, the model presented here leads to a reduction of more than 20% in terms of number of sinusoids needed to represent signals at a given quality level.

  7. An audio-magnetotelluric investigation in Terceira Island (Azores)

    Science.gov (United States)

    Monteiro Santos, Fernando A.; Trota, António; Soares, António; Luzio, Rafael; Lourenço, Nuno; Matos, Liliana; Almeida, Eugénio; Gaspar, João L.; Miranda, Jorge M.

    2006-08-01

    Ten audio-magnetotelluric soundings have been carried out along a profile crossing the Serra do Cume caldera in the eastern part of the Terceira Island (Azores). The main objectives of this investigation were to detect geoelectrical features related with tectonic structures and to characterize regional hydrological and hydrothermal aspects mainly those related to geothermal fluid dynamics. Three-dimensional numerical investigation showed that the data acquired at periods shorter than 1 s are not significantly affected by ocean effect. The data was analysed using the Smith's decomposition method in order to investigate possible distortions caused by superficial structures and to estimate a global regional strike. The results suggest that in general the soundings were not distorted. A regional N55°W strike was chosen for the two-dimensional data inversion. The low-resistivity zones (10-30 ohm-m) displayed in the central part of the 2-D geoelectrical model have been interpreted as caused by hydrothermal circulation. The low-resistivity anomalies at the ends of the profile might be attributed to alteration zones with interaction of seawater intrusion. High-resistivity (> 300 ohm-m) values have been related with less permeable zones in the SW of Cinco Picos and Guilherme Moniz caldera walls.

  8. A New Steganographic Method for Embedded Image In Audio File

    Directory of Open Access Journals (Sweden)

    Mohammed S. Altaei

    2012-04-01

    Full Text Available Because secure transaction of information is increasing day by day therefore Steganography hasbecome very important and used modern strategies. Steganography is a strategy in whichrequired information is concealment in any other information such that the second informationdoes not change significantly and it appears the same as original. This work presents a newapproach of concealment encrypted mobile image in a audio file.The proposed work is replacingtwo LSB of each byte in audio file and these bytes are choices as randomly location. It becomesvery difficult for intruder to guess that an image is hidden in the audio.

  9. Robust message authentication code algorithm for digital audio recordings

    Science.gov (United States)

    Zmudzinski, Sascha; Steinebach, Martin

    2007-02-01

    Current systems and protocols for integrity and authenticity verification of media data do not distinguish between legitimate signal transformation and malicious tampering that manipulates the content. Furthermore, they usually provide no localization or assessment of the relevance of such manipulations with respect to human perception or semantics. We present an algorithm for a robust message authentication code (RMAC) to verify the integrity of audio recodings by means of robust audio fingerprinting and robust perceptual hashing. Experimental results show that the proposed algorithm provides both a high level of distinction between perceptually different audio data and a high robustness against signal transformations that do not change the perceived information.

  10. Musical examination to bridge audio data and sheet music

    Science.gov (United States)

    Pan, Xunyu; Cross, Timothy J.; Xiao, Liangliang; Hei, Xiali

    2015-03-01

    The digitalization of audio is commonly implemented for the purpose of convenient storage and transmission of music and songs in today's digital age. Analyzing digital audio for an insightful look at a specific musical characteristic, however, can be quite challenging for various types of applications. Many existing musical analysis techniques can examine a particular piece of audio data. For example, the frequency of digital sound can be easily read and identified at a specific section in an audio file. Based on this information, we could determine the musical note being played at that instant, but what if you want to see a list of all the notes played in a song? While most existing methods help to provide information about a single piece of the audio data at a time, few of them can analyze the available audio file on a larger scale. The research conducted in this work considers how to further utilize the examination of audio data by storing more information from the original audio file. In practice, we develop a novel musical analysis system Musicians Aid to process musical representation and examination of audio data. Musicians Aid solves the previous problem by storing and analyzing the audio information as it reads it rather than tossing it aside. The system can provide professional musicians with an insightful look at the music they created and advance their understanding of their work. Amateur musicians could also benefit from using it solely for the purpose of obtaining feedback about a song they were attempting to play. By comparing our system's interpretation of traditional sheet music with their own playing, a musician could ensure what they played was correct. More specifically, the system could show them exactly where they went wrong and how to adjust their mistakes. In addition, the application could be extended over the Internet to allow users to play music with one another and then review the audio data they produced. This would be particularly

  11. Lattice Vector Quantization Applied to Speech and Audio Coding

    Institute of Scientific and Technical Information of China (English)

    Minjie Xie

    2012-01-01

    Lattice vector quantization (LVQ) has been used for real-time speech and audio coding systems. Compared with conventional vector quantization, LVQ has two main advantages: It has a simple and fast encoding process, and it significantly reduces the amount of memory required. Therefore, LVQ is suitable for use in low-complexity speech and audio coding. In this paper, we describe the basic concepts of LVQ and its advantages over conventional vector quantization. We also describe some LVQ techniques that have been used in speech and audio coding standards of international standards developing organizations (SDOs).

  12. Multi Carrier Modulation Audio Power Amplifier with Programmable Logic

    DEFF Research Database (Denmark)

    Christiansen, Theis; Andersen, Toke Meyer; Knott, Arnold;

    2009-01-01

    for performance and out of band spectral amplitudes. The basic principle in MCM is to use programmable logic to combine two or more Pulse Width Modulated (PWM) audio signals at different switching frequencies. In this way the out of band spectrum will be lowered compared with conventional class D amplifiers...... frequencies entering the audio band. Still, the MS MCM topology with two carrier signals shows a 6 dB reduction of the switching frequency amplitudes as well as THD across the audio band below 1% at 55 W output power open loop....

  13. Aurally Aided Visual Search Performance Comparing Virtual Audio Systems

    DEFF Research Database (Denmark)

    Larsen, Camilla Horne; Lauritsen, David Skødt; Larsen, Jacob Junker;

    2014-01-01

    Due to increased computational power, reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between a HRTF enhanced audio system (3D) and an...... with white dots. The results indicate that 3D audio yields faster search latencies than panning audio, especially with larger amounts of distractors. The applications of this research could fit virtual environments such as video games or virtual simulations....

  14. Aurally Aided Visual Search Performance Comparing Virtual Audio Systems

    DEFF Research Database (Denmark)

    Larsen, Camilla Horne; Lauritsen, David Skødt; Larsen, Jacob Junker;

    2014-01-01

    Due to increased computational power reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between an HRTF enhanced audio system (3D...... with white dots. The results indicate that 3D audio yields faster search latencies than panning audio, especially with larger amounts of distractors. The applications of this research could fit virtual environments such as video games or virtual simulations....

  15. Applications of Wavelets in 3-D Audio Simulation

    Institute of Scientific and Technical Information of China (English)

    2000-01-01

    Wavelet has been used as a powerful tool in the signal processing and function approx imation recently. This paper presents the application of wavelets for solving two key problems in 3-1 audio simulation. First, we employ discrete wavelet transform (DWT) combined with vector quantisation (VQ) to compress audio data in order to reduce tremendous redundant data storage and transmission times. Secondly, we use wavelets as the activation functions in neural networks called feed-forward wavelet networks to approach auditory localisation information cues (head-related transfer functions (HRTFs) are used here). The experimental results demonstrate that the applica tion of wavelets is more efficientand useful in 3-D audio simulation.

  16. Audio Effects Based on Biorthogonal Time-Varying Frequency Warping

    Directory of Open Access Journals (Sweden)

    Cavaliere Sergio

    2001-01-01

    Full Text Available We illustrate the mathematical background and musical use of a class of audio effects based on frequency warping. These effects alter the frequency content of a signal via spectral mapping. They can be implemented in dispersive tapped delay lines based on a chain of all-pass filters. In a homogeneous line with first-order all-pass sections, the signal formed by the output samples at a given time is related to the input via the Laguerre transform. However, most musical signals require a time-varying frequency modification in order to be properly processed. Vibrato in musical instruments or voice intonation in the case of vocal sounds may be modeled as small and slow pitch variations. Simulation of these effects requires techniques for time-varying pitch and/or brightness modification that are very useful for sound processing. The basis for time-varying frequency warping is a time-varying version of the Laguerre transformation. The corresponding implementation structure is obtained as a dispersive tapped delay line, where each of the frequency dependent delay element has its own phase response. Thus, time-varying warping results in a space-varying, inhomogeneous, propagation structure. We show that time-varying frequency warping is associated to an expansion over biorthogonal sets generalizing the discrete Laguerre basis. Slow time-varying characteristics lead to slowly varying parameter sequences. The corresponding sound transformation does not suffer from discontinuities typical of delay lines based on unit delays.

  17. The effect of reverberation on personal audio devices.

    Science.gov (United States)

    Simón-Gálvez, Marcos F; Elliott, Stephen J; Cheer, Jordan

    2014-05-01

    Personal audio refers to the creation of a listening zone within which a person, or a group of people, hears a given sound program, without being annoyed by other sound programs being reproduced in the same space. Generally, these different sound zones are created by arrays of loudspeakers. Although these devices have the capacity to achieve different sound zones in an anechoic environment, they are ultimately used in normal rooms, which are reverberant environments. At high frequencies, reflections from the room surfaces create a diffuse pressure component which is uniform throughout the room volume and thus decreases the directional characteristics of the device. This paper shows how the reverberant performance of an array can be modeled, knowing the anechoic performance of the radiator and the acoustic characteristics of the room. A formulation is presented whose results are compared to practical measurements in reverberant environments. Due to reflections from the room surfaces, pressure variations are introduced in the transfer responses of the array. This aspect is assessed by means of simulations where random noise is added to create uncertainties, and by performing measurements in a real environment. These results show how the robustness of an array is increased when it is designed for use in a reverberant environment. PMID:24815249

  18. Audio Effects Based on Biorthogonal Time-Varying Frequency Warping

    Science.gov (United States)

    Evangelista, Gianpaolo; Cavaliere, Sergio

    2001-12-01

    We illustrate the mathematical background and musical use of a class of audio effects based on frequency warping. These effects alter the frequency content of a signal via spectral mapping. They can be implemented in dispersive tapped delay lines based on a chain of all-pass filters. In a homogeneous line with first-order all-pass sections, the signal formed by the output samples at a given time is related to the input via the Laguerre transform. However, most musical signals require a time-varying frequency modification in order to be properly processed. Vibrato in musical instruments or voice intonation in the case of vocal sounds may be modeled as small and slow pitch variations. Simulation of these effects requires techniques for time-varying pitch and/or brightness modification that are very useful for sound processing. The basis for time-varying frequency warping is a time-varying version of the Laguerre transformation. The corresponding implementation structure is obtained as a dispersive tapped delay line, where each of the frequency dependent delay element has its own phase response. Thus, time-varying warping results in a space-varying, inhomogeneous, propagation structure. We show that time-varying frequency warping is associated to an expansion over biorthogonal sets generalizing the discrete Laguerre basis. Slow time-varying characteristics lead to slowly varying parameter sequences. The corresponding sound transformation does not suffer from discontinuities typical of delay lines based on unit delays.

  19. TNO at TRECVID 2008, Combining Audio and Video Fingerprinting for Robust Copy Detection

    NARCIS (Netherlands)

    Doets, P.J.; Eendebak, P.T.; Ranguelova, E.; Kraaij, W.

    2009-01-01

    TNO has evaluated a baseline audio and a video fingerprinting system based on robust hashing for the TRECVID 2008 copy detection task. We participated in the audio, the video and the combined audio-video copy detection task. The audio fingerprinting implementation clearly outperformed the video fing

  20. FIRST ULTRAVIOLET REFLECTANCE SPECTRA OF PLUTO AND CHARON BY THE HUBBLE SPACE TELESCOPE COSMIC ORIGINS SPECTROGRAPH: DETECTION OF ABSORPTION FEATURES AND EVIDENCE FOR TEMPORAL CHANGE

    International Nuclear Information System (INIS)

    We have observed the mid-UV spectra of both Pluto and its large satellite, Charon, at two rotational epochs using the Hubble Space Telescope (HST) Cosmic Origins Spectrograph (COS) in 2010. These are the first HST/COS measurements of Pluto and Charon. Here we describe the observations and our reduction of them, and present the albedo spectra, average mid-UV albedos, and albedo slopes we derive from these data. These data reveal evidence for a strong absorption feature in the mid-UV spectrum of Pluto; evidence for temporal change in Pluto's spectrum since the 1990s is reported, and indirect evidence for a near-UV spectral absorption on Charon is also reported.

  1. Discussion on the feature of strong earthquake: Orderly distribution in time, space and intensity before the Western Kunlun Mountain Pass M=8.1 earthquake

    Institute of Scientific and Technical Information of China (English)

    张晓东; 张永仙; 吕梅梅; 余素荣

    2003-01-01

    In the paper, the feature of strong earthquake orderly distribution in time, space and intensity before the WesternKunlun Mountain Pass M=8.1 earthquake is preliminarily studied. The modulation and triggering factors such asthe earth rotation, earth tides are analyzed. The results show that: the giant earthquakes with the magnitude morethan 8 occurred about every 24 years and the earthquakes with the magnitude more than 7 about every 7 years inChinese mainland. The Western Kunlun Mountain M=8.1 earthquake exactly occurred at the expected time; Thespatial distance show approximately the same distances between each two swarms. The earth rotation, earth tide,sun tide and sun magnetic field have played a role of modulation and triggering in the intensity. At last, the condi-tions for earthquake generation and occurrence are also discussed.

  2. Audio CAPTCHA for SIP-Based VoIP

    Science.gov (United States)

    Soupionis, Yannis; Tountas, George; Gritzalis, Dimitris

    Voice over IP (VoIP) introduces new ways of communication, while utilizing existing data networks to provide inexpensive voice communications worldwide as a promising alternative to the traditional PSTN telephony. SPam over Internet Telephony (SPIT) is one potential source of future annoyance in VoIP. A common way to launch a SPIT attack is the use of an automated procedure (bot), which generates calls and produces audio advertisements. In this paper, our goal is to design appropriate CAPTCHA to fight such bots. We focus on and develop audio CAPTCHA, as the audio format is more suitable for VoIP environments and we implement it in a SIP-based VoIP environment. Furthermore, we suggest and evaluate the specific attributes that audio CAPTCHA should incorporate in order to be effective, and test it against an open source bot implementation.

  3. Effectiveness of 3-D audio for warnings in the cockpit

    NARCIS (Netherlands)

    Oving, A.B.; Veltman, J.A.; Bronkhorst, A.W.

    2004-01-01

    Een tweetal vliegsimulator experimenten lieten zien dat piloten sneller reagereerden op de auditieve waarschuwingen van het TCAS systeem in de civiele cockpit, waneer deze waarschuwingen werden gepresenteerd met 3D-audio in vergelijking tot mono geluid.

  4. Can audio recording improve patients' recall of outpatient consultations?

    DEFF Research Database (Denmark)

    Wolderslund, Maiken; Kofoed, Poul-Erik; Axboe, Mette;

    Introduction In order to give patients possibility to listen to their consultation again, we have designed a system which gives the patients access to digital audio recordings of their consultations. An Interactive Voice Response platform enables the audio recording and gives the patients access...... to replay their consultation. The intervention is evaluated in a randomised controlled trial with 5.460 patients in order to determine whether providing patients with digital audio recording of the consultation affects the patients overall perception of their consultation. In addition to this primary...... objective we want to investigate if replay of the consultations improves the patients’ recall of the information given. Methods Interviews are carried out with 40 patients whose consultations have been audio recorded. Patients are divided into two groups, those who have listened to their consultation...

  5. Reception of infrasound and audio current in derma nerves

    Institute of Scientific and Technical Information of China (English)

    Jianwen Li; Ziyu Li; Xuezong Ma

    2010-01-01

    Determining the frequency range of derma nerve that responds to audio current is fundamental for the development of skin-hearing technology.Previous studies have shown that the range of derma nerve responding to audio current is 15-15 000 Hz,because audio amplification is not separated from the step-up transformer.Therefore,the present study used a signal generator which directly drives plane electrodes,simplified the original experimental environment for skin-hearing,measured lower limit voltage of frequency for derma nerve receiving pulse current signals,and revealed that the frequency range of human derma nerve response was as wide as 0.1-30 000 Hz.Results demonstrate that human derma nerve receives audio signals and infrasound within a wide frequency range.

  6. Proper Use of Audio-Visual Aids: Essential for Educators.

    Science.gov (United States)

    Dejardin, Conrad

    1989-01-01

    Criticizes educators as the worst users of audio-visual aids and among the worst public speakers. Offers guidelines for the proper use of an overhead projector and the development of transparencies. (DMM)

  7. A Novel Digital Audio Watermarking Scheme in the Wavelet Domain

    Institute of Scientific and Technical Information of China (English)

    WANG Xiang-yang; YANG Hong-ying; ZHAO Hong

    2005-01-01

    We present a novel quantization-based digital audio watermarking scheme in wavelet domain. By quantizing a host audio's wavelet coefficients (Integer Lifting Wavelet Transform ) and utilizing the characteristics of human auditory system ( HAS), the gray image is embedded using our watermarking method. Experimental results show that the proposed watermarking scheme is inaudible and robust against various signal processing such as noising adding, lossy compression, low pass filtering, re-sampling, and re-quantifying.

  8. Design guidelines for audio presentation of graphs and tables

    OpenAIRE

    Brown, L.M.; Brewster, S.A.; Ramloll, S.A.; Burton, R.; Riedel, B.

    2003-01-01

    Audio can be used to make visualisations accessible to blind and visually impaired people. The MultiVis Project has carried out research into suitable methods for presenting graphs and tables to blind people through the use of both speech and non-speech audio. This paper presents guidelines extracted from this research. These guidelines will enable designers to implement visualisation systems for blind and visually impaired users, and will provide a framework for researchers wishing to invest...

  9. Acoustic Neurinoma With Bilateral Audio Logical Complication; a Case Report

    OpenAIRE

    Saeed Farahani

    1998-01-01

    Many of the CP angle tumors are acoustic neuroma, vestibular schowanoma or 8th nerve tumor. This kind of tumor is benign histologically. Big size ones can cause neurological symptoms such as cerebellar imbalance, edema and cranial nerves dysfunction. Acoustic neuroma is mostly unilateral and audio logical findings manifest a unilateral hearing loss. Although big size tumors can lead to bilateral audio logical symptoms which can affect the findings of hearing assessment. Here, a 31 year-old pa...

  10. Handreiking multimediaformaten: naar optimale toegang van audio, video en afbeeldingen

    OpenAIRE

    Folmer, E.J.A.; Wams, N.; Knubben, B.

    2010-01-01

    Multimedia maken meer en meer deel uit van de manier waarop we ons dagelijks uitdrukken; audio en video maken inmiddels het overgrote deel uit van het internetverkeer. Daarbij maken we gebruik van allerhande formaten, soms zonder daar goed bij stil te staan. Deze handreiking geeft achtergrond bij de de keuzes die u kunt maken om video en audio beschikbaar te stellen. Open Standaarden zijn daarbij (nog) minder gangbaar dan gesloten standaarden, maar zijn wel in opkomst en dragen bovendien bete...

  11. Audio Analogue的两件旗舰产品

    Institute of Scientific and Technical Information of China (English)

    马龙辉

    2003-01-01

    @@ 几年前,意大利Audio Anakogue(雅乐)公司生产了一款名为Puccini(普契尼)的合并式放大器.由于这款放大器品质良加之推广声势浩大,在音响业的名声几乎家喻户晓,以至人们只知道Puccini而不知有Audio Analogue.

  12. Efficiency Optimization in Class-D Audio Amplifiers

    DEFF Research Database (Denmark)

    Yamauchi, Akira; Knott, Arnold; Jørgensen, Ivan Harald Holger;

    2015-01-01

    This paper presents a new power efficiency optimization routine for designing Class-D audio amplifiers. The proposed optimization procedure finds design parameters for the power stage and the output filter, and the optimum switching frequency such that the weighted power losses are minimized under...... lead to around 30 % of efficiency improvement at 1.3 W output power without significant effects on both audio performance and the efficiency at high power levels....

  13. CAVA (human Communication: an Audio-Visual Archive)

    OpenAIRE

    Mahon, M. S.

    2009-01-01

    In order to investigate human communication and interaction, researchers need hours of audio-visual data, sometimes recorded over periods of months or years. The process of collecting, cataloguing and transcribing such valuable data is time-consuming and expensive. Once it is collected and ready to use, it makes sense to get the maximum value from it by reusing it and sharing it among the research community. But unlike highly-controlled experimental data, natural audio-visual data tends t...

  14. Ferrite bead effect on Class-D amplifier audio quality

    OpenAIRE

    Haddad, Kevin El; Mrad, Roberto; Morel, Florent; Pillonnet, Gael; Vollaire, Christian; Nagari, Angelo

    2014-01-01

    International audience This paper studies the effect of ferrite beads on the audio quality of Class-D audio amplifiers. This latter is a switch-ing circuit which creates high frequency harmonics. Generally, a filter is used at the amplifier output for the sake of electro-magnetic compatibility (EMC). So often, in integrated solutions, this filter contains ferrite beads which are magnetic components and present nonlinear behavior. Time domain measurements and their equivalence in frequency ...

  15. IELTS speaking instruction through audio/voice conferencing

    Directory of Open Access Journals (Sweden)

    Hamed Ghaemi

    2012-02-01

    Full Text Available The currentstudyaimsatinvestigatingtheimpactofAudio/Voiceconferencing,asanewapproachtoteaching speaking, on the speakingperformanceand/orspeakingband score ofIELTScandidates.Experimentalgroupsubjectsparticipated in an audio conferencing classwhile those of the control group enjoyed attending in a traditional IELTS Speakingclass. At the endofthestudy,allsubjectsparticipatedinanIELTSExaminationheldonNovemberfourthin Tehran,Iran.To compare thegroupmeansforthestudy,anindependentt-testanalysiswasemployed.Thedifferencebetween experimental and control groupwasconsideredtobestatisticallysignificant(P<0.01.Thatisthecandidates in experimental group have outperformed the ones in control group in IELTS Speaking test scores.

  16. 特定类型音频流泛化识别方法%A Generic Method of Recognizing Specific Type Audio Stream

    Institute of Scientific and Technical Information of China (English)

    罗森林; 李金玉; 潘丽敏

    2011-01-01

    提出一种基于Mel频率倒谱系数(MFCC)和AdaBoost算法的特定类型音频流泛化识别方法,通过分析特定类型音频流的子类别间的共性和差异性,利用共性特征进行泛化识别,能够准确地检测并定位音频流中特定类型的音频.文中将枪声作为特定类型音频进行研究,通过提取各种枪声子类别的共性,弱化子类间的差异得到一个泛化的枪声模板,利用一个模板就可以支持多子类的准确识别.实验结果表明,算法的识别准确率为87.6%,查全率达到91.8%.%To meet the security demand of audio information, a generic method of recognizing specific type audio stream based on MFCC and AdaBoost is proposed in this paper, which can detect and locate the specific audio fragment from the audio stream accurately. The generality and differences between subcategories of the audio stream was analyzed to achieve the generic recognition. Multi-type gunshot was considered as the specific type of audio stream. The generic template was obtained by extracting the common features and reducing the different features of the gunshot audio, which could support the accurate identification of multiple sub-classes. The experiments show that the recognition accuracy of the proposed method is 87. 6% and the recall rate reaches 91. 8%.

  17. Guidelines for the design of location-based audio for mobile learning

    OpenAIRE

    FitzGerald, Elizabeth; Sharples, Mike; Jones, Robert; Priestnall, Gary

    2010-01-01

    In this paper, we discuss the value of location-based and movement-sensitive audio for learning. We distinguish three types of audio learning experience: audio vignettes, movement-based guides and mobile narratives. An analysis of projects in these three areas has resulted in the formulation of guidelines for the design of audio experiences. We offer a case study of a novel audio experience, called "A Chaotic Encounter", that delivers an adaptive story based on the pattern of movements of the...

  18. Audio Watermarking Based on HAS and Neural Networks in DCT Domain

    OpenAIRE

    Cheng Ji-Shiung; Yu Pao-Ta; Tsai Hung-Hsu

    2003-01-01

    We propose a new intelligent audio watermarking method based on the characteristics of the HAS and the techniques of neural networks in the DCT domain. The method makes the watermark imperceptible by using the audio masking characteristics of the HAS. Moreover, the method exploits a neural network for memorizing the relationships between the original audio signals and the watermarked audio signals. Therefore, the method is capable of extracting watermarks without original audio signals. Fina...

  19. Lossless Audio Watermarking Based on the Alpha Statistic Modulation

    Directory of Open Access Journals (Sweden)

    Sunita V. Dhavale

    2012-08-01

    Full Text Available In this paper, we propose a high capacity, self-synchronized, lossless audio watermarking algorithm based on the alpha (‘α’ statistic modulation. Here ‘α’ is related to the correlation among any given sequence i.e audio samples and it is modulated according to the watermark bit stream. The embedding scheme is tested in both the time domain and DWT domain. Though the time domain embedding reduces the computational time in searching the synchronization codes, the time-frequency localization capability of DWT provides good trade off between the computational complexity and robustness of synchronization codes. In case of DWT, ‘α’ related to the 2nd level DWT coarse wavelet components is used for embedding the watermark. The offset value used for embedding is made adaptive to the required SNR for the final watermarked audio signal. After extraction of the embedded watermark using a watermark key, original audio can be recovered with minimal distortion. The watermarking method presented here does not require the use of the original signal for watermark detection. Also high embedding capacity is achieved by using small sized audio frames. Experimental results reveal that the proposed watermarking scheme maintains high audio quality and is simultaneously highly robust to pirate attacks, including MP3 compression, cropping, filtering, re-sampling, and re-quantization.

  20. Unsupervised Learning of Structural Representation of Percussive Audio Using a Hierarchical Dirichlet Process Hidden Markov Model

    DEFF Research Database (Denmark)

    Diez Antich, Jose Luis; Paterna, Mattia; Marxer, Richard;

    2016-01-01

    A method is proposed that extracts a structural representation of percussive audio in an unsupervised manner. It consists of two parts: 1) The input signal is segmented into blocks of approximately even duration, aligned to a metrical grid, using onset and timbre feature extraction, agglomerative...... using the Adjusted Random Index (ARI). As a proof-of-concept, the system segmentation has been tested using two simple Disco-style drum loops, yielding a an ARI of 56% for the best stable HDP-HMM parameter setting....

  1. Determination of over current protection thresholds for class D audio amplifiers

    DEFF Research Database (Denmark)

    Nyboe, Flemming; Risbo, L; Andreani, Pietro

    2005-01-01

    Monolithic class-D audio amplifiers typically feature built-in over current protection circuitry that shuts down the amplifier in case of a short circuit on the output speaker terminals. To minimize cost, the threshold at which the device shuts down must be set just above the maximum current...... that can flow in the loudspeaker during normal operation. The current required is determined by the complex loudspeaker impedance and properties of the music signals played. This work presents a statistical analysis of peak output currents when playing music on typical loudspeakers for home entertainment....

  2. A ROBUST AUDIO WATERMARKING IN CEPSTRUM DOMAIN COMPOSED OF SAMPLE’S RELATION DEPENDENT EMBEDDING AND COMPUTATIONALLY SIMPLE EXTRACTION PHASE

    Directory of Open Access Journals (Sweden)

    Alok Kumar Chowdhury

    2014-04-01

    Full Text Available Watermark bits embedded in audio signals considering the sample’s relative state in a frame may strengthen the attack-invariant features of audio watermarking algorithm. In this work, we propose to embed watermarks in an audio signal considering the relation between the mean values of consecutive groups of samples which shows robustness by overcoming common watermarking challenges. Here, we divide the host audio signal into equal-sized non-overlapping frames which in turn is split into four equalsized non-overlapping sub-frames. After, transforming these sub-frames in cepstrum domain we finally use the relation between the differences of first two sub-frames and last two sub-frames to embed watermarks. Depending on the watermark bit (either 0 or 1 to be embed, our embedding technique either interchange or update the differences between these groups of samples by distorting the sample values in sub-frames selectively. Thus, watermarks are embedded by making a little or no distortion of the sub-frames which helps our scheme to be imperceptible in nature. Moreover, use of such embedding technique lead our watermarking scheme to a computationally less complex extraction method. Simulation results also justify our claim of the proposed scheme to be both robust and imperceptible.

  3. The perceptual influence of the cabin acoustics on the reproduced sound of a car audio system

    DEFF Research Database (Denmark)

    Kaplanis, Neofytos; Bech, Søren; Sakari, Tervo;

    2015-01-01

    A significant element of audio evaluation experiments is the availability of verbal descriptors that can accurately characterize the perceived auditory events. In terms of room acoustics, understanding the perceptual effects of the physical properties of the space would enable a better understand......A significant element of audio evaluation experiments is the availability of verbal descriptors that can accurately characterize the perceived auditory events. In terms of room acoustics, understanding the perceptual effects of the physical properties of the space would enable a better...... understanding of its acoustical qualities, and stipulate perceptually relevant ways to compensate for the subsequent degradations. In contrast to concert halls, perceptual evaluation of everyday-sized and less reverberant spaces has been a challenging task, and literature on the subject is limited....... In this study, a sensory evaluation methodology [Lokki et al., J. Acoust. Soc. Am. 132, 3148–2161 (2012)] was employed to identify the most relevant attributes that characterize the influence of the physical properties of a car cabin on the reproduced sound field. A series of in-situ measurements of a high...

  4. Digital Audio Radio Broadcast Systems Laboratory Testing Nearly Complete

    Science.gov (United States)

    2005-01-01

    Radio history continues to be made at the NASA Lewis Research Center with the completion of phase one of the digital audio radio (DAR) testing conducted by the Consumer Electronics Group of the Electronic Industries Association. This satellite, satellite/terrestrial, and terrestrial digital technology will open up new audio broadcasting opportunities both domestically and worldwide. It will significantly improve the current quality of amplitude-modulated/frequency-modulated (AM/FM) radio with a new digitally modulated radio signal and will introduce true compact-disc-quality (CD-quality) sound for the first time. Lewis is hosting the laboratory testing of seven proposed digital audio radio systems and modes. Two of the proposed systems operate in two modes each, making a total of nine systems being tested. The nine systems are divided into the following types of transmission: in-band on-channel (IBOC), in-band adjacent-channel (IBAC), and new bands. The laboratory testing was conducted by the Consumer Electronics Group of the Electronic Industries Association. Subjective assessments of the audio recordings for each of the nine systems was conducted by the Communications Research Center in Ottawa, Canada, under contract to the Electronic Industries Association. The Communications Research Center has the only CCIR-qualified (Consultative Committee for International Radio) audio testing facility in North America. The main goals of the U.S. testing process are to (1) provide technical data to the Federal Communication Commission (FCC) so that it can establish a standard for digital audio receivers and transmitters and (2) provide the receiver and transmitter industries with the proper standards upon which to build their equipment. In addition, the data will be forwarded to the International Telecommunications Union to help in the establishment of international standards for digital audio receivers and transmitters, thus allowing U.S. manufacturers to compete in the

  5. Comparing observer models and feature selection methods for a task-based statistical assessment of digital breast tomsynthesis in reconstruction space

    Science.gov (United States)

    Park, Subok; Zhang, George Z.; Zeng, Rongping; Myers, Kyle J.

    2014-03-01

    A task-based assessment of image quality1 for digital breast tomosynthesis (DBT) can be done in either the projected or reconstructed data space. As the choice of observer models and feature selection methods can vary depending on the type of task and data statistics, we previously investigated the performance of two channelized- Hotelling observer models in conjunction with 2D Laguerre-Gauss (LG) and two implementations of partial least squares (PLS) channels along with that of the Hotelling observer in binary detection tasks involving DBT projections.2, 3 The difference in these observers lies in how the spatial correlation in DBT angular projections is incorporated in the observer's strategy to perform the given task. In the current work, we extend our method to the reconstructed data space of DBT. We investigate how various model observers including the aforementioned compare for performing the binary detection of a spherical signal embedded in structured breast phantoms with the use of DBT slices reconstructed via filtered back projection. We explore how well the model observers incorporate the spatial correlation between different numbers of reconstructed DBT slices while varying the number of projections. For this, relatively small and large scan angles (24° and 96°) are used for comparison. Our results indicate that 1) given a particular scan angle, the number of projections needed to achieve the best performance for each observer is similar across all observer/channel combinations, i.e., Np = 25 for scan angle 96° and Np = 13 for scan angle 24°, and 2) given these sufficient numbers of projections, the number of slices for each observer to achieve the best performance differs depending on the channel/observer types, which is more pronounced in the narrow scan angle case.

  6. Tech-Assisted Language Learning Tasks in an EFL Setting: Use of Hand phone Recording Feature

    Directory of Open Access Journals (Sweden)

    Alireza Shakarami

    2014-09-01

    Full Text Available Technology with its speedy great leaps forward has undeniable impact on every aspect of our life in the new millennium. It has supplied us with different affordances almost daily or more precisely in a matter of hours. Technology and Computer seems to be a break through as for their roles in the Twenty-First century educational system. Examples are numerous, among which CALL, CMC, and Virtual learning spaces come to mind instantly. Amongst the newly developed gadgets of today are the sophisticated smart Hand phones which are far more ahead of a communication tool once designed for. Development of Hand phone as a wide-spread multi-tasking gadget has urged researchers to investigate its effect on every aspect of learning process including language learning. This study attempts to explore the effects of using cell phone audio recording feature, by Iranian EFL learners, on the development of their speaking skills. Thirty-five sophomore students were enrolled in a pre-posttest designed study. Data on their English speaking experience using audio–recording features of their Hand phones were collected. At the end of the semester, the performance of both groups, treatment and control, were observed, evaluated, and analyzed; thereafter procured qualitatively at the next phase. The quantitative outcome lent support to integrating Hand phones as part of the language learning curriculum.Keywords: Hand phone, Recording, Audio, Language learning, Enhancement, EFL           

  7. Transform based Visual Features for Bimodal Recognition of Hindi Visemes

    Directory of Open Access Journals (Sweden)

    Priyanka Varshney

    2012-06-01

    Full Text Available Visual information along with audio is important for human machine interface. It not only increases the accuracy of an Audio Speech Recognition (ASR but also improves its robustness. This paper presents an overview of different approaches used for viseme recognition and also reports the new results for Hindi viseme recognition. The visemes were extracted from a database prepared from continuous sentences uttered by 5 native Hindi speakers. For audio features mel frequency cepstral coefficients (MFCCs were used while discrete wavelet transform (DWT followed by discrete cosine transform (DCT was used for visual feature extraction. The features extracted were then given to discriminant function based classifier. The maximum improvement in the recognition performance of 10.72 % is achieved at -5 dB signals to noise ratio (SNR.

  8. Space Shuttle Wireless Crew Communications

    Science.gov (United States)

    Armstrong, R. W.; Doe, R. A.

    1982-01-01

    The design, development, and performance characteristics of the Space Shuttle's Wireless Crew Communications System are discussed. This system allows Space Shuttle crews to interface with the onboard audio distribution system without the need for communications umbilicals, and has been designed through the adaptation of commercially available hardware in order to minimize development time. Testing aboard the Space Shuttle Orbiter Columbia has revealed no failures or design deficiencies.

  9. The Fungible Audio-Visual Mapping and its Experience

    Directory of Open Access Journals (Sweden)

    Adriana Sa

    2014-12-01

    Full Text Available This article draws a perceptual approach to audio-visual mapping. Clearly perceivable cause and effect relationships can be problematic if one desires the audience to experience the music. Indeed perception would bias those sonic qualities that fit previous concepts of causation, subordinating other sonic qualities, which may form the relations between the sounds themselves. The question is, how can an audio-visual mapping produce a sense of causation, and simultaneously confound the actual cause-effect relationships. We call this a fungible audio-visual mapping. Our aim here is to glean its constitution and aspect. We will report a study, which draws upon methods from experimental psychology to inform audio-visual instrument design and composition. The participants are shown several audio-visual mapping prototypes, after which we pose quantitative and qualitative questions regarding their sense of causation, and their sense of understanding the cause-effect relationships. The study shows that a fungible mapping requires both synchronized and seemingly non-related components – sufficient complexity to be confusing. As the specific cause-effect concepts remain inconclusive, the sense of causation embraces the whole. 

  10. Talker variability in audio-visual speech perception.

    Science.gov (United States)

    Heald, Shannon L M; Nusbaum, Howard C

    2014-01-01

    A change in talker is a change in the context for the phonetic interpretation of acoustic patterns of speech. Different talkers have different mappings between acoustic patterns and phonetic categories and listeners need to adapt to these differences. Despite this complexity, listeners are adept at comprehending speech in multiple-talker contexts, albeit at a slight but measurable performance cost (e.g., slower recognition). So far, this talker variability cost has been demonstrated only in audio-only speech. Other research in single-talker contexts have shown, however, that when listeners are able to see a talker's face, speech recognition is improved under adverse listening (e.g., noise or distortion) conditions that can increase uncertainty in the mapping between acoustic patterns and phonetic categories. Does seeing a talker's face reduce the cost of word recognition in multiple-talker contexts? We used a speeded word-monitoring task in which listeners make quick judgments about target word recognition in single- and multiple-talker contexts. Results show faster recognition performance in single-talker conditions compared to multiple-talker conditions for both audio-only and audio-visual speech. However, recognition time in a multiple-talker context was slower in the audio-visual condition compared to audio-only condition. These results suggest that seeing a talker's face during speech perception may slow recognition by increasing the importance of talker identification, signaling to the listener a change in talker has occurred. PMID:25076919

  11. Lossless Audio Watermarking Based on the Alpha Statistic Modulation

    Directory of Open Access Journals (Sweden)

    Sunita V. Dhavale

    2012-09-01

    Full Text Available In this paper, we propose a high capacity, self-synchronized, lossless audio watermarking algorithm based on the alpha (‘α’ statistic modulation. Here ‘α’ is related to the correlation among any given sequence i.e audio samples and it is modulated according to the watermark bit stream. The embedding scheme is tested in both the time domain and DWT domain. Though the time domain embedding reduces the computational time in searching the synchronization codes, the time-frequency localization capability of DWT provides good trade off between the computational complexity and robustness of synchronization codes. In case of DWT, ‘α’ related to the 2nd level DWT coarse wavelet components is used for embedding the watermark. The offset value used for embedding is made adaptive to the required SNR for the final watermarked audio signal. After extraction of the embedded watermark using a watermark key, original audio can be recovered with minimal distortion. The watermarking method presented here does not require the use of the original signal for watermark detection. Also high embedding capacity is achieved by using small sizedaudio frames. Experimental results reveal that the proposed watermarking scheme maintains high audio quality and is simultaneously highly robust to pirate attacks, including MP3 compression, cropping, filtering, re-sampling, and re-quantization.

  12. Joint application of audio spectral envelope and tonality index in an e-asthma monitoring system.

    Science.gov (United States)

    Wiśniewski, Marcin; Zieliński, Tomasz P

    2015-05-01

    This paper presents in detail a recently introduced highly efficient method for automatic detection of asthmatic wheezing in breathing sounds. The fluctuation in the audio spectral envelope (ASE) from the MPEG-7 standard and the value of the tonality index (TI) from the MPEG-2 Audio specification are jointly used as discriminative features for wheezy sounds, while the support vector machine (SVM) with a polynomial kernel serves as a classifier. The advantages of the proposed approach are described in the paper (e.g., detecting weak wheezes, very good ROC characteristics, independence from noise color). Since the method is not computationally complex, it is suitable for remote asthma monitoring using mobile devices (personal medical assistants). The main contribution of this paper consists of presenting all the implementation details concerning the proposed approach for the first time, i.e., the pseudocode of the method and adjusting the values of the ASE and TI parameters after which only one (not two) FFT is required for analysis of a next overlapping signal fragment. The efficiency of the method has also been additionally confirmed by the AdaBoost classifier with a built-in mechanism to feature ranking, as well as a previously performed minimal-redundancy-maximal-relevance test. PMID:25167561

  13. Audio Watermarking Based on HAS and Neural Networks in DCT Domain

    Directory of Open Access Journals (Sweden)

    Cheng Ji-Shiung

    2003-01-01

    Full Text Available We propose a new intelligent audio watermarking method based on the characteristics of the HAS and the techniques of neural networks in the DCT domain. The method makes the watermark imperceptible by using the audio masking characteristics of the HAS. Moreover, the method exploits a neural network for memorizing the relationships between the original audio signals and the watermarked audio signals. Therefore, the method is capable of extracting watermarks without original audio signals. Finally, the experimental results are also included to illustrate that the method significantly possesses robustness to be immune against common attacks for the copyright protection of digital audio.

  14. Connotative Feature Extraction For Movie Recommendation

    Directory of Open Access Journals (Sweden)

    N. G. Meshram

    2014-09-01

    Full Text Available It is difficult to assess the emotions subject to the emotional responses to the content of the film by exploring the film connotative properties. Connotation is used to represent the emotions described by the audiovisual descriptors so that it predicts the emotional reaction of user. The connotative features can be used for the recommendation of movies. There are various methodologies for the recommendation of movies. This paper gives comparative analysis of some of these methods. This paper introduces some of the audio features that can be useful in the analysis of the emotions represented in the movie scenes. The video features can be mapped with emotions. This paper provides methodology for mapping audio features with some emotional states such as happiness, sleepiness, excitement, sadness, relaxation, anger, distress, fear, tension, boredom, comedy and fight. In this paper movie’s audio is used for connotative feature extraction which is extended to recognize emotions. This paper also provides comparative analysis of some of the methods that can be used for the recommendation of movies based on user’s emotions.

  15. Content-based audio search: from fingerprinting to semantic audio retrieval

    OpenAIRE

    Cano Vila, Pedro

    2007-01-01

    Aquesta tesi tracta de cercadors d'audio basats en contingut. Específicament, tracta de desenvolupar tecnologies que permetin fer més estret l'interval semàntic o --semantic gap' que, a avui dia, limita l'ús massiu de motors de cerca basats en contingut. Els motors de cerca d'àudio fan servir metadades, en la gran majoria generada per editors, per a gestionar col.leccions d'àudio. Tot i ser una tasca àrdua i procliu a errors, l'anotació manual és la pràctica més habitual. Els mètodes basats e...

  16. ARC Code TI: SLAB Spatial Audio Renderer

    Data.gov (United States)

    National Aeronautics and Space Administration — SLAB is a software-based, real-time virtual acoustic environment rendering system being developed as a tool for the study of spatial hearing. SLAB is designed to...

  17. Objective quality measurement for audio time-scale modification

    Science.gov (United States)

    Liu, Fang; Lee, Jae-Joon; Kuo, C. C. J.

    2003-11-01

    The recent ITU-T Recommendation P.862, known as the Perceptual Evaluation of Speech Quality (PESQ) is an objective end-to-end speech quality assessment method for telephone networks and speech codecs through the measurement of received audio quality. To ensure that certain network distortions will not affect the estimated subjective measurement determined by PESQ, the algorithm takes into account packet loss, short-term and long-term time warping resulted from delay variation. However, PESQ does not work well for time-scale audio modification or temporal clipping. We investigated the factors that impact the perceived quality when time-scale modification is involved. An objective measurement of time-scale modification is proposed in this research, where the cross-correlation values obtained from time-scale modification synchronization are used to evaluate the quality of a time-scaled audio sequence. This proposed objective measure has been verified by a subjective test.

  18. Can audio recording of outpatient consultations improve patient outcome?

    DEFF Research Database (Denmark)

    Wolderslund, Maiken; Kofoed, Poul-Erik; Axboe, Mette;

    different departments: Orthopedics, Urology, Internal Medicine and Pediatrics. A total of 5,460 patients will be included from the outpatient clinics. All patients randomized to an intervention group are offered audio recording of their consultation. An Interactive Voice Response platform enables an audio...... the communication is challenged by the fact that patients tend to forget or misunderstand a great deal of the information given. The primary objective of this study is to investigate the effects of providing patients with an audio recording of the consultation. Methods A randomized controlled trial involving four...... recording of the dialogue between the patient and the clinician via the telephone in the consultation room. This technique ensures minimal time consumption for clinicians and high sound quality. By dialing their social security number in combination with a PIN, patients can hear their consultation again...

  19. Audio system using binaural synthesis for multimodal telepresence applications

    DEFF Research Database (Denmark)

    Madsen, Esben; Markovic, Milos; Olesen, Søren Krarup;

    2013-01-01

    of microphones, headphones and loudspeakers as well as measurements of network latency and bandwidth requirements of the system. Furthermore, measurements were made to determine whether the level of echo and cross talk cause any issues. The overall system employs multiple modalities to virtually transport......An audio system was developed as part of a multimodal system aiming to go beyond current state of the art in telepresence.This paper provides an overview of how the audio was implemented and documents measurements that were performed on the audio system. The measurements include equalization...... a person (the visitor) to a different physical location (the destination). The goal is that both the visitor and the people physically at the destination (the locals) should be provided with a sensation that the visitor is really there. Both the general multimodal system and the auditory part...

  20. Tagging and Linking Lecture Audio Recordings: Goals and Practice

    CERN Document Server

    Gray, Norman; Honeychurch, Sarah; Draper, Steve; Barr, Niall

    2013-01-01

    Making and distributing audio recordings of lectures is cheap and technically straightforward, and these recordings represent an underexploited teaching resource. We explore the reasons why such recordings are not more used; we believe the barriers inhibiting such use should be easily overcome. Students can listen to a lecture they missed, or re-listen to a lecture at revision time, but their interaction is limited by the affordances of the replaying technology. Listening to lecture audio is generally solitary, linear, and disjoint from other available media. In this paper, we describe a tool we are developing at the University of Glasgow, which enriches students' interactions with lecture audio. We describe our experiments with this tool in session 2012--13. Fewer students used the tool than we expected would naturally do so, and we discuss some possible explanations for this.

  1. A novel audio watermarking scheme using multiscale wavelet modulation

    Institute of Scientific and Technical Information of China (English)

    JI Bing; ZHANG De; JI Xiaoyong

    2004-01-01

    A novel audio watermarking scheme to embed robust and inaudible watermarks for the purpose of copyright protection is proposed. The key innovation is to add time-frequency redundancy into watermark signals by multiscale wavelet modulation. In order to maximize the watermarking strength within perceptual constraints, the signals synthesized from different scales are masked using a frequency auditory model, respectively, and then intergrated to form the final watermark signal. The detection structure is built using the redundancy in watermark signals, and the performance is further enhanced by modeling the statistical behaviors of wavelet coefficients as generalized Gaussian distribution. The use of original audio signal is not required in watermark detection. The experimental results show that our approach can achieve not only good transparency but also satisfying robustness to common audio manipulations.

  2. INFORMATION HIDING USING AUDIO STEGANOGRAPHY – A SURVEY

    Directory of Open Access Journals (Sweden)

    Jayaram P

    2011-08-01

    Full Text Available Today’s large demand of internet applications requires data to be transmitted in a secure manner. Data transmission in public communication system is not secure because of interception and improper manipulation by eavesdropper. So the attractive solution for this problem is Steganography, which is the art and science of writing hidden messages in such a way that no one, apart from the sender and intend recipient, suspects the existence of the message, a form of security through obscurity. Audio steganography is the scheme of hiding the existence of secret information by concealing it into another medium such as audio file. In this paper we mainly discuss different types of audio steganographic methods, advantages and disadvantages.

  3. Technical Evaluation Report 31: Internet Audio Products (3/ 3

    Directory of Open Access Journals (Sweden)

    Jim Rudolph

    2004-08-01

    Full Text Available Two contrasting additions to the online audio market are reviewed: iVocalize, a browser-based audio-conferencing software, and Skype, a PC-to-PC Internet telephone tool. These products are selected for review on the basis of their success in gaining rapid popular attention and usage during 2003-04. The iVocalize review emphasizes the product’s role in the development of a series of successful online audio communities – notably several serving visually impaired users. The Skype review stresses the ease with which the product may be used for simultaneous PC-to-PC communication among up to five users. Editor’s Note: This paper serves as an introduction to reports about online community building, and reviews of online products for disabled persons, in the next ten reports in this series. JPB, Series Ed.

  4. Music Identification System Using MPEG-7 Audio Signature Descriptors

    Directory of Open Access Journals (Sweden)

    Shingchern D. You

    2013-01-01

    Full Text Available This paper describes a multiresolution system based on MPEG-7 audio signature descriptors for music identification. Such an identification system may be used to detect illegally copied music circulated over the Internet. In the proposed system, low-resolution descriptors are used to search likely candidates, and then full-resolution descriptors are used to identify the unknown (query audio. With this arrangement, the proposed system achieves both high speed and high accuracy. To deal with the problem that a piece of query audio may not be inside the system’s database, we suggest two different methods to find the decision threshold. Simulation results show that the proposed method II can achieve an accuracy of 99.4% for query inputs both inside and outside the database. Overall, it is highly possible to use the proposed system for copyright control.

  5. Music identification system using MPEG-7 audio signature descriptors.

    Science.gov (United States)

    You, Shingchern D; Chen, Wei-Hwa; Chen, Woei-Kae

    2013-01-01

    This paper describes a multiresolution system based on MPEG-7 audio signature descriptors for music identification. Such an identification system may be used to detect illegally copied music circulated over the Internet. In the proposed system, low-resolution descriptors are used to search likely candidates, and then full-resolution descriptors are used to identify the unknown (query) audio. With this arrangement, the proposed system achieves both high speed and high accuracy. To deal with the problem that a piece of query audio may not be inside the system's database, we suggest two different methods to find the decision threshold. Simulation results show that the proposed method II can achieve an accuracy of 99.4% for query inputs both inside and outside the database. Overall, it is highly possible to use the proposed system for copyright control. PMID:23533359

  6. Real-time Loudspeaker Distance Estimation with Stereo Audio

    DEFF Research Database (Denmark)

    Nielsen, Jesper Kjær; Gaubitch, Nikolay; Heusdens, Richard;

    2015-01-01

    Knowledge on how a number of loudspeakers are positioned relative to a listening position can be used to enhance the listening experience. Usually, these loudspeaker positions are estimated using calibration signals, either audible or psycho-acoustically hidden inside the desired audio signal. In...... this paper, we propose to use the desired audio signal instead. Specifically, we treat the case of estimating the distance between two loudspeakers playing back a stereo music or speech signal. In this connection, we develop a real-time maximum likelihood estimator and demonstrate that it has a...

  7. Audio Steganography Coding Using the Discrete Wavelet Transforms

    Directory of Open Access Journals (Sweden)

    Siwar Rekik

    2012-02-01

    Full Text Available The performance of audio steganography compression system using discrete wavelet transform(DWT is investigated. Audio steganography coding is the technology of transforming stegospeechinto efficiently encoded version that can be decoded in the receiver side to produce aclose representation of the initial signal (non compressed. Experimental results prove theefficiency of the used compression technique since the compressed stego-speech areperceptually intelligible and indistinguishable from the equivalent initial signal, while being able torecover the initial stego-speech with slight degradation in the quality .

  8. Audio engineering 101 a beginner's guide to music production

    CERN Document Server

    Dittmar, Tim

    2013-01-01

    Audio Engineering 101 is a real world guide for starting out in the recording industry. If you have the dream, the ideas, the music and the creativity but don't know where to start, then this book is for you!Filled with practical advice on how to navigate the recording world, from an author with first-hand, real-life experience, Audio Engineering 101 will help you succeed in the exciting, but tough and confusing, music industry. Covering all you need to know about the recording process, from the characteristics of sound to a guide to microphones to analog versus digital

  9. An audio file tagging mobile game, mTagATune

    OpenAIRE

    Díaz, Francisco Javier; Queiruga, Claudia Alejandra; Ferraresso, Alejandro; Larghi, José

    2011-01-01

    mTagATune is a mobile game based on TagATune. mTagATune implements the concept of GWAP and seizes the capabilities and wide acceptance of current smartphones. GWAP promotes the creation of computer games that encourage people to do voluntary work. mTagATune implements a game that collects information on audio files to facilitate future searches on them. By means of a collaborative game, mTagATune enables an ubiquitous collection of information on audio files that can later be used in searc...

  10. Class-D audio amplifiers with negative feedback

    OpenAIRE

    Cox, Stephen M.; Candy, B. H.

    2006-01-01

    There are many different designs for audio amplifiers. Class-D, or switching, amplifiers generate their output signal in the form of a high-frequency square wave of variable duty cycle (ratio of on time to off time). The square-wave nature of the output allows a particularly efficient output stage, with minimal losses. The output is ultimately filtered to remove components of the spectrum above the audio range. Mathematical models are derived here for a variety of related class-D amplifier d...

  11. Entorno de Audio usando la nueva API de HTML 5

    OpenAIRE

    LATORRE PLAYÁN, JAVIER

    2015-01-01

    Este trabajo tiene como objetivo el diseño y programación de una aplicación de audio sobre la nueva API de audio de HTML 5. Para ello, utilizamos el programa SoundCool, que es propiedad de la Universidad Politécnica de Valencia y, a partir de los módulos que implementa, los adaptaremos al lenguaje antes mencionado, con el propósito de hacerlo más accesible y atractivo visualmente. Para poder llevar a cabo lo mencionado anteriormente, se ha realizado, en primer lugar, un trabajo de investig...

  12. Evaluation of robustness and transparency of multiple audio watermark embedding

    Science.gov (United States)

    Steinebach, Martin; Zmudzinski, Sascha

    2008-02-01

    As digital watermarking becomes an accepted and widely applied technology, a number of concerns regarding its reliability in typical application scenarios come up. One important and often discussed question is the robustness of digital watermarks against multiple embedding. This means that one cover is marked several times by various users with by same watermarking algorithm but with different keys and different watermark messages. In our paper we discuss the behavior of our PCM audio watermarking algorithm when applying multiple watermark embedding. This includes evaluation of robustness and transparency. Test results for multiple hours of audio content ranging from spoken words to music are provided.

  13. Design of a WAV audio player based on K20

    Directory of Open Access Journals (Sweden)

    Xu Yu

    2016-01-01

    Full Text Available The designed player uses the Freescale Company’s MK20DX128VLH7 as the core control ship, and its hardware platform is equipped with VS1003 audio decoder, OLED display interface, USB interface and SD card slot. The player uses the open source embedded real-time operating system μC/OS-II, Freescale USB Stack V4.1.1 and FATFS, and a graphical user interface is developed to improve the user experience based on CGUI. In general, the designed WAV audio player has a strong applicability and a good practical value.

  14. 一种基于人耳听觉感知和子带补偿滤波的鲁棒语言辨识特征参数提取算法%A Robust Feature Parameter Extraction Algorithm for Language Identification Based on Audio Perception and Sub-Band Compensation Filtering

    Institute of Scientific and Technical Information of China (English)

    黄山奇; 张连海; 屈丹

    2012-01-01

    In current language identification system, the commonly used feature parameters have not made the best use of auditory characteristics and have weak robustness in complex environments. An auditory-based robust feature extraction algorithm is proposed. Each sub-band energy of the extracted auditory features is calculated by using a Gammachirp filter bank instead of the commonly used triangle filter bank. The compensation filter using data-driven analysis for each sub-band output is obtained by a constrained optimization process which jointly minimizes the environmental distortion as well as the distortion caused by the filter itself. Experimental results show that the feature outperforms the Mel-frequency cepstral coefficient widely used in noisy environments.%针对目前语言辨识系统所采用的特征参数没有充分考虑入耳听觉机制、鲁棒性较差的问题,提出一种符合人耳听觉感知特性的鲁棒语言辨识参数提取算法.该算法主要从两个方面提高特征参数的鲁棒性:在计算各子带能量时采用更符合人耳感知特性的Gammachirp滤波器组代替常用的三角滤波器组;为每一子带通道设计一个补偿滤波器.子带补偿滤波器的设计采用数据驱动的策略,通过补偿使得各子带滤波器输出信号的失真及环境噪音导致的失真同时达到最小.实验表明,文中所提出的特征在常见噪声环境下,性能均优于目前普遍使用的Mel频率倒谱系数特征及其衍生参数.

  15. Audio steganalysis based on "negative resonance phenomenon" caused by steganographic tools

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Researching on the impact different steganographic software tools have audio statistical features, revealed the phenomenon that when messages are embedded in a WAV file by using a certain tool, the variation of statistical features in the WAV file which already contains messages embedded by the same tool is abruptly smaller than those in which messages have not been embedded. We call it "negative resonance phenomenon" temporarily. With the phenomenon above and Support Vector Machines (SVMs), we can detect the existence of hidden messages, and also identify the tools used to hide them. As shown by the experimental results, the proposed method can be very effectively used to detect hidden messages embedded by Hide4PGP, Stegowav and S-Tools4.

  16. Multi-modal gesture recognition using integrated model of motion, audio and video

    Science.gov (United States)

    Goutsu, Yusuke; Kobayashi, Takaki; Obara, Junya; Kusajima, Ikuo; Takeichi, Kazunari; Takano, Wataru; Nakamura, Yoshihiko

    2015-07-01

    Gesture recognition is used in many practical applications such as human-robot interaction, medical rehabilitation and sign language. With increasing motion sensor development, multiple data sources have become available, which leads to the rise of multi-modal gesture recognition. Since our previous approach to gesture recognition depends on a unimodal system, it is difficult to classify similar motion patterns. In order to solve this problem, a novel approach which integrates motion, audio and video models is proposed by using dataset captured by Kinect. The proposed system can recognize observed gestures by using three models. Recognition results of three models are integrated by using the proposed framework and the output becomes the final result. The motion and audio models are learned by using Hidden Markov Model. Random Forest which is the video classifier is used to learn the video model. In the experiments to test the performances of the proposed system, the motion and audio models most suitable for gesture recognition are chosen by varying feature vectors and learning methods. Additionally, the unimodal and multi-modal models are compared with respect to recognition accuracy. All the experiments are conducted on dataset provided by the competition organizer of MMGRC, which is a workshop for Multi-Modal Gesture Recognition Challenge. The comparison results show that the multi-modal model composed of three models scores the highest recognition rate. This improvement of recognition accuracy means that the complementary relationship among three models improves the accuracy of gesture recognition. The proposed system provides the application technology to understand human actions of daily life more precisely.

  17. Multi-modal Gesture Recognition using Integrated Model of Motion, Audio and Video

    Institute of Scientific and Technical Information of China (English)

    GOUTSU Yusuke; KOBAYASHI Takaki; OBARA Junya; KUSAJIMAIkuo; TAKEICHI Kazunari; TAKANO Wataru; NAKAMURA Yoshihiko

    2015-01-01

    Gesture recognition is used in many practical applications such as human-robot interaction, medical rehabilitation and sign language. With increasing motion sensor development, multiple data sources have become available, which leads to the rise of multi-modal gesture recognition. Since our previous approach to gesture recognition depends on a unimodal system, it is difficult to classify similar motion patterns. In order to solve this problem, a novel approach which integrates motion, audio and video models is proposed by using dataset captured by Kinect. The proposed system can recognize observed gestures by using three models. Recognition results of three models are integrated by using the proposed framework and the output becomes the final result. The motion and audio models are learned by using Hidden Markov Model. Random Forest which is the video classifier is used to learn the video model. In the experiments to test the performances of the proposed system, the motion and audio models most suitable for gesture recognition are chosen by varying feature vectors and learning methods. Additionally, the unimodal and multi-modal models are compared with respect to recognition accuracy. All the experiments are conducted on dataset provided by the competition organizer of MMGRC, which is a workshop for Multi-Modal Gesture Recognition Challenge. The comparison results show that the multi-modal model composed of three models scores the highest recognition rate. This improvement of recognition accuracy means that the complementary relationship among three models improves the accuracy of gesture recognition. The proposed system provides the application technology to understand human actions of daily life more precisely.

  18. Audio Signal Processing Using Time-Frequency Approaches: Coding, Classification, Fingerprinting, and Watermarking

    Directory of Open Access Journals (Sweden)

    K. Umapathy

    2010-01-01

    Full Text Available Audio signals are information rich nonstationary signals that play an important role in our day-to-day communication, perception of environment, and entertainment. Due to its non-stationary nature, time- or frequency-only approaches are inadequate in analyzing these signals. A joint time-frequency (TF approach would be a better choice to efficiently process these signals. In this digital era, compression, intelligent indexing for content-based retrieval, classification, and protection of digital audio content are few of the areas that encapsulate a majority of the audio signal processing applications. In this paper, we present a comprehensive array of TF methodologies that successfully address applications in all of the above mentioned areas. A TF-based audio coding scheme with novel psychoacoustics model, music classification, audio classification of environmental sounds, audio fingerprinting, and audio watermarking will be presented to demonstrate the advantages of using time-frequency approaches in analyzing and extracting information from audio signals.

  19. 基于音频特征的抗去同步攻击数字水印算法%A Content Based Audio Watermarking Against Desynchronization Attacks

    Institute of Scientific and Technical Information of China (English)

    鲍德旺; 杨红颖; 祁薇; 王向阳

    2009-01-01

    抗去同步攻击的强鲁棒数字音频水印方法研究是一项富有挑战性的工作.本文结合数字音频自身特征,提出了一种基于音频内容的抗去同步攻击数字水印算法.该算法首先根据数字音频的局部能量特征,从原始载体中提取出稳定的特征点;然后以音频特征点为标识,确定用于水印嵌入的候选音频段;最后采纳量化调制策略,将数字水印嵌入到音频载体内.进行数字水印检测时,系统通过分析音频内容提取特征点,再以特征点为标识提取水印信息,水印检测无需原始音频信号参与.仿真实验结果表明,本文算法不仅具有较好的不可感知性,而且对常规信号处理(MP3压缩、低通滤波、添加噪声等)和去同步攻击(随机剪切、幅度缩放、时间延展、抖动等)均具有较好的鲁棒性.%It is a challenging work to design a robust audio watermarking scheme against desynchronization attacks. In this paper, a feature-based digital audio watermarking scheme robust against desynchronization attacks is proposed. Firstly, the steady feature points are extracted from the host audio according to the local energy distribution. Then, the candidate audio segments for embedding watermark are defined by using the feature points. Finally, the digital watermark is embedded into a host audio by modulating the statistics average value of audio samples. In digital watermark detection, the feature points are selected by the same technique as the embedding, and the digital watermark is extracted from the watermarked audio after the feature points. Experimental results show that the proposed scheme is inaudible and robust against common signals processing such as MP3 compression, low-pass filtering, noise addition, and equalization etc, and is robust against desynchronization attacks such as random cropping, amplitude variation, time-scale modification, and jittering etc.

  20. Study on the Design of Hall Space with Features of Fitting Place in the Ancient Village%契合场所特征的古村落厅空间设计研究

    Institute of Scientific and Technical Information of China (English)

    贺鹏飞; 闫芳; 衡苛

    2016-01-01

    This article proposed design policy on hall space of ancient village in the perspective of the features of fitting place, specifically taking the design of waterfront hall space of Ding Li Bay ancient village in Henan as an example. Through analyzing the key place features in ancient villages of water natural elements of Ding Li Bay, the village space elements, waterfront hall space elements etc., this paper will describe the design of waterfront hall space from points of extraction of tea culture, abstract form of mountain, space function design of waterfront hall etc., and summarize the general idea of the village hall space design and implementation methods based on the features of fitting place in order to provide a reference for the ancient village fall space design.%文章以契合场所特征为视角,提出古村落厅空间的设计策略,具体以豫南丁李湾古村落滨水厅空间的设计为例进行研究,通过分析丁李湾的水乡自然要素、村落空间要素、滨水厅空间要素等古村落关键场所特征,从茶文化的提炼、山体形态的抽象、滨水厅空间功能设计等角度进行设计,总结村落厅空间在契合场所特征的基础上进行设计的总体思路和实施手段,以期对古村落厅空间的设计提供参考。

  1. Responding Effectively to Composition Students: Comparing Student Perceptions of Written and Audio Feedback

    Science.gov (United States)

    Bilbro, J.; Iluzada, C.; Clark, D. E.

    2013-01-01

    The authors compared student perceptions of audio and written feedback in order to assess what types of students may benefit from receiving audio feedback on their essays rather than written feedback. Many instructors previously have reported the advantages they see in audio feedback, but little quantitative research has been done on how the…

  2. Audio Use in E-Learning: What, Why, When, and How?

    Science.gov (United States)

    Calandra, Brendan; Barron, Ann E.; Thompson-Sellers, Ingrid

    2008-01-01

    Decisions related to the implementation of audio in e-learning are perplexing for many instructional designers, and deciphering theory and principles related to audio use can be difficult for practitioners. Yet, as bandwidth on the Internet increases, digital audio is becoming more common in online courses. This article provides a review of…

  3. 47 CFR 73.9005 - Compliance requirements for covered demodulator products: Audio.

    Science.gov (United States)

    2010-10-01

    ... products: Audio. 73.9005 Section 73.9005 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED....9005 Compliance requirements for covered demodulator products: Audio. Except as otherwise provided in §§ 73.9003(a) or 73.9004(a), covered demodulator products shall not output the audio portions...

  4. 77 FR 16890 - Second Meeting: RTCA Special Committee 226, Audio Systems and Equipment

    Science.gov (United States)

    2012-03-22

    ... Federal Aviation Administration Second Meeting: RTCA Special Committee 226, Audio Systems and Equipment... meeting RTCA Special Committee 226, Audio Systems and Equipment. SUMMARY: The FAA is issuing this notice to advise the public of the second meeting of RTCA Special Committee 226, Audio Systems and...

  5. 76 FR 79755 - First Meeting: RTCA Special Committee 226 Audio Systems and Equipment

    Science.gov (United States)

    2011-12-22

    ... Federal Aviation Administration First Meeting: RTCA Special Committee 226 Audio Systems and Equipment... RTCA Special Committee 226, Audio Systems and Equipment. SUMMARY: The FAA is issuing this notice to advise the public of a meeting of RTCA Special Committee 226, Audio Systems and Equipment, for the...

  6. Interactive 3D audio: Enhancing awareness of details in immersive soundscapes?

    DEFF Research Database (Denmark)

    Schmidt, Mikkel Nørgaard; Schwartz, Stephen; Larsen, Jan

    2012-01-01

    Spatial audio and the possibility of interacting with the audio environment is thought to increase listeners' attention to details in a soundscape. This work examines if interactive 3D audio enhances listeners' ability to recall details in a soundscape. Nine different soundscapes were constructed...

  7. 37 CFR 202.22 - Acquisition and deposit of unpublished audio and audiovisual transmission programs.

    Science.gov (United States)

    2010-07-01

    ... unpublished audio and audiovisual transmission programs. 202.22 Section 202.22 Patents, Trademarks, and... REGISTRATION OF CLAIMS TO COPYRIGHT § 202.22 Acquisition and deposit of unpublished audio and audiovisual... and copies of unpublished audio and audiovisual transmission programs by the Library of Congress...

  8. DAB: Multiplex and system support features

    Science.gov (United States)

    Riley, J. L.

    This Report describes the multiplex and system support features of the Eureka 147/DAB digital audio system. It sets out the requirements of all users along the broadcast chain from service providers and broadcaster through to the listener. The contents of the transmission frame are examined drawing the distinction between the main service multiplex and the provision of control information in a separate fast data channel. The concept of the DAB service structure is introduced and the inherent system flexibility for altering the service arrangement is explained. A wide range of service information features builds on those provided in earlier systems, such as RDS (Radio Data System) and is intended to make it easier for a listener to find any required service and to add a further dimension to audio broadcasting. The choices available to users in all of these areas are examined.

  9. Real-Time Audio-Visual Analysis for Multiperson Videoconferencing

    Directory of Open Access Journals (Sweden)

    Petr Motlicek

    2013-01-01

    Full Text Available We describe the design of a system consisting of several state-of-the-art real-time audio and video processing components enabling multimodal stream manipulation (e.g., automatic online editing for multiparty videoconferencing applications in open, unconstrained environments. The underlying algorithms are designed to allow multiple people to enter, interact, and leave the observable scene with no constraints. They comprise continuous localisation of audio objects and its application for spatial audio object coding, detection, and tracking of faces, estimation of head poses and visual focus of attention, detection and localisation of verbal and paralinguistic events, and the association and fusion of these different events. Combined all together, they represent multimodal streams with audio objects and semantic video objects and provide semantic information for stream manipulation systems (like a virtual director. Various experiments have been performed to evaluate the performance of the system. The obtained results demonstrate the effectiveness of the proposed design, the various algorithms, and the benefit of fusing different modalities in this scenario.

  10. Audio Quality Assurance : An Application of Cross Correlation

    DEFF Research Database (Denmark)

    Jurik, Bolette Ammitzbøll; Nielsen, Jesper Asbjørn Sindahl

    2012-01-01

    We describe algorithms for automated quality assurance on content of audio files in context of preservation actions and access. The algorithms use cross correlation to compare the sound waves. They are used to do overlap analysis in an access scenario, where preserved radio broadcasts are used in...

  11. Integrated Spacesuit Audio System Enhances Speech Quality and Reduces Noise

    Science.gov (United States)

    Huang, Yiteng Arden; Chen, Jingdong; Chen, Shaoyan Sharyl

    2009-01-01

    A new approach has been proposed for increasing astronaut comfort and speech capture. Currently, the special design of a spacesuit forms an extreme acoustic environment making it difficult to capture clear speech without compromising comfort. The proposed Integrated Spacesuit Audio (ISA) system is to incorporate the microphones into the helmet and use software to extract voice signals from background noise.

  12. Audio Card Systems. Technical Information Bulletin No. 13.

    Science.gov (United States)

    Gasser, P.

    This examination of audio card systems for computers begins by identifying the three information processing systems for sound: sound digitizing, synthesis of text, and word recognition. Specific pedagogical applications of digitized sound are then briefly discussed. The remainder of the document focuses on specifications for the working of vocal…

  13. Deep learning, audio adversaries, and music content analysis

    DEFF Research Database (Denmark)

    Kereliuk, Corey Mose; Sturm, Bob L.; Larsen, Jan

    2015-01-01

    We present the concept of adversarial audio in the context of deep neural networks (DNNs) for music content analysis. An adversary is an algorithm that makes minor perturbations to an input that cause major repercussions to the system response. In particular, we design an adversary for a DNN...

  14. Mediatheque - digitization and preservation of audio content in RTV Slovenia

    Directory of Open Access Journals (Sweden)

    Martin Žvelc

    2011-01-01

    Full Text Available RTV Slovenia’s archives contain large amounts of audio and video materials, various documents and music scores, and most of them are still in the analogue format. Widespread digitization has revolutionized the processes and ways of creating content in the digital format, recorded on different media. Such records also require new ways of preservation. In the article the development and structure of the Mediateque department at RTV Slovenia is presented. Also an overview to the preservation model of audio content is given. Due to rapid technological changes the audio content was the most critical and the first to be digitized. The intensive work in Mediatheque began in 2008 and after two years Radio Slovenia has developed modern system of permanent storage of audio content. Radio Slovenia’s Digital Archive meets all the standards and regulations applicable to modern archival systems. In the article the application of Mediarc software is also presented, which as it could be used for digitizing and permanent storage of TV Slovenia’s video archives.

  15. An Audio-Visual Lecture Course in Russian Culture

    Science.gov (United States)

    Leighton, Lauren G.

    1977-01-01

    An audio-visual course in Russian culture is given at Northern Illinois University. A collection of 4-5,000 color slides is the basis for the course, with lectures focussed on literature, philosophy, religion, politics, art and crafts. Acquisition, classification, storage and presentation of slides, and organization of lectures are discussed. (CHK)

  16. Audio-visual perception system for a humanoid robotic head.

    Science.gov (United States)

    Viciana-Abad, Raquel; Marfil, Rebeca; Perez-Lorenzo, Jose M; Bandera, Juan P; Romero-Garces, Adrian; Reche-Lopez, Pedro

    2014-01-01

    One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting. Different approaches follow bio-inspired mechanisms, merging audio and visual cues to localize a person using multiple sensors. However, most of these fusion mechanisms have been used in fixed systems, such as those used in video-conference rooms, and thus, they may incur difficulties when constrained to the sensors with which a robot can be equipped. Besides, within the scope of interactive autonomous robots, there is a lack in terms of evaluating the benefits of audio-visual attention mechanisms, compared to only audio or visual approaches, in real scenarios. Most of the tests conducted have been within controlled environments, at short distances and/or with off-line performance measurements. With the goal of demonstrating the benefit of fusing sensory information with a Bayes inference for interactive robotics, this paper presents a system for localizing a person by processing visual and audio data. Moreover, the performance of this system is evaluated and compared via considering the technical limitations of unimodal systems. The experiments show the promise of the proposed approach for the proactive detection and tracking of speakers in a human-robot interactive framework. PMID:24878593

  17. Multi Carrier Modulator for Switch-Mode Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Knott, Arnold; Pfaffinger, Gerhard; Andersen, Michael Andreas E.

    2008-01-01

    -mode audio power amplifiers while keeping the performance measures to excellent levels is therefore of high general interest. A modulator utilizing multiple carrier signals to generate a two level pulse train will be shown in this paper. The performance of the modulator will be compared in simulation...

  18. Comparative study of Audio-lingual method and CLT

    Institute of Scientific and Technical Information of China (English)

    2013-01-01

    For language teaching,various teaching methods and approaches have been proposed. But no one teaching approach is one-for-al that is good enough to be used as the standard of teaching. Among so many methods this paper mainly concerns the audio-lingual method and CLT.

  19. Current-Driven Switch-Mode Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Knott, Arnold; Buhl, Niels Christian; Andersen, Michael A. E.

    2012-01-01

    The conversion of electrical energy into sound waves by electromechanical transducers is proportional to the current through the coil of the transducer. However virtually all audio power amplifiers provide a controlled voltage through the interface to the transducer. This paper is presenting a sw...

  20. Audio-Visual Perception System for a Humanoid Robotic Head

    Directory of Open Access Journals (Sweden)

    Raquel Viciana-Abad

    2014-05-01

    Full Text Available One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting. Different approaches follow bio-inspired mechanisms, merging audio and visual cues to localize a person using multiple sensors. However, most of these fusion mechanisms have been used in fixed systems, such as those used in video-conference rooms, and thus, they may incur difficulties when constrained to the sensors with which a robot can be equipped. Besides, within the scope of interactive autonomous robots, there is a lack in terms of evaluating the benefits of audio-visual attention mechanisms, compared to only audio or visual approaches, in real scenarios. Most of the tests conducted have been within controlled environments, at short distances and/or with off-line performance measurements. With the goal of demonstrating the benefit of fusing sensory information with a Bayes inference for interactive robotics, this paper presents a system for localizing a person by processing visual and audio data. Moreover, the performance of this system is evaluated and compared via considering the technical limitations of unimodal systems. The experiments show the promise of the proposed approach for the proactive detection and tracking of speakers in a human-robot interactive framework.

  1. Audio-visual perception system for a humanoid robotic head.

    Science.gov (United States)

    Viciana-Abad, Raquel; Marfil, Rebeca; Perez-Lorenzo, Jose M; Bandera, Juan P; Romero-Garces, Adrian; Reche-Lopez, Pedro

    2014-01-01

    One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting. Different approaches follow bio-inspired mechanisms, merging audio and visual cues to localize a person using multiple sensors. However, most of these fusion mechanisms have been used in fixed systems, such as those used in video-conference rooms, and thus, they may incur difficulties when constrained to the sensors with which a robot can be equipped. Besides, within the scope of interactive autonomous robots, there is a lack in terms of evaluating the benefits of audio-visual attention mechanisms, compared to only audio or visual approaches, in real scenarios. Most of the tests conducted have been within controlled environments, at short distances and/or with off-line performance measurements. With the goal of demonstrating the benefit of fusing sensory information with a Bayes inference for interactive robotics, this paper presents a system for localizing a person by processing visual and audio data. Moreover, the performance of this system is evaluated and compared via considering the technical limitations of unimodal systems. The experiments show the promise of the proposed approach for the proactive detection and tracking of speakers in a human-robot interactive framework.

  2. Audio and Video Reflections to Promote Social Justice

    Science.gov (United States)

    Boske, Christa

    2011-01-01

    Purpose: The purpose of this paper is to examine how 15 graduate students enrolled in a US school leadership preparation program understand issues of social justice and equity through a reflective process utilizing audio and/or video software. Design/methodology/approach: The study is based on the tradition of grounded theory. The researcher…

  3. A listening test system for automotive audio - listeners

    DEFF Research Database (Denmark)

    Choisel, Sylvain; Hegarty, Patrick; Christensen, Flemming;

    2007-01-01

    A series of experiments was conducted in order to validate an experimental procedure to perform listening tests on car audio systems in a simulation of the car environment in a laboratory, using binaural synthesis with head-tracking. Seven experts and 40 non-expert listeners rated a range...

  4. Towards a universal representation for audio information retrieval and analysis

    DEFF Research Database (Denmark)

    Jensen, Bjørn Sand; Troelsgaard, Rasmus; Larsen, Jan;

    2013-01-01

    A fundamental and general representation of audio and music which integrates multi-modal data sources is important for both application and basic research purposes. In this paper we address this challenge by proposing a multi-modal version of the Latent Dirichlet Allocation model which provides a...

  5. Audio-Visual Aid in Teaching "Fatty Liver"

    Science.gov (United States)

    Dash, Sambit; Kamath, Ullas; Rao, Guruprasad; Prakash, Jay; Mishra, Snigdha

    2016-01-01

    Use of audio visual tools to aid in medical education is ever on a rise. Our study intends to find the efficacy of a video prepared on "fatty liver," a topic that is often a challenge for pre-clinical teachers, in enhancing cognitive processing and ultimately learning. We prepared a video presentation of 11:36 min, incorporating various…

  6. Audio-Described Educational Materials: Ugandan Teachers' Experiences

    Science.gov (United States)

    Wormnaes, Siri; Sellaeg, Nina

    2013-01-01

    This article describes and discusses a qualitative, descriptive, and exploratory study of how 12 visually impaired teachers in Uganda experienced audio-described educational video material for teachers and student teachers. The study is based upon interviews with these teachers and observations while they were using the material either…

  7. The Single- and Multichannel Audio Recordings Database (SMARD)

    DEFF Research Database (Denmark)

    Nielsen, Jesper Kjær; Jensen, Jesper Rindom; Jensen, Søren Holdt;

    2014-01-01

    A new single- and multichannel audio recordings database (SMARD) is presented in this paper. The database contains recordings from a box-shaped listening room for various loudspeaker and array types. The recordings were made for 48 different configurations of three different loudspeakers and four...

  8. Possible technical solutions to reduce energy consumption in audio products

    Energy Technology Data Exchange (ETDEWEB)

    Nielsen, K.; Andersen, M.A.E.

    1999-07-01

    In common audio products nearly all the supplied power is dissipated as heat. The major consumers are with almost no exception the power supply and the audio amplifier. This paper is divided in two parts, concentrating on typical efficiency measures for the concepts of today and the possibly technical solutions, by which the overall efficiency can be considerably improved in the future. Traditional power supplies are made using a transformer operating on the mains frequency followed by a linear regulator. These are bulky and the efficiency is only around 40%. Using high frequency switch mode power supplies the size of the power supply can be reduced and the efficiency can be increased to 80-90%. Construction of optimal amplifiers in regard to total energy consumption over life time, can only be accomplished by considering both the general volume control distribution, and the general spectral amplitude distribution of audio signals. The traditional efficiency measure specified at the maximum efficiency level says only very little about the real energy consumption of the audio amplifier. As an example, the theoretical efficiency for at traditional class B amplifier is 78%. Using a new efficiency measure defined on the basis of the approximate volume control distribution, an 50W amplifier example shows an overall efficiency of only 1%. In the paper possible solutions and guidelines to increase the real amplifier efficiency are given. (au)

  9. Market potential for interactive audio-visual media

    NARCIS (Netherlands)

    Leurdijk, A.; Limonard, S.

    2005-01-01

    NM2 (New Media for a New Millennium) develops tools for interactive, personalised and non-linear audio-visual content that will be tested in seven pilot productions. This paper looks at the market potential for these productions from a technological, a business and a users' perspective. It shows tha

  10. A Power Efficient Audio Amplifier Combining Switching and Linear Techniques

    NARCIS (Netherlands)

    Zee, van der R.A.R.; Tuijl, van A.J.M.

    1998-01-01

    Integrated Class D audio amplifiers are very power efficient, but require an external filter which prevents further integration. Also due to this filter, large feedback factors are hard to realise, so that the load influences the distortion- and transfer characteristics. The amplifier presented in t

  11. Application of Feature Space in Extraction of Asparagus Planting Area Using Renote Sensing%芦笋种植面积遥感提取

    Institute of Scientific and Technical Information of China (English)

    王猛; 隋学艳; 梁守真; 姚慧敏; 侯学会

    2016-01-01

    The traditional method of extracted cash crop planting area by remote sensing has been widely carried out. However,it is not suitable for the planting area extraction of asparagus.Aiming the deficiency of the existing methods,this paper considered the characteristics of asparagus production.Taking Caoxian county as the study area,the paper studied the method of extraction of asparagus planting area by using Landsat 8 images.By comparing the different NDVI between the asparagus planting area and other objects,this paper first removed the water and wheat field by NDVI threshold segmentation method,and found the distribution of soil line by further analyzing the two-dimensional feature space of the asparagus planting area,buildings and roads.Finally,the asparagus planting area was extracted through the defined thresholds.The results showed that the asparagus planting area is 14626.55ha2 acres in Caoxian county,where the accuracy achieves 84.85%.%针对传统遥感技术提取芦笋种植面积精度不高的问题,根据芦笋的种植特点,该文以山东省曹县为研究区域,以 Landsat 8影像为研究数据,提出了芦笋种植面积的提取方法。通过分析芦笋种植区与其他地物归一化差值植被指数特征,首先利用阈值分割方法去除水体、小麦地物,进一步分析芦笋种植区、建筑物和道路等的影像二维特征空间,发现芦笋种植区的土壤线分布规律,并通过波段运算结果确定芦笋种植区阈值,最后进行芦笋种植面积提取。结果表明,曹县的芦笋种植面积为14626.55ha2,总体精度为84.85%。

  12. Space space space

    CERN Document Server

    Trembach, Vera

    2014-01-01

    Space is an introduction to the mysteries of the Universe. Included are Task Cards for independent learning, Journal Word Cards for creative writing, and Hands-On Activities for reinforcing skills in Math and Language Arts. Space is a perfect introduction to further research of the Solar System.

  13. An introduction to audio content analysis applications in signal processing and music informatics

    CERN Document Server

    Lerch, Alexander

    2012-01-01

    "With the proliferation of digital audio distribution over digital media, audio content analysis is fast becoming a requirement for designers of intelligent signal-adaptive audio processing systems. Written by a well-known expert in the field, this book provides quick access to different analysis algorithms and allows comparison between different approaches to the same task, making it useful for newcomers to audio signal processing and industry experts alike. A review of relevant fundamentals in audio signal processing, psychoacoustics, and music theory, as well as downloadable MATLAB files are also included"--

  14. Subjective audio quality evaluation of embedded-optimization-based distortion precompensation algorithms.

    Science.gov (United States)

    Defraene, Bruno; van Waterschoot, Toon; Diehl, Moritz; Moonen, Marc

    2016-07-01

    Subjective audio quality evaluation experiments have been conducted to assess the performance of embedded-optimization-based precompensation algorithms for mitigating perceptible linear and nonlinear distortion in audio signals. It is concluded with statistical significance that the perceived audio quality is improved by applying an embedded-optimization-based precompensation algorithm, both in case (i) nonlinear distortion and (ii) a combination of linear and nonlinear distortion is present. Moreover, a significant positive correlation is reported between the collected subjective and objective PEAQ audio quality scores, supporting the validity of using PEAQ to predict the impact of linear and nonlinear distortion on the perceived audio quality. PMID:27475197

  15. A Single Core Hardware Approach of MPEG Audio Decoder for Real-Time Transmission

    Directory of Open Access Journals (Sweden)

    M.B.I. Reaz

    2012-04-01

    Full Text Available The decoding of the voice audio bit stream is an issue in terms of real-time transmission of high quality voice audio over the Internet. A stand-alone chip to perform decoding is a better solution over software approach. The MPEG audio compression provides high compression with minimal loss. This study describes a VHDL model of MPEG audio layer 1 decoder that perform concurrent processing while receiving voice quality audio input bit stream at a constant bit rate and simultaneously producing a stream of 8-bit monopole PCM samples at a constant sampling frequency in real time.

  16. From ITU-T G.722.1 to ITU-T G.722.1 Annex C: A New Low-Complexity 14kHz Bandwidth Audio Coding Standard

    Directory of Open Access Journals (Sweden)

    Minjie Xie

    2007-04-01

    Full Text Available This paper describes the low-complexity 14kHz bandwidth audio coding algorithm which has been recently standardized by ITU-T as Recommendation G.722.1 Annex C (“G.722.1C”. The algorithm is an extension to ITU-T Recommendation G.722.1 and a doubled form of the G.722.1 algorithm to permit 14 kHz audio bandwidth using a 32 kHz audio sample rate, at 24, 32, and 48 kbit/s. The G. 722.1C codec features very high audio quality, extremely low computational complexity, and low algorithmic delay compared to other state-of-the-art audio coding algorithms. This codec is suitable for use in video conferencing and teleconferencing, and Internet streaming applications as well as a general-purpose 14 kHz audio codec. Subjective test results from the Characterization phase of G 722.1C are also presented in the paper.

  17. Maintaining high-quality IP audio services in lossy IP network environments

    Science.gov (United States)

    Barton, Robert J., III; Chodura, Hartmut

    2000-07-01

    In this paper we present our research activities in the area of digital audio processing and transmission. Today's available teleconference audio solutions are lacking in flexibility, robustness and fidelity. There was a need for enhancing the quality of audio for IP-based applications to guarantee optimal services under varying conditions. Multiple tests and user evaluations have shown that a reliable audio communication toolkit is essential for any teleconference application. This paper summarizes our research activities and gives an overview of developed applications. In a first step the parameters, which influence the audio quality, were evaluated. All of these parameters have to be optimized in order to result into the best achievable quality. Therefore it was necessary to enhance existing schemes or develop new methods. Applications were developed for Internet-Telephony, broadcast of live music and spatial audio for Virtual Reality environments. This paper describes these applications and issues of delivering high quality digital audio services over lossy IP networks.

  18. On the relevance of spectral features for instrument classification

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Sigurdsson, Sigurdur; Hansen, Lars Kai;

    2007-01-01

    classification operate normally on a relatively large number of features, from which those related to the spectrum of the audio signal are particularly relevant. In this paper, we confront two different models about the spectral characterization of musical instruments. The first assumes a constant envelope...

  19. A robust zero-watermarking algorithm for audio%一种鲁棒音频零水印算法

    Institute of Scientific and Technical Information of China (English)

    崔得龙; 左敬龙; 彭志平

    2011-01-01

    为了实现数字音频的版权保护,根据音频信号的短时平稳特性和离散小波变换的多分辨率分析特性,设计了一种基于范重心和提升小波变换的数字音频零水印算法.算法首先将音频信号进行分帧,其次对分帧音频进行三级小波提升,提取低频近似分量的范重心,并根据分量范重心与均值向量之间的关系生成特征向量,最后将特征向量与水印运算得到代表原始音频的版权信息.实验结果表明,该算法对音频信号遭受的常见攻击具有较强的鲁棒性,同时密钥的使用保证了算法的安全性.%According to the transient steady property of audio signal and multi-resolution analysis property of discrete wavelet transform, an audio zero watermarking scheme based on NCG (Normed Centre of Gravity) and lifting-based wavelet is proposed in this paper to achieve the copyright protection of digital audio. Firstly, the audio signal is framed by equalized length. Secondly, wavelet transformation based on three-level lifting for the framed audio is conducted, and then NCGS are detected from the low frequency components, then the feature vector is generated by taking relationship between NCGS and mean value vector. Finally, the copyright information is obtained by calculating the watermark and feature vector. Experimental results show that the scheme is robust against common signal processing attacks, meanwhile security of the algorithm is guaranteed by using secret keys.

  20. Applying Spatial Audio to Human Interfaces: 25 Years of NASA Experience

    Science.gov (United States)

    Begault, Durand R.; Wenzel, Elizabeth M.; Godfrey, Martine; Miller, Joel D.; Anderson, Mark R.

    2010-01-01

    From the perspective of human factors engineering, the inclusion of spatial audio within a human-machine interface is advantageous from several perspectives. Demonstrated benefits include the ability to monitor multiple streams of speech and non-speech warning tones using a cocktail party advantage, and for aurally-guided visual search. Other potential benefits include the spatial coordination and interaction of multimodal events, and evaluation of new communication technologies and alerting systems using virtual simulation. Many of these technologies were developed at NASA Ames Research Center, beginning in 1985. This paper reviews examples and describes the advantages of spatial sound in NASA-related technologies, including space operations, aeronautics, and search and rescue. The work has involved hardware and software development as well as basic and applied research.